Skip to contents

Define a cohort to include in a parameter estimation

Usage

system_define_cohort(cfg, cohort)

Arguments

cfg

ubiquity system object

cohort

list with cohort information

Value

ubiquity system object with cohort defined

Details

Each cohort has a name (eg d5mpk), and the dataset containing the information for this cohort is identified (the name defined in system_load_data)

cohort = list(
  name         = "d5mpk",
  dataset      = "pm_data",
  inputs       = NULL,
  outputs      = NULL)

Next if only a portion of the dataset applies to the current cohort, you can define a filter (cf field). This will be applied to the dataset to only return values relevant to this cohort. For example, if we only want records where the column DOSE is 5 (for the 5 mpk cohort). We can use the following:

cohort[["cf"]]   = list(DOSE   = c(5))

If the dataset has the headings ID, DOSE and SEX and cohort filter had the following format:

cohort[["cf"]]   = list(ID    = c(1:4),
                        DOSE  = c(5,10),
                        SEX   = c(1))

It would be translated into the boolean filter:

(ID==1) | (ID==2) | (ID==3) | (ID==4)) & ((DOSE == 5) | (DOSE==10)) & (SEX == 1)

Optionally you may want to fix a system parameter to a different value for a given cohort. This can be done using the cohort parameter (cp) field. For example if you had the body weight defined as a system parameter (BW), and you wanted to fix the body weight to 70 for the current cohort you would do the following:

cohort[["cp"]]   = list(BW        = c(70))

Note that you can only fix parameters that are not being estimated.

By default the underlying simulation output times will be taken from the general output_times option (see system_set_option). However It may also be necessary to specify simulation output times for a specific cohort. The output_times field can be used for this. Simply provide a vector of output times:

cohort[["output_times"]]   = seq(0,100,2)

Next we define the dosing for this cohort. It is only necessary to define those inputs that are non-zero. So if the data here were generated from animals given a single 5 mpk IV at time 0. Bolus dosing is defined using <B:times> and <B:events>. If Cp is the central compartment, you would pass this information to the cohort in the following manner:

cohort[["inputs"]][["bolus"]] = list()
cohort[["inputs"]][["bolus"]][["Cp"]] = list(TIME=NULL, AMT=NULL)
cohort[["inputs"]][["bolus"]][["Cp"]][["TIME"]] = c( 0) 
cohort[["inputs"]][["bolus"]][["Cp"]][["AMT"]]  = c( 5)

Inputs can also include any infusion rates (infusion_rates) or covariates (covariates). Covariates will have the default value specified in the system file unless overwritten here. The units here are the same as those in the system file

Next we need to map the outputs in the model to the observation data in the dataset. Under the outputs field there is a field for each output. Here the field ONAME can be replaced with something more useful (like PK).

cohort[["outputs"]][["ONAME"]] = list()

If you want to further filter the dataset. Say for example you have two outputs and the cf applied above reduces your dataset down to both outputs. Here you can use the "of" field to apply an "output filter" to further filter the records down to those that apply to the current output ONAME.

cohort[["outputs"]][["ONAME"]][["of"]] = list(
       COLNAME          = c(),
       COLNAME          = c())

If you do not need further filtering of data, you can you can just omit the field.

Next you need to identify the columns in the dataset that contain your times and observations. This is found in the obs field for the current observation:

cohort[["outputs"]][["ONAME"]][["obs"]] = list(
         time           = "TIMECOL",
         value          = "OBSCOL",
         missing        = -1)

The times and observations in the dataset are found in the ’TIMECOL’ column and the ’OBSCOL’ column (optional missing data option specified by -1).

These observations in the dataset need to be mapped to the appropriate elements of your model defined in the system file. This is done with the model field:

cohort[["outputs"]][["ONAME"]][["model"]] = list(
         time           = "TS",       
         value          = "MODOUTPUT",
         variance       = "PRED^2")

First the system time scale indicated by the TS placeholder above must be specfied. The time scale must correspond to the data found in TIMECOL above. Next the model output indicated by the MODOUTPUT placeholder needs to be specified. This is defined in the system file using <O> and should correspond to OBSCOL from the dataset. Lastly the variance field specifies the variance model. You can use the keyword PRED (the model predicted output) and any variance parameters. Some examples include:

  • variance = "1" - Least squares

  • variance = "PRED^2" - Weighted least squares proportional to the prediction squared

  • variance = "(SLOPE*PRED)^2" Maximum likelihood estimation where SLOPE is defined as a variance parameter (<VP>)

The following controls the plotting aspects associated with this output. The color, shape and line values are the values used by ggplot functions.

cohort[["outputs"]][["ONAME"]][["options"]] = list(
        marker_color   = "black",
        marker_shape   = 16,
        marker_line    = 1 )

If the cohort has multiple outputs, simply repeat the process above for the. additional cohorts. The estimation vignettes contains examples of this.

Note: Output names should be consistent between cohorts so they will be grouped together when plotting results.

See also