Perform Data Fitting with PKPD Models

Data Fitting Workflow

The following steps show one of the workflows you can use at the command line to fit a PK model and estimate parameters:

  1. Import data as shown in Importing Data.

  2. Specify the structural model by creating a PK model as shown in Create a Pharmacokinetic Model Using the Command Line. Alternatively, if you have a SimBiology® model that you want to use in fitting, see Prerequisites for Using Custom SimBiology Models in Data Fitting.

  3. Classify the data set to use in fitting. See Specify and Classify the Data to Fit.

  4. Specify the initial guesses for the parameters to be estimated, as shown in Set Initial Estimates.

  5. Perform individual or population fits:

    • For individual fits:

      • (Optional) Specify an error model or weights. See Specify an Error Model.

      • (Optional) Set tolerances.

      • (Optional) Specify maximum iterations.

      • (Optional) Specify to pool the data, which simultaneously fits data from multiple dose levels using the same model parameters for each dose.

    • For population fits:

  6. Obtain and visualize results.

Specify and Classify the Data to Fit

In order to use the imported data in fitting, you must identify required columns in the data set that was previously imported as shown in Importing Data.

Use the PKData object to specify the data set containing the observed data to use in fitting. The properties of the PKData object specify what each column in the data represents.

To create the PKData object:

  1. Create the PKData object for the data set data.

    pkDataObject = PKData(data);

    PKData assigns the data set data to the read-only DataSet property.

  2. Use the column headers in the data set to specify the following properties for the column in the data set.

    Column in Data Set RepresentsPKData Object Property
    Group identification labelsGroupLabel
    Independent variable
    (For example, time)
    IndependentVarLabel
    Dependent variable
    (For example, measured response)
    DependentVarLabel
    Amount of dose givenDoseLabel
    Rate of infusion (when applicable). Data must contain rate (amount/time) and not infusion time.RateLabel
    Covariates
    (For example, age, gender, weight)
    CovariateLabels

    For example, for the tobramycin data set [1]:

    pkDataObject.GroupLabel          = 'ID'; 
    pkDataObject.IndependentVarLabel = 'Time';
    pkDataObject.DependentVarLabel   = 'Response';
    pkDataObject.DoseLabel           = 'Dose';
    pkDataObject.CovariateLabels     = {'WT','HT','AGE', 'SEX', 'CLCR'};

      Note:   For the subset of data belonging to a single group (as defined by the column in your data set that represents group identification labels, which you map to the GroupLabel property), the software allows multiple observations made at the same time. If this is true for your data, be aware that:

      • These data points are not averaged, but fitted individually.

      • Different number of observations at different times cause some time points to be weighted more.

      Tip   If dosing applies to more than one compartment in the model, specify the DoseLabel property as follows:

      pkDataObject.DoseLabel =  {'Dose1', 'Dose2'};

      Dose1 and Dose2 are names of columns containing dose information for compartments. A one-to-one relationship must exist between the number and order of elements in the DoseLabel property and the Dosed property of the corresponding PKModelMap object.

      Tip   If your model measures multiple responses, specify the DependentVarLabel property as follows:

      pkDataObject.DependentVarLabel =  {'Response1', 'Response2'};

      Response1 and Response2 are names of columns containing response measurements. A one-to-one relationship must exist between the number and order of elements in the DependentVarLabel property and the Observed property of the corresponding PKModelMap object.

    When you assign a column containing group identification labels to the GroupLabel property, PKData sets these read-only properties as follows:

    • The GroupNames property is set to the unique names found in the group column.

    • The GroupID property is set to an integer corresponding to the unique names found in the group column.

Specify Solver Type and Options for Fitting

If you specify a stochastic solver and options in the Configset object associated with your model, be aware that during fitting SimBiology temporarily changes:

  • SolverType property to the default solver of ode15s

  • SolverOptions property to the options last configured for a deterministic solver

Set Initial Estimates

    Caution   If your model includes active variants that specify alternate values for the parameters to estimate, the variants are ignored for those parameters during fitting.

To set the initial estimates (or initial guesses) for the parameters with fixed effects to estimate, first identify the sequence of the parameters in the model by querying the PKModelMap object. Next, construct a vector, beta0, containing the initial conditions. For information about PKModelMap objects, see step 4 in Create a Pharmacokinetic Model Using the Command Line.

  1. Query the Estimated property of the PKModelMap object:

    PKModelMapObj.Estimated

    MATLAB® returns the sequence of the parameters to be estimated. For example:

    ans = 
    
        'Central'
        'Cl_Central'
    
  2. Set the initial estimates for the parameters. For example:

    beta0 = [10.0, 1.0];
For information on ...See ...
The parameters added to the model
Default units for the above parametersUnit Conversion for Imported Data

Specify a Nonlinear, Mixed-Effects Model

Suppose data for a nonlinear regression model falls into one of m distinct groups i = 1, ..., m. (Specifically, suppose that the groups are not nested.) To specify a general, nonlinear, mixed-effects (NLME) model for this data:

  1. Define group-specific model parameters φi as linear combinations of fixed effects β and random effects bi.

  2. Define response values yi as a nonlinear function f of the parameters and group-specific covariate variables Xi.

The model is:

φi=Aiθ+Biηi, where ηiN(0,Ψ)yi=f(φi,Xi)+εiAlternatively, logyi=logf(φi,Xi)+εi

This formulation of the nonlinear, mixed-effects model uses the following notation:

φiA vector of group-specific model parameters
θA vector of fixed effects, modeling population parameters
ηiA vector of multivariate, normally distributed, group-specific, random effects
AiA group-specific design matrix for combining fixed effects
BiA group-specific design matrix for combining random effects
XiA data matrix of group-specific covariate values
yiA data vector of group-specific response values
fA general, real-valued function of φi and Xi
εi
ΨA covariance matrix for the random effects
σ2The error variance, assumed to be constant across observations

For example, consider a one-compartment model with first-order dosing and linear clearance. The group-specific parameters (φ) in the model are clearance (Cl), compartment volume (V), and absorption rate constant (ka). From the model:

(ClVka)=(100010001)(θClθVθka)+(100010001)(ηClηVηka)

In SimBiology, Bi is an identity matrix. That is, sbionlmefit does not support the specification of a different random-effects design matrix. You can alter the design matrices, as necessary, to introduce weighting of individual effects.

The Statistics Toolbox™ function nlmefit fits the general, nonlinear, mixed-effects model to data, estimating the fixed and random effects. The function also estimates the covariance matrix Ψ for the random effects. Additional diagnostic outputs allow you to assess trade-offs between the number of model parameters and the goodness of fit. See Mixed-Effects Models in the Statistics Toolbox documentation for more information.

Specify a Covariate Model

Construct a CovariateModel object to define the relationship between parameters and covariates. After constructing the object, modify the FixedEffectValues property of the object before using the object as an input argument to sbionlmefit or sbionlmefitsa, to estimate nonlinear mixed effects.

If the NLME model assumes a group-dependent covariate such as weight (w), the model becomes:

(ClVka)=(100wi01000010) (θClθVθkaθCl/w)+(100010001) (ηClηVηka)

Thus, the parameter for clearance (Cl) for an individual is Cli = θCl + θCl/w * wi + ηCli.

Use the following procedure to specify a covariate model. If you are using the tobramycin data set, make sure you first complete the following procedures:

  1. Use the CovariateModel constructor function to construct an empty CovariateModel object:

    covModel = CovariateModel;
  2. Set the Expression property of the object to define the relationship between parameters and covariates in the CovariateModel object, where Cl, v, and ka are parameters, weight is a covariate, theta1, theta2, theta3, and theta4 are fixed effects, and eta1, eta2, and eta3 are random effects.

    covModel.Expression = {'Cl = exp(theta1 + theta4*weight + eta1)',...
                           'v = exp(theta2 + eta2)',...
                           'ka = exp(theta3 + eta3)'};
  3. Display a list of the descriptions of the fixed effects (theta1 and theta2) in the CovariateModel object:

    disp('Fixed Effects Descriptions:');
    disp(covModel.FixedEffectDescription)
    

    Your output appears as follows, where each string describes the role of a fixed effect in the expression equation:

    Fixed Effects Descriptions:
        'Cl'
        'v'
        'ka'
        'Cl/weight'
  4. Use the constructDefaultFixedEffectValues method of the CovariateModel object to create a structure containing the initial estimates for the fixed effects in the object. The initial estimates in this structure are set to a default of zero:

    initialEstimates = covModel.constructDefaultFixedEffectValues

    Your output appears as:

    initialEstimates = 
    
        theta1: 0
        theta2: 0
        theta3: 0
        theta4: 0
    
  5. Edit the initialEstimates structure to set the initial estimates of the fixed effects:

    initialEstimates.theta1 = 1.408;
    initialEstimates.theta2 = 0.061;
    initialEstimates.theta3 = 0.31;

      Tip   Typically, these initial estimates are values you determine from a previous fit of the data.

  6. Use the modified initialEstimates structure to update the FixedEffectValues property of the CovariateModel object:

    covModel.FixedEffectValues = initialEstimates;

    Now covModel, the CovariateModel object, is ready to submit as an input argument to sbionlmefit or sbionlmefitsa.

Specify the Covariance Pattern of Random Effects

By default, the function you use to perform population fits (nlmefit or nlmefitsa) assumes a diagonal covariance matrix (no covariance among the random effects). To specify a different covariance pattern of random effects, use the 'CovPattern' option. In the previous example, assuming that each of the parameters has random effects and that Cl and V exhibit covariance, the covariance pattern of random effects would be a logical array:

(110110001)

  1. Create an options struct with the specified covariance pattern:

    options.CovPattern = [1, 1, 0; 1, 1, 0; 0, 0, 1]; 
    
  2. Specify the arguments for sbionlmefit or sbionlmefitsa:

    [results, simdataI, simdataP] = sbionlmefit(modelObj,...
    				 PKModelMapObj, pkDataObject, beta0, options)

If you are using the tobramycin data set [1], do the following:

  1. Create an options struct with the specified covariance pattern:

    options.CovPattern = [1, 0; 0, 1];
  2. Specify the arguments for sbionlmefit:

    [results, simdataI, simdataP] = sbionlmefit(modelObj,...
    				 PKModelMapObj, pkDataObject, beta0, options)
    results = 
    
      NLMEResults handle
    
      Properties:
                        FixedEffects: [2x3 dataset]
                       RandomEffects: [97x2 dataset]
        IndividualParameterEstimates: [97x2 dataset]
        PopulationParameterEstimates: [97x2 dataset]
        RandomEffectCovarianceMatrix: [2x2 dataset]
             EstimatedParameterNames: {2x1 cell}
                      CovariateNames: {'WT'  'HT'  'AGE'  'SEX'  'CLCR'}
                  FixedEffectsStruct: [1x1 struct]
                               stats: [1x1 struct]
    
    results.FixedEffects
    ans = 
    
        Description         Estimate    StandardError
        'Central'           3.0478      0.064369     
        'Cl_Central'        1.3054      0.061095

For more information, see nlmefit or nlmefitsa in the Statistics Toolbox documentation.

Fitting the model and estimating the covariance matrix Ψ often leads to further refinements. A relatively small estimate for the variance of a random effect suggests that it can be removed from the model. Similarly, relatively small estimates for covariances among certain random effects suggest that a full covariance matrix is unnecessary. Since random effects are unobserved, Ψ must be estimated indirectly. Specifying a diagonal or block-diagonal covariance pattern for Ψ can improve convergence and efficiency of the fitting algorithm.

Specify an Error Model

You can specify any of the following error models when using the sbionlinfit, sbionlmefit, or sbionlmefitsa function. Each model defines the error using a standard normal (Gaussian) variable e, the function value f, and one or two parameters, a and b. The default error model is 'constant'.

  • 'constant': y = f + a*e (default)

  • 'proportional': y = f + b*abs(f)*e

  • 'combined': y = f + (a+b*abs(f))*e

  • 'exponential': y = f*exp(a*e), or equivalently log(y) = log(f) + a*e

To define an error model:

  1. Create an optionStruct input argument and set the ErrorModel field to specify one of the above error models. For example:

    optionStruct.ErrorModel = 'proportional'; 
    
  2. Specify the optionStruct input argument for sbionlinfit, sbionlmefit, or sbionlmefitsa, as shown in Perform Individual Fitting or Perform Population Fitting.

See also or nlinfit, nlmefit, or nlmefitsa in the Statistics Toolbox documentation.

Specify Parameter Transformations

To specify parameter transformations, use the ParamTransform option in sbionlinfit, sbionlmefit and sbionlmefitsa. The ParamTransform option lets you specify either no transformation, or the log, probit, or logit transformation.

    Note:   Do not use the ParamTransform option to specify parameter transformations when providing a CovariateModel object to a fitting function. The CovariateModel object provides the parameter transformation.

The underlying algorithm in nlmefit assumes that parameters follow a normal distribution. This assumption may not hold for biological parameters that are constrained to be positive, such as volume and clearance. You may specify a transformation function for the estimated parameters, so that the transformed parameters follow a normal distribution.

By default, the SimBiology fitting functions choose a log transform for all estimated parameters. Parameters that are constrained between the values 0 and 1, like absorption fraction, can be transformed by the probit or logit transformations described below.

The probit function is the inverse cumulative distribution function (CDF) associated with the standard normal distribution. To apply the probit transform to a variable x in MATLAB, use the Statistics Toolbox function norminv: t = norminv(x). To untransform a variable t, use the function normcdf: x = normcdf(t).

The logit function is the inverse of the sigmoid function. To apply the logit transform to a variable x in MATLAB, use the following expression: t = log(x) - log(1-x). To untransform the variable t, use x = 1/(1+exp(-t)).

  1. For the ParamTransform option, specify a vector of values equal to the number of parameters to be estimated. The values must be one of the integer codes listed in nlmefitsa or nlmefit specifying the transformation for the corresponding value of the parameters to be estimated. For example

    options.ParamTransform = [0 1 2]; 
    

    See nlmefit and nlmefitsa for more information.

  2. Specify the arguments for sbionlmefit or sbionlmefitsa, as shown in Perform Population Fitting.

    For individual fitting, see Perform Individual Fitting.

Perform Population Fitting

The sbionlmefit and sbionlmefitsa functions let you specify a SimBiology model that you want to use in fitting. These functions use the nlmefit and nlmefitsa functions from the Statistics Toolbox to fit data with both fixed and random sources of variation using nonlinear mixed-effects and return the estimates. Both nlmefitand nlmefitsa fit the model by maximizing an approximation to the marginal likelihood with random effects integrated out assuming the following:

  • Random effects are multivariate, normally distributed, and independent between groups.

  • Observation errors are independent, identically normally distributed, and independent of random effects.

  1. (Optional) Set the tolerance or maximum iteration options. Use an options structure that is an input argument for sbionlmefit or sbionlmefitsa:

    optionStruct.Options.TolX = 1.0E-4;
    optionStruct.Options.TolFun = 1.0E-4;
    optionStruct.Options.MaxIter = 200;
  2. Specify the model object, the PKModelMap object, the PKData object, the PKCovariateModel object, a vector containing the initial estimates for the fixed effects, and the options:

    [results, simdataI, simdataP] = sbionlmefit(modelObj,...
    	PKModelMapObj, pkDataObject, CovariateModelObject, beta0, optionStruct);

      Note:   If your population fit uses multiple doses, make sure each element in the Dosed property of the PKModelMap object is unique.

      Note:   In your PKData object, for each subset of data belonging to a single group (as defined in the data column specified by the GroupLabel property), the software allows multiple observations made at the same time. If this is true for your data, be aware that:

      • These data points are not averaged, but fitted individually.

      • Different number of observations at different times cause some time points to be weighted more.

    sbionlmefit and sbionlmefitsa return the following:

    • results, an object containing estimated values and other statistics. For more information, see the sbionlmefit and sbionlmefitsa reference pages.

    • simdataI, a SimData object containing the data from simulating the model using the estimated parameter values for individuals, which includes both the fixed and random effects.

    • simdataP, SimData object containing the data from simulating the model using the estimated parameter values for the population, which includes only the fixed effects.

  3. Plot the data from the data set. For example, in the imported data set used for fitting, ds, ID, Time, and Response are the column headers for the columns containing group IDs, time, and the response variable, respectively.

    p = sbiotrellis(ds, 'ID', 'Time', 'Response')

      Note:   If your data set has multiple responses, with column headers Response1 and Response2 containing the response variables, you plot the data as follows:

      Response = {'Response1', 'Response2'}
      p = sbiotrellis(ds, 'ID', 'Time', Response)
      
  4. Use the plot method on the trellis plot object p, returned by sbiotrellis to overlay data, using default values for the second and third input arguments.

    p.plot(simdataP, [], '', PKModelMapObj.Observed);

For a description of the results, see sbionlmefit in the SimBiology documentation.

For more information, see the following topics in the Statistics Toolbox documentation:

Obtaining the Status of Fitting

The sbiofitstatusplot function dynamically plots the progress of the fitting task. During the task, the function plots the fixed effects (β), the estimates for the diagonal elements of the covariance matrix for the random effects (Ψ), and the log-likelihood. This functionality is useful for large and complex models when you expect the time to return the results to be longer than a few minutes. Use the options structure that is an argument for the sbionlmefit function:

% Create options structure with 'OutputFcn'.      
options.Options.OutputFcn = @sbiofitstatusplot;
% Pass options structure with OutputFcn to sbionlmefit function.
results = sbionlmefit(..., options);

The following figure shows the type of plots obtained.

Tips for interpreting status plots:

  • The fitting function tries to maximize the log-likelihood. When the plot begins to display a flat line, this might indicate that maximization is complete. Try setting the maximum iterations to a lower number to reduce the number of iterations you need and improve performance. For information on how to set iteration options, see Perform Population Fitting.

  • Plots for the fixed effects (β) and the estimates for the diagonal elements of the covariance matrix for the random effects (Ψ), should show convergence. If you see oscillations, or jumps without accompanying improvements in the log-likelihood, the model may be over-parameterized. Try the following:

    • Reduce the number of fixed effects.

    • Reduce the number of random effects.

    • Simplify the covariance matrix pattern of random effects.

See also sbiofitstatusplot in the SimBiology documentation.

Simultaneously Fitting Data from Multiple Dose Levels

When performing population fitting using nonlinear regression, you can simultaneously fit data from multiple dose levels by either:

  • Using sbionlmefit with a CovariateModel object input argument and omitting the random effect (eta) from the expressions in the CovariateModel object.

  • Using sbionlmefit with an InitEstimates input argument and setting the REParamsSelect field or name-value pair input argument to a 1-by-n logical vector, with all entries set to false, where n equals the number of fixed effects.

Perform Individual Fitting

The sbionlinfit function lets you specify a SimBiology model to fit using the nlinfit function (individual fit). The nlinfit function uses nonlinear least squares and returns parameter estimates, residuals, and the estimated coefficient covariance matrix.

  1. (Optional) Specify an error model, set the tolerance, set the maximum iteration, or set the data pooling option, which lets you simultaneously fit data from multiple dose levels using the same model parameters for each dose. Use an options structure that is an input argument for sbionlinfit:

    optionStruct.ErrorModel = 'proportional';
    optionStruct.TolX = 1.0E-8;
    optionStruct.TolFun = 1.0E-8;
    optionStruct.MaxIter = 100;
    optionStruct.Pooled = true;
  2. Specify the model object, the PKModelMap object, the PKData object, a vector containing the initial estimates for the fixed effects, and the options:

    [results, simdataI] = sbionlinfit(modelobj,...
    	PKModelMapObj, PKDataObj, beta0, optionStruct);

      Note:   If your individual fit uses multiple doses, ensure each element in the Dosed property of the PKModelMap object is unique.

      Note:   In your PKData object, for each subset of data belonging to a single group (as defined in the data column specified by the GroupLabel property), the software allows multiple observations made at the same time. If this is true for your data, be aware that:

      • These data points are not averaged, but fitted individually.

      • Different number of observations at different times cause some time points to be weighted more.

    sbionlinfit returns the following:

    • A results array of objects, with each object containing the following for one group:

      • ParameterEstimates — A dataset array containing fitted coefficients and their standard errors.

      • CovarianceMatrix — Estimated covariance matrix for the fitted coefficients.

      • beta — Vector of scalars specifying the fitted coefficients in transformed space.

      • R — Residuals.

      • J — Jacobian of modelObject.

      • COVB — Estimated covariance matrix for the transformed coefficients.

      • mse — Scalar specifying the estimate of the error of the variance term.

      • errorparam — Estimated parameters of the error model or an empty array if you specified weights using the 'Weights' name-value pair argument.

    • simdataI, a SimData object containing the data from simulating the model using the estimated parameter values, for individuals.

  3. Plot the data from the data set. For example, in the imported data set (ds), ID, Time and Response are the column headers for the columns containing group IDs, time, and the response variable respectively.

    p = sbiotrellis(ds, 'ID', 'Time', 'Response')

      Note:   If your data set has multiple responses, with column headers Response1 and Response2 containing the response variables, then plot the data as follows:

      Response = {'Response1', 'Response2'}
      p = sbiotrellis(ds, 'ID', 'Time', Response)
      
  4. Use the plot method on the trellis plot object p, returned by sbiotrellis to overlay data, using default values for the second and third input arguments.

    p.plot(simdataI, [], '', PKModelMapObj.Observed);

For more information, see Nonlinear Regression and nlinfit in the Statistics Toolbox documentation.

Was this topic helpful?