Perform nonlinear leastsquares regression
fitResults = sbiofit(sm,grpData,responseMap,estiminfo)
examplefitResults = sbiofit(sm,grpData,responseMap,estiminfo,dosing)
examplefitResults = sbiofit(sm,grpData,responseMap,estiminfo,dosing,functionName)
examplefitResults = sbiofit(sm,grpData,responseMap,estiminfo,dosing,functionName,options)
examplefitResults = sbiofit(sm,grpData,responseMap,estiminfo,dosing,functionName,options,variants)
examplefitResults = sbiofit(_,Name,Value)
example[fitResults,simdata]
= sbiofit(_)
example
estimates
parameters of a SimBiology model fitResults
= sbiofit(sm
,grpData
,responseMap
,estiminfo
)sm
using nonlinear
leastsquares regression.
grpData
is a groupedData
object
specifying the data to fit. responseMap
is
a string or cell array of strings that maps model components to response
data in grpData
. estimatedInfo
is
an estimatedInfo object
that
defines the estimated parameters in the model sm
. fitResults
is
a OptimResults object
or NLINResults object
or a vector of these
objects.
sbiofit
uses the first available estimation
function among the following: lsqnonlin
(Optimization Toolbox™ required), nlinfit
(Statistics and Machine Learning Toolbox™ required),
or fminsearch
.
By default, each group in grpData
is fit
separately, resulting in groupspecific parameter estimates. If the
model contains active doses and variants, they are applied during
the simulation.
uses
the dosing information specified by a matrix of SimBiology dose objects fitResults
= sbiofit(sm
,grpData
,responseMap
,estiminfo
,dosing
)dosing
instead
of using the active doses of the model sm
if
there is any.
uses
the estimation function specified by fitResults
= sbiofit(sm
,grpData
,responseMap
,estiminfo
,dosing
,functionName
)functionName
.
If the specified function is unavailable, a warning is issued and
the first available default function is used.
uses
the additional options specified by fitResults
= sbiofit(sm
,grpData
,responseMap
,estiminfo
,dosing
,functionName
,options
)options
for
the function functionName
.
applies
variant objects specified as fitResults
= sbiofit(sm
,grpData
,responseMap
,estiminfo
,dosing
,functionName
,options
,variants
)variants
instead
of using any active variants of the model.
uses
additional options specified by one or more fitResults
= sbiofit(_,Name,Value
)Name,Value
pair
arguments.
[
also returns a vector of SimData objects fitResults
,simdata
]
= sbiofit(_)simdata
using
any of the input arguments in the previous syntaxes.
Note:

Background
This example shows how to fit an individual's PK profile data to onecompartment model and estimate pharmacokinetic parameters.
Suppose you have drug plasma concentration data from an individual and want to estimate the volume of the central compartment and the clearance. Assume the drug concentration versus the time profile follows the monoexponential decline , where is the drug concentration at time t, is the initial concentration, and is the elimination rate constant that depends on the clearance and volume of the central compartment .
The synthetic data in this example was generated using the following model and parameters:
Onecompartment model with bolus dosing and firstorder elimination
Volume of the central compartment (Central
) = 1.70 liter
Clearance parameter (Cl_Central
) = 0.55 liter/hour
Constant error model
Load Data and Visualize
The data is stored as a table with variables Time
and Conc
that represent the time course of the plasma concentration of an individual after an intravenous bolus administration measured at 13 different time points. The variable units for Time
and Conc
are hour and milligram/liter, respectively.
clear all load(fullfile(matlabroot,'examples','simbio','data15.mat'))
plot(data.Time,data.Conc,'b+') xlabel('Time'); ylabel('Drug Concentration');
Convert to groupedData Format
Convert the data set to a groupedData
object, which is the required data format for the fitting function sbiofit
for later use. A groupedData
object also lets you set independent variable and group variable names (if they exist). Set the units of the Time
and Conc
variables. The units are optional and only required for the UnitConversion
feature, which automatically converts matching physical quantities to one consistent unit system.
gData = groupedData(data); gData.Properties.VariableUnits = {'hour','milligram/liter'}; gData.Properties
groupedData
automatically set the name of the IndependentVariableName
property to the Time
variable of the data.
ans = Description: '' VariableDescriptions: {} VariableUnits: {'hour' 'milligram/liter'} DimensionNames: {'Row' 'Variable'} UserData: [] RowNames: {} VariableNames: {'Time' 'Conc'} GroupVariableName: '' IndependentVariableName: 'Time'
Construct a OneCompartment Model
Use the builtin PK library to construct a onecompartment model with bolus dosing and firstorder elimination where the elimination rate depends on the clearance and volume of the central compartment. Use the configset
object to turn on unit conversion and increase the solver tolerances.
pkmd = PKModelDesign; pkc1 = addCompartment(pkmd,'Central'); pkc1.DosingType = 'Bolus'; pkc1.EliminationType = 'linearclearance'; pkc1.HasResponseVariable = true; model = construct(pkmd); configset = getconfigset(model); configset.CompileOptions.UnitConversion = true; configset.SolverOptions.AbsoluteTolerance = 1e9; configset.SolverOptions.RelativeTolerance = 1e5;
For details on creating compartmental PK models using the SimBiology® builtin library, see Create Pharmacokinetic Models.
Define Dosing
Define a single bolus dose of 10 milligram given at time = 0. For details on setting up different dosing schedules, see Doses.
dose = sbiodose('dose'); dose.TargetName = 'Drug_Central'; dose.StartTime = 0; dose.Amount = 10; dose.AmountUnits = 'milligram'; dose.TimeUnits = 'hour';
Map Response Data to the Corresponding Model Component
The data contains drug concentration data stored in the Conc
variable. This data corresponds to the Drug_Central
species in the model. Therefore, map the data to Drug_Central
as follows.
responseMap = {'Drug_Central = Conc'};
Specify Parameters to Estimate
The parameters to fit in this model are the volume of the central compartment (Central) and the clearance rate (Cl_Central). In this case, specify logtransformation for these biological parameters since they are constrained to be positive. The estimatedInfo
object lets you specify parameter transforms, initial values, and parameter bounds (optional).
paramsToEstimate = {'log(Central)','log(Cl_Central)'}; estimatedParams = estimatedInfo(paramsToEstimate,'InitialValue',[1 1]);
Estimate Parameters
Now that you have defined onecompartment model, data to fit, mapped response data, parameters to estimate, and dosing, use sbiofit
to estimate parameters. The default estimation function that sbiofit
uses will change depending on which toolboxes are available. To see which function was used during fitting, check the EstimationFunction
property of the corresponding results object.
fitConst = sbiofit(model,gData,responseMap,estimatedParams,dose)
fitConst = OptimResults with properties: ExitFlag: 3 Output: [1x1 struct] GroupName: [1x1 categorical] Beta: [2x3 table] ParameterEstimates: [2x3 table] J: [13x2 double] COVB: [2x2 double] CovarianceMatrix: [2x2 double] R: [13x1 double] MSE: 0.0374 SSE: 0.4119 Weights: [] EstimatedParameterNames: {'Central' 'Cl_Central'} ErrorModelInfo: [1x3 table] EstimationFunction: 'lsqnonlin' ErrorModel: 'constant' ErrorParameters: [1x1 table]
Display Estimated Parameters and Plot Results
Notice the parameter estimates were not far off from the true values (1.70 and 0.55) that were used to generate the data. You may also try different error models to see if they could further improve the parameter estimates.
fitConst.ParameterEstimates plot(fitConst);
ans = Name Estimate StandardError ____________ ________ _____________ 'Central' 1.6993 0.034805 'Cl_Central' 0.53352 0.019667
Use Different Error Models
Try three other supported error models (proportional, combination of constant and proportional error models, and exponential).
fitProp = sbiofit(model,gData,responseMap,estimatedParams,dose,... 'ErrorModel','proportional'); fitExp = sbiofit(model,gData,responseMap,estimatedParams,dose,... 'ErrorModel','exponential'); fitComb = sbiofit(model,gData,responseMap,estimatedParams,dose,... 'ErrorModel','combined');
Compare Sum of Squared Errors (SSE)
Compare the SSE of each fit to see which error model best fits the data. However, only the constant and exponential error models provide the raw residuals, that is, observed minus fitted values. The proportional and combined error models provide weighted residuals. Thus to compare SSE of error models, first calculate the raw (unweighted) residuals for the proportional and combined error models.
The results returned by sbiofit contain fitted values at both experimental and solver time points. Use the fitted
function of the results object to get all the simulated (fitted) data. Then use the resample
function to extract only those fitted values at experimental time points.
propSimData = fitted(fitProp); combSimData = fitted(fitComb); expTime = gData.Time; propSimData = resample(propSimData,expTime); combSimData = resample(combSimData,expTime);
Calculate the sum of squared errors.
nlinfitPropR = gData.Conc  propSimData.Data; nlinfitPropSSE = sum(nlinfitPropR.^2); nlinfitCombR = gData.Conc  combSimData.Data; nlinfitCombSSE = sum(nlinfitCombR.^2);
Construct a table of SSEs for all error models. Use the reported SSE of the constant and exponential error models by accessing the SSE property of the corresponding results object. For the proportional and combined error models, use the calculated SSEs.
ErrorModels = {'Constant','Proportional','Exponential','Combined'}'; SSE = [fitConst.SSE,nlinfitPropSSE,fitExp.SSE,nlinfitCombSSE]'; sseTable = table(ErrorModels,SSE)
sseTable = ErrorModels SSE ______________ _______ 'Constant' 0.41187 'Proportional' 1.2789 'Exponential' 0.63346 'Combined' 0.41196
As shown in the table, the constant and combined error models have the lowest (and same) SSE. However, the constant error model is a better model since it uses only one error parameter (a
) whereas the combined error model uses two (a
and b
).
In addition, notice that the value of b
from the combined error model is close to zero, which also suggests the constant error model is better.
fitConst.ErrorParameters
ans = a _____ Conc 0.178
fitComb.ErrorParameters
ans = a b _______ ________ Conc 0.17485 0.001897
Display Estimated Parameter Values
Show the estimated parameter values of each error model.
allResults = [fitConst,fitProp,fitExp,fitComb]; Error_Model = cell(4,1); Estimated_Central = zeros(4,1); Estimated_Cl_Central = zeros(4,1); t = table(Error_Model,Estimated_Central,Estimated_Cl_Central); for i = 1:height(t) t{i,1} = {allResults(i).ErrorModel}; t{i,2} = allResults(i).ParameterEstimates.Estimate(1); t{i,3} = allResults(i).ParameterEstimates.Estimate(2); end t
t = Error_Model Estimated_Central Estimated_Cl_Central ______________ _________________ ____________________ 'constant' 1.6993 0.53352 'proportional' 1.8833 0.5122 'exponential' 1.7996 0.51951 'combined' 1.6999 0.53255
Conclusion
This example showed how to estimate PK parameters, namely the volume of the central compartment and clearance parameter of an individual, by fitting the PK profile data to onecompartment model. You compared SSE and estimated parameter values of different error models to see which model best explained the data. Final fitted results suggested both the constant and combined error models provided the closest estimates to the parameter values used to generate the data. However, the constant error model is a better model since it uses one fewer parameter than the combined model.
This example shows how to estimate pharmacokinetic parameters of multiple individuals using a twocompartment model.
Suppose you have drug plasma concentration data from three individuals
that you want to use to estimate corresponding pharmacokinetic parameters,
namely the volume of central and peripheral compartment (Central
, Peripheral
),
the clearance rate (Cl_Central
), and intercompartmental
clearance (Q12
). Assume the drug concentration
versus the time profile follows the biexponential decline $${C}_{t}=A{e}^{at}+B{e}^{bt}$$, where C_{t} is
the drug concentration at time t, a and b are
slopes for corresponding exponential declines.
The synthetic data set contains drug plasma concentration data measured in both central and peripheral compartments. The data was generated using a twocompartment model with an infusion dose and firstorder elimination. These parameters were used for each individual.
Central  Peripheral  Q12  Cl_Central  

Individual 1  1.90  0.68  0.24  0.57 
Individual 2  2.10  6.05  0.36  0.95 
Individual 3  1.70  4.21  0.46  0.95 
The data is stored as a table with variables ID
, Time
, CentralConc
,
and PeripheralConc
. It represents the time course
of plasma concentrations measured at eight different time points for
both central and peripheral compartments after an infusion dose.
clear all load(fullfile(matlabroot,'examples','simbio','data10_32R.mat'))
Convert the data set to a groupedData
object
which is the required data format for the fitting function sbiofit
for
later use. A groupedData
object also lets you set
independent variable and group variable names (if they exist). Set
the units of the ID
, Time
, CentralConc
,
and PeripheralConc
variables. The units are optional
and only required for the UnitConversion
feature, which automatically converts
matching physical quantities to one consistent unit system.
gData = groupedData(data); gData.Properties.VariableUnits = {'','hour','milligram/liter','milligram/liter'}; gData.Properties
ans = Description: '' VariableDescriptions: {} VariableUnits: {1x4 cell} DimensionNames: {'Row' 'Variable'} UserData: [] RowNames: {} VariableNames: {'ID' 'Time' 'CentralConc' 'PeripheralConc'} GroupVariableName: 'ID' IndependentVariableName: 'Time'
Create a trellis plot that shows the PK profiles of three individuals.
sbiotrellis(gData,'ID','Time',{'CentralConc','PeripheralConc'});
Use the builtin PK library to construct a twocompartment model with infusion dosing and firstorder elimination where the elimination rate depends on the clearance and volume of central compartment. Use the configset object to turn on unit conversion.
pkmd = PKModelDesign; pkc1 = addCompartment(pkmd,'Central'); pkc1.DosingType = 'Infusion'; pkc1.EliminationType = 'linearclearance'; pkc1.HasResponseVariable = true; pkc2 = addCompartment(pkmd,'Peripheral'); model = construct(pkmd); configset = getconfigset(model); configset.CompileOptions.UnitConversion = true;
Assume every individual receives an infusion dose at time = 0, with a total infusion amount of 100 mg at a rate of 50 mg/hour. For details on setting up different dosing strategies, see Doses.
dose = sbiodose('dose','TargetName','Drug_Central'); dose.StartTime = 0; dose.Amount = 100; dose.Rate = 50; dose.AmountUnits = 'milligram'; dose.TimeUnits = 'hour'; dose.RateUnits = 'milligram/hour';
The data contains measured plasma concentrations in the
central and peripheral compartments. Map these variables to the appropriate
model species, which are Drug_Central
and Drug_Peripheral
.
responseMap = {'Drug_Central = CentralConc','Drug_Peripheral = PeripheralConc'};
The parameters to estimate in this model are the volumes
of central and peripheral compartments (Central
and Peripheral
),
intercompartmental clearance Q12
, and clearance
rate Cl_Central
. In this case, specify logtransform
for Central
and Peripheral
since
they are constrained to be positive. The estimatedInfo
object
lets you specify parameter transforms, initial values, and parameter
bounds (optional).
paramsToEstimate = {'log(Central)','log(Peripheral)','Q12','Cl_Central'}; estimatedParam = estimatedInfo(paramsToEstimate,'InitialValue',[1 1 1 1]);
Fit the model to all of the data pooled together, that
is, estimate one set of parameters for all individuals. The default
estimation method that sbiofit
uses will change
depending on which toolboxes are available. To see which estimation
function sbiofit
used for the fitting, check
the EstimationFunction
property of the corresponding
results object.
pooledFit = sbiofit(model,gData,responseMap,estimatedParam,dose,'Pooled',true)
pooledFit = OptimResults with properties: ExitFlag: 3 Output: [1x1 struct] GroupName: [] Beta: [4x3 table] ParameterEstimates: [4x3 table] J: [24x4x2 double] COVB: [4x4 double] CovarianceMatrix: [4x4 double] R: [24x2 double] MSE: 6.6353 SSE: 291.9549 Weights: [] EstimatedParameterNames: {'Central' 'Peripheral' 'Q12' 'Cl_Central'} ErrorModelInfo: [1x3 table] EstimationFunction: 'lsqnonlin' ErrorModel: 'constant' ErrorParameters: [1x1 table]
Plot the fitted results versus the original data. Although three separate plots were generated, the data was fitted using the same set of parameters (that is, all three individuals had the same fitted line).
plot(pooledFit);
Estimate one set of parameters for each individual and see if there is any improvement in the parameter estimates. In this example, since there are three individuals, three sets of parameters are estimated.
unpooledFit = sbiofit(model,gData,responseMap,estimatedParam,dose,'Pooled',false);
Plot the fitted results versus the original data. Each individual was fitted differently (that is, each fitted line is unique to each individual) and each line appeared to fit well to individual data.
plot(unpooledFit);
Display the fitted results of the first individual. The MSE was lower than that of the pooled fit. This is also true for the other two individuals.
unpooledFit(1)
ans = OptimResults with properties: ExitFlag: 3 Output: [1x1 struct] GroupName: [1x1 categorical] Beta: [4x3 table] ParameterEstimates: [4x3 table] J: [8x4x2 double] COVB: [4x4 double] CovarianceMatrix: [4x4 double] R: [8x2 double] MSE: 1.9834 SSE: 23.8007 Weights: [] EstimatedParameterNames: {'Central' 'Peripheral' 'Q12' 'Cl_Central'} ErrorModelInfo: [1x3 table] EstimationFunction: 'lsqnonlin' ErrorModel: 'constant' ErrorParameters: [1x1 table]
Generate a plot of the residuals over time to compare the pooled and unpooled fit results. The figure indicates unpooled fit residuals are smaller than those of pooled fit as expected. In addition to comparing residuals, other rigorous crieteria can be used to compare the fitted results.
t = [gData.Time;gData.Time]; res_pooled = vertcat(pooledFit.R); res_pooled = res_pooled(:); res_unpooled = vertcat(unpooledFit.R); res_unpooled = res_unpooled(:); plot(t,res_pooled,'o','MarkerFaceColor','w','markerEdgeColor','b') hold on plot(t,res_unpooled,'o','MarkerFaceColor','b','markerEdgeColor','b') refl = refline(0,0); % A reference line representing a zero residual title('Residuals versus Time'); xlabel('Time'); ylabel('Residuals'); legend({'Pooled','Unpooled'});
This example showed how to perform pooled and unpooled
estimations using sbiofit
. As illustrated, the
unpooled fit accounts for variations due to the specific subjects
in the study, and, in this case, the model fits better to the data.
However, the pooled fit returns populationwide parameters. If you
want to estimate populationwide parameters while considering individual
variations, use sbiofitmixed
.
This example shows how to estimate categoryspecific (such as young versus old, male versus female), individualspecific, and populationwide parameters using PK profile data from multiple individuals.
Background
Suppose you have drug plasma concentration data from 30 individuals and want to estimate pharmacokinetic parameters, namely the volumes of central and peripheral compartment, the clearance, and intercompartmental clearance. Assume the drug concentration versus the time profile follows the biexponential decline , where is the drug concentration at time t, and and are slopes for corresponding exponential declines.
Load Data
This synthetic data contains the time course of plasma concentrations of 30 individuals after a bolus dose (100 mg) measured at different times for both central and peripheral compartments. It also contains categorical variables, namely Sex
and Age
.
clear load(fullfile(matlabroot,'examples','simbio','sd5_302RAgeSex.mat'))
Convert to groupedData Format
Convert the data set to a groupedData
object, which is the required data format for the fitting function sbiofit
. A groupedData
object also allows you set independent variable and group variable names (if they exist). Set the units of the ID
, Time
, CentralConc
, PeripheralConc
, Age
, and Sex
variables. The units are optional and only required for the UnitConversion
feature, which automatically converts matching physical quantities to one consistent unit system.
gData = groupedData(data); gData.Properties.VariableUnits = {'','hour','milligram/liter','milligram/liter','',''}; gData.Properties
The IndependentVariableName
and GroupVariableName
properties have been automatically set to the Time
and ID
variables of the data.
ans = Description: '' VariableDescriptions: {} VariableUnits: {1x6 cell} DimensionNames: {'Row' 'Variable'} UserData: [] RowNames: {} VariableNames: {1x6 cell} GroupVariableName: 'ID' IndependentVariableName: 'Time'
Visualize Data
Display the response data for each individual.
sbiotrellis(gData,'ID','Time',{'CentralConc','PeripheralConc'});
Set Up a TwoCompartment Model
Use the builtin PK library to construct a twocompartment model with infusion dosing and firstorder elimination where the elimination rate depends on the clearance and volume of the central compartment. Use the configset
object to turn on unit conversion.
pkmd = PKModelDesign; pkc1 = addCompartment(pkmd,'Central'); pkc1.DosingType = 'Bolus'; pkc1.EliminationType = 'linearclearance'; pkc1.HasResponseVariable = true; pkc2 = addCompartment(pkmd,'Peripheral'); model = construct(pkmd); configset = getconfigset(model); configset.CompileOptions.UnitConversion = true;
For details on creating compartmental PK models using the SimBiology® builtin library, see Create Pharmacokinetic Models.
Define Dosing
Assume every individual receives a bolus dose of 100 mg at time = 0. For details on setting up different dosing strategies, see Doses.
dose = sbiodose('dose','TargetName','Drug_Central'); dose.StartTime = 0; dose.Amount = 100; dose.AmountUnits = 'milligram'; dose.TimeUnits = 'hour';
Map the Response Data to Corresponding Model Components
The data contains measured plasma concentration in the central and peripheral compartments. Map these variables to the appropriate model components, which are Drug_Central
and Drug_Peripheral
.
responseMap = {'Drug_Central = CentralConc','Drug_Peripheral = PeripheralConc'};
Specify Parameters to Estimate
Specify the volumes of central and peripheral compartments Central
and Peripheral
, intercompartmental clearance Q12
, and clearance Cl_Central
as parameters to estimate. The estimatedInfo
object lets you optionally specify parameter transforms, initial values, and parameter bounds. Since both Central
and Peripheral
are constrained to be positive, specify a logtransform for each parameter.
paramsToEstimate = {'log(Central)', 'log(Peripheral)', 'Q12', 'Cl_Central'}; estimatedParam = estimatedInfo(paramsToEstimate,'InitialValue',[1 1 1 1]);
Estimate IndividualSpecific Parameters
Estimate one set of parameters for each individual by setting the 'Pooled'
namevalue pair argument to false
.
unpooledFit = sbiofit(model,gData,responseMap,estimatedParam,dose,'Pooled',false);
Display Results
Plot the fitted results versus the original data for each individual (group).
plot(unpooledFit);
For an unpooled fit, sbiofit
always returns one results object for each individual.
Examine Parameter Estimates for Category Dependencies
Explore the unpooled estimates to see if there is any categoryspecific parameters, that is, if some parameters are related to one or more categories. If there are any category dependencies, it might be possible to reduce the number of degrees of freedom by estimating just categoryspecific values for those parameters.
First extract the ID and category values for each ID
catParamValues = unique(gData(:,{'ID','Sex','Age'}));
Add variables to the table containing each parameter's estimate.
allParamValues = vertcat(unpooledFit.ParameterEstimates); catParamValues.Central = allParamValues.Estimate(strcmp(allParamValues.Name, 'Central')); catParamValues.Peripheral = allParamValues.Estimate(strcmp(allParamValues.Name, 'Peripheral')); catParamValues.Q12 = allParamValues.Estimate(strcmp(allParamValues.Name, 'Q12')); catParamValues.Cl_Central = allParamValues.Estimate(strcmp(allParamValues.Name, 'Cl_Central'));
Plot estimates of each parameter for each category. gscatter
requires Statistics and Machine Learning Toolbox™. If you do not have it, use other alternative plotting functions such as plot
.
h = figure; ylabels = {'Central','Peripheral','Cl\_Central','Q12'}; plotNumber = 1; for i = 1:4 thisParam = estimatedParam(i).Name; % Plot for Sex category subplot(4,2,plotNumber); plotNumber = plotNumber + 1; gscatter(double(catParamValues.Sex), catParamValues.(thisParam), catParamValues.Sex); ax = gca; ax.XTick = []; ylabel(ylabels(i)); % Plot for Age category subplot(4,2,plotNumber); plotNumber = plotNumber + 1; gscatter(double(catParamValues.Age), catParamValues.(thisParam), catParamValues.Age); ax = gca; ax.XTick = []; ylabel(ylabels(i)); end
Based on the plot, it seems that young individuals tend to have higher volumes of central and peripheral compartments (Central
, Peripheral
) than old invididuals (that is, the volumes seem to be agespecific). In addition, males tend to have lower clearance rates (Cl_Central
) than females but the opposite for the Q12 parameter (that is, the clearance and Q12 seem to be sexspecific).
Estimate CategorySpecific Parameters
Use the 'CategoryVariableName'
property of the estimatedInfo
object to specify which category to use during fitting. Use 'Sex'
as the group to fit for the clearrance Cl_Central
and Q12
parameters. Use 'Age'
as the group to fit for the Central
and Peripheral
parameters.
estimatedParam(1).CategoryVariableName = 'Age'; estimatedParam(2).CategoryVariableName = 'Age'; estimatedParam(3).CategoryVariableName = 'Sex'; estimatedParam(4).CategoryVariableName = 'Sex'; categoryFit = sbiofit(model,gData,responseMap,estimatedParam,dose)
When fitting by category (or group), sbiofit
always returns one results object, not one for each category level. This is because both male and female individuals are considered to be part of the same optimization using the same error model and error parameters, similarly for the young and old individuals.
categoryFit = OptimResults with properties: ExitFlag: 3 Output: [1x1 struct] GroupName: [] Beta: [8x5 table] ParameterEstimates: [120x6 table] J: [240x8x2 double] COVB: [8x8 double] CovarianceMatrix: [8x8 double] R: [240x2 double] MSE: 0.4720 SSE: 222.7673 Weights: [] EstimatedParameterNames: {'Central' 'Peripheral' 'Q12' 'Cl_Central'} ErrorModelInfo: [1x3 table] EstimationFunction: 'lsqnonlin' ErrorModel: 'constant' ErrorParameters: [1x1 table]
Plot Results
Plot the categoryspecific estimated results.
plot(categoryFit);
For the Cl_Central
and Q12
parameters, all males had the same estimates, and similarly for the females. For the Central
and Peripheral
parameters, all young individuals had the same estimates, and similarly for the old individuals.
Estimate PopulationWide Parameters
To better compare the results, fit the model to all of the data pooled together, that is, estimate one set of parameters for all individuals by setting the 'Pooled'
namevalue pair argument to true
. The warning message tells you that this option will ignore any categoryspecific information (if they exist).
pooledFit = sbiofit(model,gData,responseMap,estimatedParam,dose,'Pooled',true);
Warning: You called SBIOFIT using the Pooled option. The CategoryVariableName values of the ESTIMINFO input will be ignored.
Plot Results
Plot the fitted results versus the original data. Although a separate plot was generated for each individual, the data was fitted using the same set of parameters (that is, all individuals had the same fitted line).
plot(pooledFit);
Compare Residuals
Compare residuals of CentralConc
and PeripheralConc
responses for each fit.
t = gData.Time; allResid(:,:,1) = pooledFit.R; allResid(:,:,2) = categoryFit.R; allResid(:,:,3) = vertcat(unpooledFit.R); figure; responseList = {'CentralConc', 'PeripheralConc'}; for i = 1:2 subplot(2,1,i); oneResid = squeeze(allResid(:,i,:)); plot(t,oneResid,'o'); refline(0,0); % A reference line representing a zero residual title(sprintf('Residuals (%s)', responseList{i})); xlabel('Time'); ylabel('Residuals'); legend({'Pooled','CategorySpecific','Unpooled'}); end
As shown in the plot, the unpooled fit produced the best fit to the data as it fit the data to each indivdual. This was expected since it used the most number of degrees of freedom. The categoryfit reduced the number of degrees of freedom by fitting the data to two categories (sex and age). As a result, the residuals were larger than the unpooled fit, but still smaller than the populationfit, which estimated just one set of parameters for all individuals. The categoryfit might be a good compromise between the unpooled and pooled fitting provided that any hierarchical model exists within your data.
This example uses the yeast heterotrimeric G protein model and experimental data reported by [1]. For details about the model, see the Background section in Parameter Scanning, Parameter Estimation, and Sensitivity Analysis in the Yeast Heterotrimeric G Protein Cycle.
Load the G protein model.
sbioloadproject gprotein
Store the experimental data containing the time course for the fraction of active G protein as reported in the reference paper [1].
time = [0 10 30 60 110 210 300 450 600]'; GaFracExpt = [0 0.35 0.4 0.36 0.39 0.33 0.24 0.17 0.2]';
Create a groupedData
object based on
the experimental data.
tbl = table(time,GaFracExpt); grpData = groupedData(tbl);
Map the appropriate model component to the experimental
data. In other words, indicate which species in the model corresponds
to which response variable in the data. In this example, map the model
parameter GaFrac
to the experimental data variable GaFracExpt
from grpData
.
responseMap = 'GaFrac = GaFracExpt';
Use an estimatedInfo
object to define
the model parameter kGd
as a parameter to be estimated.
estimatedParam = estimatedInfo('kGd');
Perform the parameter estimation.
fitResult = sbiofit(m1,grpData,responseMap,estimatedParam);
View the estimated parameter value of kGd
.
fitResult.ParameterEstimates
ans = Name Estimate StandardError _____ ________ _____________ 'kGd' 0.11 0.00037969
sm
— SimBiology modelSimBiology model objectSimBiology model, specified as a SimBiology model object
. The active configset object
of the model contains
solver settings for simulation. Any active doses and variants are
applied to the model during simulation unless specified otherwise
using the dosing
and variants
input
arguments, respectively.
grpData
— Data to fitgroupedData objectData to fit, specified as a groupedData
object
.
The name of the time variable must be defined in the IndependentVariableName
property
of grpData
. For instance, if the time variable
name is 'TIME'
, then specify it as follows.
grpData.Properties.IndependentVariableName = 'TIME';
If the data contains more than one group of measurements, the
grouping variable name must be defined in the GroupVariableName
property
of grpData
. For example, if the grouping variable
name is 'GROUP'
, then specify it as follows.
grpData.Properties.GroupVariableName = 'GROUP';
Note:

responseMap
— Mapping information of model components to grpData
string  cell array of stringsMapping information of model components to grpData
,
specified as a string or cell array of strings.
Each string is an equationlike expression, similar to assignment
rules in SimBiology. It contains the name (or qualified name) of a
quantity (species, compartment, or parameter) in the model sm
,
followed by the character '='
and the name of a
variable in grpData
. For clarity, white spaces
are allowed between names and '='
.
For example, if you have the concentration data 'CONC'
in grpData
for
a species 'Drug_Central'
, you can specify it as
follows.
responseMap = 'Drug_Central = CONC';
To unambiguously name a species, use the qualified name, which
includes the name of the compartment. To name a reactionscoped parameter,
use the reaction name to qualify the parameter. If the name is not
a valid MATLAB^{®} variable name, surround it by square brackets
such as [reaction 1].[parameter 1]
.
An error is issued if any (qualified) name matches two components of the same type. However, you can use a (qualified) name that matches two components of different types, and the function first finds the species with the given name, followed by compartments and then parameters.
estiminfo
— Estimated parametersestimatedInfo
object  vector of estimatedInfo
objectsEstimated parameters, specified as an estimatedInfo
object
or vector of estimatedInfo
objects
that defines the estimated parameters in the model sm
,
and other optional information such as their initial estimates, transformations,
bound constraints, and categories. Supported transforms are log
, logit
,
and probit
.
If you do not specify Pooled
namevalue
pair argument, sbiofit
uses CategoryVariableName
property
of estiminfo
to decide if parameters should be
estimated for each individual, group, category, or all individuals
as a whole. Use the Pooled
option to override
any CategoryVariableName
values. For details about CategoryVariableName
property,
see estimatedInfo object
.
Note:

dosing
— Dosing information[]
 2D matrix of dose objectsDosing information, specified as an empty array or 2D matrix
of dose objects (ScheduleDose object
or RepeatDose object
). If empty, no doses
are applied during simulation, even if the model has active doses.
If not empty, the matrix must have a single row or one row per group
in the data grpData
. If it has a single row,
same doses are applied to all groups during simulation. If it has
multiple rows, each row is applied to a separate group, in the same
order as the groups appear in grpData
.
Multiple columns are allowed so that you can apply multiple
dose objects to each group. Each column of doses must reference the
same components in the model sm
. Specifically,
all doses (except for empty doses) in a column must have the same
values for TargetName
, DurationParameterName
,
and LagParameterName
. If some groups require more
doses than others, then fill in the matrix with dummy doses that are
either default doses or empty doses.
A default dose has default values for all properties, except
for the Name
property. An empty dose has a dose
amount of 0, thus having no effect on the model. Create a default
dose as follows.
d1 = sbiodose('d1');
grpData
has
dosing information, you can use the createDoses
method
to construct doses.
functionName
— Estimation function namestringEstimation function name, specified as a string. Choices are as follows.
'nlinfit'
(Statistics and Machine Learning Toolbox is
required.)
'fminunc'
(Optimization Toolbox is
required.)
'fmincon'
(Optimization Toolbox is
required.)
'fminsearch'
'lsqcurvefit'
(Optimization Toolbox is
required.)
'lsqnonlin'
(Optimization Toolbox is
required.)
'patternserch'
(Global Optimization Toolbox is
required.)
'ga'
(Global Optimization Toolbox is
required.)
'particleswarm'
(Global Optimization Toolbox is
required.)
By default, sbiofit
uses the first available
estimation function among the following: lsqnonlin
, nlinfit
, or fminsearch
.
options
— Options specific to estimation functionstruct  optimoptions objectOptions specific to the estimation function, specified as a
struct or optimoptions
object.
optimset
struct for fminsearch
optimoptions
object for lsqcurvefit
, lsqnonlin
, fminunc
,
and particleswarm
gaoptimset
struct for ga
psoptimset
struct for patternsearch
statset
struct for nlinfit
See Default Options for Estimation Algorithms for more details and default options associated with each estimation function.
variants
— Variants[]
 vector of variant objectsVariants, specified as an empty array or vector of variant objects. If empty, no variants are used even if the model has active variants.
Specify optional commaseparated pairs of Name,Value
arguments.
Name
is the argument
name and Value
is the corresponding
value. Name
must appear
inside single quotes (' '
).
You can specify several name and value pair
arguments in any order as Name1,Value1,...,NameN,ValueN
.
'ErrorModel','constant','UseParallel',true
specifies
a constant error model and to run simulations in parallel during parameter
estimation.'ErrorModel'
— Error model'constant'
(default)  cell array  categorical vector  tableError model(s) used for estimation, specified as a string, cell array of strings, categorical vector, or a table.
If it is a cell array or categorical vector, its length must
match the number of responses in the responseMap
argument.
If it is a table, it must contain a single variable that is
a column vector of error model names. The names can be a cell array
of strings or a vector of categorical variables. If the table has
more than one row, then the RowNames
property must
match the response variable names specified in the right hand side
of responseMap
. If the table does not use the RowNames
property,
the nth error is associated with the nth
response.
If you specify only one error model, then sbiofit
estimates
one set of error parameter(s) for all responses.
You can specify separate error models only if you are using
these methods: lsqnonlin
, lsqcurvefit
, fmincon
, fminunc
, fminsearch
, patternsearch
, ga
,
and particleswarm
.
There are four builtin error models. Each model defines the error using a standard meanzero and unitvariance (Gaussian) variable e, simulation results f, and one or two parameters a and b.
'constant'
: $$y=f+ae$$
'proportional'
: $$y=f+b\leftf\righte$$
'combined'
: $$y=f+(a+b\leftf\right)e$$
'exponential'
: $$y=f\ast \mathrm{exp}(ae)$$
Note: If you specify an error model, you cannot specify weights except for the constant error model. 
'Weights'
— Weightsmatrix  function handleWeights used for estimation, specified as a matrix of real positive weights, where the number of columns corresponds to the number of responses, or a function handle that accepts a vector of predicted response values and returns a vector of real positive weights.
If you specify an error model, you cannot use weights except
for the constant error model. If neither the 'ErrorModel'
or 'Weights'
is
specified, by default the function uses the constant error model with
equal weights.
'Pooled'
— Fit option flagfalse
(default)  true
Fit option flag, specifying whether to fit each individual (false
)
or pool all individual data (true
).
When true
, the function performs fitting
for all individuals or groups simultaneously using the same parameter
estimates, and fitResults
is a scalar results
object.
When false
, the function fits each group
or individual separately using group or individualspecific parameters,
and fitResults
is a vector of results objects
with one result for each group.
Note:
Use this option to override any 
'UseParallel'
— Flag for parallel simulationsfalse
(default)  true
Flag for parallel simulations during fitting, specified as true
or false
.
If true
and Parallel Computing Toolbox™ is available,
the function runs simulations in parallel.
'SensitivityAnalysis'
— Flag to use parameter sensitivities to determine gradients
of the objective functiontrue
 false
Flag to use parameter sensitivities to determine gradients of
the objective function, specified as true
or false
.
By default, it is true
for fmincon
, fminunc
, lsqnonlin
,
and lsqcurvefit
methods. Otherwise, it is false
.
SimBiology uses the complexstep approximation method to calculate
parameter sensitivities. Such calculated sensitivities can be used
to determine gradients of the objective function during parameter
estimation to improve fitting. The default behavior of sbiofit
is
to use such sensitivities to determine gradients whenever the algorithm
is gradientbased and if the SimBiology^{®} model supports sensitivity
analysis. For details about the model requirements and sensitivity
analysis, see
Sensitivity
Calculation.
fitResults
— Estimation resultsOptimResults object
 NLINResults object
 Vector of results objectsEstimation results, returned as a OptimResults
object
or NLINResults object
or
a vector of these objects. Both results objects are subclasses of
the LeastSquaresResults object
.
If the function uses nlinfit
,
then fitResults
is a NLINResults
object
. Otherwise, fitResults
is an OptimResults object
.
When 'Pooled'
is set to false
,
the function fits each group separately using groupspecific parameters,
and fitResults
is a vector of results objects
with one results object for each group.
When 'Pooled'
is set to true
,
the function performs fitting for all individuals or groups simultaneously
using the same parameter estimates, and fitResults
is
a scalar results object.
When 'Pooled'
is not used, and CategoryVariableName
values
of estiminfo
are all <none>
, fitResults
is
a single result object. This is the same behavior as setting 'Pooled'
to true
.
When 'Pooled'
is not used, and CategoryVariableName
values
of estiminfo
are all <GroupVariableName>
, fitResults
is
a vector of results objects. This is the same behavior as setting 'Pooled'
to false
.
In all other cases, fitResults
is a scalar
object containing estimated parameter values for different groups
or categories specified by CategoryVariableName
.
simdata
— Simulation resultsVector of SimData
objectsSimulation results, returned as a vector of SimData
objects
representing simulation results for each group or individual.
If the 'Pooled'
option is false
,
then each simulation uses groupspecific parameter values. If true
,
then all simulations use the same (populationwide) parameter values.
The states reported in simdata
are the
states that were included in the responseMap
input
argument as well as any other states listed in the StatesToLog
property of the runtime options
(RuntimeOptions
) of the SimBiology model sm
.
The function sbiofit
uses maximum likelihood
estimation (MLE) for fitting. Mathematically, the problem is typically
formulated as the minimization of the negative of the logarithm of
the likelihood function. For the constant error model, this is equivalent
to the method of leastsquares, which minimizes the sum of squares
of the residuals. For other error models, the objective function has
additional terms. When the error models are normally distributed,
the negative loglikelihood can be written as follows.
$$\begin{array}{l}\mathrm{ln}\mathcal{L}={\displaystyle \sum}_{i}^{N}\frac{{\left({y}_{i}f\left({x}_{i};p\right)\right)}^{2}}{2{\sigma}_{i}^{2}}+{\displaystyle \sum}_{i}^{N}\mathrm{ln}\sqrt{2\pi {\sigma}_{i}^{2}}\\ \end{array}$$
$$\mathcal{L}$$  Log likelihood 
N  Number of experimental observations 
y_{i}  The ith experimental observation 
$f\left({x}_{i};p\right)$  Predicted value of the ith observation, which is a function of independent variables x_{i} and estimated parameters p 
${\sigma}_{i}$  Standard deviation of the ith observation. For instance, for the combined error model $$y=f+(a+b\leftf\right)e$$, the standard deviation is $\sigma =a+b\leftf\right$. 
When you use numeric weights or the weight function, the weights are assumed to be proportional to the variance of the error, i.e., $${\sigma}_{i}^{2}={a}^{2}\cdot {w}_{i}$$ where a is the constant error parameter. If you use weights, you cannot specify an error model except the constant error model.
Various parameter estimation methods have different requirements on the objective function. For some methods, the estimation of model parameters is performed independently of the estimation of the error model parameters. The following table summarizes the objective function F and any separate formulas used for the estimation of error model parameters. r_{i} is the ith residual, a and b are error model parameters, and f_{i} is the predicted value of the ith observation.
Method  Supported Error Model  Objective Function  Error Parameter Estimation Function 

lsqnonlin
 'constant'  $$\overrightarrow{F}={r}_{i}$$  $${a}^{2}=\frac{1}{N}{\displaystyle \sum \frac{{r}_{i}^{2}}{{w}_{i}}}$$ 
lsqnonlin
 Numeric weights (w_{i})  $$\overrightarrow{F}=\frac{{r}_{i}}{\sqrt{{w}_{i}}}$$  $${a}^{2}=\frac{1}{N}{\displaystyle \sum \frac{{r}_{i}^{2}}{{w}_{i}}}$$ 
lsqnonlin
 'exponential'  $$\overrightarrow{F}=\mathrm{ln}{r}_{i}$$  $${a}^{2}=\frac{1}{N}{\displaystyle \sum \frac{{r}_{i}^{2}}{{w}_{i}}}$$ 
nlinfit  All  $$\overrightarrow{F}={f}_{i}$$  nlinfit also estimates error parameters. 
All others  'constant'  $$F=\mathrm{ln}\mathcal{L}$$  $${a}^{2}=\frac{1}{N}{\displaystyle \sum \frac{{r}_{i}^{2}}{{w}_{i}}}$$ 
All others  'proportional'  $$F=\mathrm{ln}\mathcal{L}$$  ${b}^{2}=\frac{1}{N}{{\displaystyle \sum}}^{\text{}}{\left(\raisebox{1ex}{${r}_{i}$}\!\left/ \!\raisebox{1ex}{${f}_{i}$}\right.\right)}^{2}$ 
All others  'combined'  $$F=\mathrm{ln}\mathcal{L}$$  Error parameters are included in minimization. 
All others  'exponential'  $$F={\displaystyle \sum}_{i}^{N}\frac{{\left(\mathrm{ln}{y}_{i}\mathrm{ln}f\left({x}_{i};p\right)\right)}^{2}}{2{a}^{2}}+{\displaystyle \sum}_{i}^{N}\mathrm{ln}\sqrt{2\pi {a}^{2}}$$, where $$a$$ is the standard deviation of the logtransformed ith observation.  $${a}^{2}=\frac{1}{N}{\displaystyle \sum \frac{{r}_{i}^{2}}{{w}_{i}}}$$ 
All others  Weights  $$F=\mathrm{ln}\mathcal{L}$$  $${a}^{2}=\frac{1}{N}{\displaystyle \sum \frac{{r}_{i}^{2}}{{w}_{i}}}$$ 
Function  Default Options  

nlinfit 
 
fminunc 
Display = 'off'; TolFun = 1e6*[Initial objective function value] Algorithm = 'quasinewton' FinDiffRelStep = max(eps^(1/3), SolverOptions.RelativeTolerance) TypicalX = 1e6*[Initial estimates]  
fminsearch 
 
lsqcurvefit  Requires Optimization Toolbox.
 
lsqnonlin  Requires Optimization Toolbox.
 
patternsearch  Requires Global Optimization Toolbox.
 
ga  Requires Global Optimization Toolbox.
 
particleswarm  Requires Global Optimization Toolbox. sbiofit uses
the following default options for the particleswarm algorithm,
except for:

[1] Yi, TM., Kitano, H., and Simon, M. (2003). A quantitative characterization of the yeast heterotrimeric G protein cycle. PNAS. 100, 10764–10769.
estimatedInfo object
 fmincon
 fminsearch
 fminunc
 ga
 groupedData object
 LeastSquaresResults object
 lsqcurvefit
 lsqnonlin
 nlinfit
 NLINResults object
 OptimResults object
 particleswarm
 patternsearch
 sbiofitmixed