Documentation Center

  • Trial Software
  • Product Updates

sbiofit

Perform nonlinear least-squares regression

Syntax

  • fitResults = sbiofit(sm,grpData,responseMap,estiminfo) example
  • fitResults = sbiofit(sm,grpData,responseMap,estiminfo,dosing) example
  • fitResults = sbiofit(sm,grpData,responseMap,estiminfo,dosing,functionName) example
  • fitResults = sbiofit(sm,grpData,responseMap,estiminfo,dosing,functionName,options) example
  • fitResults = sbiofit(sm,grpData,responseMap,estiminfo,dosing,functionName,options,variants) example
  • fitResults = sbiofit(_,Name,Value) example
  • [fitResults,simdata] = sbiofit(_) example

Description

example

fitResults = sbiofit(sm,grpData,responseMap,estiminfo) estimates parameters of a SimBiology model sm using nonlinear least-squares regression.

grpData is a groupedData object specifying the data to fit. responseMap is a string or cell array of strings that maps model components to response data in grpData. estimatedInfo is an estimatedInfo object that defines the estimated parameters in the model sm.

If Statistics Toolbox™ is available, the function uses nlinfit for estimation as the default algorithm. If not, but Optimization Toolbox™ is available, fminunc is used. If both are unavailable, fminsearch is used.

If the model contains active doses and variants, they are applied during the simulation.

example

fitResults = sbiofit(sm,grpData,responseMap,estiminfo,dosing) uses the dosing information specified by a matrix of SimBiology dose objects dosing instead of using the active doses of the model sm if there is any.

example

fitResults = sbiofit(sm,grpData,responseMap,estiminfo,dosing,functionName) uses the estimation function specified by functionName. If the specified function is unavailable, a warning is issued and the first available default function is used.

example

fitResults = sbiofit(sm,grpData,responseMap,estiminfo,dosing,functionName,options) uses the additional options specified by options for the function functionName.

example

fitResults = sbiofit(sm,grpData,responseMap,estiminfo,dosing,functionName,options,variants) applies variant objects specified as variants instead of using any active variants of the model.

example

fitResults = sbiofit(_,Name,Value) uses additional options specified by one or more Name,Value pair arguments.

example

[fitResults,simdata] = sbiofit(_) returns a vector of result objects fitResults and a vector of SimData objects simdata using any of the input arguments in the previous syntaxes.

    Note:   sbiofit unifies sbionlinfit and sbioparamestim estimation functions. Use sbiofit to perform nonlinear least-squares regression.

Examples

expand all

Estimate a Parameter from the Yeast G protein Model

This example uses the yeast heterotrimeric G protein model and experimental data reported by [1]. For details about the model, see the Background section in Parameter Scanning, Parameter Estimation, and Sensitivity Analysis in the Yeast Heterotrimeric G Protein Cycle.

Load the G protein model.

sbioloadproject gprotein

Store the experimental data containing the time course for the fraction of active G protein as reported in the reference paper [1].

time = [0 10 30 60 110 210 300 450 600]';
GaFracExpt = [0 0.35 0.4 0.36 0.39 0.33 0.24 0.17 0.2]';

Create a groupedData object based on the experimental data.

tbl = table(time,GaFracExpt);
grpData = groupedData(tbl);

Map the appropriate model component to the experimental data. In other words, indicate which species in the model corresponds to which response variable in the data. In this example, map the model parameter GaFrac to the experimental data variable GaFracExpt from grpData.

responseMap = 'GaFrac = GaFracExpt';

Use an estimatedInfo object to define the model parameter kGd as a parameter to be estimated.

estimatedParam = estimatedInfo('kGd');

Perform the parameter estimation.

fitResult = sbiofit(m1,grpData,responseMap,estimatedParam);

View the estimated parameter value of kGd.

fitResult.ParameterEstimates
ans = 

    Name     Estimate    StandardError
    _____    ________    _____________

    'kGd'    0.11        0.00037969   

Input Arguments

expand all

sm — SimBiology modelSimBiology model object

SimBiology model, specified as a SimBiology model object. The active configset object of the model contains solver settings for simulation. Any active doses and variants are applied to the model during simulation unless specified otherwise using the dosing and variants input arguments, respectively.

grpData — Data to fitgroupedData object

Data to fit, specified as a groupedData object.

The name of the time variable must be defined in the IndependentVariableName property of grpData. For instance, if the time variable name is 'TIME', then specify it as follows.

grpData.Properties.IndependentVariableName = 'TIME';

If the data contains more than one group of measurements, the grouping variable name must be defined in the GroupVariableName property of grpData. For example, if the grouping variable name is 'GROUP', then specify it as follows.

grpData.Properties.GroupVariableName = 'GROUP';

A group usually refers to a set of measurements that represent a single time course, often corresponding to a particular individual or experimental condition.

responseMap — Mapping information of model components to grpDatastring | cell array of strings

Mapping information of model components to grpData, specified as a string or cell array of strings.

Each string is an equation-like expression, similar to assignment rules in SimBiology. It contains the name (or qualified name) of a quantity (species, compartment, or parameter) in the model sm, followed by the character '=' and the name of a variable in grpData. For clarity, white spaces are allowed between names and '='.

For example, if you have the concentration data 'CONC' in grpData for a species 'Drug_Central', you can specify it as follows.

responseMap = 'Drug_Central = CONC';

To unambiguously name a species, use the qualified name, which includes the name of the compartment. To name a reaction-scoped parameter, use the reaction name to qualify the parameter. If the name is not a valid MATLAB® variable name, surround it by square brackets such as [reaction 1].[parameter 1].

An error is issued if any (qualified) name matches two components of the same type. However, you can use a (qualified) name that matches two components of different types, and the function first finds the species with the given name, followed by compartments and then parameters.

estiminfo — Estimated parametersestimatedInfo object | vector of estimatedInfo objects

Estimated parameters, specified as an estimatedInfo object or vector of estimatedInfo objects that defines the estimated parameters in the model sm and their initial estimates (optional). You can also specify parameter transformations if necessary. Supported transforms are log, logit, and probit. For details, see estimatedInfo object.

dosing — Dosing information[] | 2-D matrix of dose objects

Dosing information, specified as an empty array or 2-D matrix of dose objects (ScheduleDose object or RepeatDose object). If empty, no doses are applied during simulation, even if the model has active doses. If not empty, the matrix must have a single row or one row per group in the data grpData. If it has a single row, same doses are applied to all groups during simulation. If it has multiple rows, each row is applied to a separate group, in the same order as the groups appear in grpData.

Multiple columns are allowed so that you can apply multiple dose objects to each group. Each column of doses must reference the same components in the model sm. Specifically, all doses (except for empty doses) in a column must have the same values for TargetName, DurationParameterName, and LagParameterName. If some groups require more doses than others, then fill in the matrix with dummy doses that are either default doses or empty doses.

A default dose has default values for all properties, except for the Name property. An empty dose has a dose amount of 0, thus having no effect on the model. Create a default dose as follows.

d1 = sbiodose('d1');

In addition to manually constructing dose objects, if grpData has dosing information, you can use the createDoses method to construct doses.

functionName — Estimation function namestring

Estimation function name, specified as a string. Choices are as follows.

  • 'nlinfit' (Statistics Toolbox is required.)

  • 'fminunc' (Optimization Toolbox is required.)

  • 'fminsearch'

  • 'lsqcurvefit' (Optimization Toolbox is required.)

  • 'lsqnonlin' (Optimization Toolbox is required.)

  • 'patternserch' (Global Optimization Toolbox is required.)

  • 'ga' (Global Optimization Toolbox is required.)

  • 'pso' (Global Optimization Toolbox is required.)

If Statistics Toolbox is available, the function uses the nlinfit algorithm for estimation as the default. If not, but Optimization Toolbox is available, fminunc is used. If both are unavailable, fminsearch is used as the default.

options — Options specific to estimation functionstruct | optimoptions object

Options specific to the estimation function, specified as a struct or optimoptions object.

  • optimset struct for fminsearch

  • optimoptions object for lsqcurvefit, lsqnonlin, fminunc, and pso

  • gaoptimset struct for ga

  • psoptimset struct for patternsearch

  • statset struct for nlinfit

See Default Options for Estimation Algorithms for more details and default options associated with each estimation function.

variants — Variants[] | vector of variant objects

Variants, specified as an empty array or vector of variant objects. If empty, no variants are used even if the model has active variants.

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: 'ErrorModel','constant','UseParallel',true specifies a constant error model and to run simulations in parallel during parameter estimation.

'ErrorModel' — Error model'constant' (default) | 'proportional' | 'combined' | 'exponential'

Error model used for estimation, specified as a string. There are four built-in error models. Each model defines the error using a standard mean-zero and unit-variance (Gaussian) variable e, simulation results f, and one or two parameters a and b.

  • 'constant':

  • 'proportional':

  • 'combined':

  • 'exponential':

    Note:   If you specify an error model, you cannot specify weights except for the constant error model.

'Weights' — Weightsmatrix | function handle

Weights used for estimation, specified as a matrix of real positive weights, where the number of columns corresponds to the number of responses, or a function handle that accepts a vector of predicted response values and returns a vector of real positive weights.

If you specify an error model, you cannot use weights except for the constant error model. If neither the 'ErrorModel' or 'Weights' is specified, by default the function uses the constant error model with equal weights.

'Pooled' — Fit option flagfalse (default) | true

Fit option flag, specifying whether to fit each individual (false) or pool all individual data (true).

When true, the function performs fitting for all individuals or groups simultaneously using the same parameter estimates, and fitResults is a scalar results object.

When false, the function fits each group or individual separately using group- or individual-specific parameters, and fitResults is a vector of results objects with one result for each group.

'UseParallel' — Flag for parallel simulationsfalse (default) | true

Flag for parallel simulations during fitting, specified as true or false. If true and Parallel Computing Toolbox™ is available, the function runs simulations in parallel.

Output Arguments

expand all

fitResults — Estimation resultsOptimResults object | NLINResults object | Vector of results objects

Estimation results, returned as a OptimResults object or NLINResults object or a vector of these objects. Both results objects are subclasses of the LeastSquaresResults object.

If the function uses nlinfit, then fitResults is a NLINResults object. Otherwise, fitResults is an OptimResults object.

When 'Pooled' is set to false, the function fits each group separately using group-specific parameters, and fitResults is a vector of results objects with one results object for each group.

simdata — Simulation resultsVector of SimData objects

Simulation results, returned as a vector of SimData objects representing simulation results for each group or individual.

If the 'Pooled' option is false, then each simulation uses group-specific parameter values. If true, then all simulations use the same (population-wide) parameter values.

The states reported in simdata are the states that were included in the responseMap input argument as well as any other states listed in the StatesToLog property of the runtime options (RuntimeOptions) of the SimBiology model sm.

More About

expand all

Objective Functions

The function sbiofit uses maximum likelihood estimation (MLE) for fitting. Mathematically, the problem is typically formulated as the minimization of the negative of the logarithm of the likelihood function. For the constant error model, this is equivalent to the method of least-squares, which minimizes the sum of squares of the residuals. For other error models, the objective function has additional terms. When the error models are normally distributed, the negative log-likelihood can be written as follows.

Log likelihood
NNumber of experimental observations
yiThe ith experimental observation
Predicted value of the ith observation, which is a function of independent variables xi and estimated parameters p
Standard deviation of the ith observation. For instance, for the combined error model , the standard deviation is .

When you use numeric weights or the weight function, the weights are assumed to be proportional to the variance of the error, i.e., where a is the constant error parameter. If you use weights, you cannot specify an error model except the constant error model.

Various parameter estimation methods have different requirements on the objective function. For some methods, the estimation of model parameters is performed independently of the estimation of the error model parameters. The following table summarizes the objective function F and any separate formulas used for the estimation of error model parameters. ri is the ith residual, a and b are error model parameters, and fi is the predicted value of the ith observation.

MethodSupported Error ModelObjective FunctionError Parameter Estimation Function
lsqnonlin

lsqcurvefit

'constant'
lsqnonlin

lsqcurvefit

Numeric weights (wi)
lsqnonlin

lsqcurvefit

'exponential'
nlinfitAll nlinfit also estimates error parameters.
All others'constant'
All others'proportional'
All others'combined' Error parameters are included in minimization.
All others'exponential' , where is the standard deviation of the log-transformed ith observation.
All othersWeights

Default Options for Estimation Algorithms

FunctionDefault Options
nlinfit

sbiofit uses the default options structure associated with nlinfit, except for:

FunValCheck = 'off'
DerivStep = max(eps^(1/3), min(1e-4,solverOptions.RelativeTolerance)), where solverOptions property corresponds to the model's active configset object.
fminsearch

sbiofit uses the default options structure associated with fminsearch, except for:

Display = 'off'
TolFun = 1e-6 * f0, where f0 is the initial value of objective function.
lsqcurvefit

Requires Optimization Toolbox.

sbiofit uses the default options structure associated with lsqcurvefit, except for:

Display = 'off'
TolFun = 1e-6 * norm(f0), where f0 is the initial value of objective function.
FinDiffRelStep = max(eps^(1/3), min(1e-4,solverOptions.RelativeTolerance)) , where solverOptions property corresponds to the model's active configset object.
TypicalX = 1e-6 * x0, where x0 is an array of transformed initial estimates.
lsqnonlin

Requires Optimization Toolbox.

sbiofit uses the default options structure associated with lsqnonlin, except for:

Display = 'off'
TolFun = 1e-6 * norm(f0), where f0 is the initial value of objective function.
FinDiffRelStep = max(eps^(1/3), min(1e-4,solverOptions.RelativeTolerance)) , where solverOptions property corresponds to the model's active configset object.
TypicalX = 1e-6 * x0, where x0 is an array of transformed initial estimates.
patternsearch

Requires Global Optimization Toolbox.

sbioparamestim uses the default options structure associated with patternsearch, except for:

Display = 'off'
TolFun = 1e-6 * f0, where f0 is the initial value of the objective function.
TolMesh = 1.0e-3
Cache = 'on'
MeshAccel = 'on'
ga

Requires Global Optimization Toolbox.

sbioparamestim uses the default options structure associated with ga, except for:

Display = 'off'
TolFun = 1e-6 * f0, where f0 is the initial value of objective function.
MutationFcn = @mutationadaptfeasible
pso

Requires Global Optimization Toolbox.

sbioparamestim uses the following default options for the pso (particle swarm optimization) algorithm, except for:
Display = 'off'
TolFun = 1e-6 * f0, where f0 is the initial value of objective function.

See the Function Descriptions table from the sbioparamestim reference page for more details about this algorithm.

References

[1] Yi, T-M., Kitano, H., and Simon, M. (2003). A quantitative characterization of the yeast heterotrimeric G protein cycle. PNAS. 100, 10764–10769.

See Also

| | | | | | | | | | | |

Was this topic helpful?