Main Content

templateGAM

Generalized additive model (GAM) learner template

Since R2023b

    Description

    t = templateGAM returns a generalized additive learner template suitable for training a classification or regression model.

    t = templateGAM(Name=Value) returns a template with additional options specified by one or more name-value arguments. For example, you can specify the number of trees per linear term or the number of trees per interaction term.

    If you specify the type of model by using the Type name-value argument, then the display of t in the Command Window shows all options as empty ([]), except those that you specify using name-value arguments. If you do not specify the type of model, then the display suppresses the empty options. During training, the software uses default values for empty options.

    example

    Examples

    collapse all

    Create a template for a GAM classifier.

    t = templateGAM(Type="classification")
    t = 
    Fit template for classification GAM.
    
                               NumPrint: []
                              MaxPValue: []
          InitialLearnRateForPredictors: []
        InitialLearnRateForInteractions: []
                   NumTreesPerPredictor: []
                 NumTreesPerInteraction: []
               MaxNumSplitsPerPredictor: []
             MaxNumSplitsPerInteraction: []
                         VerbosityLevel: []
                           Interactions: []
                                Version: 1
                                 Method: 'GAM'
                                   Type: 'classification'
    
    

    t is a template object for a GAM learner. All properties of the template object are empty except Method and Type. When you pass t to a training function, the software sets the empty properties to their respective default values.

    Name-Value Arguments

    expand all

    Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

    Example: t=templateGAM(Type="regression") creates a GAM learner template for regression.

    GAM Classification and Regression Options

    expand all

    Initial learning rate of gradient boosting for interaction terms, specified as a numeric scalar in the interval (0,1].

    For each boosting iteration for interaction trees, the software starts fitting with the initial learning rate. The function halves the learning rate until it finds a rate that improves the model fit.

    Training a model using a small learning rate requires more learning iterations, but often achieves better accuracy.

    For more details about gradient boosting, see Gradient Boosting Algorithm.

    Example: InitialLearnRateForInteractions=0.1

    Data Types: single | double

    Initial learning rate of gradient boosting for linear terms, specified as a numeric scalar in the interval (0,1].

    For each boosting iteration for predictor trees, the software starts fitting with the initial learning rate. The function halves the learning rate until it finds a rate that improves the model fit.

    Training a model using a small learning rate requires more learning iterations, but often achieves better accuracy.

    For more details about gradient boosting, see Gradient Boosting Algorithm.

    Example: InitialLearnRateForPredictors=0.1

    Data Types: single | double

    Number or list of interaction terms to include in the candidate set S, specified as a nonnegative integer scalar, a logical matrix, or "all".

    • Number of interaction terms, specified as a nonnegative integer — S includes the specified number of important interaction terms, selected based on the p-values of the terms.

    • List of interaction terms, specified as a logical matrix — S includes the terms specified by a t-by-p logical matrix, where t is the number of interaction terms, and p is the number of predictors used to train the model. For example, logical([1 1 0; 0 1 1]) represents two pairs of interaction terms: a pair of the first and second predictors, and a pair of the second and third predictors.

      If the software uses a subset of input variables as predictors, then the function indexes the predictors using only the subset. That is, the column indexes of the logical matrix do not count the response and observation weight variables. The indexes also do not count any variables not used by the function.

    • "all"S includes all possible pairs of interaction terms, which is p*(p – 1)/2 number of terms in total.

    Among the interaction terms in S, the software identifies those whose p-values are not greater than the MaxPValue value and uses them to build a set of interaction trees. Use the default value (MaxPValue=1) to build interaction trees using all terms in S.

    Example: Interactions="all"

    Data Types: single | double | logical | char | string

    Maximum number of decision splits (or branch nodes) per interaction tree (boosted tree for an interaction term), specified as a positive integer scalar.

    Example: MaxNumSplitsPerInteraction=5

    Data Types: single | double

    Maximum number of decision splits (or branch nodes) per predictor tree (boosted tree for a linear term), specified as a positive integer scalar. By default, the software uses a tree stump for a predictor tree.

    Example: MaxNumSplitsPerPredictor=5

    Data Types: single | double

    Maximum p-value for detecting interaction terms, specified as a numeric scalar in the interval [0,1].

    The software first finds the candidate set S of interaction terms from Interactions. Then the function identifies the interaction terms whose p-values are not greater than the MaxPValue value and uses them to build a set of interaction trees.

    The default value (MaxPValue=1) builds interaction trees for all interaction terms in the candidate set S.

    For more details about detecting interaction terms, see Interaction Term Detection.

    Example: MaxPValue=0.05

    Data Types: single | double

    Number of trees per interaction term, specified as a positive integer scalar.

    The NumTreesPerInteraction value is equivalent to the number of gradient boosting iterations for the interaction terms for predictors. For each iteration, the software adds a set of interaction trees to the model, one tree for each interaction term. To learn about the gradient boosting algorithm, see Gradient Boosting Algorithm.

    Example: NumTreesPerInteraction=500

    Data Types: single | double

    Number of trees per linear term, specified as a positive integer scalar.

    The NumTreesPerPredictor value is equivalent to the number of gradient boosting iterations for the linear terms for predictors. For each iteration, the software adds a set of predictor trees to the model, one tree for each predictor. To learn about the gradient boosting algorithm, see Gradient Boosting Algorithm.

    Example: NumTreesPerPredictor=500

    Data Types: single | double

    GAM model type, specified as "classification" or "regression".

    ValueDescription
    "classification"Create a classification GAM learner template. If you do not specify Type as "classification", the fitting function testckfold sets this value when you pass t to the function.
    "regression"Create a regression GAM learner template. If you do not specify Type as "regression", the fitting function directforecaster sets this value when you pass t to the function.

    Example: Type="classification"

    Data Types: char | string

    Other Classification and Regression Options

    expand all

    Number of iterations between diagnostic message printouts, specified as a nonnegative integer scalar. This argument is valid only when you specify Verbose as 1.

    If you specify Verbose=1 and NumPrint=numPrint, then the software displays diagnostic messages every numPrint iterations in the Command Window.

    Example: NumPrint=500

    Data Types: single | double

    Verbosity level, specified as 0, 1, or 2. The Verbose value controls the amount of diagnostic information that the software displays in the Command Window.

    ValueDescription
    0The software displays no information.
    1The software displays diagnostic messages every numPrint iterations, where numPrint is the NumPrint value.
    2The software displays diagnostic messages at every iteration.

    Each line of the diagnostic messages shows the information about each boosting iteration and includes the following columns:

    • Type — Type of trained trees, 1D (predictor trees, or boosted trees for linear terms for predictors) or 2D (interaction trees, or boosted trees for interaction terms for predictors)

    • NumTrees — Number of trees per linear term or interaction term added by templateGAM to the model so far

    • DevianceDeviance of the model

    • RelTol — Relative change of model predictions: (y^ky^k1)(y^ky^k1)/y^ky^k, where y^k is a column vector of model predictions at iteration k

    • LearnRate — Learning rate used for the current iteration

    Example: Verbose=1

    Data Types: single | double

    Output Arguments

    collapse all

    GAM learner template suitable for training GAM classification or regression models, returned as a template object. During training, the software uses default values for empty options.

    More About

    collapse all

    Algorithms

    collapse all

    Version History

    Introduced in R2023b