ecmmvnrmle
Multivariate normal regression with missing data
Syntax
Description
Examples
Compute Multivariate Normal Regression With Missing Data
This example shows how to estimate a multivariate normal regression model with missing data.
First, load dates, total returns, and ticker symbols for the twelve stocks from the MAT-file.
load CAPMuniverse whos Assets Data Dates
Name Size Bytes Class Attributes Assets 1x14 1792 cell Data 1471x14 164752 double Dates 1471x1 11768 double
Dates = datetime(Dates,'ConvertFrom','datenum');
The assets in the model have the following symbols, where the last two series are proxies for the market and the riskless asset.
Assets(1:14)
ans = 1x14 cell
{'AAPL'} {'AMZN'} {'CSCO'} {'DELL'} {'EBAY'} {'GOOG'} {'HPQ'} {'IBM'} {'INTC'} {'MSFT'} {'ORCL'} {'YHOO'} {'MARKET'} {'CASH'}
The data covers the period from January 1, 2000 to November 7, 2005 with daily total returns. Two stocks in this universe have missing values that are represented by NaN
s. One of the two stocks had an IPO during this period and, consequently, has significantly less data than the other stocks.
Compute separate regressions for each stock, where the stocks with missing data have estimates that reflect their reduced observability.
[NumSamples, NumSeries] = size(Data); NumAssets = NumSeries - 2; StartDate = Dates(1); EndDate = Dates(end); Alpha = NaN(1, length(NumAssets)); Beta = NaN(1, length(NumAssets)); Sigma = NaN(1, length(NumAssets)); StdAlpha = NaN(1, length(NumAssets)); StdBeta = NaN(1, length(NumAssets)); StdSigma = NaN(1, length(NumAssets)); for i = 1:NumAssets % Set up separate asset data and design matrices TestData = zeros(NumSamples,1); TestDesign = zeros(NumSamples,2); TestData(:) = Data(:,i) - Data(:,14); TestDesign(:,1) = 1.0; TestDesign(:,2) = Data(:,13) - Data(:,14); % Estimate the multivariate normal regression for each asset separately. [Param, Covar] = ecmmvnrmle(TestData, TestDesign) end
Param = 2×1
0.0012
1.2294
Covar = 0.0010
Param = 2×1
0.0006
1.3661
Covar = 0.0020
Param = 2×1
-0.0002
1.5653
Covar = 8.8911e-04
Param = 2×1
-0.0000
1.2594
Covar = 6.4996e-04
Param = 2×1
0.0014
1.3441
Covar = 0.0014
Param = 2×1
0.0046
0.3742
Covar = 6.3272e-04
Param = 2×1
0.0001
1.3745
Covar = 6.5040e-04
Param = 2×1
-0.0000
1.0807
Covar = 2.8562e-04
Param = 2×1
0.0001
1.6002
Covar = 6.9146e-04
Param = 2×1
-0.0002
1.1765
Covar = 3.7138e-04
Param = 2×1
0.0000
1.5010
Covar = 0.0010
Param = 2×1
0.0001
1.6543
Covar = 0.0015
Input Arguments
Data
— Data
matrix
Data, specified as an
NUMSAMPLES
-by-NUMSERIES
matrix
with NUMSAMPLES
samples of a
NUMSERIES
-dimensional random vector. Missing values are
indicated by NaN
s. Only samples that are entirely
NaN
s are ignored. (To ignore samples with at least
one NaN
, use mvnrmle
.)
Data Types: double
Design
— Design model
matrix | cell array
Design model, specified as a matrix or a cell array that handles two model structures:
If
NUMSERIES = 1
,Design
is aNUMSAMPLES
-by-NUMPARAMS
matrix with known values. This structure is the standard form for regression on a single series.If
NUMSERIES
≥1
,Design
is a cell array. The cell array contains either one orNUMSAMPLES
cells. Each cell contains aNUMSERIES
-by-NUMPARAMS
matrix of known values.If
Design
has a single cell, it is assumed to have the sameDesign
matrix for each sample. IfDesign
has more than one cell, each cell contains aDesign
matrix for each sample.
Data Types: double
| cell
MaxIterations
— Maximum number of iterations for the estimation algorithm
100
(default) | numeric
(Optional) Maximum number of iterations for the estimation algorithm, specified as a numeric.
Data Types: double
TolParam
— Convergence tolerance for estimation algorithm based on changes in model parameter estimates
1.0e-8
(default) | numeric
(Optional) Convergence tolerance for estimation algorithm based on changes in model parameter estimates, specified as a numeric. The convergence test for changes in model parameters is
where Param
represents the output
Parameters
, and iteration k = 2,
3, ... . Convergence is assumed when both the TolParam
and TolObj
conditions are satisfied. If both
TolParam
≤ 0
and
TolObj
≤ 0
, do the maximum
number of iterations (MaxIterations
), whatever the
results of the convergence tests.
Data Types: double
TolObj
— Convergence tolerance for estimation algorithm based on changes in objective function
1.0e-12
(default) | numeric
(Optional) Convergence tolerance for estimation algorithm based on changes in the objective function, specified as a numeric. The convergence test for changes in the objective function is
for iteration k = 2, 3, ... . Convergence is assumed
when both the TolParam
and TolObj
conditions are satisfied. If both TolParam
≤
0
and TolObj
≤
0
, do the maximum number of iterations
(MaxIterations
), whatever the results of the
convergence tests.
Data Types: double
Param0
— Estimate for the parameters of regression model
[]
(default) | vector
(Optional) Estimate for the parameters of the regression model, specified
as an NUMPARAMS
-by-1
column
vector.
Data Types: double
Covar0
— Estimate for the covariance matrix of regression residuals
[]
(default) | matrix
(Optional) Estimate for the covariance matrix of the regression residuals,
specified as NUMSERIES
-by-NUMSERIES
matrix.
Data Types: double
CovarFormat
— Format for the covariance matrix
'full'
(default) | character vector
(Optional) Format for the covariance matrix, specified as a character vector. The choices are:
'full'
— Compute the full covariance matrix.'diagonal'
— Force the covariance matrix to be a diagonal matrix.
Data Types: char
Output Arguments
Param
— Estimates for parameters of the regression model
vector
Estimates for the parameters of the regression model, returned as a
NUMPARAMS
-by-1
column vector.
Covar
— Estimates for the covariance of regression model's residuals
matrix
Estimates for the covariance of the regression model's residuals, returned
as a NUMSERIES
-by-NUMSERIES
matrix.
Resid
— Residuals from regression
matrix
Residuals from the regression, returned as a
NUMSAMPLES
-by-NUMSERIES
matrix.
For any missing values in Data
, the corresponding
residual is the difference between the conditionally imputed value for
Data
and the model, that is, the imputed residual.
Note
The covariance estimate Covariance
cannot be
derived from the residuals.
Info
— Additional information from regression
structure
Additional information from the regression, returned as a structure. The structure has these fields:
Info.Obj
— A variable-extent column vector, with no more thanMaxIterations
elements, that contain each value of the objective function at each iteration of the estimation algorithm. The last value in this vector,Obj
(end)
, is the terminal estimate of the objective function. If you do maximum likelihood estimation, the objective function is the log-likelihood function.Info.PrevParameters
—NUMPARAMS
-by-1
column vector of estimates for the model parameters from the iteration just prior to the terminal iteration.Info.PrevCovariance
–NUMSERIES
-by-NUMSERIES
matrix of estimates for the covariance parameters from the iteration just prior to the terminal iteration.
References
[1] Little, Roderick J. A. and Donald B. Rubin. Statistical Analysis with Missing Data. 2nd Edition. John Wiley & Sons, Inc., 2002.
[2] Meng, Xiao-Li and Donald B. Rubin. “Maximum Likelihood Estimation via the ECM Algorithm.” Biometrika. Vol. 80, No. 2, 1993, pp. 267–278.
[3] Sexton, Joe and Anders Rygh Swensen. “ECM Algorithms that Converge at the Rate of EM.” Biometrika. Vol. 87, No. 3, 2000, pp. 651–662.
[4] Dempster, A. P., N. M. Laird, and Donald B. Rubin. “Maximum Likelihood from Incomplete Data via the EM Algorithm.” Journal of the Royal Statistical Society. Series B, Vol. 39, No. 1, 1977, pp. 1–37.
Version History
Introduced in R2006a
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)