Documentation

boxcox

Box-Cox transformation

boxcox has been partially removed and will no longer accept a fints object (tsobj).

Replace all instances of a fints object (tsobj) for input with an array by using fts2timetable to convert a fints object to a timetable object and then use timetable2table and table2array.

Syntax

[transdat,lambda] = boxcox(data)
[transfts,lambda] = boxcox(tsobj)
transdat = boxcox(lambda,data)
transfts = boxcox(lambda,tsobj)

Arguments

 data Data vector. Must be positive and specified as a column data vector. tsobj Financial time series object.

Description

boxcox transforms nonnormally distributed data to a set of data that has approximately normal distribution. The Box-Cox transformation is a family of power transformations.

If λ is not = 0, then

$data\left(\lambda \right)=\frac{dat{a}^{\lambda }-1}{\lambda }$

If λ is = 0, then

$data\left(\lambda \right)=\mathrm{log}\left(data\right)$

The logarithm is the natural logarithm (log base e). The algorithm calls for finding the λ value that maximizes the Log-Likelihood Function (LLF). The search is conducted using fminsearch.

[transdat,lambda] = boxcox(data) transforms the data vector data using the Box-Cox transformation method into transdat. It also estimates the transformation parameter λ.

[transfts,lambda] = boxcox(tsojb) transforms the financial time series object tsobj using the Box-Cox transformation method into transfts. It also estimates the transformation parameter λ.

If the input data is a vector, lambda is a scalar. If the input is a financial time series object, lambda is a structure with fields similar to the components of the object; for example, if the object contains series names Open and Close, lambda has fields lambda.Open and lambda.Close.

transdat = boxcox(lambda, data) and transfts = boxcox(lambda, tsobj) transform the data using a certain specified λ for the Box-Cox transformation. This syntax does not find the optimum λ that maximizes the LLF.

Examples

collapse all

Use boxcox to transform the data series contained in a financial time series object into another set of data series with relatively normal distributions.

Create a financial time series object from the supplied whirlpool.dat data file.

whrl = ascii2fts('whirlpool.dat', 1, 2, []);
Warning: FINTS will be removed in a future release. Use TIMETABLE instead. For more information, see <a href="matlab:web(fullfile(docroot, 'finance/convert-from-fints-to-timetables.html'))">Convert Financial Time Series Objects (fints) to Timetables</a>.

Fill any missing values denoted with NaN's in whrl with values calculated using the linear method.

f_whrl = fillts(whrl);
Warning: FINTS will be removed in a future release. Use TIMETABLE instead. For more information, see <a href="matlab:web(fullfile(docroot, 'finance/convert-from-fints-to-timetables.html'))">Convert Financial Time Series Objects (fints) to Timetables</a>.

Transform the nonnormally distributed filled data series f_whrl into a normally distributed one using Box-Cox transformation.

bc_whrl = boxcox(f_whrl);
Warning: FINTS will be removed in a future release. Use TIMETABLE instead. For more information, see <a href="matlab:web(fullfile(docroot, 'finance/convert-from-fints-to-timetables.html'))">Convert Financial Time Series Objects (fints) to Timetables</a>.

Compare the result of the Close data series with a normal (Gaussian) probability distribution function and the nonnormally distributed f_whrl.

subplot(2, 1, 1);
hist(f_whrl.Close);
Warning: FINTS will be removed in a future release. Use TIMETABLE instead. For more information, see <a href="matlab:web(fullfile(docroot, 'finance/convert-from-fints-to-timetables.html'))">Convert Financial Time Series Objects (fints) to Timetables</a>.
grid; title('Nonnormally Distributed Data');
subplot(2, 1, 2);
hist(bc_whrl.Close);
Warning: FINTS will be removed in a future release. Use TIMETABLE instead. For more information, see <a href="matlab:web(fullfile(docroot, 'finance/convert-from-fints-to-timetables.html'))">Convert Financial Time Series Objects (fints) to Timetables</a>.
grid; title('Box-Cox Transformed Data'); The bar chart on the top represents the probability distribution function of the filled data series, f_whrl, which is the original data series whrl with the missing values interpolated using the linear method. The distribution is skewed toward the left (not normally distributed). The bar chart on the bottom is less skewed to the left. If you plot a Gaussian probability distribution function (PDF) with similar mean and standard deviation, the distribution of the transformed data is very close to normal (Gaussian). When you examine the contents of the resulting object bc_whrl, you find an identical object to the original object whrl but the contents are the transformed data series.