Distribution GUIs

Introduction

This section describes Statistics Toolbox™ GUIs that provide convenient, interactive access to the distribution functions described in Distribution Functions.

Distribution Function Tool

To interactively see the influence of parameter changes on the shapes of the pdfs and cdfs of supported Statistics Toolbox distributions, use the Probability Distribution Function Tool.

Run the tool by typing disttool at the command line. You can also run it from the Demos tab in the Help browser.

Start by selecting a distribution. Then choose the function type: probability density function (pdf) or cumulative distribution function (cdf).

Once the plot displays, you can

Distribution Fitting Tool

The Distribution Fitting Tool is a GUI for fitting univariate distributions to data. This section describes how to use the Distribution Fitting Tool and covers the following topics:

Main Window of the Distribution Fitting Tool

To open the Distribution Fitting Tool, enter the command

dfittool

The following figure shows the main window of the Distribution Fitting Tool.

Plot Buttons.   Buttons at the top of the tool allow you to adjust the plot displayed in the main window:

Display Type.   The Display Type field specifies the type of plot displayed in the main window. Each type corresponds to a probability function, for example, a probability density function. The following display types are available:

Task Buttons.   The task buttons enable you to perform the tasks necessary to fit distributions to data. Each button opens a new window in which you perform the task. The buttons include

Display Pane.   The display pane displays plots of the data sets and fits you create. Whenever you make changes in one of the task windows, the results are updated in the display pane.

Menu Options.   The Distribution Fitting Tool menus contain items that enable you to do the following:

Example: Fitting a Distribution

This section presents an example that illustrates how to use the Distribution Fitting Tool. The example involves the following steps:

Create Random Data for the Example.   To try the example, first generate some random data to which you will fit a distribution. The following command generates a vector data, of length 100, whose entries are random numbers from a normal distribution with mean.36 and standard deviation 1.4.

data = normrnd(.36, 1.4, 100, 1);

Import Data into the Distribution Fitting Tool.   To import the vector data into the Distribution Fitting Tool, click the Data button in main window. This opens the window shown in the following figure.

The Data field displays all numeric arrays in the MATLAB® workspace. Select data from the drop-down list, as shown in the following figure.

This displays a histogram of the data in the Data preview pane.

In the Data set name field, type a name for the data set, such as My data, and click Create Data Set to create the data set. The main window of the Distribution Fitting Tool now displays a larger version of the histogram in the Data preview pane, as shown in the following figure.

Create a New Fit.   To fit a distribution to the data, click New Fit in the main window of the Distribution Fitting Tool. This opens the window shown in the following figure.

To fit a normal distribution, the default entry of the Distribution field, to My data:

  1. Enter a name for the fit, such as My fit, in the Fit name field.

  2. Select My data from the drop-down list in the Data field.

  3. Click Apply.

The Results pane displays the mean and standard deviation of the normal distribution that best fits My data, as shown in the following figure.

The main window of the Distribution Fitting Tool displays a plot of the normal distribution with this mean and standard deviation, as shown in the following figure.

Creating and Managing Data Sets

This section describes how create and manage data sets.

To begin, click the Data button in the main window of the Distribution Fitting Tool to open the Data window shown in the following figure.

Importing Data.   The Import workspace vectors pane enables you to create a data set by importing a vector from the MATLAB workspace. The following sections describe the fields of the Import workspace vectors pane and give appropriate values for vectors imported from the MATLAB workspace.

Data

The drop-down list in the Data field contains the names of all matrices and vectors, other than 1-by-1 matrices (scalars) in the MATLAB workspace. Select the array containing the data you want to fit. The actual data you import must be a vector. If you select a matrix in the Data field, the first column of the matrix is imported by default. To select a different column or row of the matrix, click Select Column or Row. This displays the matrix in the Variable Editor, where you can select a row or column by highlighting it with the mouse.

Alternatively, you can enter any valid MATLAB expression in the Data field.

When you select a vector in the Data field, a histogram of the data is displayed in the Data preview pane.

Censoring

If some of the points in the data set are censored, enter a Boolean vector, of the same size as the data vector, specifying the censored entries of the data. A 1 in the censoring vector specifies that the corresponding entry of the data vector is censored, while a 0 specifies that the entry is not censored. If you enter a matrix, you can select a column or row by clicking Select Column or Row. If you do not want to censor any data, leave the Censoring field blank.

Frequency

Enter a vector of positive integers of the same size as the data vector to specify the frequency of the corresponding entries of the data vector. For example, a value of 7 in the 15th entry of frequency vector specifies that there are 7 data points corresponding to the value in the 15th entry of the data vector. If all entries of the data vector have frequency 1, leave the Frequency field blank.

Data name

Enter a name for the data set you import from the workspace, such as My data.

As an example, if you create the vector data described in Example: Fitting a Distribution, and select it in the Data field, the upper half of the Data window appears as in the following figure.

After you have entered the information in the preceding fields, click Create Data Set to create the data set My data.

Managing Data Sets.   The Manage data sets pane enables you to view and manage the data sets you create. When you create a data set, its name appears in the Data sets list. The following figure shows the Manage data sets pane after creating the data set My data.

For each data set in the Data sets list, you can

The Distribution Fitting Tool cannot display confidence bounds on density (PDF), quantile (inverse CDF), or probability plots. Clearing the Bounds check box removes the confidence bounds from the plot in the main window.

When you select a data set from the list, the following buttons are enabled:

Setting Bin Rules.   To set bin rules for the histogram of a data set, click Set Bin Rules. This opens the dialog box shown in the following figure.

You can select from the following rules:

The Set Bin Width Rules dialog box also provides the following options:

Creating a New Fit

This section describes how to create a new fit. To begin, click the New Fit button at the top of the main window to open a New Fit window. If you created the data set My data, as described in Example: Fitting a Distribution, My data appears in the Data field, as shown in the following figure.

Fit Name.   Enter a name for the fit in the Fit Name field.

Data.   The Data field contains a drop-down list of the data sets you have created. Select the data set to which you want to fit a distribution.

Distribution.   Select the type of distribution you want to fit from the Distribution drop-down list. See Available Distributions for a list of distributions supported by the Distribution Fitting Tool.

You can specify either a parametric or a nonparametric distribution. When you select a parametric distribution from the drop-down list, a description of its parameters is displayed in the pane below the Exclusion rule field. The Distribution Fitting Tool estimates these parameters to fit the distribution to the data set. When you select Nonparametric fit, options for the fit appear in the pane, as described in Options for Nonparametric Fits.

Exclusion Rule.   You can specify a rule to exclude some the data in the Exclusion rule field. You can create an exclusion rule by clicking Exclude in the main window of the Distribution Fitting Tool. For more information, see Excluding Data.

Apply the New Fit.   Click Apply to fit the distribution. For a parametric fit, the Results pane displays the values of the estimated parameters. For a nonparametric fit, the Results pane displays information about the fit.

When you click Apply, the main window of Distribution Fitting Tool displays a plot of the distribution, along with the corresponding data.

Available Distributions.   This section lists the distributions available in the Distribution Fitting Tool.

Most, but not all, of the distributions available in the Distribution Fitting Tool are supported elsewhere in Statistics Toolbox software (see Supported Distributions), and have dedicated distribution fitting functions. These functions are used to compute the majority of the fits in the Distribution Fitting Tool, and are referenced in the list below.

Other fits are computed using functions internal to the Distribution Fitting Tool. Distributions that do not have corresponding Statistics Toolbox fitting functions are described in Additional Distributions Available in the Distribution Fitting Tool.

Not all of the distributions listed below are available for all data sets. The Distribution Fitting Tool determines the extent of the data (nonnegative, unit interval, etc.) and displays appropriate distributions in the Distribution drop-down list. Distribution data ranges are given parenthetically in the list below.

Options for Nonparametric Fits.   When you select Non-parametric in the Distribution field, a set of options appears in the pane below Exclusion rule, as shown in the following figure.

The options for nonparametric distributions are

Displaying Results

This section explains the different ways to display results in the main window of the Distribution Fitting Tool. The main window displays plots of

Display Type.   The Display Type field in the main window specifies the type of plot displayed. Each type corresponds to a probability function, for example, a probability density function. The following display types are available:

Confidence Bounds.   You can display confidence bounds for data sets and fits, provided that you set Display Type to Cumulative probability (CDF), Survivor function, Cumulative hazard, or Quantile for fits only.

To set the confidence level for the bounds, select Confidence Level from the View menu in the main window and choose from the options.

Managing Fits

This section describes how to manage fits that you have created. To begin, click the Manage Fits button in the main window of the Distribution Fitting Tool. This opens the Fit Manager window as shown in the following figure.

The Table of fits displays a list of the fits you create.

Plot.   Select Plot to display a plot of the fit in the main window of the Distribution Fitting Tool. When you create a new fit, Plot is selected by default. Clearing the Plot check box removes the fit from the plot in the main window.

Bounds.   If Plot is selected, you can also select Bounds to display confidence bounds in the plot. The bounds are displayed when you set Display Type in the main window to one of the following:

The Distribution Fitting Tool cannot display confidence bounds on density (PDF) or probability plots. In addition, bounds are not supported for nonparametric fits and some parametric fits.

Clearing the Bounds check box removes the confidence intervals from the plot in the main window.

When you select a fit in the Table of fits, the following buttons are enabled below the table:

Evaluating Fits

The Evaluate window enables you to evaluate any fit at whatever points you choose. To open the window, click the Evaluate button in the main window of the Distribution Fitting Tool. The following figure shows the Evaluate window.

The Evaluate window contains the following items:

Click Apply to apply these settings to the selected fit. The following figure shows the results of evaluating the cumulative density function for the fit My fit, created in Example: Fitting a Distribution, at the points in the vector -3:0.5:3.

The window displays the following values in the columns of the table to the right of the Fit pane:

To save the data displayed in the Evaluate window, click Export to Workspace. This saves the values in the table to a matrix in the MATLAB workspace.

Excluding Data

To exclude values from fit, click the Exclude button in the main window of the Distribution Fitting Tool. This opens the Exclude window, in which you can create rules for excluding specified values. You can use these rules to exclude data when you create a new fit in the New Fit window. The following figure shows the Exclude window.

The following sections describe how to create an exclusion rule.

Exclusion Rule Name.   Enter a name for the exclusion rule in the Exclusion rule name field.

Exclude Sections.   In the Exclude sections pane, you can specify bounds for the excluded data:

Exclude Graphically.   The Exclude Graphically button enables you to define the exclusion rule by displaying a plot of the values in a data set and selecting the bounds for the excluded data with the mouse. For example, if you created the data set My data, described in Creating and Managing Data Sets, select it from the drop-down list next to Exclude graphically and then click the Exclude graphically button. This displays the values in My data in a new window as shown in the following figure.

To set a lower limit for the boundary of the excluded region, click Add Lower Limit. This displays a vertical line on the left side of the plot window. Move the line with the mouse to the point you where you want the lower limit, as shown in the following figure.

Moving the vertical line changes the value displayed in the Lower limit: exclude data field in the Exclude window, as shown in the following figure.

The value displayed corresponds to the x-coordinate of the vertical line.

Similarly, you can set the upper limit for the boundary of the excluded region by clicking Add Upper Limit and moving the vertical line that appears at the right side of the plot window. After setting the lower and upper limits, click Close and return to the Exclude window.

Create Exclusion Rule.   Once you have set the lower and upper limits for the boundary of the excluded data, click Create Exclusion Rule to create the new rule. The name of the new rule now appears in the Existing exclusion rules pane.

When you select an exclusion rule in the Existing exclusion rules pane, the following buttons are enabled:

Once you define an exclusion rule, you can use it when you fit a distribution to your data. The rule does not exclude points from the display of the data set.

Saving and Loading Sessions

This section explains how to save your work in the current Distribution Fitting Tool session and then load it in a subsequent session, so that you can continue working where you left off.

Saving a Session.   To save the current session, select Save Session from the File menu in the main window. This opens a dialog box that prompts you to enter a filename, such as my_session.dfit, for the session. Clicking Save saves the following items created in the current session:

Loading a Session.   To load a previously saved session, select Load Session from the File menu in the main window and enter the name of a previously saved session. Clicking Open restores the information from the saved session to the current session of the Distribution Fitting Tool.

Generating an M-File to Fit and Plot Distributions

The Generate M-file option in the File menu enables you to create an M-file that

After you end the current session, you can use the M-file to create plots in a standard MATLAB figure window, without having to reopen the Distribution Fitting Tool.

As an example, assuming you created the fit described in Creating a New Fit, do the following steps:

  1. Select Generate M-file from the File menu.

  2. Save the M-file as normal_fit.m in a directory on the MATLAB path.

You can then apply the function normal_fit to any vector of data in the MATLAB workspace. For example, the following commands

new_data = normrnd(4.1, 12.5, 100, 1);
normal_fit(new_data)
legend('New Data', 'My fit')

fit a normal distribution to a data set and generate a plot of the data and the fit.

Using Custom Distributions

This section explains how to use custom distributions with the Distribution Fitting Tool.

Defining Custom Distributions.   To define a custom distribution, select Define Custom Distribution from the File menu. This opens an M-file template in the MATLAB editor. You then edit this M-file so that it computes the distribution you want.

The template includes example code that computes the Laplace distribution, beginning at the lines

% ————————————————————————————-
% —— Remove the following return statement to define the 
% —— Laplace distributon
% ————————————————————————————-
return

To use this example, simply delete the command return and save the M-file. If you save the template in a directory on the MATLAB path, under its default name dfittooldists.m, the Distribution Fitting Tool reads it in automatically when you start the tool. You can also save the template under a different name, such as laplace.m, and then import the custom distribution as described in the following section.

Importing Custom Distributions.   To import a custom distribution, select Import Custom Distributions from the File menu. This opens a dialog box in which you can select the M-file that defines the distribution. For example, if you created the file laplace.m, as described in the preceding section, you can enter laplace.m and select Open in the dialog box. The Distribution field of the New Fit window now contains the option Laplace.

Additional Distributions Available in the Distribution Fitting Tool

The following distributions are available in the Distribution Fitting Tool, but do not have dedicated distribution functions as described in Distribution Functions. The distributions can be used with the functions pdf, cdf, icdf, and mle in a limited capacity. See the reference pages for these functions for details on the limitations.

For a complete list of the distributions available for use with the Distribution Fitting Tool, see Supported Distributions. Distributions listing dfittool in the fit column of the tables in that section can be used with the Distribution Fitting Tool.

Random Number Generation Tool

The Random Number Generation Tool is a graphical user interface that generates random samples from specified probability distributions and displays the samples as histograms. Use the tool to explore the effects of changing parameters and sample size on the distributions.

Run the tool by typing randtool at the command line. You can also run it from the Demos tab in the Help browser.

Start by selecting a distribution, then enter the desired sample size.

You can also

  


 © 1984-2008- The MathWorks, Inc.    -   Site Help   -   Patents   -   Trademarks   -   Privacy Policy   -   Preventing Piracy   -   RSS