MATLAB Examples

Feature Extraction Workflow

This example shows a complete workflow for feature extraction from image data.

Contents

Obtain Data

This example uses the MNIST image data, which consists of images of handwritten digits. The images are 28-by-28 pixels in gray scale. Each image has an associated label from 0 through 9, which is the digit that the image represents.

Begin by obtaining image and label data from

http://yann.lecun.com/exdb/mnist/

Unzip the files. For better performance on this long example, use the test data as training data and the training data as test data.

imageFileName = 't10k-images.idx3-ubyte';
labelFileName = 't10k-labels.idx1-ubyte';

Process the files to load them in the workspace. The code for this processing function appears at the end of this example. To execute the code, add the directory of the function to the search path.

addpath(fullfile(matlabroot,'examples','stats'));
[Xtrain,LabelTrain] = processMNISTdata(imageFileName,labelFileName);
Read MNIST image data...
Number of images in the dataset:  10000 ...
Each image is of 28 by 28 pixels...
The image data is read to a matrix of dimensions:  10000 by  784...
End of reading image data.

Read MNIST label data...
Number of labels in the dataset:  10000 ...
The label data is read to a matrix of dimensions:  10000 by  1...
End of reading label data.

View a few of the images.

rng('default') % For reproducibility
numrows = size(Xtrain,1);
ims = randi(numrows,4,1);
imgs = Xtrain(ims,:);
for i = 1:4
    pp{i} = reshape(imgs(i,:),28,28);
end
ppf = [pp{1},pp{2};pp{3},pp{4}];
imshow(ppf);

Choose New Feature Dimensions

There are several considerations in choosing the number of features to extract:

  • More features use more memory and computational time.
  • Fewer features can produce a poor classifier.

For this example, choose 100 features.

q = 100;

Extract Features

There are two feature extraction functions, sparsefilt and rica. Begin with the sparsefilt function. Set the number of iterations to 10 so that the extraction does not take too long.

Typically, you get good results by running the sparsefilt algorithm for a few iterations to a few hundred iterations. Running the algorithm for too many iterations can lead to decreased classification accuracy, a type of overfitting problem.

Use sparsefilt to obtain the sparse filtering model while using 10 iterations.

Mdl = sparsefilt(Xtrain,q,'IterationLimit',10);
Warning: Solver LBFGS was not able to converge to a solution. 

sparsefilt warns that the internal LBFGS optimizer did not converge. The optimizer did not converge because you set the iteration limit to 10. Nevertheless, you can use the result to train a classifier.

Create Classifier

Transform the original data into the new feature representation.

NewX = transform(Mdl,Xtrain);

Train a linear classifier based on the transformed data and the correct classification labels in LabelTrain. The accuracy of the learned model is sensitive to the fitcecoc regularization parameter Lambda. Try to find the best value for Lambda by using the OptimizeHyperparameters name-value pair. Be aware that this optimization takes time. If you have a Parallel Computing Toolbox™ license, use parallel computing for faster execution. If you don't have a parallel license, remove the UseParallel calls before running this script.

t = templateLinear('Solver','lbfgs');
options = struct('UseParallel',true);
Cmdl = fitcecoc(NewX,LabelTrain,'Learners',t, ...
    'OptimizeHyperparameters',{'Lambda'}, ...
    'HyperparameterOptimizationOptions',options);
Starting parallel pool (parpool) using the 'local' profile ...
Preserving jobs with IDs: 1 2 because they contain crash dump files.
You can use 'delete(myCluster.Jobs)' to remove all jobs created with profile local. To create 'myCluster' use 'myCluster = parcluster('local')'.
connected to 6 workers.
Copying objective function to workers...
Done copying objective function to workers.
|================================================================================================|
| Iter | Active  | Eval   | Objective   | Objective   | BestSoFar   | BestSoFar   |       Lambda |
|      | workers | result |             | runtime     | (observed)  | (estim.)    |              |
|================================================================================================|
|    1 |       6 | Best   |      0.5777 |      7.4007 |      0.5777 |      0.5777 |      0.20606 |
|    2 |       5 | Accept |      0.8865 |       8.203 |      0.2041 |     0.27206 |       8.8234 |
|    3 |       5 | Best   |      0.2041 |      8.7958 |      0.2041 |     0.27206 |     0.026804 |
|    4 |       6 | Best   |      0.1077 |      14.377 |      0.1077 |     0.10773 |   1.7309e-09 |
|    5 |       6 | Best   |      0.0962 |      14.766 |      0.0962 |    0.096203 |    0.0002442 |
|    6 |       6 | Accept |         0.2 |      6.4864 |      0.0962 |    0.096221 |     0.024847 |
|    7 |       6 | Accept |      0.2077 |      6.5173 |      0.0962 |    0.096222 |     0.029081 |
|    8 |       6 | Accept |      0.0977 |      22.173 |      0.0962 |    0.096215 |   8.0495e-06 |
|    9 |       6 | Accept |      0.1238 |      8.6482 |      0.0962 |    0.096199 |     0.002979 |
|   10 |       6 | Accept |      0.1082 |      12.894 |      0.0962 |    0.096198 |   4.3382e-09 |
|   11 |       6 | Accept |      0.1085 |      10.829 |      0.0962 |    0.096211 |   0.00085219 |
|   12 |       6 | Accept |       0.105 |      15.564 |      0.0962 |     0.09621 |   1.4124e-07 |
|   13 |       6 | Accept |      0.1058 |      12.765 |      0.0962 |    0.096187 |    2.079e-08 |
|   14 |       6 | Best   |      0.0936 |      16.633 |      0.0936 |    0.093558 |   6.2283e-05 |
|   15 |       6 | Accept |      0.1026 |      18.202 |      0.0936 |    0.093551 |   1.4023e-06 |
|   16 |       6 | Accept |      0.0961 |      19.011 |      0.0936 |    0.093666 |   2.2152e-05 |
|   17 |       6 | Accept |      0.1084 |      12.674 |      0.0936 |    0.093663 |   1.0003e-09 |
|   18 |       6 | Accept |      0.1031 |      16.265 |      0.0936 |    0.093671 |   4.4527e-07 |
|   19 |       6 | Accept |      0.0937 |      14.016 |      0.0936 |    0.093696 |   0.00013015 |
|   20 |       6 | Best   |      0.0914 |      15.769 |      0.0914 |    0.092549 |   8.6862e-05 |
|================================================================================================|
| Iter | Active  | Eval   | Objective   | Objective   | BestSoFar   | BestSoFar   |       Lambda |
|      | workers | result |             | runtime     | (observed)  | (estim.)    |              |
|================================================================================================|
|   21 |       6 | Accept |      0.8865 |      5.3391 |      0.0914 |    0.092543 |       1.6582 |
|   22 |       6 | Accept |      0.0942 |      17.642 |      0.0914 |     0.09256 |   4.7242e-05 |
|   23 |       6 | Accept |       0.093 |      15.341 |      0.0914 |    0.092615 |     0.000111 |
|   24 |       6 | Accept |      0.0921 |       16.01 |      0.0914 |    0.092512 |   9.8477e-05 |
|   25 |       6 | Accept |      0.1473 |      8.1332 |      0.0914 |     0.09247 |    0.0069786 |
|   26 |       6 | Accept |      0.1064 |       15.85 |      0.0914 |    0.092473 |   5.1858e-08 |
|   27 |       6 | Accept |      0.1009 |      12.464 |      0.0914 |    0.092506 |   0.00047376 |
|   28 |       6 | Accept |      0.1072 |      13.731 |      0.0914 |    0.092507 |   9.5565e-09 |
|   29 |       6 | Accept |      0.1003 |      21.702 |      0.0914 |    0.092508 |   3.3081e-06 |
|   30 |       6 | Accept |      0.1156 |      10.224 |      0.0914 |     0.09251 |    0.0016146 |

__________________________________________________________
Optimization completed.
MaxObjectiveEvaluations of 30 reached.
Total function evaluations: 30
Total elapsed time: 81.5517 seconds.
Total objective function evaluation time: 398.4276

Best observed feasible point:
      Lambda  
    __________

    8.6862e-05

Observed objective function value = 0.0914
Estimated objective function value = 0.09251
Function evaluation time = 15.7692

Best estimated feasible point (according to models):
      Lambda  
    __________

    9.8477e-05

Estimated objective function value = 0.09251
Estimated function evaluation time = 15.7884

Evaluate Classifier

Check the error of the classifier when applied to test data. First, load the test data.

imageFileName = 'train-images.idx3-ubyte';
labelFileName = 'train-labels.idx1-ubyte';
[Xtest,LabelTest] = processMNISTdata(imageFileName,labelFileName);
Read MNIST image data...
Number of images in the dataset:  60000 ...
Each image is of 28 by 28 pixels...
The image data is read to a matrix of dimensions:  60000 by  784...
End of reading image data.

Read MNIST label data...
Number of labels in the dataset:  60000 ...
The label data is read to a matrix of dimensions:  60000 by  1...
End of reading label data.

Calculate the classification loss when applying the classifier to the test data.

TestX = transform(Mdl,Xtest);
Loss = loss(Cmdl,TestX,LabelTest)
Loss =

    0.1007

Did this transformation result in a better classifier than one trained on the original data? Create a classifier based on the original training data and evaluate its loss.

Omdl = fitcecoc(Xtrain,LabelTrain,'Learners',t, ...
    'OptimizeHyperparameters',{'Lambda'}, ...
    'HyperparameterOptimizationOptions',options);
Losso = loss(Omdl,Xtest,LabelTest)
Copying objective function to workers...
Done copying objective function to workers.
|================================================================================================|
| Iter | Active  | Eval   | Objective   | Objective   | BestSoFar   | BestSoFar   |       Lambda |
|      | workers | result |             | runtime     | (observed)  | (estim.)    |              |
|================================================================================================|
|    1 |       4 | Best   |      0.0779 |      46.584 |      0.0779 |      0.0779 |   1.3269e-06 |
|    2 |       4 | Accept |      0.0779 |      46.572 |      0.0779 |      0.0779 |   3.8643e-09 |
|    3 |       4 | Accept |      0.0779 |      46.606 |      0.0779 |      0.0779 |   5.7933e-08 |
|    4 |       6 | Accept |      0.0787 |       135.7 |      0.0779 |      0.0779 |     0.011605 |
|    5 |       6 | Best   |      0.0775 |      139.22 |      0.0775 |     0.07798 |   0.00020291 |
|    6 |       6 | Accept |      0.0775 |       91.03 |      0.0775 |      0.0779 |   6.2603e-05 |
|    7 |       6 | Accept |      0.0779 |       45.04 |      0.0775 |      0.0779 |   1.0076e-09 |
|    8 |       6 | Best   |      0.0774 |       143.5 |      0.0774 |    0.077458 |   0.00021639 |
|    9 |       6 | Accept |      0.0785 |      149.75 |      0.0774 |    0.077911 |    0.0025558 |
|   10 |       6 | Best   |      0.0744 |      243.08 |      0.0744 |    0.074408 |        6.805 |
|   11 |       6 | Accept |       0.078 |      48.495 |      0.0744 |    0.074407 |   9.1498e-06 |
|   12 |       6 | Accept |      0.0782 |      110.37 |      0.0744 |    0.074552 |   8.8714e-05 |
|   13 |       6 | Accept |      0.0782 |      137.54 |      0.0744 |    0.074594 |   0.00020355 |
|   14 |       6 | Accept |       0.078 |      124.02 |      0.0744 |    0.074578 |   0.00012798 |
|   15 |       6 | Accept |      0.0789 |      135.76 |      0.0744 |    0.074558 |    0.0087103 |
|   16 |       6 | Accept |      0.0779 |       47.27 |      0.0744 |    0.074543 |   2.7051e-07 |
|   17 |       6 | Accept |      0.0767 |      131.38 |      0.0744 |    0.074525 |      0.23863 |
|   18 |       6 | Accept |      0.0761 |       214.2 |      0.0744 |    0.075167 |        3.197 |
|   19 |       6 | Best   |      0.0729 |      249.91 |      0.0729 |    0.074075 |       9.9885 |
|   20 |       6 | Best   |      0.0719 |      264.56 |      0.0719 |    0.073435 |       9.9979 |
|================================================================================================|
| Iter | Active  | Eval   | Objective   | Objective   | BestSoFar   | BestSoFar   |       Lambda |
|      | workers | result |             | runtime     | (observed)  | (estim.)    |              |
|================================================================================================|
|   21 |       6 | Accept |      0.0779 |      45.861 |      0.0719 |    0.073414 |   1.4821e-08 |
|   22 |       6 | Accept |      0.0757 |      162.35 |      0.0719 |    0.073399 |      0.98421 |
|   23 |       6 | Accept |      0.0734 |      258.31 |      0.0719 |    0.073386 |       9.9794 |
|   24 |       6 | Accept |      0.0779 |      45.468 |      0.0719 |    0.073374 |   3.4303e-06 |
|   25 |       6 | Accept |      0.0779 |      45.322 |      0.0719 |    0.073363 |   1.7972e-09 |
|   26 |       6 | Accept |      0.0784 |      54.892 |      0.0719 |    0.073362 |    2.262e-05 |
|   27 |       6 | Accept |      0.0787 |      119.62 |      0.0719 |    0.073333 |     0.061795 |
|   28 |       6 | Accept |      0.0779 |      45.514 |      0.0719 |    0.073324 |    1.256e-07 |
|   29 |       6 | Accept |      0.0779 |      44.712 |      0.0719 |    0.073316 |   6.0925e-07 |
|   30 |       6 | Accept |      0.0779 |      46.147 |      0.0719 |    0.073308 |   2.8315e-08 |

__________________________________________________________
Optimization completed.
MaxObjectiveEvaluations of 30 reached.
Total function evaluations: 30
Total elapsed time: 685.0741 seconds.
Total objective function evaluation time: 3418.7781

Best observed feasible point:
    Lambda
    ______

    9.9979

Observed objective function value = 0.0719
Estimated objective function value = 0.073308
Function evaluation time = 264.5614

Best estimated feasible point (according to models):
    Lambda
    ______

    9.9979

Estimated objective function value = 0.073308
Estimated function evaluation time = 257.1588


Losso =

    0.0860

The classifier based on sparse filtering has a somewhat higher loss than the classifier based on the original data. However, the classifier uses only 100 features rather than the 784 features in the original data, and is much faster to create. Try to make a better sparse filtering classifier by increasing q from 100 to 200, which is still far less than 784.

q = 200;
Mdl2 = sparsefilt(Xtrain,q,'IterationLimit',10);
NewX = transform(Mdl2,Xtrain);
TestX = transform(Mdl2,Xtest);
Cmdl = fitcecoc(NewX,LabelTrain,'Learners',t, ...
    'OptimizeHyperparameters',{'Lambda'}, ...
    'HyperparameterOptimizationOptions',options);
Loss2 = loss(Cmdl,TestX,LabelTest)
Warning: Solver LBFGS was not able to converge to a solution. 
Copying objective function to workers...
Done copying objective function to workers.
|================================================================================================|
| Iter | Active  | Eval   | Objective   | Objective   | BestSoFar   | BestSoFar   |       Lambda |
|      | workers | result |             | runtime     | (observed)  | (estim.)    |              |
|================================================================================================|
|    1 |       6 | Best   |       0.752 |       7.076 |       0.752 |       0.752 |      0.51767 |
|    2 |       5 | Accept |      0.8848 |      7.1285 |       0.752 |     0.80577 |       1.3049 |
|    3 |       5 | Accept |      0.7805 |      7.1223 |       0.752 |     0.80577 |      0.65215 |
|    4 |       6 | Best   |      0.0648 |      9.0767 |      0.0648 |    0.064832 |   1.2719e-08 |
|    5 |       6 | Accept |      0.0981 |      12.339 |      0.0648 |      0.0648 |    0.0031356 |
|    6 |       6 | Accept |      0.8865 |      7.2604 |      0.0648 |    0.064956 |       3.8424 |
|    7 |       6 | Accept |       0.067 |      9.6393 |      0.0648 |    0.064907 |   1.0119e-09 |
|    8 |       6 | Best   |       0.063 |      14.945 |       0.063 |    0.064454 |     1.08e-07 |
|    9 |       6 | Accept |      0.0654 |      9.8684 |       0.063 |    0.063178 |   4.9037e-09 |
|   10 |       6 | Accept |      0.0794 |      16.553 |       0.063 |    0.063254 |   0.00099236 |
|   11 |       6 | Best   |      0.0589 |      28.416 |      0.0589 |    0.064721 |   6.1292e-06 |
|   12 |       6 | Accept |       0.066 |      9.5237 |      0.0589 |    0.059021 |   2.0984e-09 |
|   13 |       6 | Accept |      0.0639 |      10.891 |      0.0589 |    0.058997 |   3.3824e-08 |
|   14 |       6 | Accept |      0.0649 |      11.456 |      0.0589 |    0.058955 |   4.3995e-08 |
|   15 |       6 | Best   |      0.0579 |      28.103 |      0.0579 |    0.058044 |   6.7158e-05 |
|   16 |       6 | Accept |      0.0626 |      18.265 |      0.0579 |    0.058045 |   2.8966e-07 |
|   17 |       6 | Accept |      0.0621 |      21.254 |      0.0579 |    0.058051 |   8.4821e-07 |
|   18 |       6 | Accept |      0.0615 |      25.528 |      0.0579 |     0.05803 |   2.2042e-06 |
|   19 |       6 | Accept |      0.0625 |      23.378 |      0.0579 |     0.05767 |   0.00026368 |
|   20 |       6 | Accept |      0.0579 |      34.905 |      0.0579 |    0.057878 |   1.6686e-05 |
|================================================================================================|
| Iter | Active  | Eval   | Objective   | Objective   | BestSoFar   | BestSoFar   |       Lambda |
|      | workers | result |             | runtime     | (observed)  | (estim.)    |              |
|================================================================================================|
|   21 |       6 | Best   |      0.0563 |      35.465 |      0.0563 |    0.056888 |    3.384e-05 |
|   22 |       6 | Accept |      0.0576 |      34.846 |      0.0563 |    0.057376 |   1.1864e-05 |
|   23 |       6 | Best   |      0.0561 |      34.431 |      0.0561 |    0.056574 |   3.6608e-05 |
|   24 |       6 | Accept |      0.0596 |      29.375 |      0.0561 |    0.056559 |   8.4588e-05 |
|   25 |       6 | Accept |      0.0589 |      26.658 |      0.0561 |    0.056567 |   0.00011552 |
|   26 |       6 | Accept |      0.1722 |      9.9144 |      0.0561 |    0.056611 |     0.028611 |
|   27 |       6 | Accept |      0.1302 |        11.6 |      0.0561 |     0.05662 |    0.0096846 |
|   28 |       6 | Accept |      0.8865 |      8.5102 |      0.0561 |    0.056652 |       9.9923 |
|   29 |       6 | Accept |      0.2314 |      8.0709 |      0.0561 |    0.056553 |     0.080409 |
|   30 |       6 | Accept |      0.0582 |      30.686 |      0.0561 |     0.05663 |   5.8332e-05 |

__________________________________________________________
Optimization completed.
MaxObjectiveEvaluations of 30 reached.
Total function evaluations: 30
Total elapsed time: 105.4513 seconds.
Total objective function evaluation time: 542.2851

Best observed feasible point:
      Lambda  
    __________

    3.6608e-05

Observed objective function value = 0.0561
Estimated objective function value = 0.05663
Function evaluation time = 34.4305

Best estimated feasible point (according to models):
      Lambda  
    __________

    3.6608e-05

Estimated objective function value = 0.05663
Estimated function evaluation time = 33.6933


Loss2 =

    0.0686

This time the classification loss is lower than that of the original data classifier.

Try RICA

Try the other feature extraction function, rica. Extract 200 features, create a classifier, and examine its loss on the test data. Use more iterations for the rica function, because rica can perform better with more iterations than sparsefilt uses.

Often prior to feature extraction, you "prewhiten" the input data as a data preprocessing step. The prewhitening step includes two transforms, decorrelation and standardization, which make the predictors have zero mean and identity covariance. rica supports only the standardization transform. You use the Standardize name-value pair argument to make the predictors have zero mean and unit variance. Alternatively, you can transform images for contrast normalization individually by applying the zscore transformation before calling sparsefilt or rica.

Mdl3 = rica(Xtrain,q,'IterationLimit',400,'Standardize',true);
NewX = transform(Mdl3,Xtrain);
TestX = transform(Mdl3,Xtest);
Cmdl = fitcecoc(NewX,LabelTrain,'Learners',t, ...
    'OptimizeHyperparameters',{'Lambda'}, ...
    'HyperparameterOptimizationOptions',options);
Loss3 = loss(Cmdl,TestX,LabelTest)
Warning: Solver LBFGS was not able to converge to a solution. 
Copying objective function to workers...
Done copying objective function to workers.
|================================================================================================|
| Iter | Active  | Eval   | Objective   | Objective   | BestSoFar   | BestSoFar   |       Lambda |
|      | workers | result |             | runtime     | (observed)  | (estim.)    |              |
|================================================================================================|
|    1 |       6 | Best   |      0.1161 |      12.033 |      0.1161 |      0.1161 |       7.6382 |
|    2 |       6 | Best   |      0.0801 |       13.23 |      0.0801 |    0.081824 |   1.0841e-09 |
|    3 |       6 | Best   |      0.0797 |        13.7 |      0.0797 |    0.079717 |   2.1663e-09 |
|    4 |       6 | Best   |      0.0788 |      20.336 |      0.0788 |    0.078804 |   3.5739e-05 |
|    5 |       6 | Best   |      0.0765 |      21.716 |      0.0765 |     0.07651 |   9.6928e-05 |
|    6 |       6 | Best   |      0.0644 |      26.986 |      0.0644 |    0.064403 |     0.055148 |
|    7 |       6 | Accept |      0.0808 |       13.69 |      0.0644 |    0.064403 |   1.6951e-08 |
|    8 |       6 | Accept |      0.0797 |      13.349 |      0.0644 |    0.064404 |   5.7503e-09 |
|    9 |       6 | Accept |       0.079 |      20.276 |      0.0644 |    0.064403 |   2.9526e-06 |
|   10 |       6 | Accept |      0.0801 |       13.71 |      0.0644 |    0.064403 |   1.9659e-07 |
|   11 |       6 | Accept |       0.066 |        24.9 |      0.0644 |    0.064394 |      0.22353 |
|   12 |       6 | Accept |      0.0767 |      31.617 |      0.0644 |    0.064397 |   0.00059309 |
|   13 |       6 | Accept |      0.0668 |      27.014 |      0.0644 |    0.064389 |     0.011664 |
|   14 |       6 | Accept |      0.0719 |      30.169 |      0.0644 |    0.064387 |    0.0031989 |
|   15 |       6 | Accept |      0.0654 |      26.338 |      0.0644 |    0.064399 |     0.026655 |
|   16 |       6 | Accept |      0.0733 |      31.345 |      0.0644 |    0.064218 |    0.0016414 |
|   17 |       6 | Best   |      0.0643 |      25.532 |      0.0643 |    0.064239 |      0.11145 |
|   18 |       5 | Accept |      0.0648 |      26.839 |      0.0624 |    0.063728 |     0.098656 |
|   19 |       5 | Best   |      0.0624 |      26.951 |      0.0624 |    0.063728 |     0.077975 |
|   20 |       6 | Accept |      0.0859 |      16.207 |      0.0624 |    0.063798 |       1.6125 |
|================================================================================================|
| Iter | Active  | Eval   | Objective   | Objective   | BestSoFar   | BestSoFar   |       Lambda |
|      | workers | result |             | runtime     | (observed)  | (estim.)    |              |
|================================================================================================|
|   21 |       6 | Accept |      0.0628 |      26.359 |      0.0624 |    0.063609 |     0.072374 |
|   22 |       6 | Accept |      0.0648 |      25.385 |      0.0624 |    0.063715 |      0.10981 |
|   23 |       6 | Accept |      0.0798 |       14.33 |      0.0624 |    0.063713 |   5.9972e-08 |
|   24 |       6 | Accept |      0.0802 |      20.475 |      0.0624 |    0.063712 |    7.095e-07 |
|   25 |       5 | Accept |      0.0702 |       20.18 |      0.0624 |    0.063739 |      0.47577 |
|   26 |       5 | Accept |      0.0782 |      19.979 |      0.0624 |    0.063739 |   9.7848e-06 |
|   27 |       6 | Best   |      0.0619 |      26.638 |      0.0619 |    0.063479 |     0.074666 |
|   28 |       6 | Accept |      0.0624 |      26.768 |      0.0619 |     0.06335 |     0.073585 |
|   29 |       6 | Accept |      0.0992 |      13.079 |      0.0619 |    0.063361 |       3.8651 |
|   30 |       6 | Accept |      0.0807 |      13.742 |      0.0619 |    0.063362 |   3.2195e-08 |

__________________________________________________________
Optimization completed.
MaxObjectiveEvaluations of 30 reached.
Total function evaluations: 30
Total elapsed time: 119.988 seconds.
Total objective function evaluation time: 642.8708

Best observed feasible point:
     Lambda 
    ________

    0.074666

Observed objective function value = 0.0619
Estimated objective function value = 0.063362
Function evaluation time = 26.6381

Best estimated feasible point (according to models):
     Lambda 
    ________

    0.072374

Estimated objective function value = 0.063362
Estimated function evaluation time = 26.6488


Loss3 =

    0.0746

The rica-based classifier has somewhat higher test loss compared to the sparse filtering classifier.

Try More Features

The feature extraction functions have few tuning parameters. One parameter that can affect results is the number of requested features. See how well classifiers work when based on 1000 features, rather than the 200 features previously tried, or the 784 features in the original data. Using more features than appear in the original data is called "overcomplete" learning. Conversely, using fewer features is called "undercomplete" learning. Overcomplete learning can lead to increased classification accuracy, while undercomplete learning can save memory and time.

q = 1000;
Mdl4 = sparsefilt(Xtrain,q,'IterationLimit',10);
NewX = transform(Mdl4,Xtrain);
TestX = transform(Mdl4,Xtest);
Cmdl = fitcecoc(NewX,LabelTrain,'Learners',t, ...
    'OptimizeHyperparameters',{'Lambda'}, ...
    'HyperparameterOptimizationOptions',options);
Loss4 = loss(Cmdl,TestX,LabelTest)
Warning: Solver LBFGS was not able to converge to a solution. 
Copying objective function to workers...
Done copying objective function to workers.
|================================================================================================|
| Iter | Active  | Eval   | Objective   | Objective   | BestSoFar   | BestSoFar   |       Lambda |
|      | workers | result |             | runtime     | (observed)  | (estim.)    |              |
|================================================================================================|
|    1 |       6 | Best   |      0.8865 |      45.942 |      0.8865 |      0.8865 |       2.3989 |
|    2 |       6 | Accept |      0.8865 |      47.902 |      0.8865 |      0.8865 |       5.1401 |
|    3 |       6 | Best   |      0.0432 |      51.794 |      0.0432 |    0.043283 |   2.4208e-08 |
|    4 |       6 | Accept |      0.0436 |      54.696 |      0.0432 |    0.042893 |   5.5091e-08 |
|    5 |       6 | Accept |      0.4011 |        40.1 |      0.0432 |    0.043226 |      0.14741 |
|    6 |       6 | Accept |      0.0958 |      89.542 |      0.0432 |    0.043225 |    0.0050411 |
|    7 |       5 | Accept |      0.1911 |      55.344 |      0.0432 |    0.043146 |     0.043771 |
|    8 |       5 | Accept |      0.0438 |      51.203 |      0.0432 |    0.043146 |   1.0001e-09 |
|    9 |       6 | Best   |       0.043 |      51.389 |       0.043 |     0.04308 |   2.0493e-08 |
|   10 |       6 | Accept |      0.0434 |      53.469 |       0.043 |    0.043015 |    3.457e-09 |
|   11 |       6 | Best   |      0.0427 |      53.882 |      0.0427 |     0.04302 |   8.2165e-09 |
|   12 |       6 | Best   |      0.0425 |      87.783 |      0.0425 |    0.043036 |   1.5128e-07 |
|   13 |       6 | Accept |      0.0439 |      53.166 |      0.0425 |    0.043043 |   1.7564e-09 |
|   14 |       6 | Accept |      0.0429 |       53.46 |      0.0425 |    0.043086 |   1.2209e-08 |
|   15 |       6 | Accept |      0.0425 |       157.2 |      0.0425 |    0.043067 |   6.2143e-07 |
|   16 |       6 | Accept |      0.0515 |      163.53 |      0.0425 |    0.043077 |    0.0004333 |
|   17 |       6 | Accept |      0.0428 |      53.968 |      0.0425 |    0.042737 |   5.8696e-09 |
|   18 |       6 | Accept |      0.0426 |      67.358 |      0.0425 |    0.042733 |   8.9837e-08 |
|   19 |       6 | Best   |      0.0399 |      285.58 |      0.0399 |    0.039883 |   1.8692e-05 |
|   20 |       6 | Accept |      0.0424 |      128.03 |      0.0399 |    0.039881 |   2.8446e-07 |
|================================================================================================|
| Iter | Active  | Eval   | Objective   | Objective   | BestSoFar   | BestSoFar   |       Lambda |
|      | workers | result |             | runtime     | (observed)  | (estim.)    |              |
|================================================================================================|
|   21 |       6 | Best   |      0.0398 |       216.8 |      0.0398 |    0.039864 |   2.5713e-06 |
|   22 |       6 | Accept |      0.0444 |       208.6 |      0.0398 |    0.039835 |   0.00019319 |
|   23 |       6 | Accept |      0.0404 |      256.65 |      0.0398 |    0.039847 |    9.324e-05 |
|   24 |       6 | Best   |      0.0397 |       276.9 |      0.0397 |    0.039688 |   7.0356e-06 |
|   25 |       6 | Accept |       0.041 |      189.17 |      0.0397 |    0.039809 |   1.2113e-06 |
|   26 |       6 | Best   |      0.0393 |      299.66 |      0.0393 |    0.039338 |   4.1625e-05 |
|   27 |       6 | Best   |      0.0391 |      250.98 |      0.0391 |    0.039321 |   4.0724e-06 |
|   28 |       6 | Accept |      0.0391 |      287.08 |      0.0391 |     0.03932 |   1.0931e-05 |
|   29 |       6 | Accept |      0.0396 |      239.51 |      0.0391 |    0.039462 |   3.5146e-06 |
|   30 |       6 | Accept |      0.0405 |      206.25 |      0.0391 |     0.03952 |   2.1295e-06 |

__________________________________________________________
Optimization completed.
MaxObjectiveEvaluations of 30 reached.
Total function evaluations: 30
Total elapsed time: 794.8129 seconds.
Total objective function evaluation time: 4076.9347

Best observed feasible point:
      Lambda  
    __________

    4.0724e-06

Observed objective function value = 0.0391
Estimated objective function value = 0.03952
Function evaluation time = 250.9821

Best estimated feasible point (according to models):
      Lambda  
    __________

    3.5146e-06

Estimated objective function value = 0.03952
Estimated function evaluation time = 238.7933


Loss4 =

    0.0458

The classifier based on overcomplete sparse filtering with 1000 extracted features has the lowest test loss of any classifier yet tested.

Mdl5 = rica(Xtrain,q,'IterationLimit',400,'Standardize',true);
NewX = transform(Mdl5,Xtrain);
TestX = transform(Mdl5,Xtest);
Cmdl = fitcecoc(NewX,LabelTrain,'Learners',t, ...
    'OptimizeHyperparameters',{'Lambda'}, ...
    'HyperparameterOptimizationOptions',options);
Loss5 = loss(Cmdl,TestX,LabelTest)
Warning: Solver LBFGS was not able to converge to a solution. 
Copying objective function to workers...
Done copying objective function to workers.
|================================================================================================|
| Iter | Active  | Eval   | Objective   | Objective   | BestSoFar   | BestSoFar   |       Lambda |
|      | workers | result |             | runtime     | (observed)  | (estim.)    |              |
|================================================================================================|
|    1 |       6 | Best   |      0.0793 |      45.449 |      0.0793 |      0.0793 |    8.935e-09 |
|    2 |       6 | Best   |      0.0787 |      146.84 |      0.0787 |       0.079 |   1.4834e-05 |
|    3 |       6 | Best   |      0.0779 |      149.38 |      0.0779 |    0.078633 |   4.1948e-06 |
|    4 |       6 | Best   |      0.0777 |      153.15 |      0.0777 |      0.0777 |   2.6573e-06 |
|    5 |       6 | Best   |      0.0735 |      171.35 |      0.0735 |      0.0735 |      0.62958 |
|    6 |       6 | Accept |      0.1263 |      95.035 |      0.0735 |    0.073503 |       9.9845 |
|    7 |       6 | Accept |      0.0786 |      141.65 |      0.0735 |    0.073504 |   7.8708e-06 |
|    8 |       6 | Accept |       0.078 |      147.28 |      0.0735 |    0.073504 |   3.4442e-05 |
|    9 |       6 | Accept |      0.0788 |      151.31 |      0.0735 |    0.073504 |   7.0238e-07 |
|   10 |       6 | Accept |      0.0772 |      260.81 |      0.0735 |    0.073504 |   0.00033302 |
|   11 |       6 | Best   |      0.0692 |      310.13 |      0.0692 |    0.069203 |    0.0017533 |
|   12 |       6 | Accept |       0.079 |      45.189 |      0.0692 |    0.069203 |   1.0004e-09 |
|   13 |       6 | Accept |      0.0791 |      43.562 |      0.0692 |    0.069203 |   6.8221e-08 |
|   14 |       6 | Accept |      0.0791 |      44.776 |      0.0692 |    0.069204 |   2.6499e-09 |
|   15 |       6 | Accept |        0.09 |      129.34 |      0.0692 |    0.069206 |       2.1641 |
|   16 |       6 | Accept |      0.0786 |      49.025 |      0.0692 |    0.069206 |   2.0818e-07 |
|   17 |       6 | Best   |      0.0675 |      211.67 |      0.0675 |    0.067494 |      0.21351 |
|   18 |       6 | Best   |      0.0651 |      233.43 |      0.0651 |    0.065096 |     0.056995 |
|   19 |       6 | Accept |      0.0692 |      252.72 |      0.0651 |    0.065116 |     0.010482 |
|   20 |       6 | Accept |      0.0653 |      246.18 |      0.0651 |    0.065116 |     0.099991 |
|================================================================================================|
| Iter | Active  | Eval   | Objective   | Objective   | BestSoFar   | BestSoFar   |       Lambda |
|      | workers | result |             | runtime     | (observed)  | (estim.)    |              |
|================================================================================================|
|   21 |       6 | Accept |      0.0692 |      285.39 |      0.0651 |    0.065109 |    0.0042157 |
|   22 |       6 | Accept |      0.0791 |      192.95 |      0.0651 |     0.06511 |   0.00011073 |
|   23 |       6 | Accept |      0.0672 |      221.33 |      0.0651 |    0.065121 |     0.024284 |
|   24 |       6 | Accept |      0.0796 |      46.336 |      0.0651 |    0.065122 |   2.4318e-08 |
|   25 |       6 | Accept |      0.0686 |       265.9 |      0.0651 |    0.065224 |    0.0072737 |
|   26 |       6 | Accept |       0.066 |      245.17 |      0.0651 |    0.065462 |     0.069576 |
|   27 |       6 | Best   |       0.065 |      233.62 |       0.065 |    0.065338 |     0.070537 |
|   28 |       6 | Accept |      0.0652 |      236.22 |       0.065 |    0.065304 |     0.069455 |
|   29 |       6 | Accept |       0.079 |      46.072 |       0.065 |    0.065301 |   1.5309e-09 |
|   30 |       6 | Accept |      0.0662 |      221.53 |       0.065 |    0.065311 |     0.038733 |

__________________________________________________________
Optimization completed.
MaxObjectiveEvaluations of 30 reached.
Total function evaluations: 30
Total elapsed time: 985.7708 seconds.
Total objective function evaluation time: 5022.8042

Best observed feasible point:
     Lambda 
    ________

    0.070537

Observed objective function value = 0.065
Estimated objective function value = 0.065311
Function evaluation time = 233.6229

Best estimated feasible point (according to models):
     Lambda 
    ________

    0.070537

Estimated objective function value = 0.065311
Estimated function evaluation time = 239.1365


Loss5 =

    0.0741

The classifier based on RICA with 1000 extracted features has a similar test loss to the RICA classifier based on 200 extracted features.

Optimize Hyperparameters by Using bayesopt

Feature extraction functions have these tuning parameters:

  • Iteration limit
  • Function, either rica or sparsefilt
  • Parameter Lambda
  • Number of learned features q

The fitcecoc regularization parameter also affects the accuracy of the learned classifier. Include that parameter in the list of hyperparameters as well.

To search among the available parameters effectively, try bayesopt. Use the following objective function, which includes parameters passed from the workspace.

function objective = filterica(x,Xtrain,Xtest,LabelTrain,LabelTest,winit)

initW = winit(1:size(Xtrain,2),1:x.q);
if char(x.solver) == 'r'
    Mdl = rica(Xtrain,x.q,'Lambda',x.lambda,'IterationLimit',x.iterlim, ...
        'InitialTransformWeights',initW,'Standardize',true);
else
    Mdl = sparsefilt(Xtrain,x.q,'Lambda',x.lambda,'IterationLimit',x.iterlim, ...
        'InitialTransformWeights',initW);
end

NewX = transform(Mdl,Xtrain);
TestX = transform(Mdl,Xtest);
t = templateLinear('Lambda',x.lambdareg,'Solver','lbfgs');
Cmdl = fitcecoc(NewX,LabelTrain,'Learners',t);
objective = loss(Cmdl,TestX,LabelTest);

To remove sources of variation, fix an initial transform weight matrix.

W = randn(1e4,1e3);

Create hyperparameters for the objective function.

iterlim = optimizableVariable('iterlim',[5,500],'Type','integer');
lambda = optimizableVariable('lambda',[0,10]);
solver = optimizableVariable('solver',{'r','s'},'Type','categorical');
qvar = optimizableVariable('q',[10,1000],'Type','integer');
lambdareg = optimizableVariable('lambdareg',[1e-6,1],'Transform','log');
vars = [iterlim,lambda,solver,qvar,lambdareg];

Run the optimization without the warnings that occur when the internal optimizations do not run to completion. Run for 60 iterations instead of the default 30 to give the optimization a better chance of locating a good value.

warning('off','stats:classreg:learning:fsutils:Solver:LBFGSUnableToConverge');
results = bayesopt(@(x) filterica(x,Xtrain,Xtest,LabelTrain,LabelTest,W),vars, ...
    'UseParallel',true,'MaxObjectiveEvaluations',60);
warning('on','stats:classreg:learning:fsutils:Solver:LBFGSUnableToConverge');
Copying objective function to workers...
Done copying objective function to workers.
|============================================================================================================================================================|
| Iter | Active  | Eval   | Objective   | Objective   | BestSoFar   | BestSoFar   |      iterlim |       lambda |       solver |            q |    lambdareg |
|      | workers | result |             | runtime     | (observed)  | (estim.)    |              |              |              |              |              |
|============================================================================================================================================================|
|    1 |       6 | Best   |     0.53748 |      25.011 |     0.53748 |     0.53748 |           25 |       6.0656 |            s |          289 |      0.23908 |
|    2 |       6 | Best   |    0.090079 |      117.88 |    0.090079 |     0.12404 |           44 |       3.1617 |            r |          805 |   1.1979e-06 |
|    3 |       6 | Accept |    0.094795 |       128.8 |    0.090079 |    0.090738 |          166 |       7.5437 |            s |          364 |   3.4071e-05 |
|    4 |       6 | Accept |    0.091667 |      140.48 |    0.090079 |    0.090124 |          367 |       7.6065 |            r |          147 |   1.6194e-06 |
|    5 |       6 | Accept |    0.092124 |      122.07 |    0.090079 |    0.090351 |          203 |      0.43563 |            r |          216 |   6.6615e-05 |
|    6 |       6 | Best   |    0.079954 |      55.372 |    0.079954 |    0.079961 |           59 |       7.9869 |            r |          287 |    0.0082234 |
|    7 |       6 | Accept |    0.087518 |      107.37 |    0.079954 |    0.080013 |          196 |       4.2311 |            r |          204 |      0.99846 |
|    8 |       6 | Accept |      0.1044 |       262.5 |    0.079954 |    0.079955 |          219 |       5.3425 |            s |          582 |   0.00013924 |
|    9 |       6 | Accept |     0.08908 |      236.96 |    0.079954 |    0.079954 |          118 |       4.1448 |            r |          736 |   1.0006e-06 |
|   10 |       6 | Best   |    0.073399 |      357.09 |    0.073399 |    0.073454 |          453 |       7.1495 |            r |          312 |      0.11334 |
|   11 |       6 | Accept |     0.27723 |      700.19 |    0.073399 |    0.073682 |          492 |       3.3605 |            s |          784 |     0.026698 |
|   12 |       6 | Accept |    0.098615 |      495.45 |    0.073399 |    0.073412 |          280 |       2.3508 |            s |          873 |   1.0023e-06 |
|   13 |       6 | Accept |     0.26445 |      16.279 |    0.073399 |    0.073415 |          178 |       4.9932 |            s |           25 |   1.0482e-06 |
|   14 |       6 | Accept |    0.086335 |      463.99 |    0.073399 |    0.073415 |          436 |       8.1715 |            r |          416 |    0.0010325 |
|   15 |       6 | Accept |    0.073897 |      165.19 |    0.073399 |    0.073453 |          156 |        7.901 |            r |          390 |     0.072626 |
|   16 |       6 | Best   |      0.0542 |      63.842 |      0.0542 |    0.054204 |           11 |       8.0353 |            s |          646 |   8.8572e-06 |
|   17 |       6 | Accept |     0.13236 |      9.7164 |      0.0542 |    0.054172 |          100 |       2.1079 |            r |           14 |     0.060744 |
|   18 |       6 | Accept |    0.089937 |      121.57 |      0.0542 |    0.054184 |           38 |         3.14 |            r |         1000 |   1.0338e-06 |
|   19 |       6 | Accept |    0.095321 |      26.666 |      0.0542 |    0.054188 |           17 |      0.42316 |            r |          364 |   1.0253e-06 |
|   20 |       6 | Accept |    0.075511 |      663.85 |      0.0542 |    0.054177 |          263 |       5.7366 |            r |          964 |     0.030268 |
|============================================================================================================================================================|
| Iter | Active  | Eval   | Objective   | Objective   | BestSoFar   | BestSoFar   |      iterlim |       lambda |       solver |            q |    lambdareg |
|      | workers | result |             | runtime     | (observed)  | (estim.)    |              |              |              |              |              |
|============================================================================================================================================================|
|   21 |       6 | Accept |     0.11003 |      924.74 |      0.0542 |    0.054227 |          495 |      0.46688 |            s |          986 |   1.0927e-05 |
|   22 |       6 | Best   |    0.052359 |      73.155 |    0.052359 |    0.052321 |           36 |      0.60555 |            s |          539 |   4.5956e-06 |
|   23 |       6 | Accept |     0.12494 |      358.25 |    0.052359 |    0.052541 |          347 |      0.56663 |            s |          509 |   1.0309e-06 |
|   24 |       6 | Best   |    0.049896 |      78.397 |    0.049896 |    0.049868 |           27 |       1.2167 |            s |          657 |   1.0109e-05 |
|   25 |       6 | Accept |    0.082211 |      88.012 |    0.049896 |    0.049877 |           40 |       9.5342 |            r |          684 |      0.64687 |
|   26 |       6 | Accept |     0.13766 |      4.1461 |    0.049896 |    0.049929 |           13 |       7.7307 |            r |           15 |   5.0522e-06 |
|   27 |       6 | Accept |    0.091505 |      76.834 |    0.049896 |     0.04993 |            9 |       2.2992 |            r |          998 |   0.00017959 |
|   28 |       6 | Accept |    0.083367 |      377.72 |    0.049896 |    0.049922 |          185 |       8.1762 |            r |          745 |    0.0014168 |
|   29 |       6 | Accept |    0.050204 |      36.144 |    0.049896 |    0.049954 |            7 |       0.2251 |            s |          508 |   1.1083e-05 |
|   30 |       6 | Accept |    0.050111 |       94.69 |    0.049896 |    0.049853 |           13 |      0.53243 |            s |          944 |   5.0925e-06 |
|   31 |       6 | Accept |    0.049967 |      80.205 |    0.049896 |    0.049843 |           27 |       2.2348 |            s |          607 |   7.0124e-06 |
|   32 |       6 | Accept |    0.051483 |       73.96 |    0.049896 |    0.049913 |            8 |      0.69257 |            s |          764 |   3.1935e-06 |
|   33 |       6 | Accept |    0.055065 |      59.315 |    0.049896 |    0.049209 |            6 |       1.8426 |            s |          609 |   7.2764e-06 |
|   34 |       6 | Accept |    0.085603 |      646.71 |    0.049896 |    0.049199 |          259 |       8.6047 |            r |          997 |      0.91997 |
|   35 |       6 | Accept |    0.055191 |      64.876 |    0.049896 |    0.050181 |            8 |       1.1415 |            s |          657 |   7.3292e-06 |
|   36 |       6 | Accept |    0.053131 |       78.86 |    0.049896 |    0.050183 |            9 |       2.6527 |            s |          804 |   6.1887e-06 |
|   37 |       6 | Accept |    0.086441 |      972.66 |    0.049896 |    0.050138 |          496 |       4.6939 |            r |          770 |      0.98435 |
|   38 |       6 | Accept |     0.11435 |      559.73 |    0.049896 |    0.050126 |          477 |       5.0497 |            s |          584 |   5.5909e-06 |
|   39 |       6 | Accept |    0.061942 |      56.196 |    0.049896 |     0.05003 |            6 |       7.0536 |            s |          537 |   1.2679e-05 |
|   40 |       6 | Accept |    0.050695 |      146.19 |    0.049896 |    0.049991 |           38 |       0.4489 |            s |          973 |   3.1555e-05 |
|============================================================================================================================================================|
| Iter | Active  | Eval   | Objective   | Objective   | BestSoFar   | BestSoFar   |      iterlim |       lambda |       solver |            q |    lambdareg |
|      | workers | result |             | runtime     | (observed)  | (estim.)    |              |              |              |              |              |
|============================================================================================================================================================|
|   41 |       6 | Accept |    0.063993 |      67.757 |    0.049896 |     0.04959 |            6 |       7.4843 |            s |          557 |   7.2367e-06 |
|   42 |       6 | Accept |    0.052216 |      67.695 |    0.049896 |    0.050079 |            9 |      0.22962 |            s |          590 |     1.13e-05 |
|   43 |       6 | Accept |    0.050459 |      106.06 |    0.049896 |     0.05057 |            8 |       9.7543 |            s |          992 |    3.926e-06 |
|   44 |       6 | Accept |    0.080488 |      73.663 |    0.049896 |    0.050185 |            8 |       2.7259 |            s |          992 |   5.2108e-05 |
|   45 |       6 | Accept |     0.33271 |      3.0724 |    0.049896 |    0.049921 |           10 |       9.3682 |            s |           12 |    0.0053374 |
|   46 |       6 | Accept |    0.050062 |      117.69 |    0.049896 |    0.049974 |           14 |       5.7044 |            s |          963 |   4.3362e-06 |
|   47 |       6 | Accept |    0.052068 |      84.789 |    0.049896 |    0.050235 |            9 |       1.1243 |            s |          775 |   4.4925e-06 |
|   48 |       6 | Accept |      0.2018 |      3.8272 |    0.049896 |    0.049958 |           10 |       1.3355 |            s |           18 |   0.00048666 |
|   49 |       6 | Accept |     0.16043 |      4.6847 |    0.049896 |    0.050023 |           11 |        1.695 |            r |           13 |   0.00031978 |
|   50 |       6 | Best   |    0.049641 |      114.53 |    0.049641 |    0.049857 |           18 |       1.7006 |            s |          890 |   1.7471e-05 |
|   51 |       6 | Accept |      0.7381 |      4.4077 |    0.049641 |    0.049978 |           20 |       1.1795 |            s |           16 |      0.55346 |
|   52 |       6 | Best   |    0.049122 |      126.93 |    0.049122 |    0.046892 |           11 |       4.4034 |            s |          997 |   1.2552e-05 |
|   53 |       6 | Accept |     0.11509 |      5.5331 |    0.049122 |    0.046928 |           10 |       6.1473 |            r |           32 |      0.66147 |
|   54 |       6 | Accept |     0.13896 |      5.3982 |    0.049122 |    0.046931 |           16 |       9.9908 |            r |           14 |   1.0093e-06 |
|   55 |       6 | Accept |    0.089654 |        1184 |    0.049122 |    0.047014 |          461 |       9.1226 |            r |          998 |   9.5657e-05 |
|   56 |       6 | Accept |     0.13288 |      6.7787 |    0.049122 |     0.04717 |           24 |       5.1783 |            s |           34 |   1.5146e-05 |
|   57 |       6 | Accept |    0.088281 |       7.071 |    0.049122 |    0.049622 |           12 |       9.9207 |            r |           46 |    0.0070647 |
|   58 |       6 | Accept |     0.14073 |      4.7475 |    0.049122 |     0.05038 |            6 |      0.88866 |            s |           27 |   1.0616e-06 |
|   59 |       6 | Accept |     0.10047 |      6.3818 |    0.049122 |    0.050379 |           14 |       9.9281 |            r |           30 |   4.9647e-05 |
|   60 |       6 | Accept |    0.049743 |      123.64 |    0.049122 |    0.049364 |           13 |       2.1134 |            s |          985 |   2.6744e-06 |

__________________________________________________________
Optimization completed.
MaxObjectiveEvaluations of 60 reached.
Total function evaluations: 60
Total elapsed time: 1929.959 seconds.
Total objective function evaluation time: 11249.722

Best observed feasible point:
    iterlim    lambda    solver     q     lambdareg 
    _______    ______    ______    ___    __________

      11       4.4034      s       997    1.2552e-05

Observed objective function value = 0.049122
Estimated objective function value = 0.049364
Function evaluation time = 126.9278

Best estimated feasible point (according to models):
    iterlim    lambda    solver     q     lambdareg 
    _______    ______    ______    ___    __________

      27       1.2167      s       657    1.0109e-05

Estimated objective function value = 0.049364
Estimated function evaluation time = 81.6216

The resulting classifier does not have better (lower) loss than the classifier using sparsefilt for 1000 features, trained for 10 iterations.

View the filter coefficients for the best hyperparameters that bayesopt found. The resulting images show the shapes of the extracted features. These shapes are recognizable as portions of handwritten digits.

Xtbl = results.XAtMinObjective;
Q = Xtbl.q;
initW = W(1:size(Xtrain,2),1:Q);
if char(Xtbl.solver) == 'r'
    Mdl = rica(Xtrain,Q,'Lambda',Xtbl.lambda,'IterationLimit',Xtbl.iterlim, ...
        'InitialTransformWeights',initW,'Standardize',true);
else
    Mdl = sparsefilt(Xtrain,Q,'Lambda',Xtbl.lambda,'IterationLimit',Xtbl.iterlim, ...
        'InitialTransformWeights',initW);
end
Wts = Mdl.TransformWeights;
Wts = reshape(Wts,[28,28,Q]);
[dx,dy,~,~] = size(Wts);
for f = 1:Q
    Wvec = Wts(:,:,f);
    Wvec = Wvec(:);
    Wvec =(Wvec - min(Wvec))/(max(Wvec) - min(Wvec));
    Wts(:,:,f) = reshape(Wvec,dx,dy);
end
m   = ceil(sqrt(Q));
n   = m;
img = zeros(m*dx,n*dy);
f   = 1;
for i = 1:m
    for j = 1:n
        if (f <= Q)
            img((i-1)*dx+1:i*dx,(j-1)*dy+1:j*dy,:) = Wts(:,:,f);
            f = f+1;
        end
    end
end
imshow(img);
Warning: Solver LBFGS was not able to converge to a solution. 

Remove the directory of the functions processMNISTdata and filterica from the search path.

rmpath(fullfile(matlabroot,'examples','stats'));

Code for Reading MNIST Data

The code of the function that reads the data into the workspace is:

function [X,L] = processMNISTdata(imageFileName,labelFileName)

[fileID,errmsg] = fopen(imageFileName,'r','b');
if fileID < 0
    error(errmsg);
end
%%
% First read the magic number. This number is 2051 for image data, and
% 2049 for label data
magicNum = fread(fileID,1,'int32',0,'b');
if magicNum == 2051
    fprintf('\nRead MNIST image data...\n')
end
%%
% Then read the number of images, number of rows, and number of columns
numImages = fread(fileID,1,'int32',0,'b');
fprintf('Number of images in the dataset: %6d ...\n',numImages);
numRows = fread(fileID,1,'int32',0,'b');
numCols = fread(fileID,1,'int32',0,'b');
fprintf('Each image is of %2d by %2d pixels...\n',numRows,numCols);
%%
% Read the image data
X = fread(fileID,inf,'unsigned char');
%%
% Reshape the data to array X
X = reshape(X,numCols,numRows,numImages);
X = permute(X,[2 1 3]);
%%
% Then flatten each image data into a 1 by (numRows*numCols) vector, and 
% store all the image data into a numImages by (numRows*numCols) array.
X = reshape(X,numRows*numCols,numImages)';
fprintf(['The image data is read to a matrix of dimensions: %6d by %4d...\n',...
    'End of reading image data.\n'],size(X,1),size(X,2));
%%
% Close the file
fclose(fileID);
%%
% Similarly, read the label data.
[fileID,errmsg] = fopen(labelFileName,'r','b');
if fileID < 0
    error(errmsg);
end
magicNum = fread(fileID,1,'int32',0,'b');
if magicNum == 2049
    fprintf('\nRead MNIST label data...\n')
end
numItems = fread(fileID,1,'int32',0,'b');
fprintf('Number of labels in the dataset: %6d ...\n',numItems);

L = fread(fileID,inf,'unsigned char');
fprintf(['The label data is read to a matrix of dimensions: %6d by %2d...\n',...
    'End of reading label data.\n'],size(L,1),size(L,2));
fclose(fileID);