Introduction to sltoolbox
sltoolbox (Statistical Learning Toolbox) organizes a comprehensive set of matlab codes in statistical learning, pattern recognition and computer vision. It includes 256 m-files in 24 categories, which are from low-level computational routines to high-level frameworks and algorithms. The toolbox have following main features:
(1) it covers many active research topics in learning and vision, including classification, regression, statistical modeling, finite mixture model, graph theory-based learning, subspace learning, kernel learning, manifold learning, tensor algebra, vector quantization and vocabulary learning.
(2) it offers many useful utilities to facilitate your experiments in matlab, including a set of kits to manipulate data, text and files. In addition, it offers a matlab-based script system called experiment description language with an xml-based experiment control system to help you run a large batch of experiments with ease.
(3) it is highly optimized. Much efforts have been devoted to improve the run-time efficiency of the codes. It is achieved with three ways: deducing equivalent mathematical forms for fast computation, grouping the operations into matrix-based computations to maximum degree, and writing the codes in cpp-mex for those cannot be organized into matrix computation.
(4) it is flexible and extensible. For most of the functions, you can control a lot of properties to adapt its behaviour to your need. For many algorithms, the implementations support weighted samples so that you can easily incorporate the algorithm into the environment using weights. In addition, in some of the algorithms, you can change the functions' behaviour by supplying your own call-back function. For example, in K-means, you can specify your special function to measure distances or compute means; in spectral learning, you can specify your function to caculate the graph edge weights in your own manner.
(5) it is well organized. The whole toolbox is organized according to the rules in software engineering. They are not a simple collection of many algorithms, but a carefully designed system, so that the codes can be maximally reused and cooperate well.
(6) it is easy to use. Detailed help information is given for each m-file. I have tried to design friendly interfaces to user. For most of the functions, you can use a small number of arguments to invoke them in default settings, when you would like to gain more control on their behaviour, you can tell them your specification by setting properties, such as
f(x1, x2, 'propertyname1', propertyvalue1, 'propertyname2', propertyvalue2, ...)
(7) it is robust. Attention has been paid to the numerical stability of the computations and some steps have been taken to enhance the stability. In addition, a lot of error-checking statements are used to check the consistency of the input arguments. I have tried to lie a good balance between robustness and effiency, and increase the robustness without notably compromising the run-time speed.
The following is a brief list of the functions offered in sltoolbox.
It contains the following categories:
core: The core computational routines. The efficient implementation of a set of common computation routines.
smallmat: Fast functions to compute on a set of small matrices
utils: A set of useful toolkits to manipulate data.
utils_ex: Other useful kits
fileio: Facilities to manage files
text: Kits to parse and manipulate strings and texts.
perfeval: classification performance evaluation
imgproc: Functions for image-based learning and batch image processing
visualize: Visualization of data and models
xmlkits: small kits to extract information from XML elements
ann: Approximate nearest neighbors by KD-tree
cluster: Data clustering
discrete: Vector quantization, vocabulary building and histogram-based computation
graph: Graph (the graph in graph theory) contruction
interp: Interpolation kernels
kernel: Kernel learning and kernelization
learn: Some basic learning architectures
regression: Linear and Logistic regression
stat: Statistical modeling and Finite mixture model (such as GMM)
subspace: Representative subspace learning algorithms
subspace_ex: Subspace learning algortihms for very high-dimension data
manifold: Manifold embedding learning
tensor: Tensor algebra
expdl: Experiment description language