The Statistical Reference Datasets Project, maintained by staff of the Statistical Engineering Division within the IT Laboratory of the National Institute of Standards and Technology, is a collection of datasets that have been made publicly available on the World Wide Web, with the purpose of providing benchmark applications for testing statistical software.
The datasets are maintained by NIST, a US federal government agency, and they have confirmed to me that this puts the data itself entirely within the public domain.
With this in mind, for convenience and to provide a useful service to the wider MATLAB community, I have cast all the nonlinear regression datasets into an easy-to-use MAT file, containing "struct" objects for each dataset, each of which comprise:
* the dependent variable, x
* the observations/simulations, y
* a function handle describing the model function f(b,x)
* b0 and b1, the two starting points given with each dataset
* the calibrated true value, breal
* the standard deviation given, bsd
Quote from the text on the NIST group website to motivate this project:
"...most evaluations of nonlinear least squares software should also include a measure of the reliability of the code, that is, whether the code correctly recognizes when it has (or has not) found a solution. The datasets provided here are particularly well suited for such testing of robustness and reliability. We have included both generated and 'real-world' nonlinear least squares problems of varying levels of difficulty. The generated datasets are designed to challenge specific computations. Real-world data include challenging datasets such as the Thurber problem, and more benign datasets such as Misra1a. The certified values are 'best-available' solutions, obtained using 128-bit precision and confirmed by at least two different algorithms and software packages using analytic derivatives."
I hope this dataset is of use to those using techniques of nonlinear regression. Let me know if it has been of use!
Information on the entire StRD suite:
Information on nonlinear regression data:
The NLR data itself:
IMPORTANT COPYRIGHT NOTE:
This data is PUBLIC DOMAIN data by virtue of being published by a US federal government agency. I hold NO copyright on it; my contribution has been to faithfully cast the data in a convenient MAT file for use in MATLAB.
If you find this useful, please cite the NIST group in the first instance.