[1] Atkinson, A. C., and A. N. Donev.
*Optimum Experimental Designs*. New
York: Oxford University Press, 1992.

[2] Bates, D. M., and D. G. Watts.
*Nonlinear Regression Analysis and Its
Applications*. Hoboken, NJ: John Wiley &
Sons, Inc., 1988.

[3] Belsley, D. A., E. Kuh, and R. E. Welsch.
*Regression Diagnostics*. Hoboken, NJ:
John Wiley & Sons, Inc., 1980.

[4] Berry, M. W., et al. “Algorithms and
Applications for Approximate Nonnegative Matrix
Factorization.” *Computational Statistics and Data
Analysis*. Vol. 52, No. 1, 2007, pp.
155–173.

[5] Bookstein, Fred L. *Morphometric
Tools for Landmark Data*. Cambridge, UK:
Cambridge University Press, 1991.

[6] Bouye, E., V. Durrleman, A. Nikeghbali, G. Riboulet, and T. Roncalli. “Copulas for Finance: A Reading Guide and Some Applications.” Working Paper. Groupe de Recherche Operationnelle, Credit Lyonnais, 2000.

[7] Bowman, A. W., and A. Azzalini.
*Applied Smoothing Techniques for Data
Analysis*. New York: Oxford University Press,
1997.

[8] Box, G. E. P., and N. R. Draper.
*Empirical Model-Building and Response
Surfaces*. Hoboken, NJ: John Wiley & Sons,
Inc., 1987.

[9] Box, G. E. P., W. G. Hunter, and J. S.
Hunter. *Statistics for Experimenters*.
Hoboken, NJ: Wiley-Interscience, 1978.

[10] Bratley, P., and B. L. Fox.
“ALGORITHM 659 Implementing Sobol's Quasirandom Sequence
Generator.” *ACM Transactions on Mathematical
Software*. Vol. 14, No. 1, 1988, pp.
88–100.

[11] Breiman, L. “Random Forests.”
*Machine Learning.* Vol. 4, 2001, pp.
5–32.

[12] Breiman, L., J. Friedman, R. Olshen, and C.
Stone. *Classification and Regression Trees*.
Boca Raton, FL: CRC Press, 1984.

[13] Bulmer, M. G. *Principles of
Statistics*. Mineola, NY: Dover Publications,
Inc., 1979.

[14] Bury, K.. *Statistical Distributions
in Engineering*. Cambridge, UK: Cambridge
University Press, 1999.

[15] Chatterjee, S., and A. S. Hadi.
“Influential Observations, High Leverage Points, and Outliers
in Linear Regression.” *Statistical
Science*. Vol. 1, 1986, pp.
379–416.

[16] Collett, D. *Modeling Binary
Data*. New York: Chapman & Hall,
2002.

[17] Conover, W. J. *Practical
Nonparametric Statistics*. Hoboken, NJ: John
Wiley & Sons, Inc., 1980.

[18] Cook, R. D., and S. Weisberg.
*Residuals and Influence in
Regression*. New York: Chapman & Hall/CRC Press,
1983.

[19] Cox, D. R., and D. Oakes. *Analysis
of Survival Data*. London: Chapman & Hall,
1984.

[20] Davidian, M., and D. M. Giltinan.
*Nonlinear Models for Repeated Measurements
Data*. New York: Chapman & Hall,
1995.

[21] Deb, P., and M. Sefton. “The
Distribution of a Lagrange Multiplier Test of Normality.”
*Economics Letters*. Vol. 51, 1996, pp.
123–130.

[22] de Jong, S. “SIMPLS: An Alternative
Approach to Partial Least Squares Regression.”
*Chemometrics and Intelligent Laboratory
Systems*. Vol. 18, 1993, pp.
251–263.

[23] Demidenko, E. *Mixed Models:
Theory and Applications*. Hoboken, NJ: John Wiley
& Sons, Inc., 2004.

[24] Delyon, B., M. Lavielle, and E. Moulines,
*Convergence of a stochastic approximation version
of the EM algorithm*, Annals of Statistics, 27,
94-128, 1999.

[25] Dempster, A. P., N. M. Laird, and D. B. Rubin.
“Maximum Likelihood from Incomplete Data via the EM
Algorithm.” *Journal of the Royal Statistical
Society*. Series B, Vol. 39, No. 1, 1977, pp.
1–37.

[26] Devroye, L. *Non-Uniform Random
Variate Generation*. New York: Springer-Verlag,
1986.

[27] Dobson, A. J. *An Introduction to
Generalized Linear Models*. New York: Chapman
& Hall, 1990.

[28] Dunn, O.J., and V.A. Clark. *Applied Statistics:
Analysis of Variance and Regression*. New York:
Wiley, 1974.

[29] Draper, N. R., and H. Smith. *Applied
Regression Analysis*. Hoboken, NJ:
Wiley-Interscience, 1998.

[30] Drezner, Z. “Computation of the
Trivariate Normal Integral.” *Mathematics of
Computation*. Vol. 63, 1994, pp.
289–294.

[31] Drezner, Z., and G. O. Wesolowsky.
“On the Computation of the Bivariate Normal Integral.”
*Journal of Statistical Computation and
Simulation*. Vol. 35, 1989, pp.
101–107.

[32] DuMouchel, W. H., and F. L. O'Brien.
“Integrating a Robust Option into a Multiple Regression
Computing Environment.” *Computer Science and
Statistics*:* Proceedings of the 21st
Symposium on the Interface*. Alexandria, VA:
American Statistical Association, 1989.

[33] Durbin, R., S. Eddy, A. Krogh, and G.
Mitchison. *Biological Sequence Analysis*.
Cambridge, UK: Cambridge University Press, 1998.

[34] Efron, B., and R. J. Tibshirani. *An
Introduction to the Bootstrap*. New York: Chapman
& Hall, 1993.

[35] Embrechts, P., C. Klüppelberg, and T.
Mikosch. *Modelling Extremal Events for Insurance and
Finance*. New York: Springer, 1997.

[36] Evans, M., N. Hastings, and B. Peacock.
*Statistical Distributions*. 2nd ed.,
Hoboken, NJ: John Wiley & Sons, Inc., 1993, pp. 50–52, 73–74,
102–105, 147, 148.

[37] Friedman, J. H. “Greedy function approximation: a
gradient boosting machine.” *The Annals of
Statistics*. Vol. 29, No. 5, 2001, pp.
1189-1232.

[38] Genz, A. “Numerical Computation of
Rectangular Bivariate and Trivariate Normal and t
Probabilities.” *Statistics and
Computing*. Vol. 14, No. 3, 2004, pp.
251–260.

[39] Genz, A., and F. Bretz. “Comparison
of Methods for the Computation of Multivariate t
Probabilities.” *Journal of Computational and
Graphical Statistics*. Vol. 11, No. 4, 2002, pp.
950–971.

[40] Genz, A., and F. Bretz. “Numerical
Computation of Multivariate t Probabilities with Application to
Power Calculation of Multiple Contrasts.” *Journal
of Statistical Computation and Simulation*. Vol.
63, 1999, pp. 361–378.

[41] Gibbons, J. D. *Nonparametric
Statistical Inference*. New York: Marcel Dekker,
1985.

[42] Goldstein, A., A. Kapelner, J. Bleich, and E. Pitkin.
“Peeking inside the black box: Visualizing statistical
learning with plots of individual conditional expectation.”
*Journal of Computational and Graphical
Statistics*. Vol. 24, No. 1, 2015, pp.
44-65.

[43] Goodall, C. R. “Computation Using the
QR Decomposition.” *Handbook in
Statistics.* Vol. 9, Amsterdam:
Elsevier/North-Holland, 1993.

[44] Goodnight, J.H., and F.M. Speed. *Computing Expected
Mean Squares*. Cary, NC: SAS Institute,
1978.

[45] Hahn, Gerald J., and S. S. Shapiro.
*Statistical Models in Engineering*.
Hoboken, NJ: John Wiley & Sons, Inc., 1994, p. 95.

[46] Hald, A. *Statistical Theory with
Engineering Applications*. Hoboken, NJ: John
Wiley & Sons, Inc., 1960.

[47] Harman, H. H. *Modern Factor
Analysis*. 3rd Ed. Chicago: University of Chicago
Press, 1976.

[48] Hastie, T., R. Tibshirani, and J. H. Friedman.
*The Elements of Statistical Learning*.
New York: Springer, 2001.

[49] Hill, P. D. “Kernel estimation of a distribution
function.” *Communications in Statistics – Theory
and Methods*. Vol. 14, Issue 3, 1985, pp.
605-620.

[50] Hochberg, Y., and A. C. Tamhane.
*Multiple Comparison Procedures*.
Hoboken, NJ: John Wiley & Sons, 1987.

[51] Hoerl, A. E., and R. W. Kennard.
“Ridge Regression: Applications to Nonorthogonal
Problems.” *Technometrics*. Vol. 12, No.
1, 1970, pp. 69–82.

[52] Hoerl, A. E., and R. W. Kennard.
“Ridge Regression: Biased Estimation for Nonorthogonal
Problems.” *Technometrics*. Vol. 12, No.
1, 1970, pp. 55–67.

[53] Hogg, R. V., and J. Ledolter.
*Engineering Statistics*. New York:
MacMillan, 1987.

[54] Holland, P. W., and R. E. Welsch.
“Robust Regression Using Iteratively Reweighted
Least-Squares.” *Communications in Statistics:
Theory and Methods*, *A6*,
1977, pp. 813–827.

[55] Hollander, M., and D. A. Wolfe.
*Nonparametric Statistical Methods*.
Hoboken, NJ: John Wiley & Sons, Inc., 1999.

[56] Hong, H. S., and F. J. Hickernell.
“ALGORITHM 823: Implementing Scrambled Digital
Sequences.” *ACM Transactions on Mathematical
Software*. Vol. 29, No. 2, 2003, pp.
95–109.

[57] Huang, P. S., H. Avron, and T. N. Sainath, V. Sindhwani, and B.
Ramabhadran. “Kernel methods match Deep Neural Networks on
TIMIT.” *2014 IEEE International Conference on
Acoustics, Speech and Signal Processing*. 2014,
pp. 205–209.

[58] Huber, P. J. *Robust
Statistics*. Hoboken, NJ: John Wiley & Sons,
Inc., 1981.

[59] Jackson,* *J. E.
*A User's Guide to Principal
Components*. Hoboken, NJ: John Wiley and Sons,
1991.

[60] Jain, A., and R. Dubes.
*Algorithms for Clustering Data*. Upper
Saddle River, NJ: Prentice-Hall, 1988.

[61] Jarque, C. M., and A. K. Bera. “A test
for normality of observations and regression residuals.”
*International Statistical Review*.
Vol. 55, No. 2, 1987, pp. 163–172.

[62] Joe, S., and F. Y. Kuo. “Remark on
Algorithm 659: Implementing Sobol's Quasirandom Sequence
Generator.” *ACM Transactions on Mathematical
Software*. Vol. 29, No. 1, 2003, pp.
49–57.

[63] Johnson, N., and S. Kotz.
*Distributions in Statistics: Continuous
Univariate Distributions-2.* Hoboken, NJ: John
Wiley & Sons, Inc., 1970, pp. 130–148, 189–200, 201–219.

[64] Johnson, N. L., N. Balakrishnan, and S.
Kotz. *Continuous Multivariate Distributions*.
Vol. 1. Hoboken, NJ: Wiley-Interscience, 2000.

[65] Johnson, N. L., S. Kotz, and N. Balakrishnan.
*Continuous Univariate Distributions*.
Vol. 1, Hoboken, NJ: Wiley-Interscience, 1993.

[66] Johnson, N. L., S. Kotz, and N. Balakrishnan.
*Continuous Univariate Distributions*.
Vol. 2, Hoboken, NJ: Wiley-Interscience, 1994.

[67] Johnson, N. L., S. Kotz, and N. Balakrishnan.
*Discrete Multivariate Distributions*.
Hoboken, NJ: Wiley-Interscience, 1997.

[68] Johnson, N. L., S. Kotz, and A. W. Kemp.
*Univariate Discrete Distributions*.
Hoboken, NJ: Wiley-Interscience, 1993.

[69] Jolliffe, I. T. *Principal
Component Analysis*. 2nd ed., New York:
Springer-Verlag, 2002.

[70] Jones, M.C. “Simple boundary correction for kernel
density estimation.” *Statistics and
Computing*. Vol. 3, Issue 3, 1993, pp.
135-146.

[71] Jöreskog, K. G. “Some
Contributions to Maximum Likelihood Factor Analysis.”
*Psychometrika*. Vol. 32, 1967, pp.
443–482.

[72] Kaufman L., and P. J. Rousseeuw.
*Finding Groups in Data: An Introduction to
Cluster Analysis*. Hoboken, NJ: John Wiley &
Sons, Inc., 1990.

[73] Kendall, David G. “A Survey of the
Statistical Theory of Shape.” *Statistical
Science*. Vol. 4, No. 2, 1989, pp.
87–99.

[74] Klein, J. P., and M. L. Moeschberger. *Survival
Analysis*. Statistics for Biology and Health. 2nd
edition. Springer, 2003.

[75] Kleinbaum, D. G., and M. Klein. *Survival
Analysis*. Statistics for Biology and Health. 2nd
edition. Springer, 2005.

[76] Kocis, L., and W. J. Whiten.
“Computational Investigations of Low-Discrepancy
Sequences.” *ACM Transactions on Mathematical
Software*. Vol. 23, No. 2, 1997, pp.
266–294.

[77] Kotz, S., and S. Nadarajah.
*Extreme Value Distributions: Theory and
Applications*. London: Imperial College Press,
2000.

[78] Krzanowski, W. J. *Principles of
Multivariate Analysis: A User's Perspective*. New
York: Oxford University Press, 1988.

[79] Lawless, J. F. *Statistical Models
and Methods for Lifetime Data*. Hoboken, NJ:
Wiley-Interscience, 2002.

[80] Lawley, D. N., and A. E. Maxwell.
*Factor Analysis as a Statistical
Method*. 2nd ed. New York: American Elsevier
Publishing, 1971.

[81] Le, Q., T. Sarlós, and A. Smola. “Fastfood —
Approximating Kernel Expansions in Loglinear Time.”
*Proceedings of the 30th International Conference
on Machine Learning*. Vol. 28, No. 3, 2013, pp.
244–252.

[82] Lilliefors, H. W. “On the
Kolmogorov-Smirnov test for normality with mean and variance
unknown.” *Journal of the American Statistical
Association*. Vol. 62, 1967, pp.
399–402.

[83] Lilliefors, H. W. “On the
Kolmogorov-Smirnov test for the exponential distribution with mean
unknown.” *Journal of the American Statistical
Association*. Vol. 64, 1969, pp.
387–389.

[84] Lindstrom, M. J., and D. M. Bates.
“Nonlinear mixed-effects models for repeated measures
data.” *Biometrics*. Vol. 46, 1990, pp.
673–687.

[85] Little, Roderick J. A., and Donald B. Rubin.
*Statistical Analysis with Missing
Data*. 2nd ed., Hoboken, NJ: John Wiley &
Sons, Inc., 2002.

[86] Mardia, K. V., J. T. Kent, and J. M. Bibby.
*Multivariate Analysis*. Burlington,
MA: Academic Press, 1980.

[87] Marquardt, D.W. “Generalized
Inverses, Ridge Regression, Biased Linear Estimation, and Nonlinear
Estimation.” *Technometrics*. Vol. 12,
No. 3, 1970, pp. 591–612.

[88] Marquardt, D. W., and R.D. Snee.
“Ridge Regression in Practice.” *The
American Statistician*. Vol. 29, No. 1, 1975, pp.
3–20.

[89] Marsaglia, G., and W. W. Tsang. “A
Simple Method for Generating Gamma Variables.” *ACM
Transactions on Mathematical Software*. Vol. 26,
2000, pp. 363–372.

[90] Marsaglia, G., W. Tsang, and J. Wang.
“Evaluating Kolmogorov’s Distribution.”
*Journal of Statistical Software*. Vol.
8, Issue 18, 2003.

[91] Martinez, W. L., and A. R. Martinez.
*Computational Statistics with MATLAB ^{®}*. New York: Chapman & Hall/CRC
Press, 2002.

[92] Massey, F. J. “The Kolmogorov-Smirnov
Test for Goodness of Fit.” *Journal of the American
Statistical Association*. Vol. 46, No. 253, 1951,
pp. 68–78.

[93] Matousek, J. “On the L2-Discrepancy
for Anchored Boxes.” *Journal of
Complexity*. Vol. 14, No. 4, 1998, pp.
527–556.

[94] McLachlan, G., and D. Peel.
*Finite Mixture Models*. Hoboken, NJ:
John Wiley & Sons, Inc., 2000.

[95] McCullagh, P., and J. A. Nelder.
*Generalized Linear Models*. New York:
Chapman & Hall, 1990.

[96] McGill, R., J. W. Tukey, and W. A. Larsen.
“Variations of Boxplots.” *The American
Statistician*. Vol. 32, No. 1, 1978, pp.
12–16.

[97] Meeker, W. Q., and L. A. Escobar.
*Statistical Methods for Reliability
Data*. Hoboken, NJ: John Wiley & Sons, Inc.,
1998.

[98] Meng, Xiao-Li, and Donald B. Rubin.
“Maximum Likelihood Estimation via the ECM Algorithm.”
*Biometrika*. Vol. 80, No. 2, 1993, pp.
267–278.

[99] Meyers, R. H., and D.C. Montgomery.
*Response Surface Methodology: Process and Product
Optimization Using Designed Experiments*.
Hoboken, NJ: John Wiley & Sons, Inc., 1995.

[100] Miller, L. H. “Table of Percentage
Points of Kolmogorov Statistics.” *Journal of the
American Statistical Association*. Vol. 51, No.
273, 1956, pp. 111–121.

[101] Milliken, G. A., and D. E. Johnson.
*Analysis of Messy Data, Volume 1: Designed
Experiments*. Boca Raton, FL: Chapman &
Hall/CRC Press, 1992.

[102] Montgomery, D. *Introduction to
Statistical Quality Control*. Hoboken, NJ: John
Wiley & Sons, 1991, pp. 369–374.

[103] Montgomery, D. C. *Design and
Analysis of Experiments*. Hoboken, NJ: John Wiley
& Sons, Inc., 2001.

[104] Mood, A. M., F. A. Graybill, and D. C. Boes.
*Introduction to the Theory of
Statistics.* 3rd ed., New York: McGraw-Hill,
1974. pp. 540–541.

[105] Moore, J. *Total Biochemical
Oxygen Demand of Dairy Manures*. Ph.D. thesis.
University of Minnesota, Department of Agricultural Engineering,
1975.

[106] Mosteller, F., and J. Tukey. *Data
Analysis and Regression*. Upper Saddle River, NJ:
Addison-Wesley, 1977.

[107] Nelson, L. S. “Evaluating Overlapping
Confidence Intervals.” *Journal of Quality
Technology*. Vol. 21, 1989, pp.
140–141.

[108] Patel, J. K., C. H. Kapadia, and D. B. Owen.
*Handbook of Statistical
Distributions*. New York: Marcel Dekker, 1976.

[109] Pinheiro, J. C., and D. M. Bates.
“Approximations to the log-likelihood function in the
nonlinear mixed-effects model.” *Journal of
Computational and Graphical Statistics*. Vol. 4,
1995, pp. 12–35.

[110] Rahimi, A., and B. Recht. “Random Features for
Large-Scale Kernel Machines.” *Advances in Neural
Information Processing Systems*. Vol 20, 2008,
pp. 1177–1184.

[111] Rice, J. A. *Mathematical Statistics
and Data Analysis*. Pacific Grove, CA: Duxbury
Press, 1994.

[112] Rosipal, R., and N. Kramer. “Overview
and Recent Advances in Partial Least Squares.”
*Subspace, Latent Structure and Feature Selection:
Statistical and Optimization Perspectives Workshop (SLSFS
2005), Revised Selected Papers (Lecture Notes in Computer
Science 3940)*. Berlin, Germany: Springer-Verlag,
2006, pp. 34–51.

[113] Sachs, L. *Applied Statistics: A
Handbook of Techniques*. New York:
Springer-Verlag, 1984, p. 253.

[114] Scott, D. W. *Multivariate Density Estimation: Theory,
Practice, and Visualization*. John Wiley &
Sons, 2015.

[115] Searle, S. R., F. M. Speed, and G. A.
Milliken. “Population marginal means in the linear model: an
alternative to least-squares means.” *American
Statistician*. 1980, pp. 216–221.

[116] Seber, G. A. F. and A. J. Lee.
*Linear Regression Analysis*. 2nd ed.
Hoboken, NJ: Wiley-Interscience, 2003.

[117] Seber, G. A. F. *Multivariate
Observations*. Hoboken, NJ: John Wiley &
Sons, Inc., 1984.

[118] Seber, G. A. F., and C. J. Wild.
*Nonlinear Regression*. Hoboken, NJ:
Wiley-Interscience, 2003.

[119] Sexton, Joe, and A. R. Swensen. “ECM
Algorithms that Converge at the Rate of EM.”
*Biometrika*. Vol. 87, No. 3, 2000, pp.
651–662.

[120] Silverman, B.W. *Density Estimation for Statistics
and Data Analysis*. Chapman & Hall/CRC,
1986.

[121] Snedecor, G. W., and W. G. Cochran.
*Statistical Methods*. Ames, IA: Iowa
State Press, 1989.

[122] Spath, H. *Cluster Dissection and
Analysis: Theory, FORTRAN Programs, Examples*.
Translated by J. Goldschmidt. New York: Halsted Press,
1985.

[123] Stein, M. “Large sample properties of
simulations using latin hypercube sampling.”
*Technometrics*. Vol. 29, No. 2, 1987,
pp. 143–151. Correction, Vol. 32, p. 367.

[124] Stephens, M. A. “Use of the
Kolmogorov-Smirnov, Cramer-Von Mises and Related Statistics Without
Extensive Tables.” *Journal of the Royal
Statistical Society*. Series B, Vol. 32, No. 1,
1970, pp. 115–122.

[125] Street, J. O., R. J. Carroll, and D.
Ruppert. “A Note on Computing Robust Regression Estimates via
Iteratively Reweighted Least Squares.” *The
American Statistician*. Vol. 42, 1988, pp.
152–154.

[126] Student. “On the Probable Error of
the Mean.” *Biometrika*. Vol. 6, No. 1,
1908, pp. 1–25.

[127] Vellemen, P. F., and D. C. Hoaglin.
*Application, Basics, and Computing of Exploratory
Data Analysis*. Pacific Grove, CA: Duxbury Press,
1981.

[128] Weibull, W. “A Statistical Theory of
the Strength of Materials.” *Ingeniors Vetenskaps
Akademiens Handlingar*. Stockholm: Royal Swedish
Institute for Engineering Research, No. 151, 1939.

[129] Zahn, C. T. “Graph-theoretical
methods for detecting and describing Gestalt clusters.”
*IEEE Transactions on Computers*. Vol.
C-20, Issue 1, 1971, pp. 68–86.