Atkinson, A. C., and A. N. Donev. Optimum Experimental Designs. New York: Oxford University Press, 1992.
 Bates, D. M., and D. G. Watts. Nonlinear Regression Analysis and Its Applications. Hoboken, NJ: John Wiley & Sons, Inc., 1988.
 Belsley, D. A., E. Kuh, and R. E. Welsch. Regression Diagnostics. Hoboken, NJ: John Wiley & Sons, Inc., 1980.
 Berry, M. W., et al. “Algorithms and Applications for Approximate Nonnegative Matrix Factorization.” Computational Statistics and Data Analysis. Vol. 52, No. 1, 2007, pp. 155–173.
 Bookstein, Fred L. Morphometric Tools for Landmark Data. Cambridge, UK: Cambridge University Press, 1991.
 Bouye, E., V. Durrleman, A. Nikeghbali, G. Riboulet, and T. Roncalli. “Copulas for Finance: A Reading Guide and Some Applications.” Working Paper. Groupe de Recherche Operationnelle, Credit Lyonnais, 2000.
 Bowman, A. W., and A. Azzalini. Applied Smoothing Techniques for Data Analysis. New York: Oxford University Press, 1997.
 Box, G. E. P., and N. R. Draper. Empirical Model-Building and Response Surfaces. Hoboken, NJ: John Wiley & Sons, Inc., 1987.
 Box, G. E. P., W. G. Hunter, and J. S. Hunter. Statistics for Experimenters. Hoboken, NJ: Wiley-Interscience, 1978.
 Bratley, P., and B. L. Fox. “ALGORITHM 659 Implementing Sobol's Quasirandom Sequence Generator.” ACM Transactions on Mathematical Software. Vol. 14, No. 1, 1988, pp. 88–100.
 Breiman, L. “Random Forests.” Machine Learning. Vol. 4, 2001, pp. 5–32.
 Breiman, L., J. Friedman, R. Olshen, and C. Stone. Classification and Regression Trees. Boca Raton, FL: CRC Press, 1984.
 Bulmer, M. G. Principles of Statistics. Mineola, NY: Dover Publications, Inc., 1979.
 Bury, K.. Statistical Distributions in Engineering. Cambridge, UK: Cambridge University Press, 1999.
 Chatterjee, S., and A. S. Hadi. “Influential Observations, High Leverage Points, and Outliers in Linear Regression.” Statistical Science. Vol. 1, 1986, pp. 379–416.
 Collett, D. Modeling Binary Data. New York: Chapman & Hall, 2002.
 Conover, W. J. Practical Nonparametric Statistics. Hoboken, NJ: John Wiley & Sons, Inc., 1980.
 Cook, R. D., and S. Weisberg. Residuals and Influence in Regression. New York: Chapman & Hall/CRC Press, 1983.
 Cox, D. R., and D. Oakes. Analysis of Survival Data. London: Chapman & Hall, 1984.
 Davidian, M., and D. M. Giltinan. Nonlinear Models for Repeated Measurements Data. New York: Chapman & Hall, 1995.
 Deb, P., and M. Sefton. “The Distribution of a Lagrange Multiplier Test of Normality.” Economics Letters. Vol. 51, 1996, pp. 123–130.
 de Jong, S. “SIMPLS: An Alternative Approach to Partial Least Squares Regression.” Chemometrics and Intelligent Laboratory Systems. Vol. 18, 1993, pp. 251–263.
 Demidenko, E. Mixed Models: Theory and Applications. Hoboken, NJ: John Wiley & Sons, Inc., 2004.
 Delyon, B., M. Lavielle, and E. Moulines, Convergence of a stochastic approximation version of the EM algorithm, Annals of Statistics, 27, 94-128, 1999.
 Dempster, A. P., N. M. Laird, and D. B. Rubin. “Maximum Likelihood from Incomplete Data via the EM Algorithm.” Journal of the Royal Statistical Society. Series B, Vol. 39, No. 1, 1977, pp. 1–37.
 Devroye, L. Non-Uniform Random Variate Generation. New York: Springer-Verlag, 1986.
 Dobson, A. J. An Introduction to Generalized Linear Models. New York: Chapman & Hall, 1990.
 Dunn, O.J., and V.A. Clark. Applied Statistics: Analysis of Variance and Regression. New York: Wiley, 1974.
 Draper, N. R., and H. Smith. Applied Regression Analysis. Hoboken, NJ: Wiley-Interscience, 1998.
 Drezner, Z. “Computation of the Trivariate Normal Integral.” Mathematics of Computation. Vol. 63, 1994, pp. 289–294.
 Drezner, Z., and G. O. Wesolowsky. “On the Computation of the Bivariate Normal Integral.” Journal of Statistical Computation and Simulation. Vol. 35, 1989, pp. 101–107.
 DuMouchel, W. H., and F. L. O'Brien. “Integrating a Robust Option into a Multiple Regression Computing Environment.” Computer Science and Statistics: Proceedings of the 21st Symposium on the Interface. Alexandria, VA: American Statistical Association, 1989.
 Durbin, R., S. Eddy, A. Krogh, and G. Mitchison. Biological Sequence Analysis. Cambridge, UK: Cambridge University Press, 1998.
 Efron, B., and R. J. Tibshirani. An Introduction to the Bootstrap. New York: Chapman & Hall, 1993.
 Embrechts, P., C. Klüppelberg, and T. Mikosch. Modelling Extremal Events for Insurance and Finance. New York: Springer, 1997.
 Evans, M., N. Hastings, and B. Peacock. Statistical Distributions. 2nd ed., Hoboken, NJ: John Wiley & Sons, Inc., 1993, pp. 50–52, 73–74, 102–105, 147, 148.
 Friedman, J. H. “Greedy function approximation: a gradient boosting machine.” The Annals of Statistics. Vol. 29, No. 5, 2001, pp. 1189-1232.
 Genz, A. “Numerical Computation of Rectangular Bivariate and Trivariate Normal and t Probabilities.” Statistics and Computing. Vol. 14, No. 3, 2004, pp. 251–260.
 Genz, A., and F. Bretz. “Comparison of Methods for the Computation of Multivariate t Probabilities.” Journal of Computational and Graphical Statistics. Vol. 11, No. 4, 2002, pp. 950–971.
 Genz, A., and F. Bretz. “Numerical Computation of Multivariate t Probabilities with Application to Power Calculation of Multiple Contrasts.” Journal of Statistical Computation and Simulation. Vol. 63, 1999, pp. 361–378.
 Gibbons, J. D. Nonparametric Statistical Inference. New York: Marcel Dekker, 1985.
 Goldstein, A., A. Kapelner, J. Bleich, and E. Pitkin. “Peeking inside the black box: Visualizing statistical learning with plots of individual conditional expectation.” Journal of Computational and Graphical Statistics. Vol. 24, No. 1, 2015, pp. 44-65.
 Goodall, C. R. “Computation Using the QR Decomposition.” Handbook in Statistics. Vol. 9, Amsterdam: Elsevier/North-Holland, 1993.
 Goodnight, J.H., and F.M. Speed. Computing Expected Mean Squares. Cary, NC: SAS Institute, 1978.
 Hahn, Gerald J., and S. S. Shapiro. Statistical Models in Engineering. Hoboken, NJ: John Wiley & Sons, Inc., 1994, p. 95.
 Hald, A. Statistical Theory with Engineering Applications. Hoboken, NJ: John Wiley & Sons, Inc., 1960.
 Harman, H. H. Modern Factor Analysis. 3rd Ed. Chicago: University of Chicago Press, 1976.
 Hastie, T., R. Tibshirani, and J. H. Friedman. The Elements of Statistical Learning. New York: Springer, 2001.
 Hill, P. D. “Kernel estimation of a distribution function.” Communications in Statistics – Theory and Methods. Vol. 14, Issue 3, 1985, pp. 605-620.
 Hochberg, Y., and A. C. Tamhane. Multiple Comparison Procedures. Hoboken, NJ: John Wiley & Sons, 1987.
 Hoerl, A. E., and R. W. Kennard. “Ridge Regression: Applications to Nonorthogonal Problems.” Technometrics. Vol. 12, No. 1, 1970, pp. 69–82.
 Hoerl, A. E., and R. W. Kennard. “Ridge Regression: Biased Estimation for Nonorthogonal Problems.” Technometrics. Vol. 12, No. 1, 1970, pp. 55–67.
 Hogg, R. V., and J. Ledolter. Engineering Statistics. New York: MacMillan, 1987.
 Holland, P. W., and R. E. Welsch. “Robust Regression Using Iteratively Reweighted Least-Squares.” Communications in Statistics: Theory and Methods, A6, 1977, pp. 813–827.
 Hollander, M., and D. A. Wolfe. Nonparametric Statistical Methods. Hoboken, NJ: John Wiley & Sons, Inc., 1999.
 Hong, H. S., and F. J. Hickernell. “ALGORITHM 823: Implementing Scrambled Digital Sequences.” ACM Transactions on Mathematical Software. Vol. 29, No. 2, 2003, pp. 95–109.
 Huang, P. S., H. Avron, and T. N. Sainath, V. Sindhwani, and B. Ramabhadran. “Kernel methods match Deep Neural Networks on TIMIT.” 2014 IEEE International Conference on Acoustics, Speech and Signal Processing. 2014, pp. 205–209.
 Huber, P. J. Robust Statistics. Hoboken, NJ: John Wiley & Sons, Inc., 1981.
 Jackson, J. E. A User's Guide to Principal Components. Hoboken, NJ: John Wiley and Sons, 1991.
 Jain, A., and R. Dubes. Algorithms for Clustering Data. Upper Saddle River, NJ: Prentice-Hall, 1988.
 Jarque, C. M., and A. K. Bera. “A test for normality of observations and regression residuals.” International Statistical Review. Vol. 55, No. 2, 1987, pp. 163–172.
 Joe, S., and F. Y. Kuo. “Remark on Algorithm 659: Implementing Sobol's Quasirandom Sequence Generator.” ACM Transactions on Mathematical Software. Vol. 29, No. 1, 2003, pp. 49–57.
 Johnson, N., and S. Kotz. Distributions in Statistics: Continuous Univariate Distributions-2. Hoboken, NJ: John Wiley & Sons, Inc., 1970, pp. 130–148, 189–200, 201–219.
 Johnson, N. L., N. Balakrishnan, and S. Kotz. Continuous Multivariate Distributions. Vol. 1. Hoboken, NJ: Wiley-Interscience, 2000.
 Johnson, N. L., S. Kotz, and N. Balakrishnan. Continuous Univariate Distributions. Vol. 1, Hoboken, NJ: Wiley-Interscience, 1993.
 Johnson, N. L., S. Kotz, and N. Balakrishnan. Continuous Univariate Distributions. Vol. 2, Hoboken, NJ: Wiley-Interscience, 1994.
 Johnson, N. L., S. Kotz, and N. Balakrishnan. Discrete Multivariate Distributions. Hoboken, NJ: Wiley-Interscience, 1997.
 Johnson, N. L., S. Kotz, and A. W. Kemp. Univariate Discrete Distributions. Hoboken, NJ: Wiley-Interscience, 1993.
 Jolliffe, I. T. Principal Component Analysis. 2nd ed., New York: Springer-Verlag, 2002.
 Jones, M.C. "Simple boundary correction for kernel density estimation." Statistics and Computing. Vol. 3, Issue 3, 1993, pp. 135-146.
 Jöreskog, K. G. "Some Contributions to Maximum Likelihood Factor Analysis." Psychometrika. Vol. 32, 1967, pp. 443–482.
 Kaufman L., and P. J. Rousseeuw. Finding Groups in Data: An Introduction to Cluster Analysis. Hoboken, NJ: John Wiley & Sons, Inc., 1990.
 Kempka, Michał, Wojciech Kotłowski, and Manfred K. Warmuth. "Adaptive Scale-Invariant Online Algorithms for Learning Linear Models." Preprint, submitted February 10, 2019. https://arxiv.org/abs/1902.07528.
 Kendall, David G. "A Survey of the Statistical Theory of Shape." Statistical Science. Vol. 4, No. 2, 1989, pp. 87–99.
 Klein, J. P., and M. L. Moeschberger. Survival Analysis. Statistics for Biology and Health. 2nd edition. Springer, 2003.
 Kleinbaum, D. G., and M. Klein. Survival Analysis. Statistics for Biology and Health. 2nd edition. Springer, 2005.
 Kocis, L., and W. J. Whiten. “Computational Investigations of Low-Discrepancy Sequences.” ACM Transactions on Mathematical Software. Vol. 23, No. 2, 1997, pp. 266–294.
 Kotz, S., and S. Nadarajah. Extreme Value Distributions: Theory and Applications. London: Imperial College Press, 2000.
 Krzanowski, W. J. Principles of Multivariate Analysis: A User's Perspective. New York: Oxford University Press, 1988.
 Lawless, J. F. Statistical Models and Methods for Lifetime Data. Hoboken, NJ: Wiley-Interscience, 2002.
 Lawley, D. N., and A. E. Maxwell. Factor Analysis as a Statistical Method. 2nd ed. New York: American Elsevier Publishing, 1971.
 Le, Q., T. Sarlós, and A. Smola. “Fastfood — Approximating Kernel Expansions in Loglinear Time.” Proceedings of the 30th International Conference on Machine Learning. Vol. 28, No. 3, 2013, pp. 244–252.
 Lilliefors, H. W. “On the Kolmogorov-Smirnov test for normality with mean and variance unknown.” Journal of the American Statistical Association. Vol. 62, 1967, pp. 399–402.
 Lilliefors, H. W. “On the Kolmogorov-Smirnov test for the exponential distribution with mean unknown.” Journal of the American Statistical Association. Vol. 64, 1969, pp. 387–389.
 Lindstrom, M. J., and D. M. Bates. “Nonlinear mixed-effects models for repeated measures data.” Biometrics. Vol. 46, 1990, pp. 673–687.
 Liu, F. T., K. M. Ting, and Z. Zhou. "Isolation Forest," 2008 Eighth IEEE International Conference on Data Mining. Pisa, Italy, 2008, pp. 413-422.
 Little, Roderick J. A., and Donald B. Rubin. Statistical Analysis with Missing Data. 2nd ed., Hoboken, NJ: John Wiley & Sons, Inc., 2002.
 Mardia, K. V., J. T. Kent, and J. M. Bibby. Multivariate Analysis. Burlington, MA: Academic Press, 1980.
 Marquardt, D.W. “Generalized Inverses, Ridge Regression, Biased Linear Estimation, and Nonlinear Estimation.” Technometrics. Vol. 12, No. 3, 1970, pp. 591–612.
 Marquardt, D. W., and R.D. Snee. “Ridge Regression in Practice.” The American Statistician. Vol. 29, No. 1, 1975, pp. 3–20.
 Marsaglia, G., and W. W. Tsang. “A Simple Method for Generating Gamma Variables.” ACM Transactions on Mathematical Software. Vol. 26, 2000, pp. 363–372.
 Marsaglia, G., W. Tsang, and J. Wang. “Evaluating Kolmogorov’s Distribution.” Journal of Statistical Software. Vol. 8, Issue 18, 2003.
 Martinez, W. L., and A. R. Martinez. Computational Statistics with MATLAB®. New York: Chapman & Hall/CRC Press, 2002.
 Massey, F. J. “The Kolmogorov-Smirnov Test for Goodness of Fit.” Journal of the American Statistical Association. Vol. 46, No. 253, 1951, pp. 68–78.
 Matousek, J. “On the L2-Discrepancy for Anchored Boxes.” Journal of Complexity. Vol. 14, No. 4, 1998, pp. 527–556.
 McLachlan, G., and D. Peel. Finite Mixture Models. Hoboken, NJ: John Wiley & Sons, Inc., 2000.
 McCullagh, P., and J. A. Nelder. Generalized Linear Models. New York: Chapman & Hall, 1990.
 McGill, R., J. W. Tukey, and W. A. Larsen. “Variations of Boxplots.” The American Statistician. Vol. 32, No. 1, 1978, pp. 12–16.
 Meeker, W. Q., and L. A. Escobar. Statistical Methods for Reliability Data. Hoboken, NJ: John Wiley & Sons, Inc., 1998.
 Meng, Xiao-Li, and Donald B. Rubin. “Maximum Likelihood Estimation via the ECM Algorithm.” Biometrika. Vol. 80, No. 2, 1993, pp. 267–278.
 Meyers, R. H., and D.C. Montgomery. Response Surface Methodology: Process and Product Optimization Using Designed Experiments. Hoboken, NJ: John Wiley & Sons, Inc., 1995.
 Miller, L. H. “Table of Percentage Points of Kolmogorov Statistics.” Journal of the American Statistical Association. Vol. 51, No. 273, 1956, pp. 111–121.
 Milliken, G. A., and D. E. Johnson. Analysis of Messy Data, Volume 1: Designed Experiments. Boca Raton, FL: Chapman & Hall/CRC Press, 1992.
 Montgomery, D. Introduction to Statistical Quality Control. Hoboken, NJ: John Wiley & Sons, 1991, pp. 369–374.
 Montgomery, D. C. Design and Analysis of Experiments. Hoboken, NJ: John Wiley & Sons, Inc., 2001.
 Mood, A. M., F. A. Graybill, and D. C. Boes. Introduction to the Theory of Statistics. 3rd ed., New York: McGraw-Hill, 1974. pp. 540–541.
 Moore, J. Total Biochemical Oxygen Demand of Dairy Manures. Ph.D. thesis. University of Minnesota, Department of Agricultural Engineering, 1975.
 Mosteller, F., and J. Tukey. Data Analysis and Regression. Upper Saddle River, NJ: Addison-Wesley, 1977.
 Nelson, L. S. “Evaluating Overlapping Confidence Intervals.” Journal of Quality Technology. Vol. 21, 1989, pp. 140–141.
 Patel, J. K., C. H. Kapadia, and D. B. Owen. Handbook of Statistical Distributions. New York: Marcel Dekker, 1976.
 Pinheiro, J. C., and D. M. Bates. “Approximations to the log-likelihood function in the nonlinear mixed-effects model.” Journal of Computational and Graphical Statistics. Vol. 4, 1995, pp. 12–35.
 Rahimi, A., and B. Recht. “Random Features for Large-Scale Kernel Machines.” Advances in Neural Information Processing Systems. Vol 20, 2008, pp. 1177–1184.
 Rice, J. A. Mathematical Statistics and Data Analysis. Pacific Grove, CA: Duxbury Press, 1994.
 Rosipal, R., and N. Kramer. “Overview and Recent Advances in Partial Least Squares.” Subspace, Latent Structure and Feature Selection: Statistical and Optimization Perspectives Workshop (SLSFS 2005), Revised Selected Papers (Lecture Notes in Computer Science 3940). Berlin, Germany: Springer-Verlag, 2006, pp. 34–51.
 Sachs, L. Applied Statistics: A Handbook of Techniques. New York: Springer-Verlag, 1984, p. 253.
 Scott, D. W. Multivariate Density Estimation: Theory, Practice, and Visualization. John Wiley & Sons, 2015.
 Searle, S. R., F. M. Speed, and G. A. Milliken. “Population marginal means in the linear model: an alternative to least-squares means.” American Statistician. 1980, pp. 216–221.
 Seber, G. A. F. and A. J. Lee. Linear Regression Analysis. 2nd ed. Hoboken, NJ: Wiley-Interscience, 2003.
 Seber, G. A. F. Multivariate Observations. Hoboken, NJ: John Wiley & Sons, Inc., 1984.
 Seber, G. A. F., and C. J. Wild. Nonlinear Regression. Hoboken, NJ: Wiley-Interscience, 2003.
 Sexton, Joe, and A. R. Swensen. “ECM Algorithms that Converge at the Rate of EM.” Biometrika. Vol. 87, No. 3, 2000, pp. 651–662.
 Silverman, B.W. Density Estimation for Statistics and Data Analysis. Chapman & Hall/CRC, 1986.
 Snedecor, G. W., and W. G. Cochran. Statistical Methods. Ames, IA: Iowa State Press, 1989.
 Spath, H. Cluster Dissection and Analysis: Theory, FORTRAN Programs, Examples. Translated by J. Goldschmidt. New York: Halsted Press, 1985.
 Stein, M. “Large sample properties of simulations using latin hypercube sampling.” Technometrics. Vol. 29, No. 2, 1987, pp. 143–151. Correction, Vol. 32, p. 367.
 Stephens, M. A. “Use of the Kolmogorov-Smirnov, Cramer-Von Mises and Related Statistics Without Extensive Tables.” Journal of the Royal Statistical Society. Series B, Vol. 32, No. 1, 1970, pp. 115–122.
 Street, J. O., R. J. Carroll, and D. Ruppert. “A Note on Computing Robust Regression Estimates via Iteratively Reweighted Least Squares.” The American Statistician. Vol. 42, 1988, pp. 152–154.
 Student. “On the Probable Error of the Mean.” Biometrika. Vol. 6, No. 1, 1908, pp. 1–25.
 Vellemen, P. F., and D. C. Hoaglin. Application, Basics, and Computing of Exploratory Data Analysis. Pacific Grove, CA: Duxbury Press, 1981.
 Weibull, W. “A Statistical Theory of the Strength of Materials.” Ingeniors Vetenskaps Akademiens Handlingar. Stockholm: Royal Swedish Institute for Engineering Research, No. 151, 1939.
 Zahn, C. T. “Graph-theoretical methods for detecting and describing Gestalt clusters.” IEEE Transactions on Computers. Vol. C-20, Issue 1, 1971, pp. 68–86.