This is machine translation

Translated by Microsoft
Mouse over text to see original. Click the button below to return to the English verison of the page.


[1] Agresti, A. Categorical Data Analysis, 2nd Ed. John Wiley & Sons, Inc.: Hoboken, NJ, 2002.

[2] Allwein, E., R. Schapire, and Y. Singer. "Reducing multiclass to binary: A unifying approach for margin classifiers." Journal of Machine Learning Research. Vol. 1, 2000, pp. 113–141.

[3] Alpaydin, E. "Combined 5 x 2 CV F Test for Comparing Supervised Classification Learning Algorithms." Neural Computation, Vol. 11, No. 8, pp. 1885–1992, 1999.

[4] Blackard, J. A. and D. J. Dean. Comparative accuracies of artificial neural networks and discriminant analysis in predicting forest cover types from cartographic variables. Computers and Electronics in Agriculture 24, pp. 131–151, 1999.

[5] Bottou, L., and Chih-Jen Lin. Support Vector Machine Solvers. Available at

[6] Bouckaert. R. "Choosing Between Two Learning Algorithms Based on Calibrated Tests." International Conference on Machine Learning, pp. 51–58, 2003.

[7] Bouckaert, R. and E. Frank. "Evaluating the Replicability of Significance Tests for Comparing Learning Algorithms." In Advances in Knowledge Discovery and Data Mining, 8th Pacific-Asia Conference, pp. 3–12, 2004.

[8] Breiman, L. Bagging Predictors. Machine Learning 26, pp. 123–140, 1996.

[9] Breiman, L. Random Forests. Machine Learning 45, pp. 5–32, 2001.

[11] Breiman, L., et al. Classification and Regression Trees. Chapman & Hall, Boca Raton, 1993.

[12] Christianini, N., and J. Shawe-Taylor. An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, Cambridge, UK, 2000.

[13] Dietterich, T. "Approximate statistical tests for comparing supervised classification learning algorithms." Neural Computation, Vol. 10, No. 7: pp. 1895–1923, 1998.

[14] Dietterich, T., and G. Bakiri. "Solving Multiclass Learning Problems Via Error-Correcting Output Codes." Journal of Artificial Intelligence Research. Vol. 2, 1995, pp. 263–286.

[15] Escalera, S., O. Pujol, and P. Radeva. "On the decoding process in ternary error-correcting output codes." IEEE Transactions on Pattern Analysis and Machine Intelligence. Vol. 32, Issue 7, 2010, pp. 120–134.

[16] Escalera, S., O. Pujol, and P. Radeva. "Separability of ternary codes for sparse designs of error-correcting output codes." Pattern Recogn. Vol. 30, Issue 3, 2009, pp. 285–297.

[17] Fan, R.-E., P.-H. Chen, and C.-J. Lin. "Working set selection using second order information for training support vector machines." Journal of Machine Learning Research, Vol 6, 2005, pp. 1889–1918.

[18] Fagerlan, M.W., S Lydersen, P. Laake. "The McNemar Test for Binary Matched-Pairs Data: Mid-p and Asymptotic Are Better Than Exact Conditional." BMC Medical Research Methodology. Vol. 13, 2013, pp. 1–8.

[19] Freund, Y. A more robust boosting algorithm. arXiv:0905.2138v1, 2009.

[20] Freund, Y. and R. E. Schapire. A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. J. of Computer and System Sciences, Vol. 55, pp. 119–139, 1997.

[21] Friedman, J. Greedy function approximation: A gradient boosting machine. Annals of Statistics, Vol. 29, No. 5, pp. 1189–1232, 2001.

[22] Friedman, J., T. Hastie, and R. Tibshirani. Additive logistic regression: A statistical view of boosting. Annals of Statistics, Vol. 28, No. 2, pp. 337–407, 2000.

[23] Hastie, T., and R. Tibshirani. "Classification by Pairwise Coupling." Annals of Statistics. Vol. 26, Issue 2, 1998, pp. 451–471.

[24] Hastie, T., R. Tibshirani, and J. Friedman. The Elements of Statistical Learning, second edition. Springer, New York, 2008.

[25] Ho, C. H. and C. J. Lin. "Large-Scale Linear Support Vector Regression." Journal of Machine Learning Research, Vol. 13, 2012, pp. 3323–3348.

[26] Ho, T. K. The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 20, No. 8, pp. 832–844, 1998.

[27] Hsieh, C. J., K. W. Chang, C. J. Lin, S. S. Keerthi, and S. Sundararajan. "A Dual Coordinate Descent Method for Large-Scale Linear SVM." Proceedings of the 25th International Conference on Machine Learning, ICML '08, 2001, pp. 408–415.

[28] Hsu, Chih-Wei, Chih-Chung Chang, and Chih-Jen Lin. A Practical Guide to Support Vector Classification. Available at

[29] Hu, Q., X. Che, L. Zhang, and D. Yu. "Feature Evaluation and Selection Based on Neighborhood Soft Margin." Neurocomputing. Vol. 73, 2010, pp. 2114–2124.

[30] Kecman V., T. -M. Huang, and M. Vogt. "Iterative Single Data Algorithm for Training Kernel Machines from Huge Data Sets: Theory and Performance." In Support Vector Machines: Theory and Applications. Edited by Lipo Wang, 255–274. Berlin: Springer-Verlag, 2005.

[31] Kohavi, R. "Scaling Up the Accuracy of Naive-Bayes Classifiers: a Decision-Tree Hybrid." Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, 1996.

[32] Lancaster, H.O. "Significance Tests in Discrete Distributions." JASA, Vol. 56, Number 294, 1961, pp. 223–234.

[33] Langford, J., L. Li, and T. Zhang. "Sparse Online Learning Via Truncated Gradient." J. Mach. Learn. Res., Vol. 10, 2009, pp. 777–801.

[34] Loh, W.Y. "Regression Trees with Unbiased Variable Selection and Interaction Detection." Statistica Sinica, Vol. 12, 2002, pp. 361–386.

[35] Loh, W.Y. and Y.S. Shih. "Split Selection Methods for Classification Trees." Statistica Sinica, Vol. 7, 1997, pp. 815–840.

[36] McNemar, Q. "Note on the Sampling Error of the Difference Between Correlated Proportions or Percentages." Psychometrika, Vol. 12, Number 2, 1947, pp. 153–157.

[37] Meinshausen, N. "Quantile Regression Forests." Journal of Machine Learning Research, Vol. 7, 2006, pp. 983–999.

[38] Mosteller, F. "Some Statistical Problems in Measuring the Subjective Response to Drugs." Biometrics, Vol. 8, Number 3, 1952, pp. 220–226.

[39] Nocedal, J. and S. J. Wright. Numerical Optimization, 2nd ed., New York: Springer, 2006.

[40] Schapire, R. E. et al. Boosting the margin: A new explanation for the effectiveness of voting methods. Annals of Statistics, Vol. 26, No. 5, pp. 1651–1686, 1998.

[41] Schapire, R., and Y. Singer. Improved boosting algorithms using confidence-rated predictions. Machine Learning, Vol. 37, No. 3, pp. 297–336, 1999.

[42] Shalev-Shwartz, S., Y. Singer, and N. Srebro. "Pegasos: Primal Estimated Sub-Gradient Solver for SVM." Proceedings of the 24th International Conference on Machine Learning, ICML '07, 2007, pp. 807–814.

[43] Seiffert, C., T. Khoshgoftaar, J. Hulse, and A. Napolitano. RUSBoost: Improving clasification performance when training data is skewed. 19th International Conference on Pattern Recognition, pp. 1–4, 2008.

[44] Warmuth, M., J. Liao, and G. Ratsch. Totally corrective boosting algorithms that maximize the margin. Proc. 23rd Int'l. Conf. on Machine Learning, ACM, New York, pp. 1001–1008, 2006.

[45] Wu, T. F., C. J. Lin, and R. Weng. "Probability Estimates for Multi-Class Classification by Pairwise Coupling." Journal of Machine Learning Research. Vol. 5, 2004, pp. 975–1005.

[46] Wright, S. J., R. D. Nowak, and M. A. T. Figueiredo. "Sparse Reconstruction by Separable Approximation." Trans. Sig. Proc., Vol. 57, No 7, 2009, pp. 2479–2493.

[47] Xiao, Lin. "Dual Averaging Methods for Regularized Stochastic Learning and Online Optimization." J. Mach. Learn. Res., Vol. 11, 2010, pp. 2543–2596.

[48] Xu, Wei. "Towards Optimal One Pass Large Scale Learning with Averaged Stochastic Gradient Descent." CoRR, abs/1107.2490, 2011.

[49] Zadrozny, B. "Reducing Multiclass to Binary by Coupling Probability Estimates." NIPS 2001: Proceedings of Advances in Neural Information Processing Systems 14, 2001, pp. 1041–1048.

[50] Zadrozny, B., J. Langford, and N. Abe. Cost-Sensitive Learning by Cost-Proportionate Example Weighting. CiteSeerX. [Online] 2003.

[51] Zhou, Z.-H. and X.-Y. Liu. On Multi-Class Cost-Sensitive Learning. CiteSeerX. [Online] 2006.

Was this topic helpful?