[AcockStavig1979] Alan C. Acock and Gordon R. Stavig, A Measure of Association for Nonparametric Statistics, Social Forces, Oxford University Press, Volume 57, Number 4, June, 1979, 1381–1386.
[AgrawalKSX2002] Rakesh Agrawal and Jerry Kiernan and Ramakrishnan Srikant and Yirong Xu, Hippocratic Databases, Proceedings of the 28-th International Conference on Very Large Data Bases (VLDB 2002), Hong Kong, China, August 20–23, 2002, 143–154.
[Agresti2002] Alan Agresti, Categorical Data Analysis, Second Edition, Wiley Series in Probability and Statistics, Wiley-Interscience 2002, 710.
[AloiseDHP2009] Daniel Aloise and Amit Deshpande and Pierre Hansen and Preyas Popat, NP-hardness of Euclidean Sum-of-squares Clustering, Machine Learning, Kluwer Academic Publishers, Volume 75, Number 2, May, 2009, 245–248.
[ArthurVassilvitskii2007] k-means++: The Advantages of Careful Seeding, David Arthur and Sergei Vassilvitskii, Proceedings of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2007), January 7–9, 2007, New Orleans, LA, USA, 1027–1035.
[Breiman2001] L. Breiman. Random forests. Machine Learning, 45(1):5–32, 2001.
[BreimanFOS1984] L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Wadsworth, 1984.
[Chapelle2007] Olivier Chapelle, Training a Support Vector Machine in the Primal, Neural Computation, 2007.
[Cochran1954] William G. Cochran, Some Methods for Strengthening the Common $\chi^2$ Tests, Biometrics, Volume 10, Number 4, December 1954, 417–451.
[Collett2003] D. Collett. Modelling Survival Data in Medical Research, Second Edition. Chapman & Hall/CRC Texts in Statistical Science. Taylor & Francis, 2003.
[Gill2000] Jeff Gill, Generalized Linear Models: A Unified Approach, Sage University Papers Series on Quantitative Applications in the Social Sciences, Number 07-134, 2000, Sage Publications, 101.
[Hartigan1975] John A. Hartigan, Clustering Algorithms, John Wiley~&~Sons Inc., Probability and Mathematical Statistics, April 1975, 365.
[Hsieh2008] C-J Hsieh, K-W Chang, C-J Lin, S. S. Keerthi and S. Sundararajan, A Dual Coordinate Descent Method for Large-scale Linear SVM, International Conference of Machine Learning (ICML), 2008.
[Lin2008] Chih-Jen Lin and Ruby C. Weng and S. Sathiya Keerthi, Trust Region Newton Method for Large-Scale Logistic Regression, Journal of Machine Learning Research, April, 2008, Volume 9, 627–650.
[McCallum1998] A. McCallum and K. Nigam, A comparison of event models for naive bayes text classification, AAAI-98 workshop on learning for text categorization, 1998.
[McCullagh1989] Peter McCullagh and John Ashworth Nelder, Generalized Linear Models, Second Edition, Monographs on Statistics and Applied Probability, Number 37, 1989, Chapman & Hall/CRC, 532.
[Nelder1972] John Ashworth Nelder and Robert William Maclagan Wedderburn, Generalized Linear Models, Journal of the Royal Statistical Society, Series A (General), 1972, Volume 135, Number 3, 370–384.
[Nocedal1999] J. Nocedal and S. J. Wright, Numerical Optimization, Springer-Verlag, 1999.
[Nocedal2006] Optimization Numerical Optimization, Jorge Nocedal and Stephen Wright, Springer Series in Operations Research and Financial Engineering, 664, Second Edition, Springer, 2006.
[PandaHBB2009] B. Panda, J. Herbach, S. Basu, and R. J. Bayardo. PLANET: massively parallel learning of tree ensembles with mapreduce. PVLDB, 2(2):1426– 1437, 2009.
[Russell2009] S. Russell and P. Norvig, Artificial Intelligence: A Modern Approach, Prentice Hall, 2009.
[Scholkopf1995] B. Scholkopf, C. Burges and V. Vapnik, Extracting Support Data for a Given Task, International Conference on Knowledge Discovery and Data Mining (ICDM), 1995.
[Stevens1946] Stanley Smith Stevens, On the Theory of Scales of Measurement, Science June 7, 1946, Volume 103, Number 2684, 677–680.
[Vetterling1992] W. T. Vetterling and B. P. Flannery, Multidimensions in Numerical Recipes in C - The Art in Scientific Computing, W. H. Press and S. A. Teukolsky (eds.), Cambridge University Press, 1992.
[ZhouWSP08] Y. Zhou, D. M. Wilkinson, R. Schreiber, and R. Pan. Large-scale parallel collaborative filtering for the Netflix prize. In Algorithmic Aspects in Information and Management, 4th International Conference, AAIM 2008, Shanghai, China, June 23-25, 2008. Proceedings, pages 337–348, 2008.