Convexity, Classification, and Risk Bounds

Bartlett, Peter L., Jordan, Michael I, & McAuliffe, Jon D (2006) Convexity, Classification, and Risk Bounds. Journal of the American Statistical Association, 101(473), pp. 138-156.

View at publisher


Many of the classification algorithms developed in the machine learning literature, including the support vector machine and boosting, can be viewed as minimum contrast methods that minimize a convex surrogate of the 0–1 loss function. The convexity makes these algorithms computationally efficient. The use of a surrogate, however, has statistical consequences that must be balanced against the computational virtues of convexity. To study these issues, we provide a general quantitative relationship between the risk as assessed using the 0–1 loss and the risk as assessed using any nonnegative surrogate loss function. We show that this relationship gives nontrivial upper bounds on excess risk under the weakest possible condition on the loss function—that it satisfies a pointwise form of Fisher consistency for classification. The relationship is based on a simple variational transformation of the loss function that is easy to compute in many applications. We also present a refined version of this result in the case of low noise, and show that in this case, strictly convex loss functions lead to faster rates of convergence of the risk than would be implied by standard uniform convergence arguments. Finally, we present applications of our results to the estimation of convergence rates in function classes that are scaled convex hulls of a finite-dimensional base class, with a variety of commonly used loss functions.

Impact and interest:

334 citations in Scopus
Search Google Scholar™
243 citations in Web of Science®

Citation counts are sourced monthly from Scopus and Web of Science® citation databases.

These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.

Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.

ID Code: 43928
Item Type: Journal Article
Refereed: Yes
Keywords: Boosting; Convex optimization; , Empirical process theory; , Machine learning; , Rademacher complexity; , Support vector machine.
DOI: 10.1198/016214505000000907
ISSN: 0162-1459
Subjects: Australian and New Zealand Standard Research Classification > MATHEMATICAL SCIENCES (010000) > STATISTICS (010400)
Australian and New Zealand Standard Research Classification > ECONOMICS (140000) > ECONOMETRICS (140300)
Divisions: Past > QUT Faculties & Divisions > Faculty of Science and Technology
Past > Schools > Mathematical Sciences
Copyright Owner: American Statistical Association
Deposited On: 12 Aug 2011 00:48
Last Modified: 29 Feb 2012 14:34

Export: EndNote | Dublin Core | BibTeX

Repository Staff Only: item control page