Shifting: One-inclusion mistake bounds and sample compression
Description
We present new expected risk bounds for binary and multiclass prediction, and resolve several recent conjectures on sample compressibility due to Kuzmin and Warmuth. By exploiting the combinatorial structure of concept class F, Haussler et al. achieved a VC(F)/n bound for the natural one-inclusion prediction strategy. The key step in their proof is a d = VC(F) bound on the graph density of a subgraph of the hypercube—oneinclusion graph. The first main result of this paper is a density bound of n [n−1 <=d-1]/[n <=d] < d, which positively resolves a conjecture of Kuzmin and Warmuth relating to their unlabeled Peeling compression scheme and also leads to an improved one-inclusion mistake bound. The proof uses a new form of VC-invariant shifting and a group-theoretic symmetrization. Our second main result is an algebraic topological property of maximum classes of VC-dimension d as being d contractible simplicial complexes, extending the well-known characterization that d = 1 maximum classes are trees. We negatively resolve a minimum degree conjecture of Kuzmin and Warmuth—the second part to a conjectured proof of correctness for Peeling—that every class has one-inclusion minimum degree at most its VCdimension. Our final main result is a k-class analogue of the d/n mistake bound, replacing the VC-dimension by the Pollard pseudo-dimension and the one-inclusion strategy by its natural hypergraph generalization. This result improves on known PAC-based expected risk bounds by a factor of O(logn) and is shown to be optimal up to an O(logk) factor. The combinatorial technique of shifting takes a central role in understanding the one-inclusion (hyper)graph and is a running theme throughout.
Impact and interest:
Citation counts are sourced monthly from Scopus and Web of Science® citation databases.
These databases contain citations from different subsets of available publications and different time periods and thus the citation count from each is usually different. Some works are not in either database and no count is displayed. Scopus includes citations from articles published in 1996 onwards, and Web of Science® generally from 1980 onwards.
Citations counts from the Google Scholar™ indexing service can be viewed at the linked Google Scholar™ search.
ID Code: | 43977 |
---|---|
Item Type: | Contribution to Journal (Journal Article) |
Refereed: | Yes |
Measurements or Duration: | 23 pages |
Keywords: | Multiclass prediction, One-inclusion mistake bounds, Sample compression, Shifting, Worst-case expected risk |
DOI: | 10.1016/j.jcss.2008.07.005 |
ISSN: | 0022-0000 |
Pure ID: | 31967829 |
Divisions: | Past > QUT Faculties & Divisions > Faculty of Science and Technology |
Copyright Owner: | Consult author(s) regarding copyright matters |
Copyright Statement: | This work is covered by copyright. Unless the document is being made available under a Creative Commons Licence, you must assume that re-use is limited to personal use and that permission from the copyright owner must be obtained for all other uses. If the document is available under a Creative Commons License (or other specified license) then refer to the Licence for details of permitted re-use. It is a condition of access that users recognise and abide by the legal requirements associated with these rights. If you believe that this work infringes copyright please provide details by email to qut.copyright@qut.edu.au |
Deposited On: | 18 Aug 2011 01:01 |
Last Modified: | 03 Mar 2024 14:09 |
Export: EndNote | Dublin Core | BibTeX
Repository Staff Only: item control page