Information-theoretic Bounds on the Training and Testing Error of Boosting Dartmouth Technical Report TR2002-428 Sebastien M. Lahaie Date: May 2002 URL (compressed postscript): (116KB) URL (PDF): (252KB) Abstract: Boosting is a means of using weak learners as subroutines to produce a strong learner with markedly better accuracy. Recent results showing the connection between logistic regression and boosting provide the foundation for an information-theoretic analysis of boosting. We describe the analogy between boosting and gambling, which allows us to derive a new upper bound on training error. This upper bound explicitly describes the effect of noisy data on training error. We also use information-theoretic techniques to derive an alternative upper-bound on testing error which is independent of the size of the weak-learner space.