Download the full article
AdaBoost is a well known, effective technique for increasing the accuracy of learning algorithms. However, it has the potential to overfit the training set because its objective is to minimize error on the training set. We show that with the introduction of a scoring function and the random selection of training data it is possible to create a smaller set of feature vectors. The selection of this subset of weak classifiers helps boosting to reduce the generalization error and to avoid overfitting on both synthetic and real data.
Index Terms: AdaBoost, Classifier, Overfitting, Feature selection.
Luigi Rosa, “A Fast Scheme for Feature Subset Selection to Avoid Overfitting in AdaBoost”, February 2011.