Efficient determination of the number of weak learners in AdaBoost

Kyaw Kyaw Htike, “Efficient determination of the number of weak learners in AdaBoost”, Journal of Experimental and Theoretical Artificial Intelligence, Vol. 29, no. 5, pp. 967-982, Taylor & Francis, 2017. DOI: 10.1080/0952813X.2016.1266038.  [ISI and Scopus-indexed journal; Impact factor = 1.703]

AdaBoost is a successful machine learning algorithm used in a variety of fields nowadays. However, its performance is sensitive to the number of weak learners in the ensemble. Too few weak learners will result in underfitting to the training data-set and too many of them cause overfitting to the training data-set, both of which result in poor generalisation of the classifier on test data. The standard way to compute the number of weak learners that is optimal for a particular data-set is to use cross-validation; however, it is highly computationally expensive. In this paper, we propose an efficient method that does not require cross-validation or a separate validation set to determine the number of weak learners for use in AdaBoost. Our method is evaluated on eight different publicly available data-sets to demonstrate its efficacy.