Minimally overfitted learners: A general framework for ensemble learning

October 10, 2022

Publications

>

Article

Green

Minimally overfitted learners: A general framework for ensemble learning

Publicated to:Knowledge-Based Systems. 254 109669- - 2022-10-27 254(), DOI: 10.1016/j.knosys.2022.109669

Authors: Acena, Victor; Martin de Diego, Isaac; Fernandez, Ruben R; Moguerza, Javier M

Affiliations

MADOX VIAJES, Calle Cantabria 10, Arroyomolinos 28939, Spain - Author

Rey Juan Carlos Univ, Data Sci Lab, C Tulipan S-N, Mostoles 28933, Spain - Author

Rey Juan Carlos University, Data Science Laboratory, c/ Tulipán, s/n, 28933, Móstoles, Spain - Author

Rey Juan Carlos University, Data Science Laboratory, c/ Tulipán, s/n, 28933, Móstoles, Spain, MADOX VIAJES, Calle de Cantabria, 10, Arroyomolinos, 28939, Spain - Author

Abstract

The combination of Machine Learning (ML) algorithms is a solution for constructing stronger predictors than a single one. However, some approximations suggest that combining unstable algorithms provides better results than combining stable algorithms. For instance, Generative ensembles, based on re-sampling techniques, have demonstrated high performance by fusing the information of unstable base learners. Random Forest and Gradient Boosting are two well-known examples, both combining Decision Trees and providing better predictions than those obtained using a single tree. However, such successful results have not been achieved by assembling stable algorithms. This paper introduces the notion of limited learner and a new ensemble general framework called Minimally Overfitted Ensemble (MOE), a re-sampling-based ensemble approach that constructs slightly overfitted-based learners. The proposed framework works well with stable and unstable base algorithms, thanks to a Weighted RAndom Bootstrap (WRAB) sampling that provides the necessary diversity for the stable base algorithms. A hyperparameter analysis of the proposal is carried out on artificial data. Besides, its performance is evaluated on real datasets against well-known ML methods. The results confirm that the MOE framework works successfully using stable and unstable base algorithms, improving in most cases the predictive ability of single ML models and other ensemble methods. © 2022 The Author(s)

Keywords

Approximation algorithmsBaggingDecision treesEnsembleEnsemble learningForestryGenerative ensembleGenerative ensemblesMachine learningMachine learning algorithmsPerformanceRandom forestRandom forestsRe-samplingResamplingSampling techniqueStable algorithms

Quality index

Bibliometric impact. Analysis of the contribution and dissemination channel

The work has been published in the journal Knowledge-Based Systems due to its progression and the good impact it has achieved in recent years, according to the agency WoS (JCR), it has become a reference in its field. In the year of publication of the work, 2022, it was in position 19/145, thus managing to position itself as a Q1 (Primer Cuartil), in the category Computer Science, Artificial Intelligence.

From a relative perspective, and based on the normalized impact indicator calculated from World Citations provided by WoS (ESI, Clarivate), it yields a value for the citation normalization relative to the expected citation rate of: 1.45. This indicates that, compared to works in the same discipline and in the same year of publication, it ranks as a work cited above average. (source consulted: ESI Nov 14, 2024)

This information is reinforced by other indicators of the same type, which, although dynamic over time and dependent on the set of average global citations at the time of their calculation, consistently position the work at some point among the top 50% most cited in its field:

Weighted Average of Normalized Impact by the Scopus agency: 1.51 (source consulted: FECYT Feb 2024)
Field Citation Ratio (FCR) from Dimensions: 9.82 (source consulted: Dimensions Jul 2025)

Specifically, and according to different indexing agencies, this work has accumulated citations as of 2025-07-16, the following number of citations:

WoS: 19
Scopus: 21

Impact and social visibility

Leadership analysis of institutional authors

There is a significant leadership presence as some of the institution’s authors appear as the first or last signer, detailed as follows: First Author (Aceña Gil, Víctor) and Last Author (Martínez Moguerza, Javier).

the author responsible for correspondence tasks has been Aceña Gil, Víctor.

Indexed in

License and use

Citations

Altmetrics

Analysis of institutional authors

Share

Minimally overfitted learners: A general framework for ensemble learning

Affiliations

Abstract

Keywords

Quality index

Bibliometric impact. Analysis of the contribution and dissemination channel

Impact and social visibility

Leadership analysis of institutional authors