{rfName}
Mi

Indexed in

License and use

Altmetrics

Analysis of institutional authors

Aceña VCorresponding AuthorMartín De Diego IAuthorR Fernández RAuthorM Moguerza JAuthor

Share

October 10, 2022
Publications
>
Article
Green

Minimally overfitted learners: A general framework for ensemble learning

Publicated to:Knowledge-Based Systems. 254 109669- - 2022-10-27 254(), DOI: 10.1016/j.knosys.2022.109669

Authors: Acena, Victor; Martin de Diego, Isaac; Fernandez, Ruben R; Moguerza, Javier M

Affiliations

MADOX VIAJES, Calle Cantabria 10, Arroyomolinos 28939, Spain - Author
Rey Juan Carlos Univ, Data Sci Lab, C Tulipan S-N, Mostoles 28933, Spain - Author
Rey Juan Carlos University, Data Science Laboratory, c/ Tulipán, s/n, 28933, Móstoles, Spain - Author
Rey Juan Carlos University, Data Science Laboratory, c/ Tulipán, s/n, 28933, Móstoles, Spain, MADOX VIAJES, Calle de Cantabria, 10, Arroyomolinos, 28939, Spain - Author

Abstract

The combination of Machine Learning (ML) algorithms is a solution for constructing stronger predictors than a single one. However, some approximations suggest that combining unstable algorithms provides better results than combining stable algorithms. For instance, Generative ensembles, based on re-sampling techniques, have demonstrated high performance by fusing the information of unstable base learners. Random Forest and Gradient Boosting are two well-known examples, both combining Decision Trees and providing better predictions than those obtained using a single tree. However, such successful results have not been achieved by assembling stable algorithms. This paper introduces the notion of limited learner and a new ensemble general framework called Minimally Overfitted Ensemble (MOE), a re-sampling-based ensemble approach that constructs slightly overfitted-based learners. The proposed framework works well with stable and unstable base algorithms, thanks to a Weighted RAndom Bootstrap (WRAB) sampling that provides the necessary diversity for the stable base algorithms. A hyperparameter analysis of the proposal is carried out on artificial data. Besides, its performance is evaluated on real datasets against well-known ML methods. The results confirm that the MOE framework works successfully using stable and unstable base algorithms, improving in most cases the predictive ability of single ML models and other ensemble methods. © 2022 The Author(s)

Keywords

Approximation algorithmsBaggingDecision treesEnsembleEnsemble learningForestryGenerative ensembleGenerative ensemblesMachine learningMachine learning algorithmsPerformanceRandom forestRandom forestsRe-samplingResamplingSampling techniqueStable algorithms

Quality index

Bibliometric impact. Analysis of the contribution and dissemination channel

The work has been published in the journal Knowledge-Based Systems due to its progression and the good impact it has achieved in recent years, according to the agency WoS (JCR), it has become a reference in its field. In the year of publication of the work, 2022, it was in position 19/145, thus managing to position itself as a Q1 (Primer Cuartil), in the category Computer Science, Artificial Intelligence.

From a relative perspective, and based on the normalized impact indicator calculated from World Citations provided by WoS (ESI, Clarivate), it yields a value for the citation normalization relative to the expected citation rate of: 1.45. This indicates that, compared to works in the same discipline and in the same year of publication, it ranks as a work cited above average. (source consulted: ESI Nov 14, 2024)

This information is reinforced by other indicators of the same type, which, although dynamic over time and dependent on the set of average global citations at the time of their calculation, consistently position the work at some point among the top 50% most cited in its field:

  • Weighted Average of Normalized Impact by the Scopus agency: 1.51 (source consulted: FECYT Feb 2024)
  • Field Citation Ratio (FCR) from Dimensions: 9.82 (source consulted: Dimensions Jul 2025)

Specifically, and according to different indexing agencies, this work has accumulated citations as of 2025-07-16, the following number of citations:

  • WoS: 19
  • Scopus: 21

Impact and social visibility

From the perspective of influence or social adoption, and based on metrics associated with mentions and interactions provided by agencies specializing in calculating the so-called "Alternative or Social Metrics," we can highlight as of 2025-07-16:

  • The use, from an academic perspective evidenced by the Altmetric agency indicator referring to aggregations made by the personal bibliographic manager Mendeley, gives us a total of: 30.
  • The use of this contribution in bookmarks, code forks, additions to favorite lists for recurrent reading, as well as general views, indicates that someone is using the publication as a basis for their current work. This may be a notable indicator of future more formal and academic citations. This claim is supported by the result of the "Capture" indicator, which yields a total of: 29 (PlumX).

With a more dissemination-oriented intent and targeting more general audiences, we can observe other more global scores such as:

  • The Total Score from Altmetric: 7.6.
  • The number of mentions on the social network X (formerly Twitter): 9 (Altmetric).

It is essential to present evidence supporting full alignment with institutional principles and guidelines on Open Science and the Conservation and Dissemination of Intellectual Heritage. A clear example of this is:

  • The work has been submitted to a journal whose editorial policy allows open Open Access publication.

Leadership analysis of institutional authors

There is a significant leadership presence as some of the institution’s authors appear as the first or last signer, detailed as follows: First Author (Aceña Gil, Víctor) and Last Author (Martínez Moguerza, Javier).

the author responsible for correspondence tasks has been Aceña Gil, Víctor.