{rfName}
Ne

Indexed in

License and use

Icono OpenAccess

Citations

Altmetrics

Analysis of institutional authors

Reyes Salgado, GerardoAuthor

Share

June 14, 2024
Publications
>
Article

Neural Architecture Comparison for Bibliographic Reference Segmentation: An Empirical Study

Publicated to:Data. 9 (5): 71- - 2024-05-01 9(5), DOI: 10.3390/data9050071

Authors: Hidalgo, RC; Elias, RP; Torres-Moreno, JM; Villegas, OOV; Salgado, GR; Salazar, AM

Affiliations

Biblioteca Daniel Cosio Villegas, Colegio Mexico, Carretera Picacho Ajusco 20, Mexico City 14110, Mexico - Author
Tecnol Nacl Mexico CENIDET, Cuernavaca 62490, Mexico - Author
Univ Autonoma Ciudad Juarez, Ind & Mfg Engn Dept, Ciudad Juarez 32310, Mexico - Author
Univ Avignon, Lab Informat Avignon, 339 Chemin Meinajaries, F-84911 Avignon 9, France - Author
Univ Rey Juan Carlos, Dept Informat & Estadist, Ave Alcalde de Mostoles, Madrid 28933, Spain - Author
See more

Abstract

In the realm of digital libraries, efficiently managing and accessing scientific publications necessitates automated bibliographic reference segmentation. This study addresses the challenge of accurately segmenting bibliographic references, a task complicated by the varied formats and styles of references. Focusing on the empirical evaluation of Conditional Random Fields (CRF), Bidirectional Long Short-Term Memory with CRF (BiLSTM + CRF), and Transformer Encoder with CRF (Transformer + CRF) architectures, this research employs Byte Pair Encoding and Character Embeddings for vector representation. The models underwent training on the extensive Giant corpus and subsequent evaluation on the Cora Corpus to ensure a balanced and rigorous comparison, maintaining uniformity across embedding layers, normalization techniques, and Dropout strategies. Results indicate that the BiLSTM + CRF architecture outperforms its counterparts by adeptly handling the syntactic structures prevalent in bibliographic data, achieving an F1-Score of 0.96. This outcome highlights the necessity of aligning model architecture with the specific syntactic demands of bibliographic reference segmentation tasks. Consequently, the study establishes the BiLSTM + CRF model as a superior approach within the current state-of-the-art, offering a robust solution for the challenges faced in digital library management and scholarly communication.

Keywords

BilstmByte-pair encodingConditional random fieldConditional random fieldsReference miningTransformers

Quality index

Bibliometric impact. Analysis of the contribution and dissemination channel

The work has been published in the journal Data due to its progression and the good impact it has achieved in recent years, according to the agency WoS (JCR), it has become a reference in its field. In the year of publication of the work, 2024 there are still no calculated indicators, but in 2023, it was in position 59/135, thus managing to position itself as a Q2 (Segundo Cuartil), in the category Multidisciplinary Sciences. Notably, the journal is positioned en el Cuartil Q2 para la agencia Scopus (SJR) en la categoría Information Systems and Management.

Impact and social visibility

From the perspective of influence or social adoption, and based on metrics associated with mentions and interactions provided by agencies specializing in calculating the so-called "Alternative or Social Metrics," we can highlight as of 2025-10-03:

  • The use of this contribution in bookmarks, code forks, additions to favorite lists for recurrent reading, as well as general views, indicates that someone is using the publication as a basis for their current work. This may be a notable indicator of future more formal and academic citations. This claim is supported by the result of the "Capture" indicator, which yields a total of: 5 (PlumX).

With a more dissemination-oriented intent and targeting more general audiences, we can observe other more global scores such as:

    It is essential to present evidence supporting full alignment with institutional principles and guidelines on Open Science and the Conservation and Dissemination of Intellectual Heritage. A clear example of this is:

    • The work has been submitted to a journal whose editorial policy allows open Open Access publication.

    Leadership analysis of institutional authors

    This work has been carried out with international collaboration, specifically with researchers from: France; Mexico.