{rfName}
Au

Indexed in

License and use

Altmetrics

Grant support

This work has been funded by the Spanish Ministry of Economy and Competitiveness under project number TIN2014-57458-R.

Analysis of institutional authors

Sanchez, ACorresponding AuthorVelez, JfAuthorSanchez, JCorresponding AuthorMoreno, AbAuthor

Share

Publications
>
Proceedings Paper

Automatic Anonymization of Printed-Text Document Images

Publicated to:Lecture Notes In Computer Science. 10884 145-152 - 2018-01-01 10884(), DOI: 10.1007/978-3-319-94211-7_17

Authors: Sanchez, Angel; Velez, Jose F; Sanchez, Javier; Belen Moreno, A

Affiliations

Rey Juan Carlos Univ, Madrid 28933, Spain - Author

Abstract

Nowadays, the storage and transmission of some types of documents requires the removal of personal information from involved users. Automatic text anonymization or de-identification is a solution for hiding all sensible information contained in the documents. Although the problem has been mainly studied for plain printed-text documents, there are not works where the de-identification task also produces anonymized document images with the same text fonts as those in the original documents. This data augmentation process could be applied to train a system for document image classification. In this paper, we describe an implementation of an automated anonymization modular system for printedtext image documents written in Spanish. System evaluation performed on a dataset of invoice images shows the viability of our proposal.

Keywords

AnonymizationConvolutional neural networkDigital storageDocument image analysisFont classificationImage classificationInformation retrieval systemsNeural networksPrinted-text anonymizationRegular expressionRegular expressionsText processing

Quality index

Bibliometric impact. Analysis of the contribution and dissemination channel

The work has been published in the journal Lecture Notes In Computer Science due to its progression and the good impact it has achieved in recent years, according to the agency Scopus (SJR), it has become a reference in its field. In the year of publication of the work, 2018, it was in position , thus managing to position itself as a Q2 (Segundo Cuartil), in the category Computer Science (Miscellaneous).

Independientemente del impacto esperado determinado por el canal de difusión, es importante destacar el impacto real observado de la propia aportación.

Según las diferentes agencias de indexación, el número de citas acumuladas por esta publicación hasta la fecha 2025-06-12:

  • WoS: 2
  • Scopus: 2
  • OpenCitations: 2

Impact and social visibility

From the perspective of influence or social adoption, and based on metrics associated with mentions and interactions provided by agencies specializing in calculating the so-called "Alternative or Social Metrics," we can highlight as of 2025-06-12:

  • The use of this contribution in bookmarks, code forks, additions to favorite lists for recurrent reading, as well as general views, indicates that someone is using the publication as a basis for their current work. This may be a notable indicator of future more formal and academic citations. This claim is supported by the result of the "Capture" indicator, which yields a total of: 9 (PlumX).

Leadership analysis of institutional authors

There is a significant leadership presence as some of the institution’s authors appear as the first or last signer, detailed as follows: First Author (Sánchez Calle, Ángel) and Last Author (Moreno Díaz, Ana Belén).

the authors responsible for correspondence tasks have been Sánchez Calle, Ángel and Sánchez Alfonso, Javier.