Automatic Anonymization of Printed-Text Document Images

Indexed in

License and use

Citations

Cited 2 times in Scopus logo

Cited 2 times in Web of Science logo

Cited 2 times in

Altmetrics

Grant support

This work has been funded by the Spanish Ministry of Economy and Competitiveness under project number TIN2014-57458-R.

Analysis of institutional authors

Sanchez, ACorresponding AuthorVelez, JfAuthorSanchez, JCorresponding AuthorMoreno, AbAuthor

Publications

Proceedings Paper

Automatic Anonymization of Printed-Text Document Images

Publicated to:Lecture Notes In Computer Science. 10884 145-152 - 2018-01-01 10884(), DOI: 10.1007/978-3-319-94211-7_17

Authors: Sanchez, Angel; Velez, Jose F; Sanchez, Javier; Belen Moreno, A

Affiliations

Rey Juan Carlos Univ, Madrid 28933, Spain - Author

Abstract

Nowadays, the storage and transmission of some types of documents requires the removal of personal information from involved users. Automatic text anonymization or de-identification is a solution for hiding all sensible information contained in the documents. Although the problem has been mainly studied for plain printed-text documents, there are not works where the de-identification task also produces anonymized document images with the same text fonts as those in the original documents. This data augmentation process could be applied to train a system for document image classification. In this paper, we describe an implementation of an automated anonymization modular system for printedtext image documents written in Spanish. System evaluation performed on a dataset of invoice images shows the viability of our proposal.

Keywords

AnonymizationConvolutional neural networkDigital storageDocument image analysisFont classificationImage classificationInformation retrieval systemsNeural networksPrinted-text anonymizationRegular expressionRegular expressionsText processing

Quality index

Bibliometric impact. Analysis of the contribution and dissemination channel

The work has been published in the journal Lecture Notes In Computer Science due to its progression and the good impact it has achieved in recent years, according to the agency Scopus (SJR), it has become a reference in its field. In the year of publication of the work, 2018, it was in position , thus managing to position itself as a Q2 (Segundo Cuartil), in the category Computer Science (Miscellaneous).

Independientemente del impacto esperado determinado por el canal de difusión, es importante destacar el impacto real observado de la propia aportación.

Según las diferentes agencias de indexación, el número de citas acumuladas por esta publicación hasta la fecha 2025-06-12:

WoS: 2
Scopus: 2
OpenCitations: 2

Impact and social visibility

Leadership analysis of institutional authors

There is a significant leadership presence as some of the institution’s authors appear as the first or last signer, detailed as follows: First Author (Sánchez Calle, Ángel) and Last Author (Moreno Díaz, Ana Belén).

the authors responsible for correspondence tasks have been Sánchez Calle, Ángel and Sánchez Alfonso, Javier.

Indexed in

License and use

Citations

Altmetrics

Grant support

Analysis of institutional authors

Share

Automatic Anonymization of Printed-Text Document Images

Affiliations

Abstract

Keywords

Quality index

Bibliometric impact. Analysis of the contribution and dissemination channel

Impact and social visibility

Leadership analysis of institutional authors