METHOD FOR RECOGNITION OF TEXT IN IMAGES OF DOCUMENTS Russian patent published in 2022 - IPC G06K9/62 G06T7/10 G06N3/02 

Abstract RU 2768544 C1

FIELD: computing technology.

SUBSTANCE: computer-implemented method for automatic recognition of text in an image of a document, the method is executed on a computing apparatus containing a processor and a memory storing instructions executed by the processor and containing stages of: obtaining an image of the document; isolating an area of the document in the image; based on the isolated area of the document in the image, classifying a type of the document using a convolutional neural network; binarising the image of the determined type of document in order to separate the text of the document from the background; segmenting the image of the document, with the determined type of document, in order to determine text fields, by determining the coordinates of the bounding rectangles of fields; determining the type of found text fields using structural document templates; conducting text recognition on the identified text fields; validating the recognised text on the identified text fields based on a set of rules related to the data contained in the fields and the expected presentation formats of said data.

EFFECT: increase in the accuracy of extracting information from images of documents.

16 cl, 2 dwg

Similar patents RU2768544C1

Title Year Author Number
METHOD OF DETECTING FORGERY 2023
  • Kunina Irina Andreevna
  • Bursikov Aleksej Dmitrievich
  • Gajer Aleksandr Vyacheslavovich
RU2825085C1
DETECTING TEXT FIELDS USING NEURAL NETWORKS 2018
  • Zuev, Konstantin Alekseevich
  • Senkevich, Oleg Evgenyevich
  • Golubev, Sergei Vladimirovich
RU2699687C1
IMAGE RECOGNITION SYSTEM: BEORG SMART VISION 2020
  • Zuev Georgij Alekseevich
  • Kolosov Anton Aleksandrovich
RU2777354C2
TRAINING NEURAL NETWORKS FOR IMAGE PROCESSING USING SYNTHETIC PHOTOREALISTIC CONTAINING IMAGE SIGNS 2018
  • Zagajnov Ivan Germanovich
  • Borin Pavel Valerevich
RU2709661C1
RECONSTRUCTION OF THE DOCUMENT FROM DOCUMENT IMAGE SERIES 2017
  • Loginov Vasilij Vasilevich
  • Zagajnov Ivan Germanovich
  • Karatsapova Irina Aleksandrovna
RU2659745C1
EXTRACTION OF MULTIPLE DOCUMENTS FROM A SINGLE IMAGE 2020
  • Ivan Zagaynov
  • Aleksandra Stepina
RU2764705C1
HANDWRITING RECOGNITION USING NEURAL NETWORKS 2020
  • Andrey Upshinskiy
RU2757713C1
METHOD AND SYSTEM FOR DEPERSONALIZATION OF DOCUMENTS CONTAINING PERSONAL DATA 2019
  • Populyakh Vadim Valerevich
  • Skugarev Aleksej Vladimirovich
  • Sidorov Vladimir Mikhajlovich
RU2793607C1
OPTICAL CHARACTER RECOGNITION BY MEANS OF COMBINATION OF NEURAL NETWORK MODELS 2020
  • Konstantin Anisimovich
  • Alexey Zhuravlev
RU2768211C1
METHODS AND SYSTEMS OF DOCUMENT SEGMENTATION 2018
  • Zuev Konstantin Alekseevich
  • Deryagin Dmitrij Georgievich
  • Atroshchenko Mikhail Yurevich
RU2697649C1

RU 2 768 544 C1

Authors

Gordeev Dmitrij Vladimirovich

Kondratev Kirill Andreevich

Ostrovskij Konstantin Igorevich

Dates

2022-03-24Published

2021-07-16Filed