METHOD AND SYSTEM FOR EXTRACTING DATA FROM IMAGES OF SEMISTRUCTURED DOCUMENTS Russian patent published in 2017 - IPC G06K9/00 

Abstract RU 2613846 C2

FIELD: physics.

SUBSTANCE: text representation of the document image is obtained in the process of extracting data from the fields to the document image. A graph is constructed to store attributes of the document text fragments and the links between them. A cascade classification is made to calculate the attributes of the document text fragments and the links between them. A set of hypotheses is formed about the text fragment affiliation in the fields on the document image. A combination of hypotheses is selected. And data extracting is done from the fields on the document image based on the selected combination of the hypotheses.

EFFECT: saving computing resources.

15 cl, 8 dwg

Similar patents RU2613846C2

Title Year Author Number
METHODS AND SYSTEMS FOR IDENTIFYING FIELDS IN A DOCUMENT 2021
  • Stanislav Semenov
RU2774653C1
METHODS AND SYSTEMS FOR IDENTIFYING FIELDS IN A DOCUMENT 2020
  • Semenov Stanislav Vladimirovich
  • Lanin Mikhail Olegovich
RU2760471C1
COMPARING DOCUMENTS USING RELIABLE SOURCE 2014
  • Khintsitskij Ivan Petrovich
  • Isaev Andrej Anatolevich
RU2597163C2
DEVICES AND METHODS USING A HIERARCHIALLY ORDERED DATA STRUCTURE CONTAINING UNPARAMETRIC SYMBOLS FOR CONVERTING DOCUMENT IMAGES TO ELECTRONIC DOCUMENTS 2013
  • Chulinin Yurij Georgievich
RU2643465C2
DEVICES AND METHODS, WHICH BUILD THE HIERARCHIALLY ORDINARY DATA STRUCTURE, CONTAINING NONPARAMETERIZED SYMBOLS FOR DOCUMENTS IMAGES CONVERSION TO ELECTRONIC DOCUMENTS 2013
  • Chulinin Yurij Georgievich
RU2625533C1
METHODS AND DEVICES THAT CONVERT IMAGES OF DOCUMENTS TO ELECTRONIC DOCUMENTS USING TRIE-DATA STRUCTURES CONTAINING UNPARAMETERIZED SYMBOLS FOR DEFINITION OF WORD AND MORPHEMES ON DOCUMENT IMAGE 2013
  • Chulinin Yurij Georgievich
RU2631168C2
METHOD FOR TEXTUAL INFORMATION RECOGNITION AND ITS INTEGRITY EVALUATION IN INTERNET ELECTRONIC DOCUMENTS 2013
  • Molchanov Artem Nikolaevich
  • Skurnovich Aleksej Valentinovich
  • Stel'Makh Ehduard Petrovich
  • Molchanov Il'Ja Nikolaevich
RU2550543C1
METHODS AND SYSTEMS FOR PROCESSING IMAGES OF MATHEMATICAL EXPRESSIONS 2014
  • Isupov Dmitry Sergeevich
  • Masalovitch Anton Andreevich
RU2596600C2
CONTENT-BASED DOCUMENT IMAGE CLASSIFICATION 2014
  • Smirnov Anatoly Anatolyevich
  • Panferov Vasily Vladimirovich
  • Isaev Andrey Anatolyevich
RU2571545C1
DEVICES AND METHODS, WHICH PREPARE PARAMETERED SYMBOLS FOR TRANSFORMING IMAGES OF DOCUMENTS INTO ELECTRONIC DOCUMENTS 2013
  • Chulinin Yurij Georgievich
RU2625020C1

RU 2 613 846 C2

Authors

Kostyukov Mikhail Valerievich

Dates

2017-03-21Published

2015-09-07Filed