STRUCTURE OPTIMIZATION AND USE OF CODEBOOKS FOR DOCUMENT ANALYSIS Russian patent published in 2022 - IPC G06V30/20 G06V10/26 G06F40/16 G06F40/279 

Abstract RU 2787138 C1

FIELD: computer technology.

SUBSTANCE: group of inventions relates to computer systems intended for document analysis, more specifically to technologies of building and optimization of codebooks for detection of fields on a document. A method for optimization of a codebook is proposed. According to the method, the first set of document images is received by means of a data processing device. Next, a set of key areas is extracted from each document image of the first set of document images. Local descriptors are calculated for each key area of a number of extracted key area. In addition, local descriptors are clustered in such a way that each center of a local descriptor cluster corresponds to a corresponding visual word, and a codebook containing a set of visual words is built.

EFFECT: increase in the accuracy of information extraction from images due to the use of optimized codebooks.

20 cl, 12 dwg

Similar patents RU2787138C1

Title Year Author Number
GENERATION OF MARKING OF DOCUMENT IMAGES FOR TRAINING SAMPLE 2017
  • Zagajnov Ivan Germanovich
  • Borin Pavel Valerevich
RU2668717C1
METHOD AND DEVICE FOR TRACKING AND RECOGNISING OBJECTS USING ROTATION-INVARIANT DESCRIPTORS 2010
  • Takacs Gabriel
  • Grzeszczuk Radek
  • Chandrasekhar Vijay
  • Girod Bernd
RU2542946C2
DETECTING "FUZZY" IMAGE DUPLICATES USING TRIPLES OF ADJACENT RELATED FEATURES 2015
  • Fedorov Sergey Mikhailovich
  • Kacher Olga Arnoldovna
RU2613848C2
METHOD OF IMAGE DESCRIPTOR CONVERSION BASING ON HISTOGRAM OF GRADIENTS AND CORRESPONDING IMAGE PROCESSING DEVICE 2013
  • Paschalakis Stavros
  • Bober Miroslav
RU2661795C2
AUTOMATED METHODS AND SYSTEMS OF IDENTIFYING IMAGE FRAGMENTS IN DOCUMENT-CONTAINING IMAGES TO FACILITATE EXTRACTION OF INFORMATION FROM IDENTIFICATED DOCUMENT-CONTAINING IMAGE FRAGMENTS 2016
  • Zagaynov Ivan Germanovich
  • Borin Pavel Valerievich
RU2647670C1
SYSTEM AND METHOD FOR SELECTING RELEVANT PAGE ITEMS WITH IMPLICITLY SPECIFYING COORDINATES FOR IDENTIFYING AND VIEWING RELEVANT INFORMATION 2015
  • Tsyplyaev Maksim Viktorovich
  • Vinokurov Nikita Alekseevich
RU2708790C2
METHODS AND SYSTEMS FOR IDENTIFYING FIELDS IN A DOCUMENT 2020
  • Semenov Stanislav Vladimirovich
  • Lanin Mikhail Olegovich
RU2760471C1
METHOD AND DEVICE FOR CLASSIFICATION OF IMAGES OF PRINTED COPIES OF DOCUMENTS AND SORTING SYSTEM OF PRINTED COPIES OF DOCUMENTS 2016
  • Zavalishin Sergej Stanislavovich
  • But Andrej Alekseevich
  • Kurilin Ilya Vasilevich
  • Rychagov Mikhail Nikolaevich
RU2630743C1
DEVICE OF SEARCHING IMAGE DUPLICATES 2013
  • Marchuk Vladimir Ivanovich
  • Voronin Vjacheslav Vladimirovich
  • Pis'Menskova Marina Mikhajlovna
  • Morozova Tat'Jana Vladimirovna
RU2538319C1
METHODS AND SYSTEMS FOR IDENTIFYING FIELDS IN A DOCUMENT 2021
  • Stanislav Semenov
RU2774653C1

RU 2 787 138 C1

Authors

Vasily Loginov

Ivan Zagaynov

Stanislav Semenov

Dates

2022-12-29Published

2021-07-21Filed