FIELD: imaging technology.
SUBSTANCE: invention relates to systems and methods for detecting fields in a document. Systems and methods are disclosed to obtain a training data set containing several document images, with each of the several document images being mapped to the corresponding metadata defining the document field containing the variable text; formation by processing several images of the document of the first heat map, represented by a data structure containing several elements of the heat map corresponding to several pixels of the document image, and each element of the heat map stores a counter of a certain number of images of the document, in which the document field contains a pixel of the document image associated with the element of the heat map; obtaining an image of the input document and determining within the image of the input document the candidate area containing the document field, and the candidate area contains several pixels of the image of the input document corresponding to the elements of the heat map that satisfy the threshold condition.
EFFECT: more accurate identification of the fields in the document.
15 cl, 9 dwg
Title | Year | Author | Number |
---|---|---|---|
METHODS AND SYSTEMS FOR IDENTIFYING FIELDS IN A DOCUMENT | 2021 |
|
RU2774653C1 |
GENERATION OF MARKING OF DOCUMENT IMAGES FOR TRAINING SAMPLE | 2017 |
|
RU2668717C1 |
TRAINING NEURAL NETWORKS USING LOSS FUNCTIONS REFLECTING RELATIONSHIPS BETWEEN NEIGHBOURING TOKENS | 2018 |
|
RU2721190C1 |
RETRIEVING FIELDS USING NEURAL NETWORKS WITHOUT USING TEMPLATES | 2019 |
|
RU2737720C1 |
USE OF DEPTH SEMANTIC ANALYSIS OF TEXTS ON NATURAL LANGUAGE FOR CREATION OF TRAINING SAMPLES IN METHODS OF MACHINE TRAINING | 2016 |
|
RU2636098C1 |
TEXT SEGMENTATION | 2017 |
|
RU2666277C1 |
DEVICE AND METHOD FOR ANALYSIS OF MEDICAL IMAGES | 2022 |
|
RU2806982C1 |
DETECTING TEXT FIELDS USING NEURAL NETWORKS | 2018 |
|
RU2699687C1 |
MULTISTAGE TRAINING OF MACHINE LEARNING MODELS FOR RANKING SEARCH RESULTS | 2021 |
|
RU2824338C2 |
NAMED ENTITIES FROM THE TEXT AUTOMATIC EXTRACTION | 2014 |
|
RU2665239C2 |
Authors
Dates
2021-11-25—Published
2020-12-17—Filed