FIELD: computer technology.
SUBSTANCE: invention relates to the field of computer technology and can be used to detect fields on document images. The method involves obtaining a training dataset containing several documents, and each of several documents correlates with several fields marked up by the user; for a given field from several user-marked fields in this document from several documents, determining whether a specific combination of relative positions of one or more additional user-marked fields relative to this field is repeated in one or more additional documents in this document; indicating that the selection of this field is incorrect if it is established that a specific combination is not repeated in any additional documents; establishing the presence in two or more other documents of another combination of relative positions of additional one or more fields marked up by the user, relative to this field, if it is established that a specific combination is repeated in one or more additional documents, and specifying this field as marked up correctly if it is established that another combination does not exist in two or more other documents; specifying this field as marked up is contradictory if it is established that this combination exists in two or more other documents.
EFFECT: determining the accuracy of user markup of fields in documents.
1 cl, 9 dwg
Title | Year | Author | Number |
---|---|---|---|
METHODS AND SYSTEMS FOR IDENTIFYING FIELDS IN A DOCUMENT | 2020 |
|
RU2760471C1 |
FUZZY SEARCH USING WORD FORMS FOR WORKING WITH BIG DATA | 2021 |
|
RU2768233C1 |
RETRIEVING FIELDS USING NEURAL NETWORKS WITHOUT USING TEMPLATES | 2019 |
|
RU2737720C1 |
METHOD AND SERVER FOR PROCESSING TEXT SEQUENCE IN MACHINE PROCESSING TASK | 2020 |
|
RU2775820C2 |
DETECTING TEXT FIELDS USING NEURAL NETWORKS | 2018 |
|
RU2699687C1 |
NAMED ENTITIES FROM THE TEXT AUTOMATIC EXTRACTION | 2014 |
|
RU2665239C2 |
USE OF DEPTH SEMANTIC ANALYSIS OF TEXTS ON NATURAL LANGUAGE FOR CREATION OF TRAINING SAMPLES IN METHODS OF MACHINE TRAINING | 2016 |
|
RU2636098C1 |
MULTISTAGE TRAINING OF MACHINE LEARNING MODELS FOR RANKING SEARCH RESULTS | 2021 |
|
RU2824338C2 |
TRAINING NEURAL NETWORKS USING LOSS FUNCTIONS REFLECTING RELATIONSHIPS BETWEEN NEIGHBOURING TOKENS | 2018 |
|
RU2721190C1 |
EXTRACTION OF MULTIPLE DOCUMENTS FROM A SINGLE IMAGE | 2020 |
|
RU2764705C1 |
Authors
Dates
2022-06-21—Published
2021-09-16—Filed