FUZZY SEARCH USING WORD FORMS FOR WORKING WITH BIG DATA Russian patent published in 2022 - IPC G06F40/279 

Abstract RU 2768233 C1

FIELD: computer technology.

SUBSTANCE: invention relates to the field of computer technology for processing text data. The expected result is achieved by defining a mapping scheme in which words are represented by word forms, while the same word form can correspond to different words; defining a database containing many database records; and many sets of word forms in the database; formation of a set of hypotheses, including: the first hypothesis projectively associating with the target record (i) the first set of words in the document and (ii) the corresponding first set of word forms; and the second hypothesis projectively associating with the target record (i) the second set of words in the document and (ii) the corresponding second set of word forms; exclusion of the second hypothesis based on the discrepancy between the second set of word forms and each of the many sets of word forms in the database; and determination of the first set of words in the document as the target record by confirming the first hypothesis.

EFFECT: increase in the accuracy of detecting text fields and the values of these fields in digital documents by searching using word forms.

20 cl, 10 dwg

Similar patents RU2768233C1

Title Year Author Number
HANDWRITING RECOGNITION USING NEURAL NETWORKS 2020
  • Andrey Upshinskiy
RU2757713C1
TRACELESS IMAGE CAPTURE, USING MOBILE DEVICE 2020
  • Lobastov Stepan Iurevich
  • Lobastov Stepan Yurevich
  • Katkov Yurij Evgenevich
  • Shakhov Vasilij Sergeevich
  • Titova Olga Yurevna
  • Khintsitskij Ivan Petrovich
RU2787136C2
IDENTIFICATION OF FIELDS ON AN IMAGE USING ARTIFICIAL INTELLIGENCE 2018
  • Kalenkov Maksim Petrovich
RU2695489C1
IDENTIFICATION OF BLOCKS OF RELATED WORDS IN DOCUMENTS OF COMPLEX STRUCTURE 2019
  • Stanislav Semenov
RU2765884C2
METHOD AND DEVICE FOR DETERMINING TYPE OF DIGITAL DOCUMENT 2016
  • Filimonova Irina Zosimovna
RU2635259C1
CHARACTER RECOGNITION USING A HIERARCHICAL CLASSIFICATION 2018
  • Aleksey Alekseevich Zhuravlev
RU2693916C1
DETECTING TEXT FIELDS USING NEURAL NETWORKS 2018
  • Zuev, Konstantin Alekseevich
  • Senkevich, Oleg Evgenyevich
  • Golubev, Sergei Vladimirovich
RU2699687C1
NEURAL NETWORK TRAINING BY MEANS OF SPECIALIZED LOSS FUNCTIONS 2018
  • Aleksey Alekseevich Zhuravlev
RU2707147C1
IDENTIFICATION OF FIELDS AND TABLES IN DOCUMENTS USING NEURAL NETWORKS USING GLOBAL DOCUMENT CONTEXT 2019
  • Stanislav Semenov
RU2723293C1
DETECTING SECTIONS OF TABLES IN DOCUMENTS BY NEURAL NETWORKS USING GLOBAL DOCUMENT CONTEXT 2019
  • Stanislav Semenov
RU2721189C1

RU 2 768 233 C1

Authors

Stanislav Semenov

Dates

2022-03-23Published

2021-04-15Filed