FIELD: physics.
SUBSTANCE: method for analysing the results of recognition of a series of images is proposed. The method comprises the step of obtaining a current image from a series of images of the original document, the current image, at least partially overlapping the previous image from the series of images. The optical character recognition (OCR) of the current image is performed to receive the recognized text and the its corresponding text markup. Also, using the recognized text and the corresponding markup of text, a plurality of textual artifacts for each current image and a previous image are determined, each text artifact is represented by a symbolic sequence that has a frequency of occurrence in the recognized text below the threshold frequency.
EFFECT: improving the quality of optical character recognition by determining the order of clusters of symbolic sequences by determining the median permutations of clusters of symbolic sequences.
21 cl, 11 dwg
Title | Year | Author | Number |
---|---|---|---|
OPTICAL CHARACTER RECOGNITION OF IMAGE SERIES | 2016 |
|
RU2613849C1 |
DATA INPUT FROM SERIES OF IMAGES APPLICABLE TO TEMPLATE DOCUMENT | 2016 |
|
RU2634192C1 |
METHODS AND SYSTEMS OF OPTICAL IDENTIFICATION SYMBOLS OF IMAGE SERIES | 2017 |
|
RU2673016C1 |
METHODS AND SYSTEMS OF OPTICAL RECOGNITION OF IMAGE SERIES CHARACTERS | 2017 |
|
RU2673015C1 |
OPTICAL CHARACTER RECOGNITION OF DOCUMENTS WITH NON-PLANAR REGIONS | 2019 |
|
RU2721186C1 |
MULTIPLE CHAMBER USING FOR IMPLEMENTATION OF OPTICAL CHARACTER RECOGNITION | 2017 |
|
RU2661760C1 |
VERIFICATION OF OPTICAL CHARACTER RECOGNITION RESULTS | 2016 |
|
RU2634194C1 |
METHOD AND SYSTEM OF PREPARING TEXT-CONTAINING IMAGES TO OPTICAL RECOGNITION OF SYMBOLS | 2016 |
|
RU2628266C1 |
COMPARING DOCUMENTS USING RELIABLE SOURCE | 2014 |
|
RU2597163C2 |
TEXT RECOGNITION USING ARTIFICIAL INTELLIGENCE | 2017 |
|
RU2691214C1 |
Authors
Dates
2017-05-17—Published
2016-05-13—Filed