IDENTIFICATION OF CHINESE, JAPANESE AND KOREAN SCRIPT Russian patent published in 2017 - IPC G06K9/68 G06F17/27 

Abstract RU 2613847 C2

FIELD: physics.

SUBSTANCE: document image is received in the method of determining, whether the text contains Chinese, Japanese or Korean characters. The received document image is binarized. The connected components are searched on the binarized document image. Based on the received connected components, the set of fragments is detected and the document orientation is determined. The hypothesis of the language affiliation is formulated for each fragment from the set of fragments. The probability assessment is calculated for the hypothesis of the language affiliation. The set is selected from the set of fragments having the highest probability assessments. The hypothesis of the language affiliation is verified for each fragment from the subset of fragments. The decision about the presence of Chinese, Japanese and Korean characters is made on the basis of, at least, testing the hypothesis about the fragment language of the selected subset.

EFFECT: increasing the accuracy of determining the presence of Chinese, Japanese or Korean characters in the text.

20 cl, 7 dwg

Similar patents RU2613847C2

Title Year Author Number
METHOD OF DETECTING NECESSITY OF STANDARD LEARNING FOR VERIFICATION OF RECOGNIZED TEXT 2014
  • Krivosheev Mikhail Viktorovich
  • Kolodkina Natalya Aleksandrovna
  • Makushev Aleksandr Sergeevich
RU2641225C2
DEVICES AND METHODS USING A HIERARCHIALLY ORDERED DATA STRUCTURE CONTAINING UNPARAMETRIC SYMBOLS FOR CONVERTING DOCUMENT IMAGES TO ELECTRONIC DOCUMENTS 2013
  • Chulinin Yurij Georgievich
RU2643465C2
DEVICES AND METHODS, WHICH PREPARE PARAMETERED SYMBOLS FOR TRANSFORMING IMAGES OF DOCUMENTS INTO ELECTRONIC DOCUMENTS 2013
  • Chulinin Yurij Georgievich
RU2625020C1
HANDWRITING RECOGNITION USING NEURAL NETWORKS 2020
  • Andrey Upshinskiy
RU2757713C1
SYMBOLS RECOGNITION WITH THE USE OF ARTIFICIAL INTELLIGENCE 2017
  • Chulinin Yurij Georgievich
RU2661750C1
METHODS AND DEVICES THAT CONVERT IMAGES OF DOCUMENTS TO ELECTRONIC DOCUMENTS USING TRIE-DATA STRUCTURES CONTAINING UNPARAMETERIZED SYMBOLS FOR DEFINITION OF WORD AND MORPHEMES ON DOCUMENT IMAGE 2013
  • Chulinin Yurij Georgievich
RU2631168C2
DEVICES AND METHODS, WHICH BUILD THE HIERARCHIALLY ORDINARY DATA STRUCTURE, CONTAINING NONPARAMETERIZED SYMBOLS FOR DOCUMENTS IMAGES CONVERSION TO ELECTRONIC DOCUMENTS 2013
  • Chulinin Yurij Georgievich
RU2625533C1
METHODS AND SYSTEMS FOR PROCESSING IMAGES OF MATHEMATICAL EXPRESSIONS 2014
  • Isupov Dmitry Sergeevich
  • Masalovitch Anton Andreevich
RU2596600C2
METHODS AND SYSTEMS FOR EFFECTIVE AUTOMATIC RECOGNITION OF SYMBOLS USING FOREST SOLUTIONS 2014
  • Chulinin Yuri Georgievich
  • Senkevich Oleg Evgenievich
RU2582064C1
METHODS AND SYSTEMS FOR AUTOMATIC RECOGNITION OF CHARACTERS USING FOREST SOLUTIONS 2015
  • Chulinin Yuri Georgievich
  • Vatlin Yury Aleksandrovich
RU2598300C2

RU 2 613 847 C2

Authors

Atroshchenko Mikhail Yurievich

Deryagin Dmitry Georgievich

Chulinin Yuri Georgievich

Dates

2017-03-21Published

2013-12-20Filed