PREDICTION OF PROBABILITY OF OCCURRENCE OF LINE USING SEQUENCE OF VECTORS Russian patent published in 2020 - IPC G06F16/36 G06F17/22 

Abstract RU 2712101 C2

FIELD: physics.

SUBSTANCE: group of inventions relates to computer systems and can be used to construct and process a natural language model. Method comprises the following steps: obtaining a plurality of rows, where each row of a plurality of rows comprises a plurality of symbols; for each row of a plurality of lines, generating, by the processing device, a first sequence of vectors based on at least a maximum word length for each symbol in the row; transmitting, to a machine learning unit, a first sequence of vectors for each row of a plurality of rows; obtaining from the machine learning module the probability of occurrence of each line from a plurality of rows; adding a line to the natural language model based on the value of the probability of occurrence obtained from the machine learning module and using the obtained model in natural language processing tasks.

EFFECT: technical result is improved prediction of probability of appearance of linguistic unit.

20 cl, 5 dwg

Similar patents RU2712101C2

Title Year Author Number
TEXT RECOGNITION USING ARTIFICIAL INTELLIGENCE 2017
  • Orlov Nikita Konstantinovich
  • Rybkin Vladimir Yurevich
  • Anisimovich Konstantin Vladimirovich
  • Davletshin Azat Ajdarovich
RU2691214C1
OPTICAL CHARACTER RECOGNITION BY MEANS OF COMBINATION OF NEURAL NETWORK MODELS 2020
  • Konstantin Anisimovich
  • Alexey Zhuravlev
RU2768211C1
HANDWRITING RECOGNITION USING NEURAL NETWORKS 2020
  • Andrey Upshinskiy
RU2757713C1
IDENTIFICATION OF BLOCKS OF RELATED WORDS IN DOCUMENTS OF COMPLEX STRUCTURE 2019
  • Stanislav Semenov
RU2765884C2
RETRIEVING FIELDS USING NEURAL NETWORKS WITHOUT USING TEMPLATES 2019
  • Stanislav Semenov
RU2737720C1
AUTOMATIC DETERMINATION OF SET OF CATEGORIES FOR DOCUMENT CLASSIFICATION 2018
  • Nikita Orlov
  • Konstantin Anisimovich
RU2701995C2
DETECTING TEXT FIELDS USING NEURAL NETWORKS 2018
  • Zuev, Konstantin Alekseevich
  • Senkevich, Oleg Evgenyevich
  • Golubev, Sergei Vladimirovich
RU2699687C1
TEACHING LANGUAGE MODELS USING TEXT CORPUSES CONTAINING REALISTIC ERRORS OF OPTICAL CHARACTER RECOGNITION (OCR) 2019
  • Ivan Germanovich Zagaynov
RU2721187C1
METHOD FOR CONTROLLING A DIALOGUE AND NATURAL LANGUAGE RECOGNITION SYSTEM IN A PLATFORM OF VIRTUAL ASSISTANTS 2020
  • Ashmanov Stanislav Igorevich
  • Sukhachev Pavel Sergeevich
  • Zorkij Fedor Kirillovich
RU2759090C1
TEXT CLASSIFICATION METHOD AND SYSTEM 2022
  • Konodyuk Nikita Evgenevich
  • Tikhonova Mariya Ivanovna
RU2818693C2

RU 2 712 101 C2

Authors

Indenbom Evgenij Mikhajlovich

Anastasev Daniil Garrievich

Dates

2020-01-24Published

2018-06-27Filed