NAMED ENTITIES FROM THE TEXT AUTOMATIC EXTRACTION Russian patent published in 2018 - IPC G06F17/27 

Abstract RU 2665239 C2

FIELD: means of recognition.

SUBSTANCE: invention relates to the named entities recognition means from the unmapped text body. Choosing the training set of texts in the natural language. By means of processor retrieving the corresponding set of characteristics for the named entities each category. By means of processor training the classification model using the texts set and features sets for the named entities each category. By means of processor retrieving the tokens from the unmapped text. By means of processor generating the set of attributes for the unmapped text each token based on at least deep semantic-syntactic analysis. Identifying the possible syntactic links in the unmapped text at least one sentence, including obtaining the set of syntactic attributes. Formation of the language-independent semantic structure, including each token semantic links and corresponding semantic attributes definition. By means of processor classifying each token into at least one of the categories based on the classifier model and the token attributes set. By means of processor generating the mapped representation of at least a portion of the text based on at least one of the classified by category tokens.

EFFECT: technical result consists in increase the named entities in texts recognition and mapping efficiency.

13 cl, 12 dwg

Similar patents RU2665239C2

Title Year Author Number
USE OF DEPTH SEMANTIC ANALYSIS OF TEXTS ON NATURAL LANGUAGE FOR CREATION OF TRAINING SAMPLES IN METHODS OF MACHINE TRAINING 2016
  • Anisimovich Konstantin Vladimirovich
  • Selegej Vladimir Pavlovich
  • Garashchuk Ruslan Vladimirovich
RU2636098C1
METHOD FOR ATTRIBUTION OF PARTIALLY STRUCTURED TEXTS FOR FORMATION OF NORMATIVE-REFERENCE INFORMATION 2020
  • Fedosin Sergei Alekseevich
  • Plotnikova Natalia Pavlovna
  • Martynov Vladislav Aleksandrovich
  • Ryskin Konstantin Eduardovich
  • Kuznetsov Dmitrii Aleksandrovich
  • Deniskin Aleksandr Vladimirovich
  • Vechkanova Iuliia Sergeevna
  • Fediushkin Nikolai Alekseevich
  • Tsilikov Nikita Sergeevich
RU2750852C1
MULTI STAGE RECOGNITION OF THE REPRESENT ESSENTIALS IN TEXTS ON THE NATURAL LANGUAGE ON THE BASIS OF MORPHOLOGICAL AND SEMANTIC SIGNS 2016
  • Anisimovich Konstantin Vladimirovich
  • Indenbom Evgeny Mihaylovich
  • Novitskiy Valery Igorevich
RU2619193C1
RETRIEVAL OF INFORMATION OBJECTS USING A COMBINATION OF CLASSIFIERS ANALYZING LOCAL AND NON-LOCAL SIGNS 2018
  • Indenbom Evgenij Mikhajlovich
RU2686000C1
METHOD OF EXTRACTING FACTS FROM TEXTS ON NATURAL LANGUAGE 2016
  • Starostin Anatolij Sergeevich
  • Smurov Ivan Mikhajlovich
  • Dzhumaev Stanislav Sergeevich
RU2637992C1
RECOVERY OF TEXT ANNOTATIONS RELATED TO INFORMATION OBJECTS 2017
  • Bulgakov Ilya Aleksandrovich
  • Indenbom Evgenij Mikhajlovich
RU2665261C1
EXTRACTION OF INFORMATION USING ALTERNATIVE VARIANTS OF SEMANTIC-SYNTACTIC ANALYSIS 2016
  • Matskevich Stepan Evgenevich
RU2646386C1
USING VERIFIED BY USER DATA FOR TRAINING MODELS OF CONFIDENCE 2016
  • Matskevich Stepan Evgenevich
  • Belov Andrej Aleksandrovich
RU2646380C1
SYSTEM AND METHOD FOR AUTOMATIC CREATION OF TEMPLATES 2018
  • Anisimovich Konstantin Vladimirovich
  • Garashchuk Ruslan Vladimirovich
  • Matskevich Stepan Evgenevich
RU2697647C1
EXTRACTING INFORMATION OBJECTS WITH THE HELP OF A CLASSIFIER COMBINATION 2017
  • Matskevich Stepan Evgenevich
  • Starostin Anatolij Sergeevich
  • Sukhodolov Dmitrij Andreevich
RU2679988C1

RU 2 665 239 C2

Authors

Nekhaj Ilya Vladimirovich

Dates

2018-08-28Published

2014-01-15Filed