EXTRACTION OF INFORMATION FROM SANITARY BLOCKS OF DOCUMENTS USING MICROMODELS ON BASIS OF ONTOLOGY Russian patent published in 2018 - IPC G06F17/27 

Abstract RU 2662688 C1

FIELD: data processing.

SUBSTANCE: invention generally refers to the processing of texts in natural language, and in particular to the extraction of information from the semantic blocks of documents using micromodels based on ontology. In the method of extracting information from documents containing text in natural language, a semantic block belonging to a given category is identified in the text. Perform a lexical analysis of the set of words of the semantic block with the goal of constructing a set of lexical structures containing information about the lexical meanings of words and the corresponding semantic classes representing the semantic block. Identify the micromodel for retrieving information related to a given category, the micromodel including a plurality of product rules associated with the ontology. Apply the production rules of the micromodel in order to extract information objects related to the corresponding semantic class, corresponding to the concept of ontology.

EFFECT: technical result is an increase in the speed and quality of information extraction by using ontology micromodels for individual parts of the document.

22 cl, 13 dwg

Similar patents RU2662688C1

Title Year Author Number
METHOD OF EXTRACTING FACTS FROM TEXTS ON NATURAL LANGUAGE 2016
  • Starostin Anatolij Sergeevich
  • Smurov Ivan Mikhajlovich
  • Dzhumaev Stanislav Sergeevich
RU2637992C1
EXTRACTING INFORMATION FROM STRUCTURED DOCUMENTS CONTAINING TEXT IN NATURAL LANGUAGE 2015
  • Danielyan Tatiana Vladimirovna
  • Bulgakov Ilya Aleksandrovich
RU2607976C1
SYSTEM AND METHOD FOR AUTOMATIC CREATION OF TEMPLATES 2018
  • Anisimovich Konstantin Vladimirovich
  • Garashchuk Ruslan Vladimirovich
  • Matskevich Stepan Evgenevich
RU2697647C1
EXTRACTION OF INFORMATION USING ALTERNATIVE VARIANTS OF SEMANTIC-SYNTACTIC ANALYSIS 2016
  • Matskevich Stepan Evgenevich
RU2646386C1
RECOVERY OF TEXT ANNOTATIONS RELATED TO INFORMATION OBJECTS 2017
  • Bulgakov Ilya Aleksandrovich
  • Indenbom Evgenij Mikhajlovich
RU2665261C1
DEFINITION OF CONFIDENCE DEGREES RELATED TO ATTRIBUTE VALUES OF INFORMATION OBJECTS 2016
  • Belov Andrej Aleksandrovich
  • Matskevich Stepan Evgenevich
RU2640297C2
USING VERIFIED BY USER DATA FOR TRAINING MODELS OF CONFIDENCE 2016
  • Matskevich Stepan Evgenevich
  • Belov Andrej Aleksandrovich
RU2646380C1
METHOD FOR AUTOMATED EXTRACTION OF SEMANTIC COMPONENTS FROM COMPOUND SENTENCES OF NATURAL LANGUAGE TEXTS IN MACHINE TRANSLATION SYSTEMS AND DEVICE FOR ITS IMPLEMENTATION 2021
  • Karpov Anton Gennadevich
  • Khachukaev Eduard Magomedovich
  • Khachukaeva Elina Eduardovna
RU2766060C1
EXTRACTION OF ENTITIES FROM TEXTS IN NATURAL LANGUAGE 2015
  • Starostin Anatolij Sergeevich
  • Danielyan Tatyana Vladimirovna
  • Smurov Ivan Mikhajlovich
RU2626555C2
TRAINING CLASSIFIERS USED TO EXTRACT INFORMATION FROM NATURAL LANGUAGE TEXTS 2018
  • Matskevich Stepan Evgenevich
  • Bulgakov Ilya Aleksandrovich
RU2691855C1

RU 2 662 688 C1

Authors

Danielyan Tatyana Vladimirovna

Mikhajlov Maksim Borisovich

Dates

2018-07-26Published

2017-03-16Filed