FIELD: data processing.
SUBSTANCE: invention generally refers to the processing of texts in natural language, and in particular to the extraction of information from the semantic blocks of documents using micromodels based on ontology. In the method of extracting information from documents containing text in natural language, a semantic block belonging to a given category is identified in the text. Perform a lexical analysis of the set of words of the semantic block with the goal of constructing a set of lexical structures containing information about the lexical meanings of words and the corresponding semantic classes representing the semantic block. Identify the micromodel for retrieving information related to a given category, the micromodel including a plurality of product rules associated with the ontology. Apply the production rules of the micromodel in order to extract information objects related to the corresponding semantic class, corresponding to the concept of ontology.
EFFECT: technical result is an increase in the speed and quality of information extraction by using ontology micromodels for individual parts of the document.
22 cl, 13 dwg
Title | Year | Author | Number |
---|---|---|---|
METHOD OF EXTRACTING FACTS FROM TEXTS ON NATURAL LANGUAGE | 2016 |
|
RU2637992C1 |
EXTRACTING INFORMATION FROM STRUCTURED DOCUMENTS CONTAINING TEXT IN NATURAL LANGUAGE | 2015 |
|
RU2607976C1 |
SYSTEM AND METHOD FOR AUTOMATIC CREATION OF TEMPLATES | 2018 |
|
RU2697647C1 |
EXTRACTION OF INFORMATION USING ALTERNATIVE VARIANTS OF SEMANTIC-SYNTACTIC ANALYSIS | 2016 |
|
RU2646386C1 |
RECOVERY OF TEXT ANNOTATIONS RELATED TO INFORMATION OBJECTS | 2017 |
|
RU2665261C1 |
DEFINITION OF CONFIDENCE DEGREES RELATED TO ATTRIBUTE VALUES OF INFORMATION OBJECTS | 2016 |
|
RU2640297C2 |
USING VERIFIED BY USER DATA FOR TRAINING MODELS OF CONFIDENCE | 2016 |
|
RU2646380C1 |
METHOD FOR AUTOMATED EXTRACTION OF SEMANTIC COMPONENTS FROM COMPOUND SENTENCES OF NATURAL LANGUAGE TEXTS IN MACHINE TRANSLATION SYSTEMS AND DEVICE FOR ITS IMPLEMENTATION | 2021 |
|
RU2766060C1 |
EXTRACTION OF ENTITIES FROM TEXTS IN NATURAL LANGUAGE | 2015 |
|
RU2626555C2 |
TRAINING CLASSIFIERS USED TO EXTRACT INFORMATION FROM NATURAL LANGUAGE TEXTS | 2018 |
|
RU2691855C1 |
Authors
Dates
2018-07-26—Published
2017-03-16—Filed