FIELD: physics.
SUBSTANCE: one of the examples of the method includes: a lexical morphological analysis of text in natural language containing many tokens, where each token contains at least one word of natural language; lexical-morphological analysis based on one or more lexical meanings and grammatical values associated with each token in the set of tokens; calculating for each token in the set of tokens one or more classifier functions using the lexical and grammatical values associated with the token, the value of each classifier function indicating an estimate of the degree of association of the token with the category of named entities; syntactic-semantic analysis of at least a part of the text in natural language for obtaining a multitude of semantic structures that represent part of the text in natural language; and interpretation of semantic structures using a set of production rules to determine, for one or more tokens included in a part of the text in a natural language, an estimate of the degree of association of the token with the category of named entities.
EFFECT: achieving high accuracy or completeness of recognition of named entities in texts in natural language in combination with an acceptable recognition rate due to a two-stage application of depth-different methods of text analysis depending on the result of the previous stage.
20 cl, 16 dwg
Title | Year | Author | Number |
---|---|---|---|
USE OF DEPTH SEMANTIC ANALYSIS OF TEXTS ON NATURAL LANGUAGE FOR CREATION OF TRAINING SAMPLES IN METHODS OF MACHINE TRAINING | 2016 |
|
RU2636098C1 |
METHOD OF EXTRACTING FACTS FROM TEXTS ON NATURAL LANGUAGE | 2016 |
|
RU2637992C1 |
SYSTEM AND METHOD FOR AUTOMATIC CREATION OF TEMPLATES | 2018 |
|
RU2697647C1 |
RETRIEVAL OF INFORMATION OBJECTS USING A COMBINATION OF CLASSIFIERS ANALYZING LOCAL AND NON-LOCAL SIGNS | 2018 |
|
RU2686000C1 |
EXTRACTION OF INFORMATION USING ALTERNATIVE VARIANTS OF SEMANTIC-SYNTACTIC ANALYSIS | 2016 |
|
RU2646386C1 |
CLASSIFICATION OF DOCUMENTS BY LEVELS OF CONFIDENTIALITY | 2019 |
|
RU2732850C1 |
NAMED ENTITIES FROM THE TEXT AUTOMATIC EXTRACTION | 2014 |
|
RU2665239C2 |
RECOVERY OF TEXT ANNOTATIONS RELATED TO INFORMATION OBJECTS | 2017 |
|
RU2665261C1 |
EXTRACTION OF INFORMATION FROM SANITARY BLOCKS OF DOCUMENTS USING MICROMODELS ON BASIS OF ONTOLOGY | 2017 |
|
RU2662688C1 |
SENTIMENT ANALYSIS AT THE LEVEL OF ASPECTS USING METHODS OF MACHINE LEARNING | 2016 |
|
RU2657173C2 |
Authors
Dates
2017-05-12—Published
2016-06-17—Filed