FIELD: physics.
SUBSTANCE: method includes: the implementation of a lexico-morphological analysis of a text in a natural language by the computer system, the implementation of a syntactic-semantic analysis of the text in natural language for obtaining a multitude of semantic structures, the selection of the set of output attributes from the lexical, grammatical, syntactic, and semantic attributes of the semantic structures; and generation of an output text and an index including symbolic identifiers of one or more attribute values from the output attribute set, where each attribute is associated with a corresponding part of the text in natural language, and the said one or more attribute values are accompanied by a probability value.
EFFECT: automation of the process of obtaining highly accurate marked texts of any volume and a content in accordance with the selected marking method and their use for machine learning in natural language processing tasks.
20 cl, 14 dwg
Title | Year | Author | Number |
---|---|---|---|
MULTI STAGE RECOGNITION OF THE REPRESENT ESSENTIALS IN TEXTS ON THE NATURAL LANGUAGE ON THE BASIS OF MORPHOLOGICAL AND SEMANTIC SIGNS | 2016 |
|
RU2619193C1 |
SYSTEM AND METHOD FOR AUTOMATIC CREATION OF TEMPLATES | 2018 |
|
RU2697647C1 |
METHOD OF EXTRACTING FACTS FROM TEXTS ON NATURAL LANGUAGE | 2016 |
|
RU2637992C1 |
EXTRACTION OF INFORMATION USING ALTERNATIVE VARIANTS OF SEMANTIC-SYNTACTIC ANALYSIS | 2016 |
|
RU2646386C1 |
RETRIEVAL OF INFORMATION OBJECTS USING A COMBINATION OF CLASSIFIERS ANALYZING LOCAL AND NON-LOCAL SIGNS | 2018 |
|
RU2686000C1 |
USING VERIFIED BY USER DATA FOR TRAINING MODELS OF CONFIDENCE | 2016 |
|
RU2646380C1 |
DEFINITION OF CONFIDENCE DEGREES RELATED TO ATTRIBUTE VALUES OF INFORMATION OBJECTS | 2016 |
|
RU2640297C2 |
CLASSIFICATION OF DOCUMENTS BY LEVELS OF CONFIDENTIALITY | 2019 |
|
RU2732850C1 |
NAMED ENTITIES FROM THE TEXT AUTOMATIC EXTRACTION | 2014 |
|
RU2665239C2 |
RECOVERY OF TEXT ANNOTATIONS RELATED TO INFORMATION OBJECTS | 2017 |
|
RU2665261C1 |
Authors
Dates
2017-11-20—Published
2016-10-26—Filed