FIELD: means of recognition.
SUBSTANCE: invention relates to the named entities recognition means from the unmapped text body. Choosing the training set of texts in the natural language. By means of processor retrieving the corresponding set of characteristics for the named entities each category. By means of processor training the classification model using the texts set and features sets for the named entities each category. By means of processor retrieving the tokens from the unmapped text. By means of processor generating the set of attributes for the unmapped text each token based on at least deep semantic-syntactic analysis. Identifying the possible syntactic links in the unmapped text at least one sentence, including obtaining the set of syntactic attributes. Formation of the language-independent semantic structure, including each token semantic links and corresponding semantic attributes definition. By means of processor classifying each token into at least one of the categories based on the classifier model and the token attributes set. By means of processor generating the mapped representation of at least a portion of the text based on at least one of the classified by category tokens.
EFFECT: technical result consists in increase the named entities in texts recognition and mapping efficiency.
13 cl, 12 dwg
Title | Year | Author | Number |
---|---|---|---|
USE OF DEPTH SEMANTIC ANALYSIS OF TEXTS ON NATURAL LANGUAGE FOR CREATION OF TRAINING SAMPLES IN METHODS OF MACHINE TRAINING | 2016 |
|
RU2636098C1 |
METHOD FOR ATTRIBUTION OF PARTIALLY STRUCTURED TEXTS FOR FORMATION OF NORMATIVE-REFERENCE INFORMATION | 2020 |
|
RU2750852C1 |
MULTI STAGE RECOGNITION OF THE REPRESENT ESSENTIALS IN TEXTS ON THE NATURAL LANGUAGE ON THE BASIS OF MORPHOLOGICAL AND SEMANTIC SIGNS | 2016 |
|
RU2619193C1 |
RETRIEVAL OF INFORMATION OBJECTS USING A COMBINATION OF CLASSIFIERS ANALYZING LOCAL AND NON-LOCAL SIGNS | 2018 |
|
RU2686000C1 |
METHOD OF EXTRACTING FACTS FROM TEXTS ON NATURAL LANGUAGE | 2016 |
|
RU2637992C1 |
RECOVERY OF TEXT ANNOTATIONS RELATED TO INFORMATION OBJECTS | 2017 |
|
RU2665261C1 |
EXTRACTION OF INFORMATION USING ALTERNATIVE VARIANTS OF SEMANTIC-SYNTACTIC ANALYSIS | 2016 |
|
RU2646386C1 |
USING VERIFIED BY USER DATA FOR TRAINING MODELS OF CONFIDENCE | 2016 |
|
RU2646380C1 |
SYSTEM AND METHOD FOR AUTOMATIC CREATION OF TEMPLATES | 2018 |
|
RU2697647C1 |
EXTRACTING INFORMATION OBJECTS WITH THE HELP OF A CLASSIFIER COMBINATION | 2017 |
|
RU2679988C1 |
Authors
Dates
2018-08-28—Published
2014-01-15—Filed