FIELD: computer equipment.
SUBSTANCE: invention relates to computer systems, methods of natural language processing. Technical result is achieved due to extraction by computer system from text in natural language of plurality of features associated with each text segment of plurality of text segments, associating one or more tags with each text segment from a plurality of text segments by processing using the first feature stepped classifier, extracting, from a local context of a token candidate of a text segment from a plurality of text segments, a plurality of features associated with the candidate token, processing, using a second stage classifier, a combination of a plurality of local features and tags associated with text segments, to determine the degree of association of the information object referred to by the candidate token with the category of information objects.
EFFECT: high efficiency and quality of extracting information.
20 cl, 14 dwg
Title | Year | Author | Number |
---|---|---|---|
RECOVERY OF TEXT ANNOTATIONS RELATED TO INFORMATION OBJECTS | 2017 |
|
RU2665261C1 |
MULTI STAGE RECOGNITION OF THE REPRESENT ESSENTIALS IN TEXTS ON THE NATURAL LANGUAGE ON THE BASIS OF MORPHOLOGICAL AND SEMANTIC SIGNS | 2016 |
|
RU2619193C1 |
EXTRACTING INFORMATION OBJECTS WITH THE HELP OF A CLASSIFIER COMBINATION | 2017 |
|
RU2679988C1 |
USE OF DEPTH SEMANTIC ANALYSIS OF TEXTS ON NATURAL LANGUAGE FOR CREATION OF TRAINING SAMPLES IN METHODS OF MACHINE TRAINING | 2016 |
|
RU2636098C1 |
NAMED ENTITIES FROM THE TEXT AUTOMATIC EXTRACTION | 2014 |
|
RU2665239C2 |
METHOD OF EXTRACTING FACTS FROM TEXTS ON NATURAL LANGUAGE | 2016 |
|
RU2637992C1 |
TRAINING CLASSIFIERS USED TO EXTRACT INFORMATION FROM NATURAL LANGUAGE TEXTS | 2018 |
|
RU2691855C1 |
CLASSIFICATION OF DOCUMENTS BY LEVELS OF CONFIDENTIALITY | 2019 |
|
RU2732850C1 |
CLASSIFIER TRAINING USED FOR EXTRACTING INFORMATION FROM TEXTS IN NATURAL LANGUAGE | 2018 |
|
RU2681356C1 |
SYSTEM AND METHOD FOR AUTOMATIC CREATION OF TEMPLATES | 2018 |
|
RU2697647C1 |
Authors
Dates
2019-04-23—Published
2018-06-20—Filed