FIELD: physics, computer engineering.
SUBSTANCE: invention relates to methods of filling electronic glossaries - lists of terms with tags. The method of filling a glossary from a training set of electronic documents using a computer (personal computer, server, etc.) includes forming a training subset, the text of all electronic documents of which contains glossary terms. Characteristic selection criteria are applied to words met in the training subset. Words selected using the criteria are assigned tags and the selected words are optionally assigned a weight. The selected words are added to the glossary with corresponding tags (and weights).
EFFECT: high efficiency of using electronic glossaries in text analysis tasks by enabling assignment of intelligent weights to terms and automatic filling of glossaries with a training set of texts.
16 cl, 13 dwg
Title | Year | Author | Number |
---|---|---|---|
METHOD OF CONSTRUCTING AND DETECTION OF THEME HULL STRUCTURE | 2013 |
|
RU2583716C2 |
SENTIMENT ANALYSIS AT THE LEVEL OF ASPECTS USING METHODS OF MACHINE LEARNING | 2016 |
|
RU2657173C2 |
METHOD FOR AUTOMATIC CLASSIFICATION OF FORMALIZED TEXT DOCUMENTS AND AUTHORIZED USERS OF ELECTRONIC DOCUMENT MANAGEMENT SYSTEM | 2017 |
|
RU2692043C2 |
AUTOMATIC DETERMINATION OF SET OF CATEGORIES FOR DOCUMENT CLASSIFICATION | 2018 |
|
RU2701995C2 |
SENTIMENT ANALYSIS AT LEVEL OF ASPECTS AND CREATION OF REPORTS USING MACHINE LEARNING METHODS | 2016 |
|
RU2635257C1 |
NAMED ENTITIES FROM THE TEXT AUTOMATIC EXTRACTION | 2014 |
|
RU2665239C2 |
SYSTEM AND METHOD OF FORMING TRAINING SET FOR MACHINE LEARNING ALGORITHM | 2017 |
|
RU2711125C2 |
CLASSIFIER TRAINING USED FOR EXTRACTING INFORMATION FROM TEXTS IN NATURAL LANGUAGE | 2018 |
|
RU2681356C1 |
METHOD FOR AUTOMATIC CLASSIFICATION OF ELECTRONIC DOCUMENTS IN AN ELECTRONIC DOCUMENT MANAGEMENT SYSTEM WITH AUTOMATIC GENERATION OF RESOLUTION PROPS OF A MANAGER | 2018 |
|
RU2692972C1 |
EXTRACTING INFORMATION OBJECTS WITH THE HELP OF A CLASSIFIER COMBINATION | 2017 |
|
RU2679988C1 |
Authors
Dates
2015-04-20—Published
2013-05-24—Filed