FIELD: document management.
SUBSTANCE: invention relates to creation of a documents body. Technical result is achieved at the expense of classification, with the use of a classifier, of every document in the second set of documents by one or more topics of the number of initial topics, where classification involves determination of a non-classified subset of documents from the second set, which were not related to any initial topic, clustering of non-classified subset of documents by new topics not included in the initial topics, and classification of every document of the non-classified subset of documents by one or more topics of the number of new topics.
EFFECT: technical result consists in ensuring automation of analysis of the documents body for determining the topics of the documents body.
19 cl, 7 dwg
Title | Year | Author | Number |
---|---|---|---|
AUTOMATIC DETERMINATION OF SET OF CATEGORIES FOR DOCUMENT CLASSIFICATION | 2018 |
|
RU2701995C2 |
METHOD FOR SEPARATING TEXTS AND ILLUSTRATIONS IN IMAGES OF DOCUMENTS USING A DESCRIPTOR OF DOCUMENT SPECTRUM AND TWO-LEVEL CLUSTERING | 2017 |
|
RU2656708C1 |
USE OF AUTOENCODERS FOR LEARNING TEXT CLASSIFIERS IN NATURAL LANGUAGE | 2017 |
|
RU2678716C1 |
RETRIEVAL OF INFORMATION OBJECTS USING A COMBINATION OF CLASSIFIERS ANALYZING LOCAL AND NON-LOCAL SIGNS | 2018 |
|
RU2686000C1 |
EXTRACTING INFORMATION OBJECTS WITH THE HELP OF A CLASSIFIER COMBINATION | 2017 |
|
RU2679988C1 |
METHOD AND SERVER FOR DETERMINING TRAINING SET FOR MACHINE LEARNING ALGORITHM (MLA) TRAINING | 2020 |
|
RU2817726C2 |
RECOGNITION OF EVENTS ON PHOTOGRAPHS WITH AUTOMATIC SELECTION OF ALBUMS | 2020 |
|
RU2742602C1 |
RETRIEVING FIELDS USING NEURAL NETWORKS WITHOUT USING TEMPLATES | 2019 |
|
RU2737720C1 |
METHOD AND SYSTEM FOR OBTAINING A VECTOR REPRESENTATION OF AN ELECTRONIC DOCUMENT | 2021 |
|
RU2775351C1 |
SYSTEM AND METHOD OF FORMING TRAINING SET FOR MACHINE LEARNING ALGORITHM | 2017 |
|
RU2711125C2 |
Authors
Dates
2016-05-10—Published
2013-12-18—Filed