METHOD OF CLASSIFYING DOCUMENTS BY CATEGORIES Russian patent published in 2013 - IPC G06F17/27 

Abstract RU 2491622 C1

FIELD: information technology.

SUBSTANCE: method of classifying documents by categories includes constructing ontology in form of a set of categories. For each category, terms, i.e. sequences of words typical for texts in said category, are identified and the weight of each of the identified terms is determined when reading electronic versions of the documents from a training collection of documents. A profile is formed for each of the categories in form of a list of all terms in all ontology categories with indication of the weight of each term in said category. A list of possible combinations word forms of said term is compiled for each term. Identified terms are selected in each document to be classified when reading an electronic version thereof, considering only word forms from the compiled list. For each document to be classified, a profile is formed for each category based on the selected terms. Relevance of said document to each category is determined by comparing profiles of said document with profiles of categories in the ontology. A classification spectrum of the document is constructed in form of a set of categories with relevance found for each of them.

EFFECT: high rate of classification and reduced size of consumed memory.

7 cl

Similar patents RU2491622C1

Title Year Author Number
METHOD FOR AUTOMATED LANGUAGE DETECTION AND (OR) TEXT DOCUMENT CODING 2011
  • Lapshin Vladimir Anatol'Evich
  • Pshekhotskaja Ekaterina Aleksandrovna
  • Perov Dmitrij Vsevolodovich
RU2500024C2
METHOD FOR AUTOMATED CLASSIFICATION OF DOCUMENTS 2003
  • Agranovskij A.V.
  • Arutjunjan R.Eh.
  • Khadi R.A.
  • Telesnin B.A.
RU2254610C2
METHOD FOR AUTOMATIC CLASSIFICATION OF FORMALIZED TEXT DOCUMENTS AND AUTHORIZED USERS OF ELECTRONIC DOCUMENT MANAGEMENT SYSTEM 2017
  • Poddubnyj Maksim Igorevich
  • Korolev Igor Dmitrievich
  • Nosenko Sergej Vladimirovich
  • Mezentsev Aleksandr Sergeevich
RU2692043C2
METHOD OF AUTOMATED CLASSIFICATION OF FORMALISED DOCUMENTS IN ELECTRONIC DOCUMENT CIRCULATION SYSTEM 2013
  • Nosenko Sergej Vladimirovich
  • Korolev Igor' Dmitrievich
  • Poddubnyj Maksim Igorevich
RU2546555C1
METHOD FOR AUTOMATIC CLASSIFICATION OF FORMALIZED ELECTRONIC GRAPHIC AND TEXT DOCUMENTS IN THE ELECTRONIC DOCUMENT CIRCULATION SYSTEM WITH AUTOMATIC FORMATION OF ELECTRONIC CASES 2020
  • Korolev Igor Dmitrievich
  • Filippov Maksim Yurevich
  • Nazintsev Vadim Sergeevich
RU2759887C1
METHOD FOR STREAM PROCESSING OF TEXT MESSAGES 2003
  • Agranovskij A.V.
  • Arutjunjan R.Eh.
  • Khadi R.A.
  • Telesnin B.A.
RU2251148C1
METHOD OF AUTOMATIC CLASSIFICATION OF CONFIDENTIAL FORMALIZED DOCUMENTS IN ELECTRONIC DOCUMENT MANAGEMENT SYSTEM 2015
  • Poddubnyj Maksim Igorevich
  • Korolev Igor Dmitrievich
  • Nosenko Sergej Vladimirovich
RU2647640C2
METHOD OF POSITIONING TEXT IN KNOWLEDGE SPACE BASED ON ONTOLOGY SET 2009
  • Anshukov Sergej Aleksandrovich
  • Bardin Valerij Vladimirovich
RU2476927C2
METHOD FOR AUTOMATIC TEXT PROCESSING IN NATURAL LANGUAGE THROUGH SEMANTIC INDEXATION, METHOD FOR AUTOMATIC PROCESSING COLLECTION OF TEXTS IN NATURAL LANGUAGE THROUGH SEMANTIC INDEXATION AND COMPUTER READABLE MEDIA 2008
  • Khoroshevskij Vladimir Fedorovich
  • Klintsov Viktor Petrovich
RU2399959C2
METHOD FOR AUTOMATIC CLASSIFICATION OF ELECTRONIC DOCUMENTS IN AN ELECTRONIC DOCUMENT MANAGEMENT SYSTEM WITH AUTOMATIC GENERATION OF RESOLUTION PROPS OF A MANAGER 2018
  • Mezentsev Aleksandr Sergeevich
  • Korolev Igor Dmitrievich
  • Minaev Vladimir Aleksandrovich
  • Poddubnyj Maksim Igorevich
  • Volkov Igor Konstantinovich
  • Akinfiev Danil Viktorovich
  • Kislenko Ilya Anatolevich
RU2692972C1

RU 2 491 622 C1

Authors

Lapshin Vladimir Anatol'Evich

Pshekhotskaja Ekaterina Aleksandrovna

Perov Dmitrij Vsevolodovich

Dates

2013-08-27Published

2012-01-25Filed