FIELD: information technologies.
SUBSTANCE: in the method of automatic classification of formalised documents in an electronic document circulation system they identify and analyse characteristics of identical text sections (details) in a formalised document, and identified details are analysed. The informative part of the document is converted into text in natural language, document words are transformed into basic wordforms, insignificant words are deleted, word weights are counted in accordance with frequency of their occurrence, forming predicates of text criteria identification. According to the proposed set of manually classified texts they generate a system of predicates of text criteria identification, which is saved in a data base. Values of significant wordform weights are added into the system of predicates. If it is necessary to use a priori information on dependences of information areas between each other, algebra of end predicates is used, which makes it possible to perform operations over logical expressions, with the help of which information areas are described.
EFFECT: reduced time of system operation through making it possible to classify documents by form and identified metadata and to perform analysis only in the informative part of the document.
1 dwg
Title | Year | Author | Number |
---|---|---|---|
METHOD OF AUTOMATIC CLASSIFICATION OF CONFIDENTIAL FORMALIZED DOCUMENTS IN ELECTRONIC DOCUMENT MANAGEMENT SYSTEM | 2015 |
|
RU2647640C2 |
METHOD FOR AUTOMATIC CLASSIFICATION OF FORMALIZED TEXT DOCUMENTS AND AUTHORIZED USERS OF ELECTRONIC DOCUMENT MANAGEMENT SYSTEM | 2017 |
|
RU2692043C2 |
METHOD FOR AUTOMATIC CLASSIFICATION OF ELECTRONIC DOCUMENTS IN AN ELECTRONIC DOCUMENT MANAGEMENT SYSTEM WITH AUTOMATIC GENERATION OF RESOLUTION PROPS OF A MANAGER | 2018 |
|
RU2692972C1 |
METHOD FOR AUTOMATIC CLASSIFICATION OF ELECTRONIC DOCUMENTS IN AN ELECTRONIC DOCUMENT MANAGEMENT SYSTEM WITH AUTOMATIC GENERATION OF ELECTRONIC CASES | 2019 |
|
RU2726931C1 |
METHOD FOR AUTOMATIC CLASSIFICATION OF FORMALIZED ELECTRONIC GRAPHIC AND TEXT DOCUMENTS IN THE ELECTRONIC DOCUMENT CIRCULATION SYSTEM WITH AUTOMATIC FORMATION OF ELECTRONIC CASES | 2020 |
|
RU2759887C1 |
METHOD FOR AUTOMATED CLASSIFICATION OF DOCUMENTS | 2003 |
|
RU2254610C2 |
METHOD FOR STREAM PROCESSING OF TEXT MESSAGES | 2003 |
|
RU2251148C1 |
METHOD FOR ORDERING DATA SUBMITTED IN ALPHANUMERIC INFORMATION BLOCKS | 2000 |
|
RU2210809C2 |
CLASSIFICATION OF DOCUMENTS BY LEVELS OF CONFIDENTIALITY | 2019 |
|
RU2732850C1 |
METHOD OF CLASSIFYING DOCUMENTS BY CATEGORIES | 2012 |
|
RU2491622C1 |
Authors
Dates
2015-04-10—Published
2013-12-11—Filed