FIELD: computer equipment.
SUBSTANCE: method includes: extraction of metadata and informative part of document, conversion of document from storage format into text, conversion of words into word forms, discarding non-significant words, counting word weights, generating a set of classification features, wherein at the training step, a system of predicates for identifying the confidentiality mark of the document is generated based on the set of classified documents; at the document classification step, based on the characteristics, a decision is made on the relevance of the document of each of the confidentiality marks, at the training stage, based on the set of manually classified authorized users, forming a predicate identification system of their confidentiality mark, wherein on the basis of confidentiality marks of incoming documents and access rights of authorized users of system to these documents form a set of classification features.
EFFECT: automatic classification of formalized text documents and authorized users of electronic document management system according to confidentiality marks.
1 cl, 1 dwg, 1 tbl
Authors
Dates
2019-06-19—Published
2017-12-18—Filed