FIELD: information technology.
SUBSTANCE: invention relates to prevention of information leaks, in particular to the prevention of leaks of electronic copies of personal and confidential documents. In the method of training a classifier intended for determining the category of a document, documents that belong to the category are received. For each document received, the objects contained in it are defined, which are graphic elements. For each document received, a set of characteristics consisting of certain objects is formed. In this case, the mentioned characteristics are the characteristics characterizing the presence of an object, the location of an object, the number of objects, the location of one object relatively to another object, the size of an object, the angle of object's inclination. Construction of a classifier based on the values of the generated characteristics for the received documents is created.
EFFECT: technical result is higher quality of a category definition of a document by a classifier.
13 cl, 8 dwg
Title | Year | Author | Number |
---|---|---|---|
CONTENT-BASED DOCUMENT IMAGE CLASSIFICATION | 2014 |
|
RU2571545C1 |
METHOD FOR RECOGNITION OF TEXT IN IMAGES OF DOCUMENTS | 2021 |
|
RU2768544C1 |
METHOD AND SYSTEM FOR DEPERSONALIZATION OF DOCUMENTS CONTAINING PERSONAL DATA | 2019 |
|
RU2793607C1 |
SYSTEM AND METHOD OF DETECTING IMAGE CONTAINING IDENTIFICATION DOCUMENT | 2018 |
|
RU2715515C2 |
METHOD AND SYSTEM FOR EXTRACTING DATA FROM IMAGES OF SEMISTRUCTURED DOCUMENTS | 2015 |
|
RU2613846C2 |
METHOD AND SYSTEM FOR DEPERSONALIZATION OF CONFIDENTIAL DATA | 2022 |
|
RU2804747C1 |
METHOD AND SYSTEM FOR DEPERSONALIZATION OF CONFIDENTIAL DATA | 2022 |
|
RU2802549C1 |
METHOD OF IDENTIFYING PERSONAL DATA OF OPEN SOURCES OF UNSTRUCTURED INFORMATION | 2013 |
|
RU2549515C2 |
CLASSIFICATION OF DOCUMENTS BY LEVELS OF CONFIDENTIALITY | 2019 |
|
RU2732850C1 |
METHOD AND SYSTEM FOR RECOGNITION OF THE EMOTIONAL STATE OF EMPLOYEES | 2021 |
|
RU2768545C1 |
Authors
Dates
2018-11-14—Published
2017-09-29—Filed