FIELD: physics.
SUBSTANCE: invention relates to computer engineering for detection of electronic text information containing confidential data. Disclosed method consists in the fact that after the stage of preliminary processing and vectorization of the body of text information, a procedure for dividing the sample into test and training is carried out. Then classification models are sequentially trained. Trained models are tested, their accuracy values are calculated, and the test results are combined in order to correct errors of the first and second kind when performing classification by separate models of classification of electronic text information for the presence of confidential data in them. Further, the value of accuracy indicators is calculated for each formed combination of models, among which the highest accuracy value is selected.
EFFECT: high accuracy of classifying electronic text information by degree of confidentiality.
1 cl, 5 dwg
Authors
Dates
2025-02-05—Published
2024-03-05—Filed