METHOD AND SYSTEM FOR DEPERSONALIZATION OF CONFIDENTIAL DATA Russian patent published in 2023 - IPC G06F21/62 G06F40/166 G06F18/24 

Abstract RU 2802549 C1

FIELD: data protection.

SUBSTANCE: specifically the reversible depersonalization of confidential data while maintaining the data structure in text documents. The method for reversible depersonalizing confidential data in text documents while preserving the data structure comprises the following steps: receiving a document with text data; segmenting and tokenizing text data; performing token vectorization; determining whether each token belongs to the category of confidential data using a machine learning model; depersonalizing data related to tokens with confidential data while maintaining the data structure; forming a table of substitutions, and a list of replacements; replacing the original confidential data in the text document with depersonalized data according to the list of replacements, and the depersonalized data is formatted in accordance with the positions of formatting changes of parts of the text.

EFFECT: providing the possibility of preserving the stylistic, semantic, lexical and morphological structure of data in text documents when depersonalizing them, increasing the accuracy of depersonalizing data in text documents by identifying confidential data in text documents using a machine learning model.

3 cl, 6 dwg, 6 tbl

Similar patents RU2802549C1

Title Year Author Number
METHOD AND SYSTEM FOR DEPERSONALIZATION OF CONFIDENTIAL DATA 2022
  • Babak Nikita Grigorevich
  • Belorybkin Leonid Yurevich
  • Terenin Aleksej Alekseevich
  • Shabrova Anastasiya Igorevna
RU2804747C1
METHOD AND SYSTEM FOR PARAPHRASING TEXT 2023
  • Fenogenova Alena Sergeevna
  • Tikhonova Mariya Ivanovna
RU2814808C1
METHOD AND SYSTEM FOR GENERATING TEXT 2023
  • Tikhonova Mariya Ivanovna
RU2817524C1
METHOD AND SYSTEM FOR DIGITAL ASSISTANT TEXT GENERATION 2022
  • Tikhonova Mariya Ivanovna
RU2796208C1
TEXT CLASSIFICATION METHOD AND SYSTEM 2022
  • Konodyuk Nikita Evgenevich
  • Tikhonova Mariya Ivanovna
RU2818693C2
METHOD AND SYSTEM FOR CLASSIFYING DATA FOR IDENTIFYING CONFIDENTIAL INFORMATION IN THE TEXT 2019
  • Terenin Aleksej Alekseevich
  • Kotova Margarita Aleksandrovna
RU2755606C2
AUTOMATED LEGAL ADVICE SYSTEM CONTROL METHOD 2019
  • Prikhodko Olga Viktorovna
  • Khyurri Ruslan Vladimirovich
  • Prikhodko Olga Viktorovna
RU2718978C1
METHOD OF IDENTIFYING PERSONAL DATA OF OPEN SOURCES OF UNSTRUCTURED INFORMATION 2013
  • Khusnojarov Farit Faritovich
RU2549515C2
CLASSIFICATION OF DOCUMENTS BY LEVELS OF CONFIDENTIALITY 2019
  • Zyuzin Andrej Andreevich
  • Uskova Olesya Vladimirovna
RU2732850C1
SYSTEM AND METHOD FOR AUGMENTATION OF THE TRAINING SAMPLE FOR MACHINE LEARNING ALGORITHMS 2020
  • Shavrina Tatyana Olegovna
RU2758683C2

RU 2 802 549 C1

Authors

Babak Nikita Grigorevich

Belorybkin Leonid Yurevich

Terenin Aleksej Alekseevich

Shabrova Anastasiya Igorevna

Dates

2023-08-30Published

2022-12-20Filed