METHOD AND SYSTEM FOR DEPERSONALIZATION OF CONFIDENTIAL DATA Russian patent published in 2023 - IPC G06F21/62 G06F40/166 G06F18/24 

Abstract RU 2804747 C1

FIELD: data protection.

SUBSTANCE: specifically the depersonalization of confidential data while maintaining the data structure in text documents. The method for depersonalizing confidential data in text documents while preserving the data structure comprises the following steps: receiving a document with text data; segmenting and tokenizing text data; performing token vectorization; determining whether each token belongs to the category of confidential data using a machine learning model; depersonalizing data related to tokens with confidential data while maintaining the data structure; forming a list of replacements; replacing the original confidential data in the text document with depersonalized data according to the list of replacements, and in the process of replacement, the depersonalized data is formatted in accordance with the positions of formatting changes of parts of the text.

EFFECT: providing the possibility of preserving the stylistic, semantic, lexical and morphological structure of data in text documents when depersonalizing them, increasing the accuracy of depersonalizing data in text documents by identifying confidential data in text documents using a machine learning model.

10 cl, 6 dwg, 6 tbl

Similar patents RU2804747C1

Title Year Author Number
METHOD AND SYSTEM FOR DEPERSONALIZATION OF CONFIDENTIAL DATA 2022
  • Babak Nikita Grigorevich
  • Belorybkin Leonid Yurevich
  • Terenin Aleksej Alekseevich
  • Shabrova Anastasiya Igorevna
RU2802549C1
METHOD AND SYSTEM FOR GENERATING TEXT 2023
  • Tikhonova Mariya Ivanovna
RU2817524C1
METHOD AND SYSTEM FOR DIGITAL ASSISTANT TEXT GENERATION 2022
  • Tikhonova Mariya Ivanovna
RU2796208C1
METHOD AND SYSTEM FOR CLASSIFYING DATA FOR IDENTIFYING CONFIDENTIAL INFORMATION IN THE TEXT 2019
  • Terenin Aleksej Alekseevich
  • Kotova Margarita Aleksandrovna
RU2755606C2
METHOD AND SYSTEM FOR PARAPHRASING TEXT 2023
  • Fenogenova Alena Sergeevna
  • Tikhonova Mariya Ivanovna
RU2814808C1
TEXT CLASSIFICATION METHOD AND SYSTEM 2022
  • Konodyuk Nikita Evgenevich
  • Tikhonova Mariya Ivanovna
RU2818693C2
CLASSIFICATION OF DOCUMENTS BY LEVELS OF CONFIDENTIALITY 2019
  • Zyuzin Andrej Andreevich
  • Uskova Olesya Vladimirovna
RU2732850C1
NAMED ENTITIES FROM THE TEXT AUTOMATIC EXTRACTION 2014
  • Nekhaj Ilya Vladimirovich
RU2665239C2
METHOD OF IDENTIFYING PERSONAL DATA OF OPEN SOURCES OF UNSTRUCTURED INFORMATION 2013
  • Khusnojarov Farit Faritovich
RU2549515C2
METHOD AND SYSTEM FOR RETRIEVING NAMED ENTITIES 2020
  • Emelyanov Anton Aleksandrovich
RU2760637C1

RU 2 804 747 C1

Authors

Babak Nikita Grigorevich

Belorybkin Leonid Yurevich

Terenin Aleksej Alekseevich

Shabrova Anastasiya Igorevna

Dates

2023-10-04Published

2022-12-09Filed