TEXT CLASSIFICATION METHOD AND SYSTEM Russian patent published in 2024 - IPC G06F40/279 G06F16/35 G06N20/00 

Abstract RU 2818693 C2

FIELD: physics.

SUBSTANCE: group of inventions relates to computer engineering and can be used for additional training of a language model for solving the problem of text classification. Method of classifying text by a language model is carried out by at least one computing device and comprises the following steps: obtaining an input data set corresponding to the required classification task, in the format on the basis of which the language model is additionally trained; formatting it, supplementing it with symbols, each of which corresponds to an abstract pseudoword; performing tokenization and vectorization of the input data set, wherein symbols corresponding to abstract pseudowords are replaced with trained vector representations of symbols; processing the obtained data, obtaining a vector of logits, which reflects the probability distribution of classes corresponding to words of the dictionary of the language model; selecting target components of logits, corresponding to tokens of target classes of solved problem of classification; determining the logit component reflecting the highest probability of belonging to the target class; response is generated in text form corresponding to the selected component.

EFFECT: enabling automatic generation of hints for additional training of the language model.

7 cl, 5 dwg

Similar patents RU2818693C2

Title Year Author Number
METHOD AND SYSTEM FOR DIGITAL ASSISTANT TEXT GENERATION 2022
  • Tikhonova Mariya Ivanovna
RU2796208C1
METHOD AND SYSTEM FOR GENERATING TEXT 2023
  • Tikhonova Mariya Ivanovna
RU2817524C1
METHOD AND SYSTEM FOR PARAPHRASING TEXT 2023
  • Fenogenova Alena Sergeevna
  • Tikhonova Mariya Ivanovna
RU2814808C1
SYSTEM AND METHOD FOR AUGMENTATION OF THE TRAINING SAMPLE FOR MACHINE LEARNING ALGORITHMS 2020
  • Shavrina Tatyana Olegovna
RU2758683C2
METHOD AND SYSTEM FOR DEPERSONALIZATION OF CONFIDENTIAL DATA 2022
  • Babak Nikita Grigorevich
  • Belorybkin Leonid Yurevich
  • Terenin Aleksej Alekseevich
  • Shabrova Anastasiya Igorevna
RU2804747C1
METHOD AND SYSTEM FOR DEPERSONALIZATION OF CONFIDENTIAL DATA 2022
  • Babak Nikita Grigorevich
  • Belorybkin Leonid Yurevich
  • Terenin Aleksej Alekseevich
  • Shabrova Anastasiya Igorevna
RU2802549C1
METHOD AND SYSTEM FOR RECOGNIZING INFORMATION CONSTITUTING TRADE SECRET 2024
  • Babak Nikita Grigorevich
  • Belorybkin Leonid Yurevich
  • Garbuzov Georgij Valerevich
  • Denisov Vitalij Igorevich
  • Terenin Aleksej Alekseevich
  • Shabrova Anastasiya Igorevna
RU2841161C1
METHOD AND SYSTEM FOR OBTAINING VECTOR PRESENTATIONS OF DATA IN TABLE TAKING INTO ACCOUNT STRUCTURE OF TABLE AND ITS CONTENT 2024
  • Volkov Maksim Aleksandrovich
RU2839037C1
METHOD AND SYSTEM FOR GENERATING RESPONSE TO SEARCH QUERY 2024
  • Dremin Mikhail Vitalevich
  • Korolev Petr Alekseevich
  • Slavkina Tatyana Sergeevna
RU2834217C1
METHOD AND SYSTEM FOR RETRIEVING NAMED ENTITIES 2020
  • Emelyanov Anton Aleksandrovich
RU2760637C1

RU 2 818 693 C2

Authors

Konodyuk Nikita Evgenevich

Tikhonova Mariya Ivanovna

Dates

2024-05-03Published

2022-05-27Filed