FIELD: physics.
SUBSTANCE: group of inventions relates to computer engineering and can be used for additional training of a language model for solving the problem of text classification. Method of classifying text by a language model is carried out by at least one computing device and comprises the following steps: obtaining an input data set corresponding to the required classification task, in the format on the basis of which the language model is additionally trained; formatting it, supplementing it with symbols, each of which corresponds to an abstract pseudoword; performing tokenization and vectorization of the input data set, wherein symbols corresponding to abstract pseudowords are replaced with trained vector representations of symbols; processing the obtained data, obtaining a vector of logits, which reflects the probability distribution of classes corresponding to words of the dictionary of the language model; selecting target components of logits, corresponding to tokens of target classes of solved problem of classification; determining the logit component reflecting the highest probability of belonging to the target class; response is generated in text form corresponding to the selected component.
EFFECT: enabling automatic generation of hints for additional training of the language model.
7 cl, 5 dwg
Title | Year | Author | Number |
---|---|---|---|
METHOD AND SYSTEM FOR DIGITAL ASSISTANT TEXT GENERATION | 2022 |
|
RU2796208C1 |
METHOD AND SYSTEM FOR GENERATING TEXT | 2023 |
|
RU2817524C1 |
METHOD AND SYSTEM FOR PARAPHRASING TEXT | 2023 |
|
RU2814808C1 |
SYSTEM AND METHOD FOR AUGMENTATION OF THE TRAINING SAMPLE FOR MACHINE LEARNING ALGORITHMS | 2020 |
|
RU2758683C2 |
METHOD AND SYSTEM FOR DEPERSONALIZATION OF CONFIDENTIAL DATA | 2022 |
|
RU2804747C1 |
METHOD AND SYSTEM FOR DEPERSONALIZATION OF CONFIDENTIAL DATA | 2022 |
|
RU2802549C1 |
METHOD AND SYSTEM FOR RECOGNIZING INFORMATION CONSTITUTING TRADE SECRET | 2024 |
|
RU2841161C1 |
METHOD AND SYSTEM FOR OBTAINING VECTOR PRESENTATIONS OF DATA IN TABLE TAKING INTO ACCOUNT STRUCTURE OF TABLE AND ITS CONTENT | 2024 |
|
RU2839037C1 |
METHOD AND SYSTEM FOR GENERATING RESPONSE TO SEARCH QUERY | 2024 |
|
RU2834217C1 |
METHOD AND SYSTEM FOR RETRIEVING NAMED ENTITIES | 2020 |
|
RU2760637C1 |
Authors
Dates
2024-05-03—Published
2022-05-27—Filed