FIELD: physics.
SUBSTANCE: group of inventions relates to computer engineering and can be used for additional training of a language model for solving the problem of text classification. Method of classifying text by a language model is carried out by at least one computing device and comprises the following steps: obtaining an input data set corresponding to the required classification task, in the format on the basis of which the language model is additionally trained; formatting it, supplementing it with symbols, each of which corresponds to an abstract pseudoword; performing tokenization and vectorization of the input data set, wherein symbols corresponding to abstract pseudowords are replaced with trained vector representations of symbols; processing the obtained data, obtaining a vector of logits, which reflects the probability distribution of classes corresponding to words of the dictionary of the language model; selecting target components of logits, corresponding to tokens of target classes of solved problem of classification; determining the logit component reflecting the highest probability of belonging to the target class; response is generated in text form corresponding to the selected component.
EFFECT: enabling automatic generation of hints for additional training of the language model.
7 cl, 5 dwg
Title | Year | Author | Number |
---|---|---|---|
METHOD AND SYSTEM FOR DIGITAL ASSISTANT TEXT GENERATION | 2022 |
|
RU2796208C1 |
METHOD AND SYSTEM FOR GENERATING TEXT | 2023 |
|
RU2817524C1 |
METHOD AND SYSTEM FOR PARAPHRASING TEXT | 2023 |
|
RU2814808C1 |
SYSTEM AND METHOD FOR AUGMENTATION OF THE TRAINING SAMPLE FOR MACHINE LEARNING ALGORITHMS | 2020 |
|
RU2758683C2 |
METHOD AND SYSTEM FOR DEPERSONALIZATION OF CONFIDENTIAL DATA | 2022 |
|
RU2804747C1 |
METHOD AND SYSTEM FOR DEPERSONALIZATION OF CONFIDENTIAL DATA | 2022 |
|
RU2802549C1 |
METHOD AND SYSTEM FOR RETRIEVING NAMED ENTITIES | 2020 |
|
RU2760637C1 |
METHOD AND SYSTEM FOR EXTRACTING NAMED ENTITIES | 2021 |
|
RU2823914C2 |
AUTOMATED LEGAL ADVICE SYSTEM CONTROL METHOD | 2019 |
|
RU2718978C1 |
SYSTEM AND METHOD FOR AUTOMATED ASSESSMENT OF INTENTIONS AND EMOTIONS OF USERS OF DIALOGUE SYSTEM | 2020 |
|
RU2762702C2 |
Authors
Dates
2024-05-03—Published
2022-05-27—Filed