FIELD: computer technology.
SUBSTANCE: MLA algorithm is trained using the first set of string pairs. The first set of string pairs has a natural proportion of string pairs of each context. MLA algorithm turns out to be biased towards forming a parallel string as a translation of the corresponding string found in the main context. A method includes defining the second set of string pairs containing a controlled proportion of string pairs of each context. The second set of string pairs is associated with labels pointing to corresponding contexts. The method includes re-training of MLA algorithm using the second set of string pairs and corresponding labels. MLA algorithm is re-trained to determine the context of the string and create the corresponding parallel string as a translation, taking into account the context.
EFFECT: reduction in the bias of MLA algorithm that arose during the initial training in translation of a text string in the first language with a parallel text string in the second language.
20 cl, 6 dwg
Title | Year | Author | Number |
---|---|---|---|
METHOD AND SERVER FOR PROCESSING TEXT SEQUENCE IN MACHINE PROCESSING TASK | 2020 |
|
RU2775820C2 |
METHOD AND SERVER FOR TEACHING A NEURAL NETWORK TO FORM A TEXT OUTPUT SEQUENCE | 2020 |
|
RU2798362C2 |
METHOD AND SYSTEM FOR GENERATING AN OBJECT CARD | 2018 |
|
RU2739554C1 |
METHOD AND SERVER FOR REPEATED TRAINING OF MACHINE LEARNING ALGORITHM | 2019 |
|
RU2743932C2 |
METHOD AND SERVER FOR DETERMINING TRAINING SET FOR MACHINE LEARNING ALGORITHM (MLA) TRAINING | 2020 |
|
RU2817726C2 |
METHOD AND SYSTEM FOR CLASSIFYING WORD AS OBSCENE WORD | 2020 |
|
RU2803576C2 |
METHOD AND SYSTEM FOR TRAINING MACHINE LEARNING ALGORITHM TO PREDICT VISIBILITY ASSESSMENT | 2022 |
|
RU2814079C1 |
METHOD AND SYSTEM FOR DETERMINATION OF RANKED POSITIONS OF ELEMENTS BY RANKING SYSTEM | 2020 |
|
RU2781621C2 |
METHOD AND SERVER FOR GENERATING EXTENDED REQUEST | 2021 |
|
RU2813582C2 |
SYSTEM AND METHOD FOR FORMATION OF TRAINING SET FOR MACHINE LEARNING ALGORITHM | 2020 |
|
RU2790033C2 |
Authors
Dates
2022-04-18—Published
2020-03-04—Filed