FIELD: data processing.
SUBSTANCE: disclosed group of inventions relates to training a translation machine learning model. Disclosed is a method and a server for generating a training data set for training a machine learning model for translation. Method includes: obtaining a corpus of texts in a source language and a corpus of texts in a target language; formation of a first version of the translation of the phrase in the source language from the corpus in the source language into the target language and its first assessment of reliability; generating a second version of translating a phrase in a source language into a target language and a second assessment thereof; and replacing the phrase in the target language with the first or second version of the translation if the first or second validity score exceeds the baseline validity score associated with the phrase in the target language from the corpus in the target language.
EFFECT: high accuracy of the target corpus of texts using sentences in the target language, high quality of translation.
20 cl, 7 dwg
Title | Year | Author | Number |
---|---|---|---|
SYSTEM AND METHOD FOR TRAINING MACHINE LEARNING MODELS FOR RANKING SEARCH RESULTS | 2023 |
|
RU2832419C2 |
METHOD AND SERVER FOR TRAINING MACHINE LEARNING ALGORITHM IN TRANSLATION | 2020 |
|
RU2770569C2 |
METHOD AND SYSTEM FOR TRAINING CHATBOT SYSTEM | 2023 |
|
RU2820264C1 |
METHOD AND SYSTEM FOR TRANSLATING SOURCE SENTENCE IN FIRST LANGUAGE BY TARGET SENTENCE IN SECOND LANGUAGE | 2017 |
|
RU2692049C1 |
SYSTEM AND METHOD FOR TRAINING MACHINE LEARNING MODELS FOR RANKING SEARCH RESULTS | 2023 |
|
RU2829065C1 |
METHOD AND SYSTEM FOR CHECKING MEDIA CONTENT | 2022 |
|
RU2815896C2 |
METHOD AND SERVER FOR TEACHING A NEURAL NETWORK TO FORM A TEXT OUTPUT SEQUENCE | 2020 |
|
RU2798362C2 |
METHOD AND SERVER FOR PERFORMING CONTEXT-SENSITIVE TRANSLATION | 2021 |
|
RU2812301C2 |
METHOD AND SERVER FOR PERFORMING PROBLEM-ORIENTED TRANSLATION | 2021 |
|
RU2820953C2 |
METHOD AND SYSTEM FOR RECOGNIZING USER'S SPEECH FRAGMENT | 2021 |
|
RU2808582C2 |
Authors
Dates
2025-02-24—Published
2023-10-10—Filed