MULTISTAGE TRAINING OF MACHINE LEARNING MODELS FOR RANKING SEARCH RESULTS Russian patent published in 2024 - IPC G06N3/08 G06F16/00 

Abstract RU 2824338 C2

FIELD: physics.

SUBSTANCE: invention relates to a system and a method for training a machine learning model to rank digital objects of the use stage. Method includes obtaining by processor a first plurality of training digital objects, wherein each training digital object from the first plurality of training digital objects is associated with a parameter of past user actions indicating user actions of past users with said training digital object; training at the first stage of training based on the first set of training digital objects of the machine learning model to determine the parameter of predicted user actions for the digital object of the use stage, wherein the predicted user actions parameter indicates user actions of future users with the digital object of the use stage; obtaining by the processor a second set of training digital objects, wherein each training digital object from the second set of training digital objects is connected (a) with a training search query used to generate a training digital object from a second plurality of training digital objects, and (b) with a first label indicating the degree of relevance of the object from the second plurality of training digital objects to the training search query; training at the second training stage following the first training stage, based on the second plurality of training digital objects of the machine learning model, determining a synthesized label of the digital object of the use stage, indicating the degree of relevance of the digital object of the use stage to the search request of the use stage; application by a processor of a machine learning model with respect to a first plurality of training digital objects to augment an object from the first plurality of training digital objects with a synthesized label and thus forming a first augmented plurality of training digital objects; and training, based on the first augmented set of training digital objects of the machine learning model, to determine the relevance parameter of the digital object of the use stage, which indicates the degree of relevance of the digital object of the use stage to the search request of the use stage, wherein the training digital object from the first plurality of training digital objects contains an indication of the digital document associated with the document metadata, and based on the first plurality of training digital objects, training the machine learning model at the first training stage further includes: converting the document metadata into a text representation thereof containing tokens; preprocessing the text representation for masking several masked tokens therein; and training, based on the first plurality of training digital objects, of the machine learning model to determine a token from a plurality of masked tokens based on the context provided by neighboring tokens, wherein the relevance parameter of the digital object of the use stage further indicates a semantic relevance parameter indicating the degree of semantic relevance of the search query of the use stage to the content of the digital object of the use stage.

EFFECT: high relevance of search results generated by a search engine in response to a user request, due to accurate ranking of search results on a SERP page, performed by a machine learning model.

23 cl, 6 dwg

Similar patents RU2824338C2

Title Year Author Number
METHOD AND SYSTEM FOR CHECKING MEDIA CONTENT 2022
  • Gorb Roman Viktorovich
  • Yudin Sergej Mikhajlovich
  • Zobnin Aleksej Igorevich
  • Oreshin Pavel Evgenevich
RU2815896C2
METHOD AND SYSTEM FOR TRAINING CHATBOT SYSTEM 2023
  • Zinov Nikolaj Aleksandrovich
  • Korenev Artem Arkadevich
RU2820264C1
METHOD AND SYSTEM FOR RANKING SET OF DOCUMENTS FROM SEARCH RESULT 2021
  • Svetlov Vsevolod Aleksandrovich
  • Gushchenko-Cheverda Ivan Ilich
RU2821294C2
METHOD AND SERVER FOR TEACHING A NEURAL NETWORK TO FORM A TEXT OUTPUT SEQUENCE 2020
  • Petrov Aleksey Sergeevich
  • Gubanov Sergey Dmitrievich
  • Gaydaenko Sergey Aleksandrovich
RU2798362C2
METHOD AND SERVER FOR DETERMINING TRAINING SET FOR MACHINE LEARNING ALGORITHM (MLA) TRAINING 2020
  • Dorogush Anna Veronika Yurevna
  • Alipov Vyacheslav Vyacheslavovich
  • Kruchinin Dmitriy Andreevich
  • Oganesyan Dmitry Alekseevich
RU2817726C2
METHOD AND SERVER FOR REPEATED TRAINING OF MACHINE LEARNING ALGORITHM 2019
  • Pevtsov Sergey Evgenievich
  • Kostin Mikhail Yurievich
  • Chigin Anton Olegovich
  • Vasilyev Dmitry Sergeevich
RU2743932C2
METHOD AND SYSTEM FOR RANKING DIGITAL OBJECTS BASED ON TARGET CHARACTERISTIC RELATED TO THEM 2019
  • Ustimenko Aleksey Ivanovich
  • Vorobyev Aleksandr Leonidovich
  • Gusev Gleb Gennadevich
  • Serdyukov Pavel Viktorovich
RU2757174C2
METHODS AND SERVERS FOR RANKING DIGITAL DOCUMENTS IN RESPONSE TO A QUERY 2020
  • Volynets Eduard Mechislavovich
  • Pastushyk Dzianis Sergeevich
  • Grechnikov Yevgeny Aleksandrovich
RU2775815C2
METHOD AND SYSTEM FOR GENERATING TRAINING DATA FOR MACHINE LEARNING ALGORITHM 2021
  • Biryukov Valentin Andreevich
  • Pavlichenko Nikita Vitalevich
  • Fedorova Valentina Pavlovna
RU2819647C2
METHOD OF ESTABLISHING TRAINING OBJECT FOR TRAINING MACHINE TRAINING ALGORITHM 2016
  • Gusev Gleb Gennadevich
  • Fedorova Valentina Pavlovna
  • Mishchenko Andrej Sergeevich
RU2637883C1

RU 2 824 338 C2

Authors

Bojmel Aleksandr Alekseevich

Soboleva Darya Mikhajlovna

Dates

2024-08-07Published

2021-12-02Filed