COMPUTER-IMPLEMENTED METHOD OF TRAINING NEURAL NETWORK TO DETERMINE GENRE AND SUBGENRE OF TEXT Russian patent published in 2024 - IPC G06F17/00 

Abstract RU 2831511 C1

FIELD: information technology.

SUBSTANCE: present invention relates to determining the genre of text, in particular to training a neural network for determining the genre and subgenre of text, including a large volume and complex semantic structure. According to the proposed method of training a neural network to determine the genre and subgenre of the text at the first stage: providing the availability of texts from the first group relating to one genre and containing a through named entity, and a dictionary containing said named entity and words falling into a predetermined step before and after the end-to-end named entity, training the neural network using the text from the first group, during training, the neural network selects the named entity and words and/or context structures falling into the given step before and after the named entity, they are placed in a list and the list is compared with said dictionary, based on which the neural network outputs the matching result to determine the genre of the text. At the second stage: providing the availability of texts from the second group, relating to the same genre and containing different named entities, training the neural network, trained at the first stage, using the text from the second group, repeating said operations of the first stage, starting with the named entity selection. At the third stage: after training the neural network at least two genres, providing the presence of texts from a third group relating to said trained genres and containing different named entities, and a combined dictionary obtained from augmented dictionaries for said trained genres, and training the neural network using the text from the third group, repeating said operations of the first step, starting with the named entity selection, wherein the merged dictionary is used for the comparison operation, and at the output, the neural network outputs a comparison result to determine the genre and subgenre of the text.

EFFECT: proposed method reduces the total amount of training data and time for training a neural network for the task of determining genre and subgenre belonging of large text corpuses with provision of high accuracy of results.

5 cl

Similar patents RU2831511C1

Title Year Author Number
METHOD AND SYSTEM FOR GENERATING TEXT 2023
  • Tikhonova Mariya Ivanovna
RU2817524C1
METHOD AND SYSTEM FOR CLASSIFYING AND FILTERING PROHIBITED CONTENT IN A NETWORK 2020
  • Prudkovskij Nikolaj Sergeevich
RU2738335C1
METHOD AND SYSTEM FOR DIGITAL ASSISTANT TEXT GENERATION 2022
  • Tikhonova Mariya Ivanovna
RU2796208C1
METHOD FOR ATTRIBUTION OF PARTIALLY STRUCTURED TEXTS FOR FORMATION OF NORMATIVE-REFERENCE INFORMATION 2020
  • Fedosin Sergei Alekseevich
  • Plotnikova Natalia Pavlovna
  • Martynov Vladislav Aleksandrovich
  • Ryskin Konstantin Eduardovich
  • Kuznetsov Dmitrii Aleksandrovich
  • Deniskin Aleksandr Vladimirovich
  • Vechkanova Iuliia Sergeevna
  • Fediushkin Nikolai Alekseevich
  • Tsilikov Nikita Sergeevich
RU2750852C1
METHOD AND SYSTEM FOR DEPERSONALIZATION OF CONFIDENTIAL DATA 2022
  • Babak Nikita Grigorevich
  • Belorybkin Leonid Yurevich
  • Terenin Aleksej Alekseevich
  • Shabrova Anastasiya Igorevna
RU2804747C1
METHOD AND SYSTEM FOR DEPERSONALIZATION OF CONFIDENTIAL DATA 2022
  • Babak Nikita Grigorevich
  • Belorybkin Leonid Yurevich
  • Terenin Aleksej Alekseevich
  • Shabrova Anastasiya Igorevna
RU2802549C1
NAMED ENTITIES FROM THE TEXT AUTOMATIC EXTRACTION 2014
  • Nekhaj Ilya Vladimirovich
RU2665239C2
SENTIMENT ANALYSIS AT LEVEL OF ASPECTS AND CREATION OF REPORTS USING MACHINE LEARNING METHODS 2016
  • Mikhajlov Maksim Borisovich
  • Pasechnikov Konstantin Alekseevich
RU2635257C1
RETRIEVAL OF INFORMATION OBJECTS USING A COMBINATION OF CLASSIFIERS ANALYZING LOCAL AND NON-LOCAL SIGNS 2018
  • Indenbom Evgenij Mikhajlovich
RU2686000C1
METHOD OF EXTRACTING FACTS FROM TEXTS ON NATURAL LANGUAGE 2016
  • Starostin Anatolij Sergeevich
  • Smurov Ivan Mikhajlovich
  • Dzhumaev Stanislav Sergeevich
RU2637992C1

RU 2 831 511 C1

Authors

Aliev Rizvan Idrisovich

Grigorev Sergej Sergeevich

Kiparisov Aleksej Sergeevich

Maksimovich Andrej Vladimirovich

Strokin Aleksej Anatolevich

Strokin Nikita Alekseevich

Dates

2024-12-09Published

2023-11-12Filed