METHOD OF RECOGNIZING NATURE OF TEXT CONTENT Russian patent published in 2024 - IPC G06N3/00 G06F17/40 

Abstract RU 2827987 C1

FIELD: machine learning.

SUBSTANCE: invention relates to a method of recognizing the nature of text content. Method comprises steps of: generating an initial set of text data sources containing content of a predetermined subject, wherein each source is assigned at least one content nature mark and at least one content subject mark; automatically performing parsing of each source in the set of sources to identify the author of the source and identify links to third-party sources, wherein sources not included in available set of sources are considered as third-party sources, wherein links to third-party sources are the names of third-party sources and url-links to third-party sources; searching for said third-party sources by identified links; searching for third-party sources by identified authors; selecting sources from found third-party sources, subject of which is close to at least one of the content subjects of the initial set of sources; automatically assigning to selected sources corresponding content subject labels; forming an additional set of sources from the selected sources; each source from the additional set of sources is automatically assigned at least one content character label by comparing this source with sources from the initial set, having the same subject matter as this source; and generating a training set of sources by combining the initial set of sources and the marked additional set of sources.

EFFECT: high accuracy and speed of obtaining the end result.

4 cl

Similar patents RU2827987C1

Title Year Author Number
RETRIEVAL OF INFORMATION OBJECTS USING A COMBINATION OF CLASSIFIERS ANALYZING LOCAL AND NON-LOCAL SIGNS 2018
  • Indenbom Evgenij Mikhajlovich
RU2686000C1
METHOD AND SYSTEM FOR CREATING BRIEF SUMMARY OF DIGITAL CONTENT 2016
  • Sadovskij Aleksandr Anatolevich
RU2637998C1
NAMED ENTITIES FROM THE TEXT AUTOMATIC EXTRACTION 2014
  • Nekhaj Ilya Vladimirovich
RU2665239C2
DISTRIBUTED LEARNING MACHINE LEARNING MODELS FOR PERSONALIZATION 2018
  • Kudinov Mikhail Sergeevich
  • Piontkovskaya Irina Igorevna
  • Nevidomskii Aleksei Yurievich
  • Popov Vadim Sergeevich
  • Vytovtov Petr Konstantinovich
  • Polubotko Dmitry Valerievich
  • Malyugina Olga Valerievna
RU2702980C1
METHOD AND SYSTEM FOR CHECKING MEDIA CONTENT 2022
  • Gorb Roman Viktorovich
  • Yudin Sergej Mikhajlovich
  • Zobnin Aleksej Igorevich
  • Oreshin Pavel Evgenevich
RU2815896C2
SYSTEM AND METHOD FOR AUGMENTATION OF THE TRAINING SAMPLE FOR MACHINE LEARNING ALGORITHMS 2020
  • Shavrina Tatyana Olegovna
RU2758683C2
SYSTEM FOR AUTOMATIC DETERMINATION OF SUBJECT MATTER OF TEXT DOCUMENTS BASED ON EXPLICABLE ARTIFICIAL INTELLIGENCE METHODS 2023
  • Sochenkov Ilia Vladimirovich
  • Zhebel Vladimir Viktorovich
  • Zubarev Denis Vladimirovich
  • Deviatkin Dmitrii Alekseevich
  • Iadrintsev Vasilii Vladimirovich
RU2823436C1
METHOD AND SYSTEM FOR GENERATING AN OBJECT CARD 2018
  • Akulov Yaroslav Viktorovich
RU2739554C1
SYSTEM FOR CREATING DOCUMENTS BASED ON TEXT ANALYSIS ON NATURAL LANGUAGE 2016
  • Danielyan Tatyana Vladimirovna
RU2639655C1
SYSTEM AND METHOD FOR AUTOMATED ASSESSMENT OF INTENTIONS AND EMOTIONS OF USERS OF DIALOGUE SYSTEM 2020
  • Fenogenova Alena Sergeevna
  • Shavrina Tatyana Olegovna
RU2762702C2

RU 2 827 987 C1

Authors

Nikanov Ivan Aleksandrovich

Sevastianov Ruslan Sergeevich

Merkulova Ekaterina Vladimirovna

Dates

2024-10-04Published

2023-06-30Filed