METHOD AND SYSTEM FOR OBTAINING A VECTOR REPRESENTATION OF AN ELECTRONIC DOCUMENT Russian patent published in 2022 - IPC G06F40/284 G06F16/35 G06F40/30 

Abstract RU 2775351 C1

FIELD: computing technology.

SUBSTANCE: computer-implemented method for obtaining a vector representation of an electronic document, executed by means of a processing unit and including the stages of: generating a model of placement of m-skip-n-grams by clusters, wherein the generation of said model involves the following: determining the list of used m-skip-n-grams; converting each m-skip-n-gram from the list into a vector representation; clustering the m-skip-n-grams; processing the text document using the resulting model, involving: calculating the occurrence of m-skip-n-grams in the document; determining clusters of the document based on the occurrence of m-skip-n-grams; summarizing the amount of occurrences of m-skip-n-grams from each cluster; forming a vector representation of the document.

EFFECT: possibility of preserving different semantics of words in the document by matching words to multiple clusters.

10 cl, 6 dwg, 1 tbl

Similar patents RU2775351C1

Title Year Author Number
METHOD AND SYSTEM FOR OBTAINING VECTOR REPRESENTATION OF ELECTRONIC TEXT DOCUMENT FOR CLASSIFICATION BY CATEGORIES OF CONFIDENTIAL INFORMATION 2021
  • Vyshegorodtsev Kirill Evgenevich
  • Obolenskij Ivan Aleksandrovich
  • Golovnya Maksim Sergeevich
RU2775358C1
METHOD AND SYSTEM FOR DETERMINING RESULT OF TASK EXECUTION IN CROWDSOURCED ENVIRONMENT 2019
  • Fedorova Valentina Pavlovna
  • Gusev Gleb Gennadievich
  • Drutsa Alexey Valerievich
RU2744032C2
METHOD AND SYSTEM OF SEMANTIC PROCESSING TEXT DOCUMENTS 2016
  • Mitelkov Dmitrij Vladimirovich
  • Novikov Andrej Yurevich
  • Satin Boris Borisovich
RU2630427C2
METHOD OF CONSTRUCTING SEMANTIC MODEL OF DOCUMENT 2011
  • Turdakov Denis Jur'Evich
  • Nedumov Jaroslav Rostislavovich
  • Sysoev Andrej Anatol'Evich
RU2487403C1
THEMATIC MODELS WITH A PRIORI TONALITY PARAMETERS BASED ON DISTRIBUTED REPRESENTATIONS 2018
  • Tutubalina Elena Viktorovna
  • Nikolenko Sergey Igorevich
RU2719463C1
METHOD FOR GENERATING MATHEMATICAL MODELS OF A PATIENT USING ARTIFICIAL INTELLIGENCE TECHNIQUES 2017
  • Drokin Ivan Sergeevich
  • Bukhvalov Oleg Leonidovich
  • Sorokin Sergej Yurevich
RU2720363C2
AUTOMATED LEGAL ADVICE SYSTEM CONTROL METHOD 2019
  • Prikhodko Olga Viktorovna
  • Khyurri Ruslan Vladimirovich
  • Prikhodko Olga Viktorovna
RU2718978C1
METHOD OF CLASSIFYING DOCUMENTS BY CATEGORIES 2012
  • Lapshin Vladimir Anatol'Evich
  • Pshekhotskaja Ekaterina Aleksandrovna
  • Perov Dmitrij Vsevolodovich
RU2491622C1
AUTOMATIC DETERMINATION OF SET OF CATEGORIES FOR DOCUMENT CLASSIFICATION 2018
  • Nikita Orlov
  • Konstantin Anisimovich
RU2701995C2
AI TRANSACTION ADMINISTRATION SYSTEM 2020
  • Fehling, Ronny
  • Short, Samantha
  • De Goursac, Axel
  • Dubois, Raphael
  • Erlebach, Joerg
  • Von Funck, Karin
RU2777958C2

RU 2 775 351 C1

Authors

Vyshegorodtsev Kirill Evgenevich

Davidov Dmitrij Georgievich

Ryupichev Dmitrij Yurevich

Balashov Aleksandr Viktorovich

Dates

2022-06-29Published

2021-06-01Filed