METHOD AND SYSTEM FOR OBTAINING VECTOR PRESENTATIONS OF DATA IN TABLE TAKING INTO ACCOUNT STRUCTURE OF TABLE AND ITS CONTENT Russian patent published in 2025 - IPC G06F40/284 G06N3/04 

Abstract RU 2839037 C1

FIELD: data processing.

SUBSTANCE: group of inventions relates to data processing and can be used to obtain vector representations of data in a table based on the structure of the table and its content. Method comprises the following steps: obtaining data, which includes: text, table structure; table is defined as a set from a list of table header cells and a list of table body cells; each cell of the table body is marked with tags characterizing: a table identifier, a list of atomic columns to which the cell belongs, a list of atomic rows to which the cell belongs; data of each cell of table body is supplemented with information from corresponding cells of headers; performing the text in the table tokenisation; performing position coding at table rows level; forming vector representations of tokens for each token in table by aggregation of vector representations of tokens and positional vector representations; attention matrix is generated, using cell belonging to column or row of table; storing coordinates of boundaries of table cells in sequence of table tokens; the base model receives at the input prepared text and position vector representations of tokens and an attention matrix and processes them to obtain contextualized vector representations of tokens; using stored coordinates of boundaries of table cells, pooling is used to obtain a vector representation of a table cell.

EFFECT: faster process of training a language model when working with spreadsheet documents.

6 cl, 5 dwg

Similar patents RU2839037C1

Title Year Author Number
TEXT CLASSIFICATION METHOD AND SYSTEM 2022
  • Konodyuk Nikita Evgenevich
  • Tikhonova Mariya Ivanovna
RU2818693C2
METHOD AND DEVICE FOR DETERMINING FRAUDULENT TRANSACTIONS OF USER 2024
  • Vyshegorodtsev Kirill Evgenevich
  • Gubanov Dmitrij Nikolaevich
  • Saukov Pavel Aleksandrovich
  • Umerenko Grigorij Sergeevich
RU2839053C1
ADJUSTABLE TABLE STYLES FOR SPREADSHEETS 2006
  • Simkej Roj
  • Gejner Dehvid F.
  • Khouk Tom Dzh.
  • Chemberlehjn Benzhamin K.
  • Dzhajanti Paavani
  • Ehllis Charl'Z D.
RU2419851C2
METHOD AND SYSTEM FOR DETECTING OBFUSCATED MALICIOUS COMMANDS IN SYSTEM CONSOLE OF OPERATING SYSTEM 2024
  • Vyshegorodtsev Kirill Evgenevich
  • Nagornov Ivan Grigorevich
  • Balashov Aleksandr Viktorovich
  • Saukov Pavel Aleksandrovich
  • Levkina Ulyana Sergeevna
  • Novikov Evgenij Aleksandrovich
RU2838483C1
METHOD AND SYSTEM FOR TRAINING CHATBOT SYSTEM 2023
  • Zinov Nikolaj Aleksandrovich
  • Korenev Artem Arkadevich
RU2820264C1
METHOD AND SYSTEM FOR OBTAINING VECTOR REPRESENTATION OF ELECTRONIC TEXT DOCUMENT FOR CLASSIFICATION BY CATEGORIES OF CONFIDENTIAL INFORMATION 2021
  • Vyshegorodtsev Kirill Evgenevich
  • Obolenskij Ivan Aleksandrovich
  • Golovnya Maksim Sergeevich
RU2775358C1
METHOD AND DEVICE FOR GENERATING VIDEO CLIP FROM TEXT DESCRIPTION AND SEQUENCE OF KEY POINTS SYNTHESIZED BY DIFFUSION MODEL 2024
  • Demochkin Kirill Vladislavovich
  • Sobolev Konstantin Victorovich
  • Kuzhamuratov Arsen Rinatovich
  • Zhirnov Mikhail Denisovich
  • Bortnikov Mikhail Evgenievich
  • Chernyavskiy Alexey Stanislavovich
RU2823216C1
EXTRACTING INFORMATION FROM STRUCTURED DOCUMENTS CONTAINING TEXT IN NATURAL LANGUAGE 2015
  • Danielyan Tatiana Vladimirovna
  • Bulgakov Ilya Aleksandrovich
RU2607976C1
SYSTEM AND METHOD FOR TRAINING MACHINE LEARNING MODELS FOR RANKING SEARCH RESULTS 2023
  • Boimel Aleksandr Alekseevich
  • Gusev Daniil Vladimirovich
  • Kulunchakov Andrei Sergeevich
  • Mironov Artem Vladimirovich
RU2829065C1
METHOD FOR PREDICTION OF DIAGNOSIS BASED ON DATA PROCESSING CONTAINING MEDICAL KNOWLEDGE 2019
  • Tarasov Denis Stanislavovich
RU2723674C1

RU 2 839 037 C1

Authors

Volkov Maksim Aleksandrovich

Dates

2025-04-25Published

2024-08-28Filed