METHOD FOR EXTRACTING INFORMATION FROM UNSTRUCTURED TEXTS WRITTEN IN NATURAL LANGUAGE Russian patent published in 2021 - IPC G06F40/00 

Abstract RU 2751993 C1

FIELD: computing.

SUBSTANCE: invention relates to a method for extracting information from unstructured texts written in a natural language. In the method, a set of texts is tokenised into sentences, words and word sequences, rare words are deleted, words are brought to the initial form without typos, according to the words in the initial form, a selected plurality of words of certain parts of speech is selected, used in the description of the target information, the presence of the target information is determined in word sequences containing all words from the selected plurality, the presence of the target information is determined for all text documents containing marked word sequences, the amount of text sources, the word occurrence threshold, and the set of parts of speech are optimised to achieve a set quality of information extraction.

EFFECT: increased quality of information extraction from text data sources.

3 cl, 4 dwg, 1 tbl

Similar patents RU2751993C1

Title Year Author Number
WAY TO DEFINE AND CLASSIFY A CONCEPT BASED ON THE CONTEXT OF ITS USE 2022
  • Danilov Gleb Valerevich
  • Tsukanova Tatyana Vasilevna
  • Strunina Yuliya Vladimirovna
  • Ishankulov Timur Aleksandrovich
  • Kotik Konstantin Vladimirovich
  • Potapov Aleksandr Aleksandrovich
RU2795870C1
METHOD AND SYSTEM FOR DEPERSONALIZATION OF CONFIDENTIAL DATA 2022
  • Babak Nikita Grigorevich
  • Belorybkin Leonid Yurevich
  • Terenin Aleksej Alekseevich
  • Shabrova Anastasiya Igorevna
RU2804747C1
METHOD AND SYSTEM FOR DEPERSONALIZATION OF CONFIDENTIAL DATA 2022
  • Babak Nikita Grigorevich
  • Belorybkin Leonid Yurevich
  • Terenin Aleksej Alekseevich
  • Shabrova Anastasiya Igorevna
RU2802549C1
METHOD AND SYSTEM FOR EXTRACTING NAMED ENTITIES 2021
  • Vodolazskij Daniil Ivanovich
  • Gladkikh Prokhor Vladimirovich
  • Sorokin Semen Aleksandrovich
  • Cherkasov Roman Vladislavovich
  • Gazizov Kuat
RU2823914C2
METHOD OF GENERATING AND USING RECURSIVE INDEX OF SEARCH ENGINES 2011
  • Serebrennikov Oleg Aleksandrovich
RU2459242C1
METHOD AND SYSTEM FOR DETERMINING ACTIVITY OF ACCOUNTS IN COMPUTING ENVIRONMENT 2023
  • Uskov Svyatoslav Aleksandrovich
  • Kravchenko Andrej Alekseevich
  • Drachukov Andrej Aleksandrovich
  • Zhirov Dmitrij Viktorovich
RU2824919C1
EXPANDING OF INFORMATION SEARCH POSSIBILITY 2015
  • Danielyan Tatyana Vladimirovna
  • Indenbom Evgenij Mikhajlovich
RU2618375C2
SYSTEM FOR CREATING DOCUMENTS BASED ON TEXT ANALYSIS ON NATURAL LANGUAGE 2016
  • Danielyan Tatyana Vladimirovna
RU2639655C1
METHOD FOR PREDICTING SPEECH IMPAIRMENTS DURING NEUROSURGICAL INTERVENTIONS ACCORDING TO INTRAOPERATIVE REGISTRATION OF CORTICOCORTICAL EVOKED POTENTIALS 2022
  • Danilov Gleb Valerevich
  • Bykanov Andrej Egorovich
  • Ishankulov Timur Aleksandrovich
  • Titov Oleg Yurevich
  • Pitskhelauri David Ilich
RU2806013C1
METHOD AND SYSTEM FOR DIGITAL ASSISTANT TEXT GENERATION 2022
  • Tikhonova Mariya Ivanovna
RU2796208C1

RU 2 751 993 C1

Authors

Danilov Gleb Valerevich

Shifrin Mikhail Abramovich

Potapov Aleksandr Aleksandrovich

Strunina Yuliya Vladimirovna

Tsukanova Tatyana Vasilevna

Pronkina Tatyana Evgenevna

Kosyrkova Aleksandra Vyacheslavovna

Melchenko Semen Andreevich

Dates

2021-07-21Published

2020-09-09Filed