FIELD: information technology.
SUBSTANCE: method of searching for semantically similar electronic documents stored on data storage devices includes loading two electronic documents; determining search parameters by setting rules for generating a plurality of unique words which form a plurality of weighted unique words and weighted links between said words; constructing a semantic network and searching for semantically similar documents by comparing the semantic networks; additionally setting rules for generating stylistic images of documents by determining the size of transition frequency matrices and selecting elements of the transition frequency matrices, wherein elements of the transition frequency matrices either bigrams or trigrams; generating a transition frequency matrix of documents and comparing the transition frequency matrix of documents for similarity by calculating a similarity coefficient.
EFFECT: higher accuracy of searching for similar electronic documents in an array of documents of different styles.
2 dwg
Title | Year | Author | Number |
---|---|---|---|
METHOD OF SEARCHING FOR ELECTRONIC DOCUMENTS SIMILAR ON SEMANTIC CONTENT, STORED ON DATA STORAGE DEVICES | 2009 |
|
RU2420800C2 |
METHOD OF SEARCHING FOR SIMILAR FILES PLACED ON DATA STORAGE DEVICES | 2018 |
|
RU2663474C1 |
METHOD AND SYSTEM FOR PARAPHRASING TEXT | 2023 |
|
RU2814808C1 |
EXPANDING OF INFORMATION SEARCH POSSIBILITY | 2015 |
|
RU2618375C2 |
METHOD AND SYSTEM FOR GENERATING TEXT | 2023 |
|
RU2817524C1 |
METHOD FOR AUTOMATIC ITERATIVE CLUSTERISATION OF ELECTRONIC DOCUMENTS ACCORDING TO SEMANTIC SIMILARITY, METHOD FOR SEARCH IN PLURALITY OF DOCUMENTS CLUSTERED ACCORDING TO SEMANTIC SIMILARITY AND COMPUTER-READABLE MEDIA | 2014 |
|
RU2556425C1 |
METHOD AND SYSTEM FOR ARRANGING DIALOGUE WITH USER IN USER-FRIENDLY CHANNEL | 2018 |
|
RU2688758C1 |
METHOD AND SYSTEM FOR DIGITAL ASSISTANT TEXT GENERATION | 2022 |
|
RU2796208C1 |
EXTRACTION OF ENTITIES FROM TEXTS IN NATURAL LANGUAGE | 2015 |
|
RU2626555C2 |
COMPREHENSIVE AUTOMATIC PROCESSING OF TEXT INFORMATION | 2014 |
|
RU2662699C2 |
Authors
Dates
2015-12-20—Published
2014-02-03—Filed