FIELD: information technology.
SUBSTANCE: all electronic files of reference documents are first converted to a predetermined format while selecting in each document comprehensible fragments referred to as clauses, and the converted electronic files of reference documents are stored in a database. Each electronic file of an analysed document is converted to a predetermined format. A match between the selected clauses in the electronic file of the analysed document and the selected clauses in the electronic files of the reference documents is detected. The relative number of clauses in the electronic file of the analysed document matching corresponding clauses of each of the electronic files of the reference documents is counted. The relative number of matches found is then compared with a predetermined threshold value in order to determine presence of text excerpts of any of the reference documents in the electronic file of the analysed document.
EFFECT: wider range of apparatus by designing a relatively fast and universal method which enables to detect expressions, phrases or even text excerpts in a document from other documents.
5 cl, 2 dwg
Title | Year | Author | Number |
---|---|---|---|
METHOD OF IDENTIFYING ARRAYS OF BINARY DATA | 2015 |
|
RU2601191C1 |
METHOD OF CLASSIFYING DOCUMENTS BY CATEGORIES | 2012 |
|
RU2491622C1 |
METHOD FOR GENERATION OF ELECTRONIC DOCUMENT AND ITS COPIES | 2013 |
|
RU2543928C1 |
AUTOMATED LEGAL ADVICE SYSTEM CONTROL METHOD | 2019 |
|
RU2718978C1 |
METHOD FOR AUTOMATED ANALYSIS OF REFERENCE FORMS | 2013 |
|
RU2581766C2 |
METHOD OF AUTOMATED VECTOR IMAGE ANALYSIS | 2016 |
|
RU2633156C1 |
SYSTEM OF AUTOMATED ANALYSIS OF DOWN-LOADING FROM DATA BASES | 2013 |
|
RU2546583C2 |
METHOD TO DETECT TEXT OBJECTS | 2012 |
|
RU2498401C2 |
METHOD FOR CHECKING INTEGRITY AND AUTHENTICITY OF ELECTRONIC DOCUMENTS IN TEXT FORMAT STORED AS HARD COPY | 2015 |
|
RU2591655C1 |
METHOD OF ASSOCIATING PREVIOUSLY UNKNOWN FILE WITH COLLECTION OF FILES DEPENDING ON DEGREE OF SIMILARITY | 2009 |
|
RU2420791C1 |
Authors
Dates
2013-02-10—Published
2011-11-18—Filed