FIELD: computer engineering.
SUBSTANCE: invention relates to computer engineering. Result is achieved due to the method of hashing files for fast search of duplicates, which consists of the following stages: protocol parser receives originals of files and their metadata from external sources; metadata and originals of files are stored in a database; files are scanned against the database, the hash for which has not been calculated yet; file for which hash has not yet been calculated is hashed, wherein if such a file is less than a given size, then the file is hashed completely, and if the size of the file is greater than the given size, then its first or last blocks of the given size are hashed; file size is added to the obtained hash, and the obtained hash is stored in a database with reference to the file; selection of specific files in a database in accordance with their hashes and search for duplicates.
EFFECT: faster and more accurate search for duplicate files and reduced load on the CPU and data storage system.
3 cl, 1 dwg
Title |
Year |
Author |
Number |
METHOD FOR PROCESSING AUDIO CONTENT AND SYSTEM FOR ITS IMPLEMENTATION |
2022 |
- Pangaev Dmitrij Viktorovich
|
RU2797759C1 |
METHOD AND SYSTEM FOR AUTOMATIC DETERMINATION OF FUZZY DUPLICATES OF VIDEO CONTENT |
2018 |
- Slipenchuk Pavel Vladimirovich
|
RU2677368C1 |
IMPLEMENTATION OF PARALLEL REHASHING OF HASH-TABLES FOR MULTITHREADED APPLICATIONS |
2009 |
- Malakhov Anton Aleksandrovich
|
RU2517238C2 |
METHODS AND DEVICE FOR EFFICIENT IMPLEMENTATION OF DATABASE SUPPORTING FAST COPYING |
2018 |
- Baird, Leemon C., Iii
- Harmon, Mance
|
RU2740865C1 |
METHOD AND DEVICE FOR ANALYSIS OF DATA PACKETS |
2012 |
|
RU2601201C2 |
METHODS AND DEVICE FOR EFFECTIVE IMPLEMENTATION OF DATABASE SUPPORTING FAST COPYING |
2018 |
- Berd, Limon S., Iii
- Kharmon, Mans
|
RU2785613C2 |
AUTHENTICATION IN PROTECTED COMPUTERIZED GAME SYSTEM |
2003 |
|
RU2302276C2 |
GENOMIC INFRASTRUCTURE FOR LOCAL AND CLOUD PROCESSING AND ANALYSIS OF DNA AND RNA |
2017 |
- Van Rojn, Piter
- Makmillen, Robert Dzh.
- Ryule, Majkl
- Mekho, Rami
|
RU2761066C2 |
GENOMIC INFRASTRUCTURE FOR LOCAL AND CLOUD PROCESSING AND ANALYSIS OF DNA AND RNA |
2017 |
- Van Rojn, Piter
- Makmillen, Robert Dzh.
- Ryule, Majkl
- Mekho, Rami
|
RU2804029C2 |
METHOD OF IDENTIFYING ARRAYS OF BINARY DATA |
2015 |
- Ryabokon Vladimir Vladimirovich
- Lebedenko Evgenij Viktorovich
|
RU2601191C1 |