METHOD OF HASHING FILES FOR FAST SEARCH OF DUPLICATES Russian patent published in 2024 - IPC G06F16/172 

Abstract RU 2825549 C1

FIELD: computer engineering.

SUBSTANCE: invention relates to computer engineering. Result is achieved due to the method of hashing files for fast search of duplicates, which consists of the following stages: protocol parser receives originals of files and their metadata from external sources; metadata and originals of files are stored in a database; files are scanned against the database, the hash for which has not been calculated yet; file for which hash has not yet been calculated is hashed, wherein if such a file is less than a given size, then the file is hashed completely, and if the size of the file is greater than the given size, then its first or last blocks of the given size are hashed; file size is added to the obtained hash, and the obtained hash is stored in a database with reference to the file; selection of specific files in a database in accordance with their hashes and search for duplicates.

EFFECT: faster and more accurate search for duplicate files and reduced load on the CPU and data storage system.

3 cl, 1 dwg

Similar patents RU2825549C1

Title Year Author Number
METHOD FOR PROCESSING AUDIO CONTENT AND SYSTEM FOR ITS IMPLEMENTATION 2022
  • Pangaev Dmitrij Viktorovich
RU2797759C1
METHOD AND SYSTEM FOR AUTOMATIC DETERMINATION OF FUZZY DUPLICATES OF VIDEO CONTENT 2018
  • Slipenchuk Pavel Vladimirovich
RU2677368C1
IMPLEMENTATION OF PARALLEL REHASHING OF HASH-TABLES FOR MULTITHREADED APPLICATIONS 2009
  • Malakhov Anton Aleksandrovich
RU2517238C2
METHODS AND DEVICE FOR EFFICIENT IMPLEMENTATION OF DATABASE SUPPORTING FAST COPYING 2018
  • Baird, Leemon C., Iii
  • Harmon, Mance
RU2740865C1
METHOD AND DEVICE FOR ANALYSIS OF DATA PACKETS 2012
  • Knott Kristof
RU2601201C2
METHODS AND DEVICE FOR EFFECTIVE IMPLEMENTATION OF DATABASE SUPPORTING FAST COPYING 2018
  • Berd, Limon S., Iii
  • Kharmon, Mans
RU2785613C2
AUTHENTICATION IN PROTECTED COMPUTERIZED GAME SYSTEM 2003
  • Dzhekson Mark D.
RU2302276C2
GENOMIC INFRASTRUCTURE FOR LOCAL AND CLOUD PROCESSING AND ANALYSIS OF DNA AND RNA 2017
  • Van Rojn, Piter
  • Makmillen, Robert Dzh.
  • Ryule, Majkl
  • Mekho, Rami
RU2761066C2
GENOMIC INFRASTRUCTURE FOR LOCAL AND CLOUD PROCESSING AND ANALYSIS OF DNA AND RNA 2017
  • Van Rojn, Piter
  • Makmillen, Robert Dzh.
  • Ryule, Majkl
  • Mekho, Rami
RU2804029C2
METHOD OF IDENTIFYING ARRAYS OF BINARY DATA 2015
  • Ryabokon Vladimir Vladimirovich
  • Lebedenko Evgenij Viktorovich
RU2601191C1

RU 2 825 549 C1

Authors

Matveev Lev Lazarevich

Dates

2024-08-27Published

2024-02-29Filed