FIELD: computing technology.
SUBSTANCE: method for clustering executable files implemented on a computer apparatus and containing the stages of: obtaining a set of executable files; determining the format of each executable file separately for each file format: finding repeating sequences of a set length in the files; determining the most frequent sequences; attributing files containing at least one most frequent sequence to one family; clearing all files attributed to this family from further processing; repeating the search for the most frequent sequences; attributing files containing at least one most frequent sequence to the next family and clearing said files from further processing until all files are attributed to some family or until the remaining files do not contain repeating sequences; in response to the remaining files not containing repeating sequences, attributing each of said files to a separate family.
EFFECT: ensured automatic clustering of executable files.
17 cl, 7 dwg
Authors
Dates
2022-08-29—Published
2021-03-29—Filed