METHOD FOR DETECTING ANOMALIES IN MULTIDIMENSIONAL DATA Russian patent published in 2022 - IPC G06N3/08 G06F11/30 

Abstract RU 2773010 C1

FIELD: computer technology.

SUBSTANCE: invention relates to a method for detecting anomalies in multidimensional data in a computing system. In the method, a personal computer is started in the mode of controlled normal operation; a training sample is formed, which is a two-dimensional numeric array; the minimum and sufficient number of rows of the training sample Nmin is selected for training the autoencoder; if the number of rows of the generated training sample N exceeds Nmin, then the training sample is compressed by performing the following actions: calculating the compression ratio Kcmp=N/Nmin; creating an empty two-dimensional numeric array Tcmp; finding the number of repetitions Ri of each unique row in the training sample, where i is the number of a unique row; for each unique row i Rmini=Ri/Kcmp is calculated, rounding the resulting Rmini to the nearest integer, while as the result of rounding, Rmini turns out to be equal to 0, then setting Rmini=1; each unique row i is added Rmini times to the end of the Tcmp array; a Tcmp array is used as a training sample; then the first autoencoder is formed containing an input layer, at least one hidden layer and an output layer, and the sizes of the input and output layers are the same, the size of the hidden layer is smaller than the size of the input layer; the first autoencoder is trained using the training sample; selecting the rounding accuracy of IRE values; calculating the instantaneous reconstruction error IREj for each row j of the training sample and rounding IREj with the selected rounding accuracy; based on the obtained IRE values, two one-dimensional numeric arrays are formed: an IREu array containing only unique IRE values in ascending order, an IREc array containing the number of repetitions of unique IRE values for the training sample; if the IREu array contains less than three elements, and the first element of the array is 0, then the first element of the array is set equal to 0.01; setting the threshold for instantaneous reconstruction error IREm to IREu0, where 0 is the number of the first element of the array; if the IREu array contains more than one element, then the following actions are performed: selecting the value of the Kout ejection criterion; if the IREu array contains two elements, and at the same time IREu1-IREu0≤Kout, then IREm is set equal to IREu1; if the IREu array contains more than two elements, then the following actions are performed: finding the largest CNTm value of the IREc array; calculating a one-dimensional array of metrics M according to the formula: where k is the number elements of the IREu array; obtaining the Msrt array by sorting the elements of the M array in ascending order; the threshold of the instantaneous error of IREm reconstruction is set to the value of the last element of the IREu array, while the Kan variable is assigned the value 0; the elements of the IREu array are checked, starting from the third from the beginning to the last, where k is the number of the current element of the array, numbering starts from 0, and if for the k-th element of the array, the two conditions are met simultaneously: and then the checking of the IREu array is stopped, while the current value of k is assigned to the variable Kan; if Kan>0, then the elements of the IREu array are checked in reverse order, starting from the last element and up to the element with the Kan number, where n is the number of the current element of the array, and if the Mn≥MsrtKan condition is met for the nth element of the array, then IREm is set equal to IREun-1, otherwise checking the array IREu is terminated; outlier rows are removed from the training sample, for which IREj>IREm, where j is the row number of the training sample; duplicate rows are removed from the training sample; the first autoencoder is removed; a second autoencoder containing an input layer, at least one hidden layer and an output layer is formed, and the sizes of the input and output layers are the same, the size of the hidden layer is smaller than the size of the input layer; the second autoencoder is trained using the resulting training sample; the instantaneous reconstruction error IREj is calculated for each row j of the training sample and IREj is rounded with the selected accuracy; the threshold of instantaneous reconstruction error IREm is set equal to the largest value of IRE for the training sample; a test sample is formed, which is a two-dimensional numeric array; a reconstruction of the test sample is performed by a trained second autoencoder; the IRE is calculated for each row of the test sample and round with the selected rounding accuracy; if the IRE of the test sample row exceeds IREm, then this row is considered abnormal and marked; a report is generated on the anomalous rows found in the test sample.

EFFECT: reducing the preparation time of the autoencoder for detecting anomalies and reducing the number of errors of the first (false positive) and second type (false negative) when detecting anomalies.

3 cl, 2 tbl

Similar patents RU2773010C1

Title Year Author Number
METHOD FOR DETECTING ANOMALOUS NETWORK TRAFFIC 2023
  • Zmitrovich Nikolaj Leonidovich
RU2811840C1
INTELLIGENT AUDIO-ANALYTICAL DEVICE AND METHOD FOR SPACECRAFTS 2019
  • Szurley Joseph
  • Das Samarjit
RU2793797C2
METHOD OF DETECTING ANOMALY OF HYPERSPECTRAL IMAGE BASED ON "TRAINING-TRAINED" MODEL, COMPUTER DATA MEDIUM AND DEVICE 2023
  • Zhou, Zuofeng
  • Zheng, Xiangtao
RU2817001C1
ESTIMATION OF THE THICKNESS OF THE CARDIAC WALL BASED ON ECG RESULTS 2020
  • Tsoref, Liat
  • Auerbach, Shmuel
  • Amit, Matityahu
  • Amos, Yariv Avraham
  • Shalgi, Avi
RU2767883C1
METHOD AND SYSTEM FOR WARNING ABOUT UPCOMING ANOMALIES IN THE DRILLING PROCESS 2021
  • Simon Igor Vladimirovich
  • Koryabkin Vitalij Viktorovich
  • Makarov Viktor Aleksandrovich
  • Osmonalieva Oksana Taalaevna
  • Bajbolov Timur Serikbaevich
  • Semenikhin Artem Sergeevich
  • Chebunyaev Igor Aleksandrovich
  • Vasilev Vasilij Olegovich
  • Golitsyna Mariya Vadimovna
  • Stiven Lord
RU2772851C1
METHOD FOR DETECTION OF ANOMALIES IN SHAPE OF ELECTRICAL SIGNAL 2021
  • Nedorezov Dmitrii Aleksandrovich
RU2786156C1
METHOD AND SYSTEM FOR STATIC ANALYSIS OF EXECUTABLE FILES BASED ON PREDICTIVE MODELS 2020
  • Prudkovskij Nikolaj Sergeevich
RU2759087C1
SYSTEM AND METHOD FOR DETERMINING ANOMALY SOURCE IN CYBER-PHYSICAL SYSTEM HAVING CERTAIN CHARACTERISTICS 2018
  • Lavrentev Andrej Borisovich
  • Vorontsov Artem Mikhajlovich
  • Filonov Pavel Vladimirovich
  • Shalyga Dmitrij Konstantinovich
  • Shkulev Vyacheslav Igorevich
  • Demidov Nikolaj Nikolaevich
  • Ivanov Dmitrij Aleksandrovich
RU2724075C1
METHOD FOR OBTAINING LOW-DIMENSIONAL NUMERIC REPRESENTATIONS OF SEQUENCES OF EVENTS 2020
  • Babaev Dmitrij Leonidovich
  • Ovsov Nikita Pavlovich
  • Kireev Ivan Aleksandrovich
RU2741742C1
METHOD FOR DIAGNOSING A COMPLEX OF ON-BOARD EQUIPMENT OF AIRCRAFT BASED ON MACHINE LEARNING AND A DEVICE FOR ITS IMPLEMENTATION 2023
  • Bukirev Aleksandr Sergeevich
  • Savchenko Andrej Yurevich
  • Ippolitov Sergej Viktorovich
  • Kryachkov Vyacheslav Nikolaevich
  • Resnyanskij Sergej Nikolaevich
RU2816667C1

RU 2 773 010 C1

Authors

Guzev Oleg Yurevich

Gurina Anastasiya Olegovna

Dates

2022-05-30Published

2021-09-08Filed