FIELD: information storage.
SUBSTANCE: invention relates to a method, an apparatus, a computer-readable data storage medium for compression of a neural network model and a method for translation of a language corpora. The method includes obtaining a set of training samples including multiple pairs of training samples, wherein each pair of training samples includes source data and target data corresponding with the source data; training the initial teacher model using said initial data as input data and using said target data as control data; training one or more intermediate teacher models based on said set of training samples and the initial teacher model, wherein said one or more intermediate teacher models form a set of teacher models; training multiple candidate student models based on said set of training samples, initial teacher model and set of teacher models, wherein said set of candidate student models forms a set of student models; estimating the accuracy of the output results of the multiple candidate student models using a set of control data and selecting one of the multiple candidate student models as a target student model in accordance with the accuracy, wherein the number of model parameters of any of the intermediate teacher models is less than that of the initial teacher model and the number of model parameters of the candidate student models is less than that of any of the intermediate teacher models.
EFFECT: increased efficiency of compression of a neural network model.
18 cl, 9 dwg
Title | Year | Author | Number |
---|---|---|---|
METHOD AND SERVER FOR WAVEFORM GENERATION | 2021 |
|
RU2803488C2 |
METHOD AND SERVER FOR TRAINING MACHINE LEARNING ALGORITHM IN TRANSLATION | 2020 |
|
RU2770569C2 |
USE OF AUTOENCODERS FOR LEARNING TEXT CLASSIFIERS IN NATURAL LANGUAGE | 2017 |
|
RU2678716C1 |
GENERATING PSEUDO-CT FROM MR-DATA USING A REGRESSION MODEL BASED ON FEATURES | 2016 |
|
RU2703344C1 |
METHOD FOR GENERATING MATHEMATICAL MODELS OF A PATIENT USING ARTIFICIAL INTELLIGENCE TECHNIQUES | 2017 |
|
RU2720363C2 |
TRAINING OF DNN-STUDENT BY MEANS OF OUTPUT DISTRIBUTION | 2014 |
|
RU2666631C2 |
METHOD OF INTERPRETING ARTIFICIAL NEURAL NETWORKS | 2018 |
|
RU2689818C1 |
OPTICAL CHARACTER RECOGNITION BY MEANS OF COMBINATION OF NEURAL NETWORK MODELS | 2020 |
|
RU2768211C1 |
CHARACTER RECOGNITION USING A HIERARCHICAL CLASSIFICATION | 2018 |
|
RU2693916C1 |
SPEAKER VERIFICATION | 2017 |
|
RU2697736C1 |
Authors
Dates
2021-06-21—Published
2019-11-26—Filed