FIELD: computer engineering.
SUBSTANCE: invention relates to methods of organizing parallel computer calculations. Result is achieved by performing a method comprising steps of: obtaining input data tensors and input weights from high-throughput memory; separating tensors of input data and weights into several parts; required amount of matrix memory is reserved; alternately loading parts of input data from high-throughput memory into vector memory; dividing parts of the input data into several sub-parts taking into account the reserved amount of memory and loading the sub-parts into the memory; alternately loading parts of weights into memory, wherein loading parts of weights into memory is carried out in parallel with loading of sub-parts of input data; processing the input data and weights read from the matrix memory, as a result of which sub-parts of the resultant tensor are obtained; parts of the resultant tensor are loaded into high-throughput memory in parallel with processing of the next sub-parts of the input data and parts of the weights by at least one matrix multiplier based on systolic arrays.
EFFECT: improving the performance of the tiling process of the convolution operation.
6 cl, 2 dwg
Title | Year | Author | Number |
---|---|---|---|
METHOD FOR CONSTRUCTING PROCESSORS FOR OUTPUT IN CONVOLUTIONAL NEURAL NETWORKS BASED ON DATA-FLOW COMPUTING | 2020 |
|
RU2732201C1 |
MATRIX-VECTOR MULTIPLIER WITH A SET OF REGISTERS FOR STORING VECTORS CONTAINING MULTIPORT MEMORY | 2019 |
|
RU2795887C2 |
METHOD OF PROCESSING DATA BY MEANS OF A NEURAL NETWORK SUBJECTED TO DECOMPOSITION TAKING INTO ACCOUNT THE AMOUNT OF MEMORY OF A COMPUTING DEVICE (VERSIONS), AND A COMPUTER-READABLE MEDIUM | 2023 |
|
RU2820172C1 |
VECTOR COMPUTING DEVICE | 2024 |
|
RU2830044C1 |
METHOD AND DEVICE FOR MAP GENERALIZATION | 2022 |
|
RU2803880C1 |
SPECIALIZED COMPUTING SYSTEM DESIGNED FOR INFERENCE IN DEEP NEURAL NETWORKS BASED ON STREAM PROCESSORS | 2022 |
|
RU2793084C1 |
METHOD OF DETERMINING VECTOR-MATRIX TRANSFORMATION RESULTS IN CONCURRENT ACOUSTOOPTICAL PROCESSING UNITS | 0 |
|
SU1735836A1 |
METHOD AND DEVICE FOR ENCODING AND DECODING DATA | 2005 |
|
RU2370886C2 |
METHOD AND DEVICE FOR IMPROVING SPEECH SIGNAL USING FAST FOURIER CONVOLUTION | 2022 |
|
RU2795573C1 |
CONTROL VECTOR COMPUTER SYSTEM | 0 |
|
SU1120340A1 |
Authors
Dates
2024-11-11—Published
2024-07-15—Filed