FIELD: computer technologies.
SUBSTANCE: invention relates to methods of processing and analyzing audio recordings, and can be used to improve the quality and intelligibility of speech recordings. Method of reconstructing voice in speech recordings comprises steps of: receiving audio data of a speech recording containing a voice audio signal; applying a probabilistic diffusion model, trained on the task of selecting a contribution function for noise reduction, for unconstrained generation of speech, wherein the probabilistic diffusion model is applied to the audio data of the speech recording in the form of a waveform containing random Gaussian noise; performing iterative sampling of the waveform using the conditional contribution function, which is the sum of the unconstrained contribution function, estimated by the probabilistic diffusion model, and logarithmic likelihood, to obtain a sample with a reduced amount of noise for the next iteration, until a denoised speech waveform is obtained; and outputting the processed voice audio signal containing the denoised speech signal waveform. Also disclosed is a system and a machine-readable medium for implementing the method.
EFFECT: high quality and legibility of speech recordings.
18 cl, 2 dwg, 5 tbl
Title | Year | Author | Number |
---|---|---|---|
METHOD AND DEVICE FOR IMPROVING SPEECH SIGNAL USING FAST FOURIER CONVOLUTION | 2022 |
|
RU2795573C1 |
METHOD FOR IMPROVING A SPEECH SIGNAL WITH A LOW DELAY, A COMPUTING DEVICE AND A COMPUTER-READABLE MEDIUM THAT IMPLEMENTS THE ABOVE METHOD | 2023 |
|
RU2802279C1 |
AUDIO DATA GENERATOR AND METHODS OF GENERATING AUDIO SIGNAL AND TRAINING AUDIO DATA GENERATOR | 2021 |
|
RU2823016C1 |
AUDIO DATA GENERATOR AND METHODS OF GENERATING AUDIO SIGNAL AND TRAINING AUDIO DATA GENERATOR | 2021 |
|
RU2823015C1 |
METHOD OF MULTIMODAL CONTACTLESS CONTROL OF MOBILE INFORMATION ROBOT | 2020 |
|
RU2737231C1 |
AUDIO ENCODER AND AUDIO SIGNAL ENCODING METHOD | 2016 |
|
RU2707144C2 |
METHOD AND DISCRIMINATOR FOR CLASSIFYING DIFFERENT SIGNAL SEGMENTS | 2009 |
|
RU2507609C2 |
DEVICE, METHOD, OR COMPUTER PROGRAM FOR GENERATING AN EXTENDED-BAND AUDIO SIGNAL USING A NEURAL NETWORK PROCESSOR | 2018 |
|
RU2745298C1 |
SYSTEM AND METHOD FOR AUTOMATED ASSESSMENT OF INTENTIONS AND EMOTIONS OF USERS OF DIALOGUE SYSTEM | 2020 |
|
RU2762702C2 |
METHOD OF DETERMINING RISK OF DEVELOPMENT OF INDIVIDUAL'S DISEASE BY THEIR VOICE AND HARDWARE-SOFTWARE COMPLEX FOR METHOD REALISATION | 2013 |
|
RU2559689C2 |
Authors
Dates
2024-07-17—Published
2023-07-04—Filed