FIELD: computer engineering.
SUBSTANCE: invention relates to computer engineering for processing audio data. Result is achieved by steps of obtaining a first voice characteristic from any audio sample fragment using a voice extraction network in an initial voice conversion model; obtaining a first semantic characteristic based on a first voice characteristic and a linear spectrogram corresponding to the audio data sample using a voice removal network in the original voice conversion model, wherein the first semantic characteristic is a characteristic of the audio data sample, which is not associated with the speaker’s voice, but with semantic information; obtaining synthesized audio data based on a first semantic characteristic and a second voice characteristic of the target audio data corresponding to the audio data sample using a vocoder in the original voice conversion model; and obtaining a trained voice conversion model by training the initial voice conversion model based on the target audio data and the synthesized audio data corresponding to each audio data sample fragment.
EFFECT: high accuracy of content information of an initial audio signal during voice conversion.
18 cl, 8 dwg
Title | Year | Author | Number |
---|---|---|---|
UNCONTROLLED VOICE RESTORATION USING UNCONDITIONED DIFFUSION MODEL WITHOUT TEACHER | 2023 |
|
RU2823017C1 |
AUDIO DATA GENERATOR AND METHODS OF GENERATING AUDIO SIGNAL AND TRAINING AUDIO DATA GENERATOR | 2021 |
|
RU2823015C1 |
METHOD FOR SPEECH SYNTHESIS WITH TRANSMISSION OF ACCURATE INTONATION OF THE CLONED SAMPLE | 2020 |
|
RU2754920C1 |
AUDIO DATA GENERATOR AND METHODS OF GENERATING AUDIO SIGNAL AND TRAINING AUDIO DATA GENERATOR | 2021 |
|
RU2823016C1 |
TEXT-DEPENDENT VOICE CONVERSION METHOD | 2010 |
|
RU2427044C1 |
METHOD FOR AUDIOVISUAL RECOGNITION OF PERSONAL PROTECTION EQUIPMENT ON HUMAN FACE | 2022 |
|
RU2791415C1 |
METHOD AND SERVER FOR SPEECH SYNTHESIS IN TEXT | 2015 |
|
RU2632424C2 |
METHOD OF RE-SOUNDING AUDIO MATERIALS AND APPARATUS FOR REALISING SAID METHOD | 2012 |
|
RU2510954C2 |
METHOD AND SERVER FOR WAVEFORM GENERATION | 2021 |
|
RU2803488C2 |
SPEECH SYNTHESIS AND CODING METHODS | 2010 |
|
RU2557469C2 |
Authors
Dates
2024-11-26—Published
2022-12-20—Filed