FIELD: speech synthesis.
SUBSTANCE: invention relates to methods for speech synthesis using artificial neural networks and can be applied in synthesising the speech of a selected speaker with transmission of an accurate intonation of the cloned sample. A training dataset consisting of a text and a corresponding audio recording of the speech of the selected speaker is preliminarily prepared. Deep training of a neural network based on the training dataset is performed obtaining in a mel spectrogram of the voice of the selected speaker at the output, converting the mel spectrogram using a vocoder obtaining an audio file in the WAV format at the output. The Tacotron2 network is used as a deep-learning neural network, the Waveglow neural network is used as a vocoder. In the process of deep training of the Tacotron2 neural network, modification thereof is executed based on the prepared dataset by means of increasing the amount of weights of the model and expanding the volume of the memory thereof. The trained neural network and vocoder are reused in order to convert random text uploaded by the user into the speech of the selected speaker, obtaining an audio file of the random text voiced by the selected speaker at the output.
EFFECT: technical result of the invention consists in achieving transmission of an accurate intonation of the cloned sample of speech of the selected speaker in a natural language.
1 cl
Title | Year | Author | Number |
---|---|---|---|
METHOD AND SERVER FOR WAVEFORM GENERATION | 2021 |
|
RU2803488C2 |
UNCONTROLLED VOICE RESTORATION USING UNCONDITIONED DIFFUSION MODEL WITHOUT TEACHER | 2023 |
|
RU2823017C1 |
AUDIO DATA GENERATOR AND METHODS OF GENERATING AUDIO SIGNAL AND TRAINING AUDIO DATA GENERATOR | 2021 |
|
RU2823015C1 |
AUDIO DATA GENERATOR AND METHODS OF GENERATING AUDIO SIGNAL AND TRAINING AUDIO DATA GENERATOR | 2021 |
|
RU2823016C1 |
TECHNOLOGY FOR ANALYZING ACOUSTIC DATA FOR SIGNS OF COVID-19 DISEASE | 2021 |
|
RU2758649C1 |
METHOD FOR DIAGNOSING A PATIENT FOR SIGNS OF RESPIRATORY INFECTION BY MEANS OF CNN WITH AN ATTENTION MECHANISM AND A SYSTEM FOR ITS IMPLEMENTATION | 2021 |
|
RU2758648C1 |
METHOD FOR DIAGNOSING SIGNS OF BRONCHOPULMONARY DISEASES ASSOCIATED WITH COVID-19 VIRUS DISEASE | 2021 |
|
RU2758550C1 |
METHOD AND SERVER FOR SPEECH SYNTHESIS IN TEXT | 2015 |
|
RU2632424C2 |
METHOD OF RE-SOUNDING AUDIO MATERIALS AND APPARATUS FOR REALISING SAID METHOD | 2012 |
|
RU2510954C2 |
METHOD AND SYSTEM FOR DETECTING FRAUDULENT CALLS AND ALERTING SUBSCRIBERS THEREABOUT | 2022 |
|
RU2820329C2 |
Authors
Dates
2021-09-08—Published
2020-08-17—Filed