Show metadata Hide metadata

(19)

(11)

2 754 920

(13)

(51)

IPC

G10L13/00(2006-01-01)

G10L15/16(2006-01-01)

G10L25/30(2013-01-01)

(21) (22)

Application

2020127476, 2020-08-17

(24)

Start date

2020-08-17

(22)

Actual filing date

2020-08-17

(45)

Published

2021-09-08

(72)

Inventor

Tagunov Petr VladimirovichGonta Vladislav Aleksandrovich

(73)

Holder

Avtonomnaya Nekommercheskaya Organizatsiya Podderzhki I Razvitiya Nauki, Upravleniya I Sotsialnogo Razvitiya Lyudej V Oblasti Razrabotki I Vnedreniya Iskusstvennogo Intellekta

METHOD FOR SPEECH SYNTHESIS WITH TRANSMISSION OF ACCURATE INTONATION OF THE CLONED SAMPLE Russian patent published in 2021 - IPC G10L13/00 G10L15/16 G10L25/30

Abstract RU 2754920 C1

FIELD: speech synthesis.

SUBSTANCE: invention relates to methods for speech synthesis using artificial neural networks and can be applied in synthesising the speech of a selected speaker with transmission of an accurate intonation of the cloned sample. A training dataset consisting of a text and a corresponding audio recording of the speech of the selected speaker is preliminarily prepared. Deep training of a neural network based on the training dataset is performed obtaining in a mel spectrogram of the voice of the selected speaker at the output, converting the mel spectrogram using a vocoder obtaining an audio file in the WAV format at the output. The Tacotron2 network is used as a deep-learning neural network, the Waveglow neural network is used as a vocoder. In the process of deep training of the Tacotron2 neural network, modification thereof is executed based on the prepared dataset by means of increasing the amount of weights of the model and expanding the volume of the memory thereof. The trained neural network and vocoder are reused in order to convert random text uploaded by the user into the speech of the selected speaker, obtaining an audio file of the random text voiced by the selected speaker at the output.

EFFECT: technical result of the invention consists in achieving transmission of an accurate intonation of the cloned sample of speech of the selected speaker in a natural language.

1 cl

Similar patents RU2754920C1

Title	Year	Author	Number
METHOD FOR DETERMINING PARKINSONIAN SIGNS BY VOICE USING ARTIFICIAL INTELLIGENCE	2023	Khasanova Diana Magomedovna Khasanov Ildar Akramovich Zalialova Zuleikha Abdullazianovna Sukhachev Pavel Sergeevich Smirnova Anna Sergeevna	RU2841464C2
METHOD AND SERVER FOR WAVEFORM GENERATION	2021	Kirichenko Vladimir Vladimirovich Molchanov Aleksandr Aleksandrovich Chernenkov Dmitry Mikhailovich Babenko Artem Valerevich Aliev Vladimir Andreevich Baranchuk Dmitry Aleksandrovich	RU2803488C2
UNCONTROLLED VOICE RESTORATION USING UNCONDITIONED DIFFUSION MODEL WITHOUT TEACHER	2023	Andreev Pavel Konstantinovich Iashchenko Anastasia Sergeevna Shchekotov Ivan Sergeevich Babaev Nicholas Andrew	RU2823017C1
AUDIO DATA GENERATOR AND METHODS OF GENERATING AUDIO SIGNAL AND TRAINING AUDIO DATA GENERATOR	2021	Ahmed, Ahmed Mustafa Mahmoud Pia, Nicola Fuchs, Guillaume Multrus, Markus Korse, Srikanth Gupta, Kishan Buethe, Jan	RU2823015C1
AUDIO DATA GENERATOR AND METHODS OF GENERATING AUDIO SIGNAL AND TRAINING AUDIO DATA GENERATOR	2021	Ahmed, Ahmed Mustafa Mahmoud Pia, Nicola Fuchs, Guillaume Multrus, Markus Korse, Srikanth Gupta, Kishan Buethe, Jan	RU2823016C1
METHODS AND SERVERS FOR TRAINING MODEL TO DETECT SPEAKER CHANGE	2024	Gritskevich Evgenii Marianovich	RU2841235C1
TECHNOLOGY FOR ANALYZING ACOUSTIC DATA FOR SIGNS OF COVID-19 DISEASE	2021	Samsonov Pavel Romanovich Mikhajlov Dmitrij Mikhajlovich Chumanskaya Vera Vasilevna Dvoryankin Sergej Vladimirovich	RU2758649C1
METHOD FOR DIAGNOSING A PATIENT FOR SIGNS OF RESPIRATORY INFECTION BY MEANS OF CNN WITH AN ATTENTION MECHANISM AND A SYSTEM FOR ITS IMPLEMENTATION	2021	Samsonov Pavel Romanovich Mikhajlov Dmitrij Mikhajlovich Chumanskaya Vera Vasilevna	RU2758648C1
METHOD FOR DIAGNOSING SIGNS OF BRONCHOPULMONARY DISEASES ASSOCIATED WITH COVID-19 VIRUS DISEASE	2021	Samsonov Pavel Romanovich Mikhajlov Dmitrij Mikhajlovich Chumanskaya Vera Vasilevna	RU2758550C1
METHOD AND SERVER FOR SPEECH SYNTHESIS IN TEXT	2015	Edrenkin Ilya Vladimirovich	RU2632424C2

RU 2 754 920 C1

Authors

Tagunov Petr Vladimirovich

Gonta Vladislav Aleksandrovich

Dates

2021-09-08—Published

2020-08-17—Filed