Show metadata Hide metadata

(19)

(11)

2 823 216

(13)

(51)

IPC

G06T13/00(2011-01-01)

G06F40/00(2020-01-01)

G06N20/00(2019-01-01)

(21) (22)

Application

2024101227, 2024-01-18

(24)

Start date

2024-01-18

(22)

Actual filing date

2024-01-18

(45)

Published

2024-07-22

(72)

Inventor

Demochkin Kirill VladislavovichSobolev Konstantin VictorovichKuzhamuratov Arsen RinatovichZhirnov Mikhail DenisovichBortnikov Mikhail EvgenievichChernyavskiy Alexey Stanislavovich

(73)

Holder

Samsung Electronics Co., Ltd.

METHOD AND DEVICE FOR GENERATING VIDEO CLIP FROM TEXT DESCRIPTION AND SEQUENCE OF KEY POINTS SYNTHESIZED BY DIFFUSION MODEL Russian patent published in 2024 - IPC G06T13/00 G06F40/00 G06N20/00

Abstract RU 2823216 C1

FIELD: physics.

SUBSTANCE: invention relates to the field of machine learning models implementing the synthesis of video clips based on text descriptions. A method of generating a video clip from a text description includes steps of: receiving a text description of the video clip to be generated; obtaining a vector representation e of the text description according to the vector space of the pre-trained neural network model of linking images and text descriptions based on the received text description of the video clip; vector representation 2×N×L sequence of key points of the generated video clip, which is synthesized by the trained diffusion motion model based on the vector representation e of the text description, where 2 is the number of coordinates, the first coordinate indicates the height H of the frame, and the second coordinate indicates the width W of the frame, N is the number of key points on each frame, and L is the number of frames in the video; displaying a vector representation of a sequence of key points of the generated video clip into a row of L two-dimensional images of key points of the generated video clip, wherein each two-dimensional image of key points from said row corresponds to a corresponding frame of the generated video clip, and generating a sequence of video clip frames using a pretrained stable diffusion model, wherein generation of each frame of the video clip by the stable diffusion model is additionally controlled by the controlling neural network model based on the two-dimensional image of key points of the corresponding frame from said row.

EFFECT: possibility of synthesizing a high-quality video clip from a text description without using reference video clips.

18 cl, 9 dwg

Similar patents RU2823216C1

Title	Year	Author	Number
METHOD OF SYNTHESIZING VIDEO FROM INPUT FRAME USING AUTOREGRESSIVE METHOD, USER ELECTRONIC DEVICE AND COMPUTER-READABLE MEDIUM FOR REALIZING SAID METHOD	2023	Demochkin Kirill Vladislavovich Sobolev Konstantin Victorovich Kuzhamuratov Arsen Rinatovich Gabdullina Svetlana Alexandrovna Chernyavskiy Alexey Stanislavovich	RU2829010C1
METHOD AND SYSTEM FOR GENERATING TRAINING DATA FOR MACHINE LEARNING ALGORITHM	2023	Pavlichenko Nikita Vitalevich Ustalov Dmitrij Alekseevich	RU2831408C2
METHOD AND SYSTEM FOR TRAINING CHATBOT SYSTEM	2023	Zinov Nikolaj Aleksandrovich Korenev Artem Arkadevich	RU2820264C1
METHOD AND SYSTEM FOR OBTAINING VECTOR PRESENTATIONS OF DATA IN TABLE TAKING INTO ACCOUNT STRUCTURE OF TABLE AND ITS CONTENT	2024	Volkov Maksim Aleksandrovich	RU2839037C1
METHOD OF CONTROLLING ON-BOARD SYSTEMS OF UNMANNED VEHICLES USING NEURAL NETWORKS BASED ON ARCHITECTURE OF TRANSFORMERS	2024	Karim Atef Abdelmagid Abdo Eldakruri Khegazi Mostafa Ajman Akhmed Mokhamed Rashid Bader	RU2841111C1
METHOD AND SYSTEM FOR AUTHENTICATING FACE ON IMAGE	2024	Mikheyushkin Vladimir Igorevich Mityagin Kirill Sergeevich Sosulnikov Mikhail Vyacheslavovich Kononykhin Danil Aleksandrovich Varfolomeeva Anna Andreevna Telegina Kseniya Antonovna	RU2840316C1
METHOD AND SYSTEM FOR RECOGNIZING USER'S SPEECH FRAGMENT	2021	Ershov Vasily Alekseevich Kuralenok Igor Evgenevich	RU2808582C2
SYSTEM AND METHOD FOR TRAINING MACHINE LEARNING MODELS FOR RANKING SEARCH RESULTS	2023	Boimel Aleksandr Alekseevich Gusev Daniil Vladimirovich Kulunchakov Andrei Sergeevich Mironov Artem Vladimirovich	RU2829065C1
METHOD AND SYSTEM FOR SEARCHING GRAPHIC IMAGES	2022	Shulga Sergej Aleksandrovich	RU2807639C1
METHODS AND SERVERS FOR TRAINING MODEL TO DETECT SPEAKER CHANGE	2024	Gritskevich Evgenii Marianovich	RU2841235C1

RU 2 823 216 C1

Authors

Demochkin Kirill Vladislavovich

Sobolev Konstantin Victorovich

Kuzhamuratov Arsen Rinatovich

Zhirnov Mikhail Denisovich

Bortnikov Mikhail Evgenievich

Chernyavskiy Alexey Stanislavovich

Dates

2024-07-22—Published

2024-01-18—Filed