DIALOG DETECTOR Russian patent published in 2023 - IPC G10L25/81

Abstract RU 2807170 C2

FIELD: computer technology.

SUBSTANCE: invention relates to the extracting audio features in a dialogue detector in response to an input audio signal. The technical result consists in increasing the performance of extracting sound features when using several context windows, each of which contains a different number of frames to represent the frame in different contexts. The technical result is achieved by dividing the input audio signal into many frames; extracting frame audio features from each frame I; defining a set of context windows, where each context window contains a number of frames surrounding the current frame; deriving, for each context window, a corresponding contextual audio feature for the current frame based on the frame audio features of the frames in each corresponding context; performing concatenation on each contextual audio feature to form a combined feature vector to represent the current frame; and obtaining a speech confidence score representing the probability of dialogue occurring in the current frame using the combined feature vector, wherein the number of frames in one or more context windows is determined adaptively based on the extracted frame audio features.

EFFECT: increasing the performance of extracting sound features when using several context windows.

12 cl, 12 dwg

Similar patents RU2807170C2

Title	Year	Author	Number
VOLUME EQUALIZER CONTROLLER AND CONTROL METHOD	2014		RU2612728C1
LOUDNESS EQUALIZER CONTROLLER AND CONTROL METHOD	2021	Wang, Jun Lu, Lie Seefeldt, Alan	RU2826268C2
LOUDNESS EQUALIZER CONTROLLER AND CONTROL METHOD	2024	Wang, Jun Lu, Lie Seefeldt, Alan	RU2836703C1
VOLUME EQUALIZER CONTROLLER AND CONTROL METHOD	2014	Wang Jun Lu Lie Seefeldt Alan	RU2715029C2
VOLUME LEVELING CONTROLLER AND THE CONTROL METHOD	2014	Wang Jun Lu Lie Seefeldt Alan	RU2746343C2
TWO-WAY MEDIA ANALYTICS	2019	Bai, Yanning Gerrard, Mark William Han, Richard Wolters, Martin	RU2768224C1
SYSTEM, METHOD AND PERSISTENT MACHINE-READABLE DATA MEDIUM FOR GENERATING, ENCODING AND PRESENTING ADAPTIVE AUDIO SIGNAL DATA	2020	Robinson, Charles Q. Tsingos, Nicolas R. Chabanne, Christophe	RU2820838C2
SYSTEM AND METHOD FOR GENERATING, ENCODING AND PRESENTING ADAPTIVE AUDIO SIGNAL DATA	2012	Robinson, Charles Q. Tsingos, Nicolas R. Chabanne, Christophe	RU2731025C2
SYSTEM AND METHOD FOR GENERATING, CODING AND PRESENTING ADAPTIVE SOUND SIGNAL DATA	2012	Robinson Charlz K Tsingos Nikolas R Shabanne Kristof	RU2617553C2
SYSTEM, METHOD AND PERMANENT MACHINE-READABLE DATA MEDIUM FOR GENERATION, CODING AND PRESENTATION OF ADAPTIVE AUDIO SIGNAL DATA	2020	Robinson, Charles Q. Tsingos, Nicolas R. Chabanne, Christophe	RU2741738C1

RU 2 807 170 C2

Authors

Lu, Lie

Liu, Xin

Dates

2023-11-10—Published

2020-04-13—Filed