DIALOG DETECTOR Russian patent published in 2023 - IPC G10L25/81 

Abstract RU 2807170 C2

FIELD: computer technology.

SUBSTANCE: invention relates to the extracting audio features in a dialogue detector in response to an input audio signal. The technical result consists in increasing the performance of extracting sound features when using several context windows, each of which contains a different number of frames to represent the frame in different contexts. The technical result is achieved by dividing the input audio signal into many frames; extracting frame audio features from each frame I; defining a set of context windows, where each context window contains a number of frames surrounding the current frame; deriving, for each context window, a corresponding contextual audio feature for the current frame based on the frame audio features of the frames in each corresponding context; performing concatenation on each contextual audio feature to form a combined feature vector to represent the current frame; and obtaining a speech confidence score representing the probability of dialogue occurring in the current frame using the combined feature vector, wherein the number of frames in one or more context windows is determined adaptively based on the extracted frame audio features.

EFFECT: increasing the performance of extracting sound features when using several context windows.

12 cl, 12 dwg

Similar patents RU2807170C2

Title Year Author Number
VOLUME EQUALIZER CONTROLLER AND CONTROL METHOD 2014
RU2612728C1
LOUDNESS EQUALIZER CONTROLLER AND CONTROL METHOD 2021
  • Wang, Jun
  • Lu, Lie
  • Seefeldt, Alan
RU2826268C2
VOLUME EQUALIZER CONTROLLER AND CONTROL METHOD 2014
  • Wang Jun
  • Lu Lie
  • Seefeldt Alan
RU2715029C2
VOLUME LEVELING CONTROLLER AND THE CONTROL METHOD 2014
  • Wang Jun
  • Lu Lie
  • Seefeldt Alan
RU2746343C2
TWO-WAY MEDIA ANALYTICS 2019
  • Bai, Yanning
  • Gerrard, Mark William
  • Han, Richard
  • Wolters, Martin
RU2768224C1
SYSTEM, METHOD AND PERSISTENT MACHINE-READABLE DATA MEDIUM FOR GENERATING, ENCODING AND PRESENTING ADAPTIVE AUDIO SIGNAL DATA 2020
  • Robinson, Charles Q.
  • Tsingos, Nicolas R.
  • Chabanne, Christophe
RU2820838C2
SYSTEM AND METHOD FOR GENERATING, ENCODING AND PRESENTING ADAPTIVE AUDIO SIGNAL DATA 2012
  • Robinson, Charles Q.
  • Tsingos, Nicolas R.
  • Chabanne, Christophe
RU2731025C2
SYSTEM AND METHOD FOR GENERATING, CODING AND PRESENTING ADAPTIVE SOUND SIGNAL DATA 2012
  • Robinson Charlz K
  • Tsingos Nikolas R
  • Shabanne Kristof
RU2617553C2
SYSTEM, METHOD AND PERMANENT MACHINE-READABLE DATA MEDIUM FOR GENERATION, CODING AND PRESENTATION OF ADAPTIVE AUDIO SIGNAL DATA 2020
  • Robinson, Charles Q.
  • Tsingos, Nicolas R.
  • Chabanne, Christophe
RU2741738C1
AUDIO ENCODER AND AUDIO DECODER WITH METADATA INFORMATION ON PROGRAM OR SUBSTREAMS STRUCTURE 2014
  • Riedmiller Jeffrey
  • Ward Michael
RU2624099C1

RU 2 807 170 C2

Authors

Lu, Lie

Liu, Xin

Dates

2023-11-10Published

2020-04-13Filed