FIELD: computing technology.
SUBSTANCE: invention relates to the field of computing technology for processing audio data. The technical result is achieved by implementing a method for diarisation of a speech audio signal, comprising the stages of: receiving digital audio signals containing voice data synchronously recorded by at least two microphones; determining a difference signal for the signals from the two microphones based on the digital audio signal data received from said microphones; determining the values of the envelope function of the difference signal; determining the values of the envelope function of the initial audio signal based on the data of the digital audio signal received from one of the microphones; based on the value of the envelope function of the difference signal and the value of the envelope function of the initial audio signal, determining a characteristic value of the audio signal; based on the characteristic value of the audio signal, marking the data of the digital audio signal, indicating to which audio signal source the corresponding data block of the digital audio signal belongs.
EFFECT: providing a possibility of marking (segmentation) of the audio signal with a small margin of error and with low power consumption based on data received from two microphones, including in real time.
8 cl, 4 dwg
Title | Year | Author | Number |
---|---|---|---|
CUSTOMIZED OUTPUT WHICH IS OPTIMIZED FOR USER PREFERENCES IN DISTRIBUTED SYSTEM | 2020 |
|
RU2821283C2 |
METHOD OF RE-SOUNDING AUDIO MATERIALS AND APPARATUS FOR REALISING SAID METHOD | 2012 |
|
RU2510954C2 |
METHOD FOR HYBRID GENERATIVE-DISCRIMINATIVE SEGMENTATION OF SPEAKERS IN AUDIO-FLOW | 2013 |
|
RU2530314C1 |
UNCONTROLLED VOICE RESTORATION USING UNCONDITIONED DIFFUSION MODEL WITHOUT TEACHER | 2023 |
|
RU2823017C1 |
METHOD FOR COMPENSATING FOR HEARING LOSS IN TELEPHONE SYSTEM AND IN MOBILE TELEPHONE APPARATUS | 2013 |
|
RU2568281C2 |
METHOD AND SYSTEM FOR RECOGNIZING REPLAYED SPEECH FRAGMENT | 2020 |
|
RU2767962C2 |
METHOD FOR RECOGNIZING SPOKEN WORDS | 2005 |
|
RU2296376C2 |
METHOD AND APPARATUS FOR PROVIDING AUDIBLE, VISUAL OR TACTILE SIDETONE FEEDBACK NOTIFICATION TO USER OF COMMUNICATION DEVICE WITH MULTIPLE MICROPHONES | 2009 |
|
RU2482617C2 |
MUSIC SYNTHESIZER WITH SPATIAL METADATA OUTPUT | 2022 |
|
RU2834365C2 |
SYNCHRONOUS COMPREHENSION OF SEMANTIC OBJECTS FOR HIGHLY ACTIVE INTERFACE | 2004 |
|
RU2352979C2 |
Authors
Dates
2021-11-15—Published
2020-10-23—Filed