FIELD: computer engineering.
SUBSTANCE: result is achieved by the following steps: obtaining a data set for training punctuation, comprising first input data including audio data and text data representing speech, and a first label including a sequence of control tokens, training the model using the punctuation training data set to form the trained punctuation model, obtaining the data set for the speaker changing training data, containing the second input data including the second audio data and the second text data, and a second label including a second sequence of control tokens, additional training of the trained punctuation model using the data set for training a speaker change, forming a speaker change model, obtaining working text data and corresponding working audio data, and generating a second working sequence of tokens based on the working audio data and the working text data using the speaker change model.
EFFECT: high accuracy of speech recognition and speaker diarization.
16 cl, 6 dwg
Title | Year | Author | Number |
---|---|---|---|
METHOD AND SYSTEM FOR PARAPHRASING TEXT | 2023 |
|
RU2814808C1 |
TEXT CLASSIFICATION METHOD AND SYSTEM | 2022 |
|
RU2818693C2 |
METHOD AND SYSTEM FOR GENERATING TEXT | 2023 |
|
RU2817524C1 |
METHOD AND SYSTEM FOR DIGITAL ASSISTANT TEXT GENERATION | 2022 |
|
RU2796208C1 |
METHOD AND SERVER FOR SPEECH SYNTHESIS IN TEXT | 2015 |
|
RU2632424C2 |
METHOD AND SERVER FOR PROCESSING TEXT SEQUENCE IN MACHINE PROCESSING TASK | 2020 |
|
RU2775820C2 |
METHOD AND SERVER FOR PERFORMING PROBLEM-ORIENTED TRANSLATION | 2021 |
|
RU2820953C2 |
SYSTEM AND METHODOLOGY OF AUTOMATIC LANGUAGE LEARNING ON BASIS OF SYNTACTIC MODELS FREQUENCY | 2015 |
|
RU2632656C2 |
METHOD AND SYSTEM FOR DETECTING FRAUDULENT CALLS AND ALERTING SUBSCRIBERS THEREABOUT | 2022 |
|
RU2820329C2 |
METHOD FOR ATTRIBUTION OF PARTIALLY STRUCTURED TEXTS FOR FORMATION OF NORMATIVE-REFERENCE INFORMATION | 2020 |
|
RU2750852C1 |
Authors
Dates
2025-06-04—Published
2024-05-31—Filed