FIELD: speech recognition.
SUBSTANCE: invention relates to a method and system for identifying the completion of a statement of the user by a digital audio signal. In the method, a set of features for the corresponding segment of the digital audio signal is received by an electronic apparatus, wherein each set of features comprises at least acoustic-type features extracted from the corresponding segment of the digital audio signal, wherein the segments of the digital audio signal are associated with the corresponding time intervals of a predetermined duration; an indication of the time of completion of the statement in the digital audio signal corresponding to a certain point in time after which the statement of the user is completed is received by the electronic apparatus; the adjusted time of completion of the statement is determined by the electronic apparatus by adding a predetermined time offset to the time of completion of the statement; tags for the corresponding sets of features are determined by the electronic apparatus based on the adjusted time of completion of the statement and the time intervals of the corresponding segments of the digital audio signal, wherein the tag indicates whether the statement of the user was completed during the corresponding segment of the digital audio signal associated with the corresponding set of features; the electronic apparatus uses the sets of features and corresponding labels for training a neural network (NN) to predict during which segment of the digital audio signal the statement of the user utterance was completed.
EFFECT: increase in the accuracy of identifying the completion of a statement of the user.
20 cl, 6 dwg
Authors
Dates
2021-12-14—Published
2018-12-18—Filed