FIELD: computer engineering.
SUBSTANCE: invention relates to a method and a system for converting text based on text generation. Disclosed is a method of generating text in a text paraphrasing system, comprising steps of: obtaining a text fragment in a natural language; obtaining a target style of a text fragment, which characterizes stylistic features inherent in said target style, and a text styling parameter characterizing the degree of text styling; text fragment is processed. Processing includes at least breaking down said fragment into text blocks. Performing each text block encoding, wherein during encoding performing tokenization of the text block; performing text blocks vectorization by tokens; vector representations of tokens of each text block are processed, during which a set of candidates of stylized paraphrased texts is formed for the text block in vectorized form. Each candidate in each text block is decoded, wherein during decoding, at least conversion of vectorized stylized texts into tokens and detokenization is performed. Ranking the set of candidates of the stylized paraphrased text and selecting the best paraphrased stylized candidate; stylized paraphrased texts of each block are combined with preservation of the initial order into the paraphrased stylized text fragment and the specified fragment is sent to the text paraphrasing system.
EFFECT: high originality and semantic accuracy of generating a paraphrased stylized text from a source text.
12 cl, 4 dwg, 4 tbl
Title | Year | Author | Number |
---|---|---|---|
METHOD AND SYSTEM FOR GENERATING TEXT | 2023 |
|
RU2817524C1 |
METHOD AND SYSTEM FOR DIGITAL ASSISTANT TEXT GENERATION | 2022 |
|
RU2796208C1 |
TEXT CLASSIFICATION METHOD AND SYSTEM | 2022 |
|
RU2818693C2 |
SYSTEM AND METHOD FOR AUGMENTATION OF THE TRAINING SAMPLE FOR MACHINE LEARNING ALGORITHMS | 2020 |
|
RU2758683C2 |
METHOD AND SYSTEM FOR DEPERSONALIZATION OF CONFIDENTIAL DATA | 2022 |
|
RU2804747C1 |
METHOD AND SYSTEM FOR DEPERSONALIZATION OF CONFIDENTIAL DATA | 2022 |
|
RU2802549C1 |
SYSTEM AND METHOD FOR CORRECTING SPELLING ERRORS | 2020 |
|
RU2753183C1 |
SYSTEM AND METHOD FOR AUTOMATED ASSESSMENT OF INTENTIONS AND EMOTIONS OF USERS OF DIALOGUE SYSTEM | 2020 |
|
RU2762702C2 |
SYSTEM FOR IDENTIFYING REPHRASING USING MACHINE TRANSLATION TECHNOLOGY | 2004 |
|
RU2368946C2 |
METHOD AND SYSTEM FOR RETRIEVING NAMED ENTITIES | 2020 |
|
RU2760637C1 |
Authors
Dates
2024-03-04—Published
2023-06-22—Filed