FIELD: bioinformatics; biotechnology.
SUBSTANCE: computer-implemented method for determining the sequence of at least a portion of at least one template target nucleic acid by determining whether two mutated sequence reads originate from the same mutated sequence. Multiple mutated sequence reads are performed. Moreover, each reading of a mutated sequence corresponds to a subsequence of the sequence containing the mutations. A general minimizing function is applied for each mutated sequence read, thus determining one or more appropriate minimizers for each mutated sequence read. The positions of one or more corresponding minimizers in each mutated sequence read are determined. The positions of one or more mutations in each mutated sequence read are determined. At the same time, for at least two mutated reads of the sequence with a common minimizer, the number of mutations with the same position and/or with a mismatched position is counted, when the corresponding minimizers are aligned in order to determine the indicator correlated with the probability that the indicated at least two mutated sequence reads come from the same sequence containing the mutation. At least two mutated sequence reads are assembled based on the specified index. Sequences of at least a portion of at least one template target nucleic acid are determined based on said assembly. A method for determining at least a portion of the sequence of at least one template target nucleic acid molecule is also described, including using the above method after the step of sequencing regions of at least one mutated template target nucleic acid molecule to obtain a plurality of mutated sequence reads.
EFFECT: accurate sequencing of nucleic acids, as well as fast and accurate sequence assembly from short sequence reads.
29 cl, 5 dwg
Title | Year | Author | Number |
---|---|---|---|
CRISPR-CAS SYSTEM COMPONENTS, METHODS AND COMPOSITIONS FOR SEQUENCE MANIPULATION | 2013 |
|
RU2701662C2 |
METHOD OF DETECTION OF NUCLEIC ACID SEQUENCE VARIANT USING SHIFT TERMINATION ANALYSIS | 2000 |
|
RU2200762C2 |
DETECTION OF MUTATIONS AND PLOIDY IN CHROMOSOMAL SEGMENTS | 2015 |
|
RU2717641C2 |
GENOMIC SELECTION AND SEQUENCING USING CODED MICROCARRIERS | 2010 |
|
RU2609630C2 |
LIBRARIES FOR NEXT GENERATION SEQUENCING | 2014 |
|
RU2698125C2 |
METHOD OF DETECTING MUTATIONS IN COMPLEX DNA MIXTURES | 2014 |
|
RU2613489C2 |
NUCLEOTIDE SEQUENCES ASSOCIATED WITH INCREASING OR REDUCING OF OVULATORY RATE IN MAMMALIANS | 2001 |
|
RU2283866C2 |
METHODS OF SEQUENCING OF THREE-DIMENSIONAL STRUCTURE OF THE ANALYZED GENOME REGION | 2011 |
|
RU2603082C2 |
A DEEP LEARNING FRAME FOR IDENTIFYING SEQUENCE PATTERNS THAT CAUSE SEQUENCE SPECIFIC ERRORS (SSE) | 2019 |
|
RU2745733C1 |
SEQUENCE GRAPH TOOL FOR DETERMINING VARIATIONS IN REGIONS OF SHORT TANDEM REPEATS | 2020 |
|
RU2825664C2 |
Authors
Dates
2023-07-11—Published
2020-09-29—Filed