FIELD: bioinformatics.
SUBSTANCE: described embodiments relate to methods, devices, systems for genotyping repeat sequences, including short tandem repeats (STR), that are medically significant. The methods include aligning reads with a repeat sequence represented by a sequence graph and using the aligned reads to genotype the repeat sequence. A sequence graph is a directed graph that includes at least one simple cycle of its own, representing a subsequence of repetitions. In some embodiments, the reads are paired-end reads, and both conjugated reads of each pair of reads can be used to genotype repeat sequences. Some implementations can be used to determine the repeats of degenerate codons. Some implementation options can be used to genotyping repeat sequences, each of which includes two or more repeat subsequences. Some implementation options can be used to genotyping nucleotide sequences, each of which includes at least one repetitive subsequence and another genetic variant, such as an insertion, deletion or substitution.
EFFECT: invention expands the arsenal of means for genotyping.
31 cl, 7 dwg, 1 tbl, 1 ex
Title | Year | Author | Number |
---|---|---|---|
SEQUENCE GRAPH TOOL FOR DETERMINING VARIATIONS IN REGIONS OF SHORT TANDEM REPEATS | 2020 |
|
RU2825664C2 |
SET OF PROBES FOR ANALYZING DNA SAMPLES AND METHODS FOR THEIR USE | 2016 |
|
RU2753883C2 |
SUPPRESSING ERRORS IN SEQUENCED DNA FRAGMENTS BY USING EXCESSIVE READING WITH UNIQUE MOLECULAR INDICES (UMI) | 2016 |
|
RU2704286C2 |
ANIMALS OTHER THAN HUMANS CHARACTERIZED IN EXPANSION OF HEXANUCLEOTIDE REPEATS AT LOCUS C9orf72 | 2017 |
|
RU2760877C2 |
METHODS AND SYSTEMS FOR OBTAINING SETS OF UNIQUE MOLECULAR INDICES WITH HETEROGENEOUS LENGTH OF MOLECULES AND CORRECTING ERRORS THEREIN | 2018 |
|
RU2766198C2 |
A DEEP LEARNING FRAME FOR IDENTIFYING SEQUENCE PATTERNS THAT CAUSE SEQUENCE SPECIFIC ERRORS (SSE) | 2019 |
|
RU2745733C1 |
BIOINFORMATIC SYSTEMS, DEVICES AND METHODS FOR PERFORMING SECONDARY AND/OR TERTIARY PROCESSING | 2017 |
|
RU2750706C2 |
WHOLE GENOME SEQUENCING DATA PROCESSING SYSTEM | 2023 |
|
RU2806429C1 |
BIOINFORMATION SYSTEMS, DEVICES AND METHODS FOR SECONDARY AND/OR TERTIARY PROCESSING | 2017 |
|
RU2799750C2 |
DETECTION OF MUTATIONS AND PLOIDY IN CHROMOSOMAL SEGMENTS | 2015 |
|
RU2717641C2 |
Authors
Dates
2023-07-07—Published
2020-03-06—Filed