FIELD: biotechnology.
SUBSTANCE: data processing system (pipeline) for whole-genome sequencing and is intended for automatic analysis of germinal samples with an average coverage of 30x from a set of raw reads to results ready for analysis by a clinical interpreter. The whole-genome sequencing data processing system includes interconnected modules: the SNP/INDELS module, designed for mapping reads to the reference genome and genotyping vcf and INDELS and consisting of blocks, one of which provides mapping of the original reads submitted in pairs in .fastq.gz format to the reference human genome (hg19 or hg38, as selected) with obtaining a file in bam format, the second one carries out SNP/INDELS collating and generates vcf and, optionally gvcf, files containing information on polymorphisms of the analysed sample, the third one collects statistics on the completion of each step of the SNP/INDELS module and generates a separate report in html format for quality control of analysed samples; the cohort collating module takes as input a set of .gvcf files obtained after running the SNP/INDELS search module on several samples and returns a single .vcf file containing the resulting SNP/INDELS set for all samples; the SNP/INDELS annotation module is launched separately for each received .vcf file, and for each variant recorded in the .vcf file, the module matches information from various databases containing clinically relevant information about the variants, and also aggregates it into ACMG rank; module for determining Mt and Y haplogroups.
EFFECT: expanded functionality for analysing data from the complete human genome while reducing the time to obtain genetic analysis results ready for interpretation.
6 cl, 7 dwg, 2 tbl
| Title | Year | Author | Number | 
|---|---|---|---|
| WHOLE GENOME SEQUENCING DATA PROCESSING SYSTEM | 2023 | 
 | RU2806429C1 | 
| METHOD OF DETECTING COPY NUMBER VARIATIONS (CNV) BASED ON SEQUENCING DATA OF COMPLETE HUMAN EXOME AND LOW-COVERAGE GENOME | 2023 | 
 | RU2822040C1 | 
| BIOINFORMATION SYSTEMS, DEVICES AND METHODS FOR SECONDARY AND/OR TERTIARY PROCESSING | 2017 | 
 | RU2799750C2 | 
| SYSTEM AND METHOD OF INTERPRETING DATA AND PROVIDING RECOMMENDATIONS TO USER BASED ON GENETIC DATA THEREOF AND DATA ON COMPOSITION OF INTESTINAL MICROBIOTA | 2017 | 
 | RU2699284C2 | 
| BIOINFORMATIC SYSTEMS, DEVICES AND METHODS FOR PERFORMING SECONDARY AND/OR TERTIARY PROCESSING | 2017 | 
 | RU2750706C2 | 
| CALCULATION OF THE BURDEN OF TUMOUR MUTATIONS USING TUMOUR RNA SEQUENCING DATA AND CONTROLLED MACHINE LEARNING | 2020 | 
 | RU2759205C1 | 
| METHOD FOR ANALYSING MITOCHONDRIAL DNA FOR NON-INVASIVE PRENATAL TESTING | 2021 | 
 | RU2772912C1 | 
| METHOD FOR DETECTION AND ANALYSIS OF METHYLATION OF GENOMIC DNA REGIONS IN BIOLOGICAL SAMPLES OF PERIPHERAL BLOOD OF PATIENTS WITH NON-HODGKIN'S LYMPHOMA | 2022 | 
 | RU2804962C1 | 
| METHOD FOR ASSESSING RISK OF DISEASE IN USER BASED ON GENETIC DATA AND DATA ON COMPOSITION OF INTESTINAL MICROBIOTA | 2018 | 
 | RU2699517C2 | 
| GENOMIC INFRASTRUCTURE FOR LOCAL AND CLOUD PROCESSING AND ANALYSIS OF DNA AND RNA | 2017 | 
 | RU2761066C2 | 
Authors
Dates
2023-10-02—Published
2023-06-14—Filed