FIELD: biotechnology.
SUBSTANCE: system for secure data transmission is described. Data is obtained by genome sequencing. Data is processed with obtaining a variant call format (hereinafter – VCF), which includes (i) chromosome number data, (ii) chromosome position data setting the position of nucleotide variant in the genome, (iii) reference base, (iv) alternative base, (v) variant determination quality, (vi) variant determination nature. The system includes a transmitting station, which includes a hardware-software compression module made for VCF file compression to VCF annotated file, including non-redundant variant data from VCF file, by comparison of VCF file variants to the database of known variants indexed by chromosome and, if VCF file variant is known from the specified database, compression of the known VCF file variant to a record containing (i) chromosome number data and (ii) data of position of nucleotide variant chromosome, as well as hardware-software encoding module made for encoding VCF annotated file by converting a record containing chromosome number data and data of position of nucleotide variant chromosome into a coordinate system according to encoding scheme. It also includes a hardware-software memory module made for storage of the encoded VCF annotated file, a hardware-software unit of input/output of the transmitting station made for receiving encoded VCF annotated file, a hardware-software decoding module made for decoding encoded VCF annotated file using the specified encoding scheme, and a hardware-software filling module made for filling decoded VCF annotated file using database of known variants indexed by chromosome. The corresponding methods of data transmission are also presented.
EFFECT: invention provides for ease, for example acceleration, of transmission of large amounts of data in a secure manner decreasing resource amount needed for such a transmission.
14 cl, 5 dwg
Title | Year | Author | Number |
---|---|---|---|
WHOLE GENOME SEQUENCING DATA PROCESSING SYSTEM | 2023 |
|
RU2804535C1 |
WHOLE GENOME SEQUENCING DATA PROCESSING SYSTEM | 2023 |
|
RU2806429C1 |
GENOMIC INFRASTRUCTURE FOR LOCAL AND CLOUD PROCESSING AND ANALYSIS OF DNA AND RNA | 2017 |
|
RU2761066C2 |
GENOMIC INFRASTRUCTURE FOR LOCAL AND CLOUD PROCESSING AND ANALYSIS OF DNA AND RNA | 2017 |
|
RU2804029C2 |
BIOINFORMATIC SYSTEMS, DEVICES AND METHODS FOR PERFORMING SECONDARY AND/OR TERTIARY PROCESSING | 2017 |
|
RU2750706C2 |
BIOINFORMATION SYSTEMS, DEVICES AND METHODS FOR SECONDARY AND/OR TERTIARY PROCESSING | 2017 |
|
RU2799750C2 |
SYSTEM AND METHOD OF INTERPRETING ALLELES USING GRAPH-BASED REFERENCE GENOME | 2019 |
|
RU2809124C2 |
SPLICING SITES CLASSIFICATION BASED ON DEEP LEARNING | 2018 |
|
RU2780442C2 |
METHOD FOR JOINT DATA COMPRESSION AND ENCRYPTION IN GENOME ALIGNMENT | 2020 |
|
RU2747625C1 |
VIRTUAL SETS OF FRAGMENTS OF NUCLEOTIDE SEQUENCES | 2004 |
|
RU2390561C2 |
Authors
Dates
2021-08-12—Published
2015-11-18—Filed