METHOD OF COMPRESSING GENOME SEQUENCE DATA Russian patent published in 2024 - IPC G16B50/50 

Abstract RU 2815860 C1

FIELD: data processing; biotechnology.

SUBSTANCE: group of inventions relates to a process of compressing genome sequence data obtained by a sequencer based on a reference sequence. For sequences of nucleotides or bases, which were previously aligned with a reference sequence, it is determined how accurately they are compared, inaccurately compared or not compared with a reference sequence; after which they are encoded depending on said definition. For each inaccurately matched sequence, the determining step includes comparing the number of discrepancies between said sequence and the reference sequence taking into account the reference threshold value and depending on the result of said comparison, encoding the inaccurately matched sequences using uniquely defined encoding processes of said method of compressing genome sequence data obtained by the sequencer. Group of inventions includes a computer-implemented method of compressing genome sequence data, a system for compressing genome sequence data, a machine-readable data storage device for compressing genome sequence data and a hardware processor for compressing genome sequence data.

EFFECT: group of inventions provides fast compression and decompression of data, while eliminating loss of information, and provides a high compression ratio.

32 cl, 7 dwg

Similar patents RU2815860C1

Title Year Author Number
METHOD FOR COMPRESSING GENOME SEQUENCE DATA 2020
  • Rizk, Gijom Aleksandr Paskal
RU2807474C1
FAST DETECTION OF GENE FUSIONS 2020
  • Deshpande, Viraj
  • Shlezinger, Jokhann Feliks Vilgelm
  • Truong, Shon
  • Roddi, Dzhon Kuper
  • Ryule, Majkl
  • Katru, Severin
  • Mekho, Rami
RU2818363C1
SEQUENCE GRAPH-BASED TOOL FOR DETERMINING VARIATION IN SHORT TANDEM REPEAT AREAS 2020
  • Dolzhenko, Egor
  • Eberle, Michael A.
RU2799654C2
SEQUENCE GRAPH TOOL FOR DETERMINING VARIATIONS IN REGIONS OF SHORT TANDEM REPEATS 2020
  • Dolzhenko, Egor
  • Eberle, Michael A.
RU2825664C2
FLEXIBLE SEED EXTENSION FOR HASH TABLE-BASED GENOMIC MAPPING 2020
  • Ruehle, Michael
RU2796915C1
METHOD FOR DETERMINING INDICATOR CORRELATED WITH PROBABILITY THAT TWO MUTATED SEQUENCE READINGS ARE FROM THE SAME SEQUENCE CONTAINING MUTATION 2020
  • Darling, Aaron Erl
RU2799778C1
SECURE GENOMIC DATA TRANSMISSION 2015
  • Agraval Vartika
  • Dimitrova Nevenka
  • Krasinski Rejmond Dzh.
RU2753245C2
SYSTEM AND METHOD FOR SECONDARY ANALYSIS OF NUCLEOTIDE SEQUENCING DATA 2017
  • Garsiya, Frantsisko Khose
  • Rachi, Kome
  • Dej, Aaron
  • Karni, Majkl Dzh.
RU2741807C2
DETECTION OF SOMATIC VARIATION OF NUMBER OF COPIES 2017
  • Chuang, Han-Yu
  • Zhao, Chen
RU2768718C2
TERMINAL AND METHOD OF COMMUNICATION 2020
  • Esioka, Sokhei
  • Nagata, Satosi
RU2802817C1

RU 2 815 860 C1

Authors

Rizk, Gijom Aleksandr Paskal

Dates

2024-03-22Published

2020-09-11Filed