FIELD: biotechnology.
SUBSTANCE: described is a method of identifying one or more gene fusions in a biological sample. Method includes operations for obtaining first data, which represent a plurality of aligned readings, identifying a plurality of potential merges included in the obtained first data, filtering a plurality of potential fusions to determine a filtered set of potential merges, for each specific potential fusion from the filtered set of fusion candidates: generating input data by one or more computers for input into a machine learning model, which include extracted feature data representing a particular potential fusion, transmission of generated input data as input data to machine learning model, which has been trained to generate output reflecting the probability that the potential fusion is a confirmed gene fusion, and determining whether the particular potential fusion corresponds to the confirmed gene fusion based on the output. Also disclosed is a system for identifying one or more gene fusions in a biological sample, comprising one or more computers and one or more storage devices on which instructions are stored, when executed by one or more computers, one or more computers perform operations involving the described method. In addition, provided is a non-volatile machine-readable medium for identifying one or more gene fusions in a biological sample, on which software containing instructions is stored, made with possibility of execution by one or more computers, due to execution of which one or more computers perform operations, including described method.
EFFECT: fast detection of gene fusions.
13 cl, 4 dwg
Title | Year | Author | Number |
---|---|---|---|
INCREMENTAL SECONDARY ANALYSIS OF NUCLEIC ACID SEQUENCES | 2021 |
|
RU2839343C1 |
HARDWARE-ACCELERATED GENERATION OF K-DIMENSIONAL GRAPH | 2021 |
|
RU2817560C1 |
METHOD OF COMPRESSING GENOME SEQUENCE DATA | 2020 |
|
RU2815860C1 |
METHOD AND SYSTEM FOR CORRECTING UNDESIRABLE BATCH EFFECTS IN MICROBIOME DATA | 2019 |
|
RU2742003C1 |
GENOMIC INFRASTRUCTURE FOR LOCAL AND CLOUD PROCESSING AND ANALYSIS OF DNA AND RNA | 2017 |
|
RU2804029C2 |
FLEXIBLE SEED EXTENSION FOR HASH TABLE-BASED GENOMIC MAPPING | 2020 |
|
RU2796915C1 |
GENOMIC INFRASTRUCTURE FOR LOCAL AND CLOUD PROCESSING AND ANALYSIS OF DNA AND RNA | 2017 |
|
RU2761066C2 |
COMPUTER-IMPLEMENTED INTEGRAL METHOD FOR ASSESSING QUALITY OF TARGET SEQUENCING RESULTS | 2018 |
|
RU2717809C1 |
SYSTEMS AND METHODS FOR IDENTIFICATION OF INTOXICATED CUSTOMERS ON THE PLATFORM OF AN ONLINE TO OFFLINE SERVICE | 2018 |
|
RU2753458C1 |
BIOINFORMATION SYSTEMS, DEVICES AND METHODS FOR SECONDARY AND/OR TERTIARY PROCESSING | 2017 |
|
RU2799750C2 |
Authors
Dates
2024-05-02—Published
2020-12-04—Filed