SPLICING SITES CLASSIFICATION BASED ON DEEP LEARNING Russian patent published in 2022 - IPC G16B40/30 

Abstract RU 2780442 C2

FIELD: biotechnology.

SUBSTANCE: computer-implemented method for prediction of the likelihood of splicing sites in pre-mRNA genomic sequences is described. The method includes: obtaining pre-mRNA genomic sequences by sequencing pre-mRNA transcripts, and training a sparse convolutional neural network (hereinafter – ACNN) based on training examples of pre-mRNA nucleotide sequences, including at least 50,000 training examples of donor splicing sites, at least 50,000 training examples of acceptor splicing sites, and at least 100,000 training examples of sites not related to splicing, and the trained ACNN generates triple assessments to assess the likelihood of that each nucleotide in target nucleotides is a donor splicing site, an acceptor splicing site, or a site not related to splicing. In this case, the specified training includes: input of training examples of nucleotide sequences, encoded by encoding with one active state, wherein each nucleotide sequence contains at least 401 nucleotides, with at least one target nucleotide and a context of at least 200 flanking nucleotides on each side, in the 5’ direction and in the 3’ direction from the target nucleotide; and correction, by reverse distribution, of filter parameters in the specified ACNN to predict the likelihood assessments of that each target nucleotide in the specified nucleotide sequence is a donor splicing site, an acceptor splicing site, or a site not related to splicing; wherein the trained ACNN receives as an input a pre-mRNA nucleotide sequence of at least 401 nucleotides, which is encoded by encoding with one active state, and which includes at least one target nucleotide and a context of at least 200 flanking nucleotides on each side. A system for prediction of the likelihood of splicing sites in pre-mRNA genomic sequences is also described, including one or more processors related to memory, wherein computer commands are loaded into memory, which, when executed on the specified processors, implement actions including: training a sparse convolutional neural network (ACNN) based on training examples of pre-mRNA nucleotide sequences, including at least 50,000 training examples of donor splicing sites, at least 50,000 training examples of acceptor splicing sites, and at least 100,000 training examples of sites not related to splicing, and the trained ACNN generates triple assessments to assess the likelihood of that each nucleotide in target nucleotides is a donor splicing site, an acceptor splicing site, or a site not related to splicing.

EFFECT: invention expands the range of tools for learning (training) deep convolutional neural networks.

41 cl, 59 dwg, 3 tbl

Similar patents RU2780442C2

Title Year Author Number
METHODS FOR TRAINING DEEP CONVOLUTIONAL NEURAL NETWORKS BASED ON DEEP LEARNING 2018
  • Gao, Khun
  • Farkh, Kaj-Khou
  • Sundaram, Laksshman
  • Makrej, Dzheremi Frensis
RU2767337C2
A DEEP LEARNING FRAME FOR IDENTIFYING SEQUENCE PATTERNS THAT CAUSE SEQUENCE SPECIFIC ERRORS (SSE) 2019
  • Kashefagigi, Dorna
  • Kia, Amirali
  • Farkh, Kaj-Khou
RU2745733C1
GENOMIC INFRASTRUCTURE FOR LOCAL AND CLOUD PROCESSING AND ANALYSIS OF DNA AND RNA 2017
  • Van Rojn, Piter
  • Makmillen, Robert Dzh.
  • Ryule, Majkl
  • Mekho, Rami
RU2761066C2
GENOMIC INFRASTRUCTURE FOR LOCAL AND CLOUD PROCESSING AND ANALYSIS OF DNA AND RNA 2017
  • Van Rojn, Piter
  • Makmillen, Robert Dzh.
  • Ryule, Majkl
  • Mekho, Rami
RU2804029C2
METHOD FOR QUANTIFYING THE STATISTICAL ANALYSIS OF ALTERNATIVE SPLICING IN RNA-SEC DATA 2020
  • Khajtovich Filipp Efimovich
  • Mazin Pavel Vladimirovich
RU2752663C1
COMPUTER-IMPLEMENTED INTEGRAL METHOD FOR ASSESSING QUALITY OF TARGET SEQUENCING RESULTS 2018
  • Milejko Vladislav Ajkovich
  • Kasyanov Artem Sergeevich
  • Kovtun Aleksej Sergeevich
RU2717809C1
ANTISENSE OLIGONUCLEOTIDE DIRECTED REMOVAL OF PROTEOLYTIC CLEAVAGE SITES, HCHWA-D MUTATIONS AND INCREASED NUMBER OF TRINUCLEOTIDE REPEATS 2014
  • Van Ron-Mom Vilgelmina Mariya Klasina
  • Evers Melvin Maurise
  • Pepers Barri Antonius
  • Artsma-Rus Annemike
  • Van Ommen Garrit-Yan Baudevijn
RU2692634C2
CALCULATION OF THE BURDEN OF TUMOUR MUTATIONS USING TUMOUR RNA SEQUENCING DATA AND CONTROLLED MACHINE LEARNING 2020
  • Buzdin Anton
  • Sorokin Maxim
  • Zotova Evgenia
  • Tkachev Victor
  • Garazha Andrew
RU2759205C1
GENE EDITING OF DEEP INTRON MUTATIONS 2016
  • Ruan, Guoxiang
  • Scaria, Abraham
RU2759335C2
EXON SKIP WITH PEPTIDONUCLEIC ACID DERIVATIVES 2017
  • Chung, Shin
  • Jung, Daram
  • Cho, Bongjun
  • Jang, Kangwon
  • Yoon, Heungsik
RU2786637C2

RU 2 780 442 C2

Authors

Dzhaganatan, Kishor

Farkh, Kaj-Khou

Kiriazopulu Panajotopulu, Sofiya

Makrej, Dzheremi Frensis

Dates

2022-09-23Published

2018-10-15Filed