Show metadata Hide metadata

(19)

(11)

2 780 442

(13)

(51)

IPC

G16B40/30(2019-01-01)

(21) (22)

Application

2019139175, 2018-10-15

(24)

Start date

2018-10-15

(22)

Actual filing date

2018-10-15

(45)

Published

2022-09-23

(72)

Inventor

Dzhaganatan, KishorFarkh, Kaj-KhouKiriazopulu Panajotopulu, SofiyaMakrej, Dzheremi Frensis

(73)

Holder

Illyumina, Ink.

SPLICING SITES CLASSIFICATION BASED ON DEEP LEARNING Russian patent published in 2022 - IPC G16B40/30

Abstract RU 2780442 C2

FIELD: biotechnology.

SUBSTANCE: computer-implemented method for prediction of the likelihood of splicing sites in pre-mRNA genomic sequences is described. The method includes: obtaining pre-mRNA genomic sequences by sequencing pre-mRNA transcripts, and training a sparse convolutional neural network (hereinafter – ACNN) based on training examples of pre-mRNA nucleotide sequences, including at least 50,000 training examples of donor splicing sites, at least 50,000 training examples of acceptor splicing sites, and at least 100,000 training examples of sites not related to splicing, and the trained ACNN generates triple assessments to assess the likelihood of that each nucleotide in target nucleotides is a donor splicing site, an acceptor splicing site, or a site not related to splicing. In this case, the specified training includes: input of training examples of nucleotide sequences, encoded by encoding with one active state, wherein each nucleotide sequence contains at least 401 nucleotides, with at least one target nucleotide and a context of at least 200 flanking nucleotides on each side, in the 5’ direction and in the 3’ direction from the target nucleotide; and correction, by reverse distribution, of filter parameters in the specified ACNN to predict the likelihood assessments of that each target nucleotide in the specified nucleotide sequence is a donor splicing site, an acceptor splicing site, or a site not related to splicing; wherein the trained ACNN receives as an input a pre-mRNA nucleotide sequence of at least 401 nucleotides, which is encoded by encoding with one active state, and which includes at least one target nucleotide and a context of at least 200 flanking nucleotides on each side. A system for prediction of the likelihood of splicing sites in pre-mRNA genomic sequences is also described, including one or more processors related to memory, wherein computer commands are loaded into memory, which, when executed on the specified processors, implement actions including: training a sparse convolutional neural network (ACNN) based on training examples of pre-mRNA nucleotide sequences, including at least 50,000 training examples of donor splicing sites, at least 50,000 training examples of acceptor splicing sites, and at least 100,000 training examples of sites not related to splicing, and the trained ACNN generates triple assessments to assess the likelihood of that each nucleotide in target nucleotides is a donor splicing site, an acceptor splicing site, or a site not related to splicing.

EFFECT: invention expands the range of tools for learning (training) deep convolutional neural networks.

41 cl, 59 dwg, 3 tbl

Similar patents RU2780442C2

Title	Year	Author	Number
METHODS FOR TRAINING DEEP CONVOLUTIONAL NEURAL NETWORKS BASED ON DEEP LEARNING	2018	Gao, Khun Farkh, Kaj-Khou Sundaram, Laksshman Makrej, Dzheremi Frensis	RU2767337C2
A DEEP LEARNING FRAME FOR IDENTIFYING SEQUENCE PATTERNS THAT CAUSE SEQUENCE SPECIFIC ERRORS (SSE)	2019	Kashefagigi, Dorna Kia, Amirali Farkh, Kaj-Khou	RU2745733C1
GENOMIC INFRASTRUCTURE FOR LOCAL AND CLOUD PROCESSING AND ANALYSIS OF DNA AND RNA	2017	Van Rojn, Piter Makmillen, Robert Dzh. Ryule, Majkl Mekho, Rami	RU2761066C2
GENOMIC INFRASTRUCTURE FOR LOCAL AND CLOUD PROCESSING AND ANALYSIS OF DNA AND RNA	2017	Van Rojn, Piter Makmillen, Robert Dzh. Ryule, Majkl Mekho, Rami	RU2804029C2
METHOD FOR QUANTIFYING THE STATISTICAL ANALYSIS OF ALTERNATIVE SPLICING IN RNA-SEC DATA	2020	Khajtovich Filipp Efimovich Mazin Pavel Vladimirovich	RU2752663C1
ANTISENSE OLIGONUCLEOTIDE DIRECTED REMOVAL OF PROTEOLYTIC CLEAVAGE SITES, HCHWA-D MUTATIONS AND INCREASED NUMBER OF TRINUCLEOTIDE REPEATS	2014	Van Ron-Mom Vilgelmina Mariya Klasina Evers Melvin Maurise Pepers Barri Antonius Artsma-Rus Annemike Van Ommen Garrit-Yan Baudevijn	RU2692634C2
COMPUTER-IMPLEMENTED INTEGRAL METHOD FOR ASSESSING QUALITY OF TARGET SEQUENCING RESULTS	2018	Milejko Vladislav Ajkovich Kasyanov Artem Sergeevich Kovtun Aleksej Sergeevich	RU2717809C1
GENE EDITING OF DEEP INTRON MUTATIONS	2016	Ruan, Guoxiang Scaria, Abraham	RU2759335C2
EXON SKIP WITH PEPTIDONUCLEIC ACID DERIVATIVES	2017	Chung, Shin Jung, Daram Cho, Bongjun Jang, Kangwon Yoon, Heungsik	RU2786637C2
CALCULATION OF THE BURDEN OF TUMOUR MUTATIONS USING TUMOUR RNA SEQUENCING DATA AND CONTROLLED MACHINE LEARNING	2020	Buzdin Anton Sorokin Maxim Zotova Evgenia Tkachev Victor Garazha Andrew	RU2759205C1

RU 2 780 442 C2

Authors

Dzhaganatan, Kishor

Farkh, Kaj-Khou

Kiriazopulu Panajotopulu, Sofiya

Makrej, Dzheremi Frensis

Dates

2022-09-23—Published

2018-10-15—Filed