![Page 1: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/1.jpg)
Alternative Splicing from ESTs
Eduardo Eyras
Bioinformatics UPF – February 2004
![Page 2: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/2.jpg)
Intro
ESTs
Prediction of Alternative Splicing from ESTs
![Page 3: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/3.jpg)
AAAAAAA5’ CAPMature mRNA
Splicing
5’
3’
3’
5’
pre-mRNA
Transcriptionexons
introns
Translation
Peptide
![Page 4: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/4.jpg)
AAAAAAA5’ CAPMature mRNA
Different Splicing
5’
3’
3’
5’
pre-mRNA
Transcriptionexons
introns
Translation
Different Peptide
![Page 5: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/5.jpg)
Alt splicing as a mechanism of gene regulation
Functional domains can be added/subtracted protein diversity
Can introduce early stop codons, resulting in truncated proteins or unstable mRNAs
It can modify the activity of the transcription factors, affecting the expression of genes
It is observed nearly in all metazoans
Estimated to occur in 30%-40% of human
![Page 6: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/6.jpg)
Forms of alternative splicing
Exon skipping / inclusion
Alternative 3’ splice site
Alternative 5’ splice site
Mutually exclusive exons
Intron retention
Constitutive exon Alternatively spliced exons
![Page 7: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/7.jpg)
How to study alternative splicing?
![Page 8: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/8.jpg)
ESTs (Expressed Sequence Tags)
Single-pass sequencing of a small (end) piece of cDNA
Typically 200-500 nucleotides long
It may contain coding and/or non-coding region
![Page 9: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/9.jpg)
ESTsCells from a specific organ, tissue or developmental stage
AAAAAA 3’5’
AAAAAA 3’5’
TTTTTT5’3’
AAAAAA 3’5’
TTTTTT5’3’
TTTTTT5’3’
AAAAAA 3’5’
TTTTTT5’3’
mRNA extraction
RNA
DNA
Double stranded cDNA
Add oligo-dT primer
Reverse transcriptase
Ribonuclease H
DNA polimerase Ribonuclease H
![Page 10: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/10.jpg)
ESTs
AAAAAA 3’5’
TTTTTT5’3’Clone cDNA into a vector
Multiple cDNA clones5’ EST
3’ EST
Single-pass sequence reads
![Page 11: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/11.jpg)
Splice variants
Genomic
Primary transcript
Splicing
cDNA clones
EST sequences
5’ 3’ 5’ 3’
Alternative Splicing from ESTs
![Page 12: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/12.jpg)
Alternative Splicing from ESTs
ESTs can also provide information about potential alternative splicing when aligned to the genome (and when aligned to mRNA data)
![Page 13: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/13.jpg)
EST sequencing
Is fast and cheap
Gives direct information about the gene sequence
Partial information
Resulting ESTs Known gene
(DB searches) Similar to known gene
Contaminant
Novel gene
![Page 14: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/14.jpg)
ESTs provide expression data
eVOC Ontologies http://www.sanbi.ac.za/evoc/
Anatomical System
Cell Type
The tissue, organ or anatomical system from which the sample was prepared. Examples are digestive, lung and retina.
Pathology
The precise cell type from which a sample was prepared. Examples are: B-lymphocyte, fibroblast and oocyte.
Developmental Stage
The pathological state of the sample from which the sample was prepared.Examples are: normal, lymphoma, and congenital.
Pooling
The stage during the organism's development at which the sample was prepared. Examples are: embryo, fetus, and adult.
Indicates whether the tissue used to prepare the library was derived from single or multiple samples. Examples are pooled, pooled donor and pooled tissue.
![Page 15: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/15.jpg)
Linking the expression vocabulary to gene annotations
ESTs
Genes
![Page 16: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/16.jpg)
Normalized vs. non-normalized libraries
![Page 17: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/17.jpg)
The down side of the ESTs
Cannot detect lowly/rarely expressed genes or non-expressed sequences (regulatory)
Random sampling: the more ESTs we sequence the less new useful sequences we will get
![Page 18: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/18.jpg)
Gene Hunting
Sequencing of the Human Genome (HGP) EST Sequencing
![Page 19: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/19.jpg)
Origin of the ESTs
Science. 1991 Jun 21;252(5013):1651-6
Complementary DNA sequencing: expressed sequence tags and human genome project.
Adams MD, Kelley JM, Gocayne JD, Dubnick M, Polymeropoulos MH, Xiao H, Merril CR, Wu A, Olde B, Moreno RF, et al.
Section of Receptor Biochemistry and Molecular Biology, National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD.
Automated partial DNA sequencing was conducted on more than 600 randomly selected human brain complementary DNA (cDNA) clones to generate expressed sequence tags (ESTs). ESTs have applications in the discovery of new human genes, mapping of the human genome, and identification of coding regions in genomic sequences. Of the sequences generated, 337 represent new genes, including 48 with significant similarity to genes from other organisms, such as a yeast RNA polymerase II subunit; Drosophila kinesin, Notch, and Enhancer of split; and a murine tyrosine kinase receptor. Forty-six ESTs were mapped to chromosomes after amplification by the polymerase chain reaction. This fast approach to cDNA characterization will facilitate the tagging of most human genes in a few years at a fraction of the cost of complete genomic sequencing, provide new genetic markers, and serve as a resource in diverse biological research fields.
![Page 20: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/20.jpg)
EST-sequencing explosion
Merck and WashU (1994)
public ESTs
GenBank
dbEST
non-exclusivity (1992)
![Page 21: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/21.jpg)
Number of public entries: 20,039,613
Summary by organism
Homo sapiens (human) 5,472,005Mus musculus + domesticus (mouse) 4,056,481Rattus sp. (rat) 583,841Triticum aestivum (wheat) 549,926Ciona intestinalis 492,511Gallus gallus (chicken) 460,385Danio rerio (zebrafish) 450,652Zea mays (maize) 391,417Xenopus laevis (African clawed frog) 359,901…
dbEST release 20 February 2004
![Page 22: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/22.jpg)
EST lengths
Human EST length distribution (dbEST Sep. 2003 )
~ 450 bp
![Page 23: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/23.jpg)
Recover the mRNA from the ESTs
![Page 24: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/24.jpg)
What is an EST cluster?
A cluster is a set of fragmented EST data (plus mRNA data if known), consolidated according to sequence similarity
Clusters are indexed by gene such that all expressed data concerning a single gene is in a single index class, and each index class contains the information for only one gene. (Burke, Davison, Hide, Genome Research 1999).
![Page 25: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/25.jpg)
EST pre-processing
VectorRepeats MitochondrialXenocontaminants
![Page 26: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/26.jpg)
EST Clustering
UniGene (NCBI) www.ncbi.nlm.nih.gov/UniGene
TIGR Human Gene Index www.tigr.org
(The Institute for Genomic Research)
StackDB www.sanbi.ac.za
(South African Bioinformatics Institute)
![Page 27: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/27.jpg)
UniGene
Species UniGene Entries
Homo sapiens 118,517
Mus musculus 82,482
Rattus norvegicus 43,942
Sus scrofa 20,426
Gallus gallus 11,970
Xenopus laevis 21,734
Xenopus tropicalis 17,102
…
![Page 28: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/28.jpg)
ESTs and the Genome
![Page 29: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/29.jpg)
ESTs aligned to the genome
Some advantages:
•It defines the location of exons and introns
•We can verify the splice sites of introns (e.g. GT-AG)
hence also check the correct strand of spliced ESTs
•It helps preventing chimeras
•It can avoid putting together ESTs from paralogous genes
•We can prevent including pseudogenes in our analysis
![Page 30: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/30.jpg)
Aligning ESTs to the Genome
Many ESTs Fast programs, Fast computers
Nearly exact matches Coverage >= 97%Percent_id >= 97%
Splice sites: GT—AG, AT—AC, GC—AG
![Page 31: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/31.jpg)
Aligning ESTs to the Genome
Clip poly A tails/Clip 20bp from either end
Best in genome
Remove potential processed pseudogenes
Give preference to ESTs that are spliced
Extra pre-processing of ESTs:
![Page 32: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/32.jpg)
Human ESTGenesGenomic length distribution of aligned human ESTs
Tail up to ~ 800kb
~ 400bp
![Page 33: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/33.jpg)
The Problem
What are the transcripts represented in this set of mapped ESTs?
ESTs
Genome
![Page 34: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/34.jpg)
Transcript predictions
ESTs
Predict Transcripts from ESTs
Merge ESTs according to splicing structure compatibility
![Page 35: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/35.jpg)
Representation
Extension
Inclusion zx
y
x
Sort by the smallest coordinate ascending and by the largest coordinate descending
Every 2 ESTs in a Genomic Cluster may represent the same splicing (redundant) or not
The redundancy relation is a graph:
x
y
x
z
![Page 36: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/36.jpg)
Criteria of merging
Allow internal mismatches
Allow intron mismatches
Allow edge-exon mismatches
![Page 37: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/37.jpg)
Transitivity
Extension
Inclusionwz
y
x
w
x
This reduces the number of comparisons needed
x
y
z
xzw
![Page 38: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/38.jpg)
ClusterMerge graph
z
x
x
y
y
z
w
Each node defines an inclusion sub-tree
Extensions form acyclic graphs
y
xz
xyzw
![Page 39: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/39.jpg)
Recovering the Solution
1
2
9
6
8
7
43
5
Mergeable sets of ESTs can be recovered asspecial paths in the graph
![Page 40: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/40.jpg)
Recovering the Solution
1
2
9
6
8
7
43
5
Root
Leaves
Leaf: not-extended and root of an inclusion tree
Root: does not extend any node
![Page 41: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/41.jpg)
Recovering the Solution
1
2
9
6
8
7
43
5
Root
Leaves
Any set of ESTs in a path from a root to a leaf is mergeable
![Page 42: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/42.jpg)
Recovering the Solution
1
2
9
6
8
7
43
5
Root
Leaves
Add the inclusion tree attached to each node in the path
![Page 43: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/43.jpg)
Recovering the Solution
1
2
9
6
8
7
43
5
Lists produced: (1,2,3,4,5,6,7,8) ( 1,2,3,4,5,6,7,9)
This representation minimizes the necessary comparisons between ESTs
![Page 44: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/44.jpg)
How to build the graph
Mutual Recursion
Search graph (leaves)
Recursion search along extension branch
Search sub-graph
Inclusion => go up in the tree
![Page 45: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/45.jpg)
How to build the graph
1
32
4
65
Example
![Page 46: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/46.jpg)
How to build the graph
1
32
4
65
Example
1
4
2
6
5
3
![Page 47: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/47.jpg)
How to build the graph
1
32
4
65
Example
1
4
2
6
5
3
7
Leaves
![Page 48: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/48.jpg)
How to build the graph
1
32
4
65
Example
1
4
2
6
5
3
7
Inclusion
![Page 49: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/49.jpg)
How to build the graph
1
32
4
65
Example
1
4
2
6
5
3
7
Inclusion
![Page 50: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/50.jpg)
How to build the graph
1
32
4
65
Example
1
4
2
6
5
3
7
Extension
![Page 51: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/51.jpg)
How to build the graph
1
32
4
65
Example
1
4
2
6
5
3
7
Inclusion
![Page 52: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/52.jpg)
How to build the graph
1
32
4
65
Example
1
4
2
6
5
3
7
Place
7
![Page 53: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/53.jpg)
How to build the graph
1
32
4
65
Example
1
4
2
6
5
3
7
Inclusion
7
![Page 54: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/54.jpg)
How to build the graph
1
32
4
65
Example
1
4
2
6
5
3
7
tagged as visited - skip
7
![Page 55: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/55.jpg)
How to build the graph
1
32
4
65
Example
1
4
2
6
5
3
7
Possible sub-trees beyond 1 or 3 remain unseen!
The representation minimizes the necessary comparisons
7
![Page 56: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/56.jpg)
Deriving the transcripts from the lists
Internal Splice Sites: external coordinates of the 5’ and 3’ exons are not allowed to contribute
![Page 57: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/57.jpg)
Deriving the transcripts from the lists
Splice Sites: are set to the most common coordinate
5’ and 3’ coordinates: are set to the exon coordinate that extends the potential UTR the most
![Page 58: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/58.jpg)
Single exon transcripts
Reject resulting single exon transcripts when using ESTs
![Page 59: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/59.jpg)
Annotation with ESTs
ESTs aligned to the genome can provide information about UTRs and alternative splicing
![Page 60: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/60.jpg)
Annotation with ESTs
EST-Transcripts at www.ensembl.org
![Page 61: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/61.jpg)
Annotation with ESTs
![Page 62: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/62.jpg)
Results for Human and Mouse
Human EST-genes (assembly ncbi33):
38,581 Genes
122,247Transcripts ( 42% with full CDS )
Mouse EST-genes (assembly ncbi30):
32,848 Genes
103,664 Transcripts ( 36% with full CDS )
![Page 63: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/63.jpg)
How many transcripts are conserved?
Is Alternative Splicing conserved?
![Page 64: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/64.jpg)
EST-transcript pairs
42,625 transcript pairs (in 18,242 gene pairs)
gene pairs
78% with one transcript pair conserved
22% with more than one transcript pair conserved
For 22% of the gene pairs
some form of alt. splicing is conserved
![Page 65: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/65.jpg)
Conservation of Alt. SplicingTake gene-pairs with more than one transcript-pair
19% of alt. variants in human are conserved in mouse
32% of alt. variants in mouse are conserved in human
∑ ( number of paired transcripts - 1)
%conservation = -------------------------------------------------------
∑ ( number of transcripts - 1 )
∑ = sum over genes in a gene pair with more than one variant
( subtract the ‘main’ transcript form)
![Page 66: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/66.jpg)
How many predicted ‘novel’ genes are validated by Human-Mouse
comparison?
![Page 67: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/67.jpg)
Novel genesESTGenes
Not in Ensembl Human ESTGenes validated by comparison to mouse
13,174 18,242
ESTGenes with at least one complete ORF
24,201
![Page 68: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/68.jpg)
Novel genes
984
ESTGenes not in Ensembl validated by comparison to mouse
With a complete ORF
![Page 69: Alternative Splicing from ESTs Eduardo Eyras Bioinformatics UPF – February 2004](https://reader035.vdocuments.pub/reader035/viewer/2022062407/56649d495503460f94a2616f/html5/thumbnails/69.jpg)
THE END