ch2. genome organization and evolution (continue) 阮雪芬 jan02, 2003 ntust

50
Ch2. Genome Ch2. Genome Organization and Organization and Evolution (continue) Evolution (continue) 阮阮阮 阮阮阮 Jan02, 2003 Jan02, 2003 NTUST NTUST

Post on 22-Dec-2015

234 views

Category:

Documents


2 download

TRANSCRIPT

Ch2. Genome Organization Ch2. Genome Organization and Evolution (continue)and Evolution (continue)

阮雪芬阮雪芬Jan02, 2003Jan02, 2003

NTUSTNTUST

Pick out Genes in GenomesPick out Genes in Genomes

• Open reading frames (ORFs)– Start codon------------------stop codon– A potential protein-coding region

• Approaches to identify protein-coding regions– Detection of regions similar to known coding regions f

rom other organisms– Ab inition methods

• It is more complete and accurate for bacteria than eukaryotes

Pick out Genes in GenomesPick out Genes in Genomes

• A framework for ab initio gene identification in eukaryotic genomes

Pick out Genes in GenomesPick out Genes in Genomes

Genomes of ProkaryotesGenomes of Prokaryotes

• Most prokaryotic cells contain – A large single circular piece of double-

stranded DNA (< 5 Mb)– Plasmids

• E. coli only ~11% of the DNA is non-coding.

The Genome of the Bacterium The Genome of the Bacterium E. E. colicoli

• Strain K-12 contains 4639221 bp in a single circular DNA molecules, with no plastids.

• An inventory reveals– 4285 protein-coding genes– 122 structural RNA genes– Non-coding repeat sequences– Regulatory elements– Transcription/translation guides– Transposase– Prophage remnants– Insertion sequence elements– Patches of unusual composition

大腸桿菌

The Genome of the Bacterium The Genome of the Bacterium E. E. colicoli

• The average size of an ORF is 317 amino acids.

• 630-700 operons, operons vary in size, although few contain more than five genes. Genes within operons vary to have related functions.

The Genome of the Bacterium The Genome of the Bacterium E. E. colicoli

• Several features of E. coli– It can synthesize all components of proteins

and nucleic acids, and cofactors.– It has metabolic flexibility– A wide range of transporters– Even for specific metabolic reactions there

are many cases of multiple enzymes.– Does not posses a complete range of

enzymatic capacity.

The genome of the archaeon The genome of the archaeon MethMethanococcus jannnaschiianococcus jannnaschii

• Methanococcus jannnaschii was collected from a hydrothermal vent 2600m deep off the coast of Baja California, Mexico, in 1983.

• Thermophilic organism• The genome was sequenced in 1996 by T

he Institute for Genomic Research (TIGR). It was the first archaeal genome sequenced.

古甲烷球菌

The genome of the archaeon The genome of the archaeon MethMethanococcus jannnaschiianococcus jannnaschii

• It contains a large chromosome containing a circular double-stranded DNA molecule 1664976 bp long.

• 1743 predicted coding regions.• Some RNA genes contain introns.• As in other prokaryotic genomes there is a little n

on-coding DNA.• In archaea, protein involved in transcription, tran

slation, and regulation are more similar to those of eukaryotes.

• Archaeal proteins involved in metabolism are more similar to those of bacteria.

The genome of one of the simplest The genome of one of the simplest organisms: organisms: Mycoplasma genitaliumMycoplasma genitalium• An infectious bacterium.• Its genome was sequenced in 1995 by TIGR, Th

e Johns Hopkins University and The University of North Carolina.

• The gene repertoire includes some that encode proteins– DNA replication– Transcription– Translation– Adhesions– Other molecules for defence against the host’s immun

e system.– Transport proteins

黴漿菌

Genomes of EukaryotesGenomes of Eukaryotes

• In eukaryotic cells, the majority of DNA is in the nucleus, separated into bundles of nucleoproteins, the chromosomes.

• Each chromosome contains a single double-stranded DNA molecule.

• Nuclear genomes of different species vary widely in size.

• Eukaryotic species vary in the number of chromosomes and distribution of genes among them.– Human chromosome 2~~a fusion of chimpanzee

chromosomes 12 and 13.

Genomes of EukaryotesGenomes of Eukaryotes

• Saccaromyces cerevisiae (Ibaker’s yeast)– Protein-protein interaction

• Yeast two-hybrid system

Yeast Two-hybrid SystemYeast Two-hybrid System

• Useful in the study of various interactions• The technology was originally developed during

the late 1980's in the laboratory Dr. Stanley Fields (see Fields and Song, 1989, Nature).

Yeast Two-hybrid SystemYeast Two-hybrid System

GAL4 DNA-binding

domain

GAL4 DNA-activation domain

Nature, 2000

Yeast Two-hybrid SystemYeast Two-hybrid System

• Library-based yeast two-hybrid screening method

Nature, 2000

Protein-protein Interactions on Protein-protein Interactions on the Webthe Web

• Yeast http://depts.washington.edu/sfields/yplm/data/index.html

http://portal.curagen.com

http://mips.gsf.de/proj/yeast/CYGD/interaction/

http://www.pnas.org/cgi/content/full/97/3/1143/DC1

http://dip.doe-mbi.ucla.edu/

http://genome.c.kanazawa-u.ac.jp/Y2H

• C. Elegans http://cancerbiology.dfci.harvard.edu/cancerbiology/ResLabs/Vidal/

• H. Pylori

http://pim/hybrigenics.com

• Drosophila

http://gifts.univ-mrs.fr/FlyNets/Flynets_home_page.html

Yeast Protein Linkage Map Yeast Protein Linkage Map DataData

• New protein-protein interactions in yeast

Stanley Fields Lab http://depts.washington.edu/sfields/yplm/data

List of interactions with links to YPD

Yeast Protein Linkage Map Yeast Protein Linkage Map DataData

Genomes of EukaryotesGenomes of Eukaryotes

• Caenorhabditis elegans– The genome was completed in 1998– The first full DNA sequence of a multicellular o

rganism– XX genotype: a self-fertilizing hermaphrodite.– XO genotype: a male.

Genomes of EukaryotesGenomes of Eukaryotes

• Drosophila melanogaster– Its genome sequence was announced in 1999 by a co

llaboration between Celera Genomics and the Berkeley Drosophila Genome Project.

– Despite the fact that insects are not very closely related to mammals, the fly genome is useful in the study of human disease.

– It contains homolgues of 289 human genes implicated in various disease:

• Cancer• Cardiovascular disease….etc.

Genomes of EukaryotesGenomes of Eukaryotes

• Arabidopsis thaliana– A flowering plant– ~125 Mbp DNA

Genomes of Eukaryotes-Genomes of Eukaryotes-HumanHuman

– In Feb 2001, the International Human Genome Sequencing Consortium and Celera Genomics published, separately, drafts of the human genome.

– 22 chromosome pairs +X, Y – Protein coding gene

• ~32000 genes in all

Genomes of Eukaryotes-Genomes of Eukaryotes-HumanHuman

– Nucleic acid binding– Transcription factor binding– Cell cycle regulator– Chaperone– Motor– Actin binding– Defense/immunity protein– Enzyme– Enzyme activator– Enzyme inhibitor

– Apoptosis– Signal transduction– Storage protein– Cell adhesion– Structural protein– Transporter– Ligand binding or carrier– Tumour suppressor– Unclassified

•Human protein coding gene

Genomes of Eukaryotes-Genomes of Eukaryotes-HumanHuman

• Repeat sequences– 50% of the genome– Contain

• Transposable elements• Retroposed pseudogenes• Simple “sutters”• Segmental duplications• Blocks of tandem repeats

Genomes of Eukaryotes-Genomes of Eukaryotes-HumanHuman

• RNA– 497 transfer RNA genes– Genes for 28S and 5.8S ribosomal RNAs– Small nucleolar RNAs– Spliceosomal snRNAs

SNPsSNPs

• Single-nucleotide polymorphisms (SNPs)– A genetic variation between individuals, limite

d to a single base pair which can be substituted, inserted or deleted.

– Sickle-cell anaemia is an example of a disease caused by a specific SNP

• AT mutation in the beta-globin gene changes a GluVal

SNPsSNPs

• Single-nucleotide polymorphisms (SNPs)– Nearly 1.8 million SNPs – Occurring on the average every 2000 base pa

irs.– Not all SNPs are linked to disease– The A, B, and O alleles of genes for blood gro

ups illustrate these possibilities.• A and B alleles differ by four SNP substitutions.

ABO Blood GroupsABO Blood Groups

The human ABO blood groups illustrate the effect of glycosyl-transferases.

N-acetylgalactosamine Galactose

Evolution of GenomesEvolution of Genomes

• Synonymous nucleotide substitution

• Non- synonymous nucleotide substitution Ka = the number of non- synonymous

nucleotide substitution

Ks = the number of synonymous nucleotide substitution

Ka/ Ks : high ratio

possibly functional changes

Databases of Aligned Gene Databases of Aligned Gene FamiliesFamilies

ExampleExample- The Effect of RGD Mimetic - The Effect of RGD Mimetic Peptide in Breast Cancer Cell Line Peptide in Breast Cancer Cell Line

MCF7MCF7

IntroductionIntroduction

•RGD has been used as inhibitor of integrin-ligand interaction.•Loss of integrin-mediated signaling will induce apoptosis.

Control Aggregation Cell Death

RGD(Arg-Gly-Asp) is the smallest motif that bind with the integrin receptor on the cell surface and Play important role in cell cycle.

IntroductionIntroduction

Human breast cancer cell MCF-7

Cell Apotosis

Genomic Study

Proteomics

Bioinformatics

Our Study

The Structures of RGD Mimetic The Structures of RGD Mimetic PeptidesPeptides

Asp

GlyArg

NH

H2N O

O

N

O

HN

NH

O

O

OH

HN

O

HN

O

S

S

HN

O

NH

NH

H2N

ArgGly Asp

Trp

Pro

Cys

Tpa

Cyclic-RGD

RGD cRGDcontrol

1mM

5mM

0.5mM

1mM

control

cDNA MicroarraycDNA Microarray

C-RGD, 6hr C-RGD, 24hr

C-RGD, 48hr C-RGD, 72hr

Apoptosis Apoptosis

• Total 34 genes, but after filtering there are only 19 genes• Total 11 genes have expression fold >2 (up or down

changes)

Apoptosis RegulatorApoptosis Regulator

U60519

U97075

AF051941

U13738

AF005775

U60521

Z48810

AAF19819

U67319

U28976

AF015450

DescriptionGenebankaccession

No.

6 hFold Change

24 hFold Change

48 hFold Change

72 hFold Change

Group 1

caspase 10, apoptosis-related cysteine protease U60519 - - - 0.471

CASP8 and FADD-like apoptosis regulator U97075 - - - 0.355

nucleoside diphosphate kinase type 6 (inhibitorof p53-induced apoptosis-alpha) AF051941 - - - 0.376

Group 2

caspase 3, apoptosis-related cysteine protease U13738 - 2.301 - -

CASP8 and FADD-like apoptosis regulator AF005775 - 2.272 - -

Group 3

caspase 9, apoptosis-related cysteine protease U60521 - - 2.519 -

Group 4

caspase 4, apoptosis-related cysteine protease Z48810 2.615 - 2.796 2.819

Group 5

inhibitor of apoptosis protein AAF19819 - - - 5.249

caspase 7, apoptosis-related cysteine protease U67319 - - - 2.19

caspase 4, apoptosis-related cysteine protease U28976 - - - 2.603

Group 6

CASP8 and FADD-like apoptosis regulator AF015450 - - - 6.912

Apoptosis RegulatorApoptosis Regulator

6 7224 48

time (hour)0.01

0.1

1

10

Normalized Intensity(log scale)

p1

6 7224 48

time (hour)0.01

0.1

1

10

Normalized Intensity(log scale)

p1

6 7224 48

time (hour)0.01

0.1

1

10

Normalized Intensity(log scale)

p1

6 7224 48

time (hour)0.01

0.1

1

10

Normalized Intensity(log scale)

p1

6 7224 48

time (hour)0.01

0.1

1

10

Normalized Intensity(log scale)

p1

6 7224 48

time (hour)0.01

0.1

1

10

Normalized Intensity(log scale)

p1

Caspase Pathway in Caspase Pathway in CCRGD-treRGD-treated MCF7 Cellated MCF7 Cell

Caspase 10

Caspase 9 Caspase 8 and FADD Caspase 4

Caspase 7

Caspase 3

Searching and Clustering of Searching and Clustering of RGD-containing Protein in RGD-containing Protein in

Swiss-Prot DatabaseSwiss-Prot Database• In Swiss-Prot database, there are 541 human

RGD-containing protein containing 5 caspase proteins.

• Caspase 8 was clustered with integrin beta4• Caspase 1, caspase 2, caspase 3 and caspase

7 are clustered.

Please pass the genes: horizontal Please pass the genes: horizontal gene transfer gene transfer

• Horizontal gene transfer is the acquisition of genetic material by one organism from the other.– Direct uptake– Via a viral carrier

Genome DatabasesGenome Databases

• PIR

Genome DatabasesGenome Databases

• Entrez Genomes

ExercisesExercises

• Weblem 2.1

• Weblem 2.9

• Weblem 3.1

Deadline: Jan 16