experiment guide of bioinformatics - zhejiang...

17
Experiment Guide of Bioinformatics --农学院生物信息学教学实验课 王煜 (Yu Wang) Email: [email protected]

Upload: others

Post on 18-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Experiment Guide of Bioinformatics - Zhejiang Universityibi.zju.edu.cn/bioinplant/tools/bioinformatics... · Contents Introduction of biological database Reference, DNA, protein,

Experiment Guide of Bioinformatics--农学院生物信息学教学实验课

王煜 (Yu Wang)

Email: [email protected]

Page 2: Experiment Guide of Bioinformatics - Zhejiang Universityibi.zju.edu.cn/bioinplant/tools/bioinformatics... · Contents Introduction of biological database Reference, DNA, protein,

Biological Data

collection

classification storage

analysis

processing

Research Contents in Bioinformatics

New Technology and methods of Maths and Information

2

Page 3: Experiment Guide of Bioinformatics - Zhejiang Universityibi.zju.edu.cn/bioinplant/tools/bioinformatics... · Contents Introduction of biological database Reference, DNA, protein,

Contents

Introduction of biological database Reference, DNA, protein, structure, et al.

Public database, personal database, et al.

Raw data, annotation information for raw data

Search/Query in database (genBank) Reference (PubMed)

DNA sequences

Protein sequences

3

Analysis and explanation of biological data Homologous sequence analysis and multiple sequence alignment

Phylogenetic analysis

Gene prediction and annotation

Genome analysis

Page 4: Experiment Guide of Bioinformatics - Zhejiang Universityibi.zju.edu.cn/bioinplant/tools/bioinformatics... · Contents Introduction of biological database Reference, DNA, protein,

4

Biological database genBank/EMBL/DDBJ http://www.ncbi.nlm.nih.gov/

dbEST/UniGene/dbSTS/dbGSS/HTG/Taxonomy/dbSNP

SWISS-PROT/UniProt/PIR/Prosite http://www.ebi.ac.uk/swissprot/

PDB http://www.rcsb.org

Specialized species database: rice (RGAP), arabidopsis (TAIR), et al.

Gene annotation database

GO http://www.geneontology.org/

KEGG http://www.genome.ad.jp/kegg/

PKR http://www.nih.go.jp/mirror/Kinases/

COG http://www.ncbi.nlm.nih.gov/COG

PubMed/Web of science/Agricola

Submission

BankIt

Sequin

Page 5: Experiment Guide of Bioinformatics - Zhejiang Universityibi.zju.edu.cn/bioinplant/tools/bioinformatics... · Contents Introduction of biological database Reference, DNA, protein,

5

Abbreviation Full name

PRI 灵长类序列 (primate sequences)

ROD 啮齿类序列 (rodent sequences)

MAM 其它哺乳类序列 (other mammalian sequences)

VRT 其它脊椎动物序列 (other vertebrate sequences)

INV 无脊椎动物序列 (invertebrate sequences)

PLN 植物、真菌和海藻类序列 (plant, fungal, and algalsequences)

BCT 细菌序列 (bacterial sequences)

VRL 病毒序列 (viral sequences)

PHG 噬菌体序列 (bacteriophage sequences)

The divisions(分支) of GenBank

GenBank

Page 6: Experiment Guide of Bioinformatics - Zhejiang Universityibi.zju.edu.cn/bioinplant/tools/bioinformatics... · Contents Introduction of biological database Reference, DNA, protein,

6

Abbreviation Full name

SYN 合成序列 (synthetic sequences)UNA 未注释的序列 (unannotated sequences)EST 表达序列标签 (expressed sequence tags)PAT 已专利的序列 (patent sequences)STS 序列标签位点 (sequence tagged sites)GSS 基因组序列 (genome survey sequences)HTG 高产出基因组序列 (high throughput genomic

sequences)HTC 高产出cDNA序列 (high throughput cDNA sequences)

Page 7: Experiment Guide of Bioinformatics - Zhejiang Universityibi.zju.edu.cn/bioinplant/tools/bioinformatics... · Contents Introduction of biological database Reference, DNA, protein,

GenBank Format

FASTA Format

7

Page 8: Experiment Guide of Bioinformatics - Zhejiang Universityibi.zju.edu.cn/bioinplant/tools/bioinformatics... · Contents Introduction of biological database Reference, DNA, protein,

8

Search/Query in databaseText-based database searching

Noun, descriptive words, phrase

Boolean logical operators: AND, OR, NOT

Entrez, Sequence Retrieval System (SRS), DBGET

Tips:1. Quote the words in a phrase:“disease resistance”

2. disease resistance = disease AND resistance

3. enzyme* = enzyme + enzymes

Sequence-based database searching

Essence: sequence alignment

Purpose: function analysis, evolution, detect SNPs and INDELs, elongation and location the sequence

BLAST (blastn, blastp, tblastx, psi-blast)/FASTA/BLAT

Tips:Identity and similarity

E value

Bl2seq, Primer-BLAST

Page 9: Experiment Guide of Bioinformatics - Zhejiang Universityibi.zju.edu.cn/bioinplant/tools/bioinformatics... · Contents Introduction of biological database Reference, DNA, protein,

9

A snapshot for Entrez

Page 10: Experiment Guide of Bioinformatics - Zhejiang Universityibi.zju.edu.cn/bioinplant/tools/bioinformatics... · Contents Introduction of biological database Reference, DNA, protein,

10

Blastn alignment

Blastp alignment

Page 11: Experiment Guide of Bioinformatics - Zhejiang Universityibi.zju.edu.cn/bioinplant/tools/bioinformatics... · Contents Introduction of biological database Reference, DNA, protein,

11

Gene prediction and annotationFGENESH, GenScan, Glimmer, Genemark, Augustus

Web-server (FGENESH) or standalone package (Augustus, Glimmer)

Steps:Eliminate repetitive sequence

Confirm ORF (open reading frame)

Promoter

Tips:Difference between prokaryotes and eukaryotes

Difference between different softwares

comprehensive consideration all results

GO, KEGG, COG

Page 12: Experiment Guide of Bioinformatics - Zhejiang Universityibi.zju.edu.cn/bioinplant/tools/bioinformatics... · Contents Introduction of biological database Reference, DNA, protein,

12

A snapshot for FGENESH

Page 13: Experiment Guide of Bioinformatics - Zhejiang Universityibi.zju.edu.cn/bioinplant/tools/bioinformatics... · Contents Introduction of biological database Reference, DNA, protein,

13

http://www.ebi.ac.uk/Tools/

Page 14: Experiment Guide of Bioinformatics - Zhejiang Universityibi.zju.edu.cn/bioinplant/tools/bioinformatics... · Contents Introduction of biological database Reference, DNA, protein,

14

Multiple sequence alignmentClustalW, MAFFT, MUSCLE, T-Coffee

Web-server or standalone package (ClustalX)

Visualization of alignment results

Gendoc/Bioedit

Phylogenetic analysisMEGA5.0, phylip, PUAP*, MrBayes

NJ, maximum parsimony, maximum likelihood, Bayesian, UPGMA

Page 15: Experiment Guide of Bioinformatics - Zhejiang Universityibi.zju.edu.cn/bioinplant/tools/bioinformatics... · Contents Introduction of biological database Reference, DNA, protein,

15

A snapshot for ClustalW2

Page 16: Experiment Guide of Bioinformatics - Zhejiang Universityibi.zju.edu.cn/bioinplant/tools/bioinformatics... · Contents Introduction of biological database Reference, DNA, protein,

16

Multi-alignment (MSF format in Gendoc)

Phylogenetic Tree (NJ methods in MEGA)

Page 17: Experiment Guide of Bioinformatics - Zhejiang Universityibi.zju.edu.cn/bioinplant/tools/bioinformatics... · Contents Introduction of biological database Reference, DNA, protein,

Happy new year! Please submit your homework in PDF format to

[email protected] at the 8th week in winter term.

17