![Page 1: BIOINFORMATIK I UEBUNGEN HUBERT HACKL icbi.at/bioinf](https://reader033.vdocuments.pub/reader033/viewer/2022051400/55204d8349795902118d7269/html5/thumbnails/1.jpg)
BIOINFORMATIK I UEBUNGEN
HUBERT HACKLicbi.at/bioinf
![Page 2: BIOINFORMATIK I UEBUNGEN HUBERT HACKL icbi.at/bioinf](https://reader033.vdocuments.pub/reader033/viewer/2022051400/55204d8349795902118d7269/html5/thumbnails/2.jpg)
Organisation
• 3 Übungen
• Kurze Einführung anschließend Labor
• Protokoll (je 2 Studierende, elektronisch doc, pdf ..)
• Abgabe der Übungen bis spätestens 22. Mai 2014
![Page 3: BIOINFORMATIK I UEBUNGEN HUBERT HACKL icbi.at/bioinf](https://reader033.vdocuments.pub/reader033/viewer/2022051400/55204d8349795902118d7269/html5/thumbnails/3.jpg)
Termine
![Page 4: BIOINFORMATIK I UEBUNGEN HUBERT HACKL icbi.at/bioinf](https://reader033.vdocuments.pub/reader033/viewer/2022051400/55204d8349795902118d7269/html5/thumbnails/4.jpg)
Übungsziele
• Kennlernen biologischer Datenbanken (NCBI, …)
• Arbeiten mit Protein- und DNA/RNA-Sequenzen
• Sequenzalignment (BLAST)
• Arbeiten mit Genome-Browsern (UCSC, Ensembl)
• Lösung praktischer Beispiele mit Online-Analyse (keine Programmierübung)
![Page 5: BIOINFORMATIK I UEBUNGEN HUBERT HACKL icbi.at/bioinf](https://reader033.vdocuments.pub/reader033/viewer/2022051400/55204d8349795902118d7269/html5/thumbnails/5.jpg)
Biologischer Informationsfluss
![Page 6: BIOINFORMATIK I UEBUNGEN HUBERT HACKL icbi.at/bioinf](https://reader033.vdocuments.pub/reader033/viewer/2022051400/55204d8349795902118d7269/html5/thumbnails/6.jpg)
Chromsome, Chromatin, DNA
![Page 7: BIOINFORMATIK I UEBUNGEN HUBERT HACKL icbi.at/bioinf](https://reader033.vdocuments.pub/reader033/viewer/2022051400/55204d8349795902118d7269/html5/thumbnails/7.jpg)
DNA
![Page 8: BIOINFORMATIK I UEBUNGEN HUBERT HACKL icbi.at/bioinf](https://reader033.vdocuments.pub/reader033/viewer/2022051400/55204d8349795902118d7269/html5/thumbnails/8.jpg)
Symbol Meaning Description
R A or G puRineY C or T pYrimidineW A or T Weak hydrogen bondsS G or C Strong hydrogen bondsM A or C aMino groupsK G or T Keto groupsH A, C, or T (U) not G, (H follows G)B G, C, or T (U) not A, (B follows A)V G, A, or C not T (U), (V follows U)D G, A, or T (U) not C, (D follows C)N G, A, C or T (U) aNy nucleotide
Nomenklatur von Nukleinsäuren
Base Symbol Occurrence
Adenin A DNA, RNAGuanin G DNA, RNACytosin C DNA, RNAThymin T DNAUracil U RNA
![Page 9: BIOINFORMATIK I UEBUNGEN HUBERT HACKL icbi.at/bioinf](https://reader033.vdocuments.pub/reader033/viewer/2022051400/55204d8349795902118d7269/html5/thumbnails/9.jpg)
+ strand 5´-ACGGTCGCTGTCGGTAGC-3´- strand 3´-TGCCAGCGACAGCCATCG-5´
e.g. in fasta format : >gene sequence|gi12345|chr17|- GCTACCGACAGCGACCGT
DNA sequences are always from 5‘ to 3‘
Positions in the genome (genome assembly) are chromosome wise
e.g. human GRCh37/hg19
chr11:1-100 chr11:49,686,777-49,689,777
Positions in the chromosome start for both!! strands from position 1
+ strand 5´-ACGGTCGCTG…………TCGGTAGC-3´- strand 3´-TGCCAGCGAC…………AGCCATCG-5´
chr11:1 2523 2529
chr11:1 2523 2529
Nomenklatur
![Page 10: BIOINFORMATIK I UEBUNGEN HUBERT HACKL icbi.at/bioinf](https://reader033.vdocuments.pub/reader033/viewer/2022051400/55204d8349795902118d7269/html5/thumbnails/10.jpg)
Regulation of transcription
![Page 11: BIOINFORMATIK I UEBUNGEN HUBERT HACKL icbi.at/bioinf](https://reader033.vdocuments.pub/reader033/viewer/2022051400/55204d8349795902118d7269/html5/thumbnails/11.jpg)
mRNA processing
![Page 12: BIOINFORMATIK I UEBUNGEN HUBERT HACKL icbi.at/bioinf](https://reader033.vdocuments.pub/reader033/viewer/2022051400/55204d8349795902118d7269/html5/thumbnails/12.jpg)
Translation, genetic code and reading frames
![Page 13: BIOINFORMATIK I UEBUNGEN HUBERT HACKL icbi.at/bioinf](https://reader033.vdocuments.pub/reader033/viewer/2022051400/55204d8349795902118d7269/html5/thumbnails/13.jpg)
Peptid chain, amino acid sequence, proteins
Protein sequences are always form N-terminal end to C-terminal end
backbone
sidechains
E.g.. SCD sequence in fasta format
![Page 14: BIOINFORMATIK I UEBUNGEN HUBERT HACKL icbi.at/bioinf](https://reader033.vdocuments.pub/reader033/viewer/2022051400/55204d8349795902118d7269/html5/thumbnails/14.jpg)
201020082003
Start Human Genome Project - ) komplettes HG - ) 3.000.000.000 bp - ) 20 Institute - ) 2.5000 Wissenschafter
Erste Entwurfsversionvon HG publiziert
Lander et al.,
Venter et al.,
Endversion von HG publiziert
Ende HGP
20011990
Start 1000 Genomes Project - ) detaillierter Katalog genetischer Variationen - ) 1000 anonyme Spender
Start ENCODE Project - ) Encyclopedia of DNA Elements - ) funktionale Elemente der DNA
Stand ENCODE Project - ) Endphase - ) Daten durch UCSC verfügbarStand 1000 Genomes Project - ) 4 “highly covered” Individuen - ) 1000 Genomes Browser
Projekte
![Page 15: BIOINFORMATIK I UEBUNGEN HUBERT HACKL icbi.at/bioinf](https://reader033.vdocuments.pub/reader033/viewer/2022051400/55204d8349795902118d7269/html5/thumbnails/15.jpg)
National Library of Medicine (NLM)National Center for Biotechnology Information (NCBI)
• NIH (National Institute of Health)–Campus in Bethesda, Maryland, USA (gegründet 1836 - Budget >30 Mrd $)
![Page 16: BIOINFORMATIK I UEBUNGEN HUBERT HACKL icbi.at/bioinf](https://reader033.vdocuments.pub/reader033/viewer/2022051400/55204d8349795902118d7269/html5/thumbnails/16.jpg)
• www.pubmed.gov• Datenbank wurde entwickelt um Zugang zu Zitaten und Abstracts
biomedizinischer Literatur zur Verfügung zu stellen• 2012 – 21 Mio Einträge von über 5000 Journalen• >700 Mio Online Suchen pro Jahr
PubMed
![Page 17: BIOINFORMATIK I UEBUNGEN HUBERT HACKL icbi.at/bioinf](https://reader033.vdocuments.pub/reader033/viewer/2022051400/55204d8349795902118d7269/html5/thumbnails/17.jpg)
• Datenbank zur Verwaltung von Sequenzdaten
• Frei zugänglich
• Täglicher Datenaustausch mit EBI und DDBJ
• Neuer „Release“ alle zwei Monate
• 2012 > 149 Millionen Sequenzen (137 Milliarden bp)
• > 205.000 Spezien
• > 1150 komplette Genome
GenBank
![Page 18: BIOINFORMATIK I UEBUNGEN HUBERT HACKL icbi.at/bioinf](https://reader033.vdocuments.pub/reader033/viewer/2022051400/55204d8349795902118d7269/html5/thumbnails/18.jpg)
• Textbasiertes Abfragesystem für > 30 Datenbanken– PubMed – OMIM– Nucleotide – Protein– Gene – dbSNP– GEO – ...
• Ergebnisse sind vorberechnet und verlinkt
• Mehr als 5.000.000 Suchen pro Tag• Batchmodus verfügbar• LinkOut service zu externen Datenbanken
Entrez
![Page 19: BIOINFORMATIK I UEBUNGEN HUBERT HACKL icbi.at/bioinf](https://reader033.vdocuments.pub/reader033/viewer/2022051400/55204d8349795902118d7269/html5/thumbnails/19.jpg)
Entrez
![Page 20: BIOINFORMATIK I UEBUNGEN HUBERT HACKL icbi.at/bioinf](https://reader033.vdocuments.pub/reader033/viewer/2022051400/55204d8349795902118d7269/html5/thumbnails/20.jpg)
Entrez
![Page 21: BIOINFORMATIK I UEBUNGEN HUBERT HACKL icbi.at/bioinf](https://reader033.vdocuments.pub/reader033/viewer/2022051400/55204d8349795902118d7269/html5/thumbnails/21.jpg)
RefSeq
• Best, comprehensive, non-redundant set of sequences
• For genomic DNA (NG_), transcript mRNA (NM_), other RNA (NR_) and protein (NP_)
• For major research organisms (2645 organisms)
• Based on GenBank derived sequences
• Ongoing curation by NCBI staff and collaborators, with review status indicated on each record (computational XM_, XP_)
![Page 22: BIOINFORMATIK I UEBUNGEN HUBERT HACKL icbi.at/bioinf](https://reader033.vdocuments.pub/reader033/viewer/2022051400/55204d8349795902118d7269/html5/thumbnails/22.jpg)
Gene
• One record represents one single gene from an organism
• Gene-specific information such as map, sequence, expression, structure, function, homology, publications, links
• Can have one or more Refseq transcripts assigned (NM_)
• Official gene symbol and name, GeneID, aliases and other designations
![Page 23: BIOINFORMATIK I UEBUNGEN HUBERT HACKL icbi.at/bioinf](https://reader033.vdocuments.pub/reader033/viewer/2022051400/55204d8349795902118d7269/html5/thumbnails/23.jpg)
• Online Mendelian Inheritance in Men
• Bibliographisches, krankheitszentriertes Kompendium
• Ursprünglich Buchform (MIM, Johns Hopkins University)
• Tägliche Updates
• Für Ärzte, Wissenschafter, Studenten und Ausbildner
• Links zu vielen Datenbanken (Literatur, Sequenzen...)
OMIM
![Page 24: BIOINFORMATIK I UEBUNGEN HUBERT HACKL icbi.at/bioinf](https://reader033.vdocuments.pub/reader033/viewer/2022051400/55204d8349795902118d7269/html5/thumbnails/24.jpg)
Insulin• Polypeptid-Hormon
• Bildung: Betazellen der Langerhansinseln im Pankreas (Bauchspeicheldrüse)
• 51 Aminosäuren (2 Ketten)
• A mit 21 AS
• B mit 30 AS
• Schweineinsulin (1 AS unterschiedlich)
• Rinderinsulin (3 AS unterschiedlich)
• Glucosetransport in die Zelle und Blutzuckerregulation
• Hemmt in der Fettzelle Lipolyse und fördert Lipogenese
• In Leber und Muskelzelle wird Glykogenaufbau gefördert
![Page 25: BIOINFORMATIK I UEBUNGEN HUBERT HACKL icbi.at/bioinf](https://reader033.vdocuments.pub/reader033/viewer/2022051400/55204d8349795902118d7269/html5/thumbnails/25.jpg)
Proinsulin
![Page 26: BIOINFORMATIK I UEBUNGEN HUBERT HACKL icbi.at/bioinf](https://reader033.vdocuments.pub/reader033/viewer/2022051400/55204d8349795902118d7269/html5/thumbnails/26.jpg)
Vom Preproinsulin zum Insulin
![Page 27: BIOINFORMATIK I UEBUNGEN HUBERT HACKL icbi.at/bioinf](https://reader033.vdocuments.pub/reader033/viewer/2022051400/55204d8349795902118d7269/html5/thumbnails/27.jpg)
• Verwendung von Schweine- und Rinderinsulin• Bildung von Antikörpern & allergische Reaktionen möglich• Versorgung eines Diabetikers: 50 Pankreata/Jahr
• Gentechnische Herstellung mit rekombinanter DNA Technologie
• Unterschiedliche Wirkungsdauer (zB. Dissoziation von Insulinhexameren) und Insulinanaloga
Insulin als Medikament
![Page 28: BIOINFORMATIK I UEBUNGEN HUBERT HACKL icbi.at/bioinf](https://reader033.vdocuments.pub/reader033/viewer/2022051400/55204d8349795902118d7269/html5/thumbnails/28.jpg)
Exercise 1-1: Find difference between insulin sequence in pig and human
1.2 Show that C-peptide sequence is less conserved than A-chain and B-chain
![Page 29: BIOINFORMATIK I UEBUNGEN HUBERT HACKL icbi.at/bioinf](https://reader033.vdocuments.pub/reader033/viewer/2022051400/55204d8349795902118d7269/html5/thumbnails/29.jpg)
2.1 Which genes/proteins are involved?2.2 On which chromosome (arm, cytogenetic band) genes are located?2.3 What is the position and strand on the human reference genome assembly?2.4 Can these genes also found in the mouse (location)?2.5 Are there common mutations i.e. non-synonymous SNPs known?2.6 What is the function of the encoded proteins?2.7 Find recent publications
Exercise 1-2: Find information on SICKLE CELL ANEMIA and KABUKI SYNDROM