a bioinformatics tool for analyzing g-quadruplexes in the mrna untranslated regions

20
A Bioinformatics Tool for A Bioinformatics Tool for Analyzing G-quadruplexes in Analyzing G-quadruplexes in the the mRNA Untranslated Regions mRNA Untranslated Regions ザザザ ザザザザ ザザザ ザザザザ Zachary Zappala Zachary Zappala

Upload: lazar

Post on 30-Jan-2016

28 views

Category:

Documents


0 download

DESCRIPTION

A Bioinformatics Tool for Analyzing G-quadruplexes in the mRNA Untranslated Regions. ザカレ ザッパァ Zachary Zappala. But why?. To map theoretical existence of Quadruplex forming G-Rich Sequences in cytoplasmic mRNAs. Utility Belt. Access to NCBI Entrez Gene PHP/MySQL/C++/JavaScript - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: A Bioinformatics Tool for  Analyzing G-quadruplexes in the  mRNA Untranslated Regions

A Bioinformatics Tool for A Bioinformatics Tool for Analyzing G-quadruplexes in the Analyzing G-quadruplexes in the

mRNA Untranslated RegionsmRNA Untranslated Regions

ザカレ ザッパァザカレ ザッパァZachary ZappalaZachary Zappala

Page 2: A Bioinformatics Tool for  Analyzing G-quadruplexes in the  mRNA Untranslated Regions

But why?But why?

To map theoretical existence of To map theoretical existence of Quadruplex forming G-Rich Sequences in Quadruplex forming G-Rich Sequences in cytoplasmic mRNAscytoplasmic mRNAs

Page 3: A Bioinformatics Tool for  Analyzing G-quadruplexes in the  mRNA Untranslated Regions

Utility BeltUtility Belt

Access to NCBI Entrez GeneAccess to NCBI Entrez Gene PHP/MySQL/C++/JavaScriptPHP/MySQL/C++/JavaScript Laptop (Dell Latitude D505 1.6 ghz 512 Laptop (Dell Latitude D505 1.6 ghz 512

DDR)DDR) Internet serverInternet server PerseverancePerseverance Starbucks Frappucinos™Starbucks Frappucinos™

Page 4: A Bioinformatics Tool for  Analyzing G-quadruplexes in the  mRNA Untranslated Regions

Bioinformatics?Bioinformatics?

A hybrid of A hybrid of information sciences information sciences and and biologybiology Similar, but not the same as computational Similar, but not the same as computational

biologybiology Enlists the help of databases and tools to Enlists the help of databases and tools to

analyze large masses of data to find patterns analyze large masses of data to find patterns that are not easily discernable by the human eyethat are not easily discernable by the human eye Tools like NCBI BLAST are especially well knowTools like NCBI BLAST are especially well know

Page 5: A Bioinformatics Tool for  Analyzing G-quadruplexes in the  mRNA Untranslated Regions

……Biology? In computer science?Biology? In computer science?

Well, it’s not that complicatedWell, it’s not that complicated The biology involved in this project is The biology involved in this project is

transcription/translationtranscription/translation Genetics!Genetics! Quick overview:Quick overview:

DNA (double helix; 2 nucleotide strands)DNA (double helix; 2 nucleotide strands) RNA (single nucleotide strands)RNA (single nucleotide strands) DNA is DNA is transcriptedtranscripted into RNA, which travels to into RNA, which travels to

ribosomesribosomes, which , which translatetranslate the RNA data into the RNA data into amino acidsamino acids, the building blocks of proteins, the building blocks of proteins

Page 6: A Bioinformatics Tool for  Analyzing G-quadruplexes in the  mRNA Untranslated Regions

Eukaryotic mRNA!Eukaryotic mRNA!

DNA in the nucleus

RNA in the nucleus (pre-mRNA)

RNA in the Cytoplasm (mRNA)

RNA processing (SPLICING!)

There are 3 sections in cytoplasmiceukaryotic mRNAs.•The 5’ UTR•The Coding sequence•The 3’ UTR

CDS 3’ UTR

transcription

5’ UTR

Gene expression regulation factors?

Page 7: A Bioinformatics Tool for  Analyzing G-quadruplexes in the  mRNA Untranslated Regions

G-quadrawhats?G-quadrawhats? That’s right: G-quadruplexesThat’s right: G-quadruplexes A type of A type of secondary structure secondary structure that forms in that forms in

single stranded nucleotide sequences single stranded nucleotide sequences (aka..mRNA)(aka..mRNA)

……GG~GG~GG~GG…GG~GG~GG~GG…

Plates (tetrads) form Plates (tetrads) form between 4 guanine molecules between 4 guanine molecules that line upthat line up

Page 8: A Bioinformatics Tool for  Analyzing G-quadruplexes in the  mRNA Untranslated Regions

Why is this important?Why is this important?

Since all this work is theoretical, it’s important to Since all this work is theoretical, it’s important to know that there know that there couldcould be an application be an application

QGRS in pre-mRNA has already been shown to QGRS in pre-mRNA has already been shown to play an important role in pre-mRNA splicing play an important role in pre-mRNA splicing (Kikin, D’Antonio, Bagga 2006)(Kikin, D’Antonio, Bagga 2006)

So, what about cytoplasmic mRNA?So, what about cytoplasmic mRNA? Gene expression control (since not all mRNA become Gene expression control (since not all mRNA become

proteins)proteins) Internal Ribosomal Entry Sites (IRES)Internal Ribosomal Entry Sites (IRES)

• Allows entry of ribosomes to start translation not at the Allows entry of ribosomes to start translation not at the beginning of the 5’ UTRbeginning of the 5’ UTR

Page 9: A Bioinformatics Tool for  Analyzing G-quadruplexes in the  mRNA Untranslated Regions

But how to predict?But how to predict?

Prior research has given several clues to what Prior research has given several clues to what constitutes a strong QGRSconstitutes a strong QGRS For instance, it is known that only one loop can have For instance, it is known that only one loop can have

a length of zeroa length of zero Also, the more tetrad plates that are forming, the Also, the more tetrad plates that are forming, the

more likely it is that the QGRS will existmore likely it is that the QGRS will exist

QGRS Motif:QGRS Motif:

GGxxNNy1y1GGxxNNy2y2GGxxNNy3y3GGxx

G-score is assigned using a straightforward functionG-score is assigned using a straightforward function

Page 10: A Bioinformatics Tool for  Analyzing G-quadruplexes in the  mRNA Untranslated Regions

Divining the QGRS in an mRNA sequenceDivining the QGRS in an mRNA sequence

When a gene is requested by the user, When a gene is requested by the user, data is parsed from NCBI and given as data is parsed from NCBI and given as parameters for a C++ programparameters for a C++ program

Executes and saves data from the Executes and saves data from the sequence, which is then picked up again sequence, which is then picked up again by the PHP program to be displayed to the by the PHP program to be displayed to the useruser

Page 11: A Bioinformatics Tool for  Analyzing G-quadruplexes in the  mRNA Untranslated Regions

Program Flow!Program Flow!“Behind the scenes”

“Interface”

PHP SessionVariable

Page 12: A Bioinformatics Tool for  Analyzing G-quadruplexes in the  mRNA Untranslated Regions

Screenshots!Screenshots!

Page 13: A Bioinformatics Tool for  Analyzing G-quadruplexes in the  mRNA Untranslated Regions

……more screenshots!more screenshots!

Page 14: A Bioinformatics Tool for  Analyzing G-quadruplexes in the  mRNA Untranslated Regions

……and more screenshots!and more screenshots!

Page 15: A Bioinformatics Tool for  Analyzing G-quadruplexes in the  mRNA Untranslated Regions

But that’s not all!But that’s not all!

In order to not overload you with pretty In order to not overload you with pretty pictures, let’s just say that you can also pictures, let’s just say that you can also view direct view direct data tablesdata tables and a and a sequence sequence viewview that block out the QGRS locations that block out the QGRS locations

Program executes in a small time frame, Program executes in a small time frame, and due to the nature of mRNA there are and due to the nature of mRNA there are not many abnormal situationsnot many abnormal situations Poor internet connects Poor internet connects dodo tend to slow tend to slow

display…but that’s your ISPdisplay…but that’s your ISP

Page 16: A Bioinformatics Tool for  Analyzing G-quadruplexes in the  mRNA Untranslated Regions

Successes & FailuresSuccesses & Failures

Research on the NRAS oncogene has shown, Research on the NRAS oncogene has shown, using crystallography, that a QGRS exists @ the using crystallography, that a QGRS exists @ the -222 CDS bp position (within the 5’ UTR)-222 CDS bp position (within the 5’ UTR) When QGRS Mapper 2 analyzed the same gene, it When QGRS Mapper 2 analyzed the same gene, it

predicted a QGRS at the same positionpredicted a QGRS at the same position Incomplete NCBI entries have prevented full Incomplete NCBI entries have prevented full

verification of the reported dataverification of the reported data Unfortunately, not enough data is available for Unfortunately, not enough data is available for

research to be done on IRES sitesresearch to be done on IRES sites All IRES sites must be determined empirically as no All IRES sites must be determined empirically as no

strict pattern has been shown to exist yetstrict pattern has been shown to exist yet

Page 17: A Bioinformatics Tool for  Analyzing G-quadruplexes in the  mRNA Untranslated Regions

Think of tomorrow…Think of tomorrow…

Currently analyzing various oncogenes, Currently analyzing various oncogenes, especially the NRAS to find out if the especially the NRAS to find out if the Mapper successfully maps the Mapper successfully maps the conservative QGRSconservative QGRS

GRS UTRdb is currently being built as GRS UTRdb is currently being built as well, making it possible for large well, making it possible for large calculations to be applied to mapped datacalculations to be applied to mapped data

The design of this database is…The design of this database is…

Page 18: A Bioinformatics Tool for  Analyzing G-quadruplexes in the  mRNA Untranslated Regions

Entity Relationship Diagrams!Entity Relationship Diagrams!

ORACLE Certified!ORACLE Certified! Shows the Shows the

relationships between relationships between different tables in the different tables in the databasedatabase

Is currently being Is currently being populated, and is populated, and is notnot yet publicyet public

Page 19: A Bioinformatics Tool for  Analyzing G-quadruplexes in the  mRNA Untranslated Regions

Want to try mapping yourself?Want to try mapping yourself?

Go to Go to http://bioinformatics.ramapu.edu/QGRS2/index.phphttp://bioinformatics.ramapu.edu/QGRS2/index.php

While the Mapper program is publicly While the Mapper program is publicly available, the database is still not ready for available, the database is still not ready for public accesspublic access

Page 20: A Bioinformatics Tool for  Analyzing G-quadruplexes in the  mRNA Untranslated Regions

Related ReferencesRelated References