myvariant.info--community aggregated variant annotation as a service (ngs2016, barcelona)
Post on 13-Jan-2017
349 Views
Preview:
TRANSCRIPT
Jiwen (Kevin) Xin, Cyrus Afrasiabi, Sean D. Mooney, Andrew I. Su, Chunlei Wu
kevinxin@scripps.edu
The Scripps Research InstituteLa Jolla, CA, USA
NGS 2016
04/05/2016
MyVariant.infoCommunity-aggregated Variant Annotations As a
Service
Schematic view of MyVariant.info architecture
Each data source is updated individually. Colors indicate their different updating
schedules.
MyVariant.info for the end users:
http://MyVariant.info(currently v1 API, two endpoints)
http://MyVariant.info/v1/query?q=<query>
any query term(s)
matching variant hits
http://MyVariant.info/v1/variant/<variantid>
hgvs id(s)
matching variant object(s)
Both supports batch-mode via POST
Simple API. No sign-up. No API key.
Try our live API , and documentations
http://myvariant.info/v1/variant/chr1:g.31349647C>T
Retrieving a single variant
Integrated annotations across resources in well-formatted data structure
Always up-to-date
http://myvariant.info/v1/variant/chr1:g.31349647C>T
http://myvariant.info/v1/variant/chr1:g.31349647C>T?fields=dbnsfp
http://myvariant.info/v1/variant/chr1:g.31349647C>T?fields=dbnsfp.clinvar
http://myvariant.info/v1/variant/chr1:g.31349647C>T?fields=dbnsfp.clinvar,dbsnp.gmaf,clinvar.hgvs.coding
Filtering returned fields
Making flexible queries
• All variants with dbNSFP annotation: http://myvariant.info/v1/query?q=_exists_:dbnsfp
• All non-synonymous variants on gene "BTK": http://myvariant.info/v1/query?q=dbnsfp.genename:BTK
• All variants within a genomic range: http://myvariant.info/v1/query?q=chr1:69000-70000
• Query Wellderly variants together with other annotation sources: http://myvariant.info/v1/query?q=_exists_:wellderly AND cadd.polyphen.cat:possibly_damaging
&fields=wellderly,cadd.polyphen
Many more ways of querying, across resources
Full-text queries Wildcard queries Range queries Boolean queries Regex queries Field existing/missing Faceting Paging Sorting Batch queries Support JSONP, CORS …
MyVariant.info stats• total (334,293,820)
• dbNSFP (82,030,830; v3.0)• dbSNP (145,132,257; v144)• ClinVar (131,383; 201602)• EVS (1,977,300; v2)• CADD (226,932,858; v1.3)• MutDB (420,221)• gwassnps (15,243; from UCSC)• COSMIC (1,024,498; v68 from UCSC)• DOCM (1,119)• SNPedia (5,907)• EMVClass (12,066)• Wellderly (21,240,519)• EXAC (10,195,872; v0.3)• GRASP (2,212,148; v2.0.0.0) As of April, 2016
MyVariant.info official Python/R Clients
myvariant Python client hosted in PyPI (initial release in Aug 2015)
myvariant R client hosted in Bioconductor(initial release in Oct 2015)
User Case 2: An example workflow for variant prioritization
input variants
output variants
filter1 <- lapply(vars, function(i) subset(i, cadd.consequence %in% c("NON_SYNONYMOUS", "STOP_GAINED", "STOP_LOST", "CANONICAL_SPLICE", "SPLICE_SITE")))
filter2 <- lapply(filter1, function(i) subset(i, exac.af < 0.01))
filter3 <- lapply(filter2, function(i) subset(i, sapply(dbnsfp.1000gp1.af, function(j) j < 0.01 )))
Use case 3
For curator/data provider:
A platform for
integrating with other resources(saving repetitive efforts)
distribute your valuable data(under your own source field)
Use case 4
For variant curation itself:
Identify discrepancies
Serve as the base of community-engaged curation process
Linked data
URI (Uniform Resource Identifier):
Provide unique identifier for anything or any concept on the website
Connective:connecting data, concepts, applications and ultimately people.
URL (Uniform Resource Link):
Provide unique identifier for webpages
Text files, images, music, videos
Interactive:Twitter, Facebook, blogs
Why Linked Data?
Providing Unique Identifier for a concept
Genename
e.g. CDK2
genename, (database1)
gene_name, (database2)
{’gene’: {‘name’:…}}, (database3)
URI: http://identifiers.org/hgnc.symbol
Data Discrepancy ---- Example
http://myvariant.info/v1/variant/chr12:g.111351981C>T?fields=clinvar.rsid,dbsnp.rsid,evs.rsid
Acknowledgement
Funding and SupportU54GM114833U01HG008473
Washington U:Ben AinscoughObi Griffith
TSRI:
Chunlei WuAndrew SuJiwen XinCyrus AfrasiabiGinger TsuengAdam Mark
Greg StuppTim Putman
STSI:
Eric TopolAli TorkamaniGalina Erikson
U. Washington:
Sean MooneyMoritz JuchlerNikhil Gopal
OICR:Robin Haw
UC Berkeley:Chris Mungall
UCSD:Trish Whetzel
MyVariant.info
MyVariant.info Clients
API:https://myvariant.info
Python Client: https://pypi.python.org/pypi/myvariant/
R Client: http://bioconductor.org/packages/release/bioc/html/myvariant.html.
Jupyter Notebook Tutorial for Python Client (Focus on Clinvar): https://cdn.rawgit.com/SuLab/myvariant.info/master/docs/ipynb/myvariant_clinvar_demo.html
top related