Download - Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 [email protected] 1
![Page 2: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/2.jpg)
2
About me
![Page 3: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/3.jpg)
3
Overview• Big Data
– Big Data is BIG– Issues in research
• Semantic Web– Standards: URIs, RDF, SPARQL, OWL– Linked data
• Ontologies– Definition and reasoning– OBO Foundry– Example of existing ontologies– Pharmacovigilance– Publishing ontologies on the Semantic Web
• IRIDA– The IRIDA platform– Adding standards to IRIDA
• Take home message
![Page 4: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/4.jpg)
4
Overview• Big Data
– Big Data is BIG– Issues in research
• Semantic Web– Standards: URIs, RDF, SPARQL, OWL– Linked data
• Ontologies– Definition and reasoning– OBO Foundry– Example of existing ontologies– Pharmacovigilance– Publishing ontologies on the Semantic Web
• IRIDA– The IRIDA platform– Adding standards to IRIDA
• Take home message
![Page 5: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/5.jpg)
5
![Page 6: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/6.jpg)
6
Big data
Big data is data that is too large and complex to process for any conventional data tools.
![Page 7: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/7.jpg)
7
2005
![Page 8: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/8.jpg)
8
2013
![Page 9: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/9.jpg)
9
What is a Zettabyte?
1,000,000,000,000 gigabytes1,000,000,000,000 terabytes1,000,000,000,000 petabytes1,000,000,000,000 exabytes1,000,000,000,000 zettabyte
![Page 10: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/10.jpg)
10
How big is big?
• Facebook: 25 Terabytes of logged data per day, Google (2008): 20 Petabytes per day
• Over 90% of all the data in the world was created in the past 2 years [1]
• Today 3.2 zettabytes. 2020: 40 zettabytes.[2]
• Good news: jobs! [3]
1. http://www-01.ibm.com/software/data/bigdata/2. http://barnraisersllc.com/2012/12/38-big-facts-big-data-companies/3. http://www.webopedia.com/quick_ref/important-big-data-facts-for-it-professionals.html
![Page 11: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/11.jpg)
11https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century
![Page 12: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/12.jpg)
12
Issues with research data (1): data availability
http://www.nature.com/news/scientists-losing-data-at-a-rapid-rate-1.14416
![Page 13: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/13.jpg)
13
Issues with research data (2):
data reproducibility
http://www.firstwordpharma.com/node/931605#axzz3IalL2lzU
![Page 14: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/14.jpg)
14
Overview• Big Data
– Big Data is BIG– Issues in research
• Semantic Web– Standards: URIs, RDF, SPARQL, OWL– Linked data
• Ontologies– Definition and reasoning– OBO Foundry– Example of existing ontologies– Pharmacovigilance– Publishing ontologies on the Semantic Web
• IRIDA– The IRIDA platform– Adding standards to IRIDA
• Take home message
![Page 15: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/15.jpg)
15
A solution: the Semantic Web
"The Semantic Web is an... extension of the current web in which... information is given well-defined meaning,... better enabling computers and people to work in cooperation.”
The Semantic WebTim Berners-Lee, James Hendler and Ora LassilaScientific American, May 2001
http://www.scientificamerican.com/article/the-semantic-web/
![Page 16: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/16.jpg)
16
Adds to Web standards and practices (currently only for documents and services) encouraging• Unambiguous names for things, classes, and relationships• Well organized and documented in ontologies• With data expressed using uniform knowledge
representation languages (e.g. OWL)• To enable computationally assisted exploitation of
information• That can be easily integrated from different sources
The Semantic Web in a nutshell
![Page 17: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/17.jpg)
17
Some Semantic Web successes
• In February 2011, the Watson system by IBM made international headlines for beating the best humans in the quiz show Jeopardy!
• A significant number of very prominent websites are powered by Semantic Web technologies, including the New York Times, Thomson Reuters, BBC, and Google's Freebase.
• The Speech Interpretation and Recognition Interface Siri launched by Apple in 2011 as an intelligent personal assistant for the new generation of IPhone smartphones heavily draws from work on ontologies, knowledge representation, and reasoning.
http://130.108.5.60/faculty/pascal/pub/crc-handbook-13.pdf
![Page 18: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/18.jpg)
18
![Page 19: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/19.jpg)
19
Uniform Resource Identifiers (URIs)
• Two different uses:– Unambiguous name for something– Location of a document
• Examples:– http://example.org/wiki/Main_Page – ftp://example.org/resource.txt– mailto:[email protected]
![Page 20: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/20.jpg)
20
Resource Description Framework (RDF)
• Resources (= nodes)• Identified by Unique Resource Identifier (URI)
• Properties (= edges)• Identified by Unique Resource Identifier (URI)• Binary relations between 2 resources
http://elmonline.ca/sw/sparql/social.ttl
![Page 21: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/21.jpg)
21
<http://www.linkedin.com/in/mcourtot> a foaf:Person ; foaf:name "Melanie Courtot" ; foaf:knows <http://elmonline.ca/luke> ; foaf:knows <http://www.linkedin.com/pub/mark-wilkinson/1/674/665> .
![Page 22: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/22.jpg)
22
SPARQL
SELECT ?personWHERE { <http://www.linkedin.com/in/mcourtot> <http://xmlns.com/foaf/0.1/knows> ?person .}
---------------------------------------------------------------------------------------------| person |==========================================================| http://www.linkedin.com/pub/mark-wilkinson/1/674/665 || <http://elmonline.ca/luke> |----------------------------------------------------------------------------------------------
• An excellent tutorial by Luke McCarthy: http://elmonline.ca/sw/sparql/
A query language for RDF
![Page 23: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/23.jpg)
23
The Web Ontology Language (OWL)
• Knowledge representation language• Based on Description Logics: fragments of
First-Order logics with decidable and defined computational properties
• Sound, complete, terminating reasoners available
![Page 24: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/24.jpg)
24
Overview• Big Data
– Big Data is BIG– Issues in research
• Semantic Web– Standards: URIs, RDF, SPARQL, OWL– Linked data
• Ontologies– Definition and reasoning– OBO Foundry– Example of existing ontologies– Pharmacovigilance– Publishing ontologies on the Semantic Web
• IRIDA– The IRIDA platform– Adding standards to IRIDA
• Take home message
![Page 25: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/25.jpg)
25
Linked open data cloud
![Page 26: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/26.jpg)
26
Biological resources in LOD
![Page 27: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/27.jpg)
27
Examples of issues in linking data incorrectly
• http://dbpedia.org/resource/WelshOWL:sameAs
<http://sw.cyc.com/2006/07/27/cyc/EthnicGroupOfWelsh><http://sw.cyc.com/2006/07/27/cyc/Welsh-TheWord><http://sw.cyc.com/2006/07/27/cyc/WelshLanguage><http://sw.cyc.com/2006/07/27/cyc/Welshing-Cheating>
![Page 28: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/28.jpg)
28
Overview• Big Data
– Big Data is BIG– Issues in research
• Semantic Web– Standards: URIs, RDF, SPARQL, OWL– Linked data
• Ontologies– Definition and reasoning– OBO Foundry– Example of existing ontologies– Pharmacovigilance– Publishing ontologies on the Semantic Web
• IRIDA– The IRIDA platform– Adding standards to IRIDA
• Take home message
![Page 29: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/29.jpg)
29
Ontologies
• Representation of important things in a specific domain– Describes types of entities (e.g. cells) and relations between
them (e.g. prokaryotic cells and eukaryotic cells are cells) and their instances (e.g. the specific cells in my sample)
• An active computational artifact– A mathematical model based on a subset of first order logic– Tools can automatically process ontologies
• A communication tool– Provides a dictionary for collaborators, a shared
understanding– Allows data sharing
![Page 30: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/30.jpg)
30
Reasoning is critical
• Prokaryotic and Eukaryotic cell are declared disjoints
• Fungal cell is a Eukaryotic cell
• Spore is a Fungal cell and a Prokaryotic cell
Insatisfiability Solution: clarify spore
(sensu Mycetozoa) AND actinomycete-type spore
http://www.plosone.org/article/info:doi/10.1371/journal.pone.0022006
![Page 31: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/31.jpg)
31
Logics
• Simple example based on http://arxiv.org/pdf/1201.4089v1.pdf
• Ontology file available from http://www.sfu.ca/~mcourtot/course/20141112BigDataSemWebOntologies/ontology.owl
• Manipulation done using Protégé: http://protege.stanford.edu
![Page 32: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/32.jpg)
32
Family ontology
![Page 33: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/33.jpg)
33
Logics of a grandfather
![Page 34: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/34.jpg)
34
Reasoning
![Page 35: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/35.jpg)
35
Inferred class hierarchy
![Page 36: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/36.jpg)
36
Explanations
![Page 37: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/37.jpg)
37
A wrong assertion
![Page 38: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/38.jpg)
38
Unsatisfiability
![Page 39: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/39.jpg)
39
Overview• Big Data
– Big Data is BIG– Issues in research
• Semantic Web– Standards: URIs, RDF, SPARQL, OWL– Linked data
• Ontologies– Definition and reasoning– OBO Foundry– Example of existing ontologies– Pharmacovigilance– Publishing ontologies on the Semantic Web
• IRIDA– The IRIDA platform– Adding standards to IRIDA
• Take home message
![Page 40: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/40.jpg)
40
OBO Foundry
A subset of biological and biomedical ontologies whose developers have agreed in advance to accept a common set of principles reflecting best practice in ontology development designed to ensure
• tight connection to the biomedical basic sciences
• Compatibility
• interoperability, common relations
• formal robustness
• support for logic-based reasoning
![Page 42: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/42.jpg)
42
RELATION TO TIME
GRANULARITY
CONTINUANT OCCURRENT
INDEPENDENT DEPENDENT
ORGAN ANDORGANISM
Organism(NCBI
Taxonomy?)
Anatomical Entity
(FMA, CARO)
OrganFunction
(FMP, CPRO) Phenotypic
Quality(PaTO)
Organism-Level Process
(GO)
CELL AND CELLULAR
COMPONENT
Cell(CL)
Cellular Compone
nt(FMA, GO)
Cellular Function
(GO)
Cellular Process
(GO)
MOLECULEMolecule
(ChEBI, SO,RnaO, PrO)
Molecular Function(GO)
Molecular Process
(GO)Slide credit: Barry Smith
![Page 43: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/43.jpg)
43
Minimum Information to Reuse an External Ontology Term
• OBO and Sematic Web promote reuse of resources• Biological resources (e.g., FMA for anatomy),
taken together, are too big for current tool support.
• MIREOT used across the OBO library– OBI: 400 mireoted terms (140 GO, 55 ChEBI, 50 PATO)– PR (Protein Ontology): 23,000 mireoted terms
• http://ontofox.hegroup.org
![Page 44: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/44.jpg)
Example of OBO ontologies
• OBI, Ontology for Biomedical investigations• VO, the vaccine ontology• AERO, the Adverse Event Reporting Ontology
![Page 45: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/45.jpg)
45
Ontology for Biomedical Investigations (OBI)
• OBI is a multi-community project driven by the practical needs of its members with the goal to build a high quality, interoperable reference ontology
• OBI high level classes are in place - solidified over several years - that cover all aspects of biomedical investigations
• OBI is expanded to enable member applications and based on term requests
![Page 46: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/46.jpg)
46
High level class hierarchy (partial)
Slide credit: OBI Consortium
![Page 47: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/47.jpg)
47Slide credit: Alan Ruttenberg
![Page 48: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/48.jpg)
48Slide credit: OBI Consortium
![Page 49: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/49.jpg)
49
Representing vaccine data – the Vaccine Ontology (VO)
Picture credit: Yongqun He
![Page 50: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/50.jpg)
50
Overview• Big Data
– Big Data is BIG– Issues in research
• Semantic Web– Standards: URIs, RDF, SPARQL, OWL– Linked data
• Ontologies– Definition and reasoning– OBO Foundry– Example of existing ontologies– Pharmacovigilance– Publishing ontologies on the Semantic Web
• IRIDA– The IRIDA platform– Adding standards to IRIDA
• Take home message
![Page 51: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/51.jpg)
51
Representing pharmacovigilance data
• The Adverse Event Reporting Ontology (AERO)
• Encodes existing clinical guidelines (Brighton Collaboration)
![Page 52: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/52.jpg)
52
Background and problem statement
• Surveillance of Adverse Events Following Immunization is important– Detection of issues with vaccine – Importance of vaccine-risk communication
• Analysis of AE reports is a subjective, time- and money costly process– Manual review of the textual reports
![Page 53: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/53.jpg)
53
Workflow• Hypothesis: Use the AERO I developed to annotate
and classify a dataset• VAERS dataset
– Vaccine Adverse Event Reporting System– 6032 reports: ~5800 negative, ~230 positive– Post H1N1 immunization 2009/2010– Manually classified for anaphylaxis
• MedDRA (Medical Dictionary of Regulatory Activities) is used to represent clinical findings
![Page 54: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/54.jpg)
54
Automated Diagnosis workflow
![Page 55: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/55.jpg)
55
Results
At best cut-off point: Sensitivity 57%Specificity 97%
![Page 56: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/56.jpg)
56
AE classification can be improved through the use of ontologies
• Manual analysis: 3 months for 12 medical officers• Ontology-based analysis: once data collected (2 months), almost instantaneous
(2h on laptop) => Could allow for earlier detection of safety issues and better understanding of adverse events
2h automatedvs.
3 months manual
http://dx.doi.org/10.1371/journal.pone.0092632
![Page 57: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/57.jpg)
57
Overview• Big Data
– Big Data is BIG– Issues in research
• Semantic Web– Standards: URIs, RDF, SPARQL, OWL– Linked data
• Ontologies– Definition and reasoning– OBO Foundry– Example of existing ontologies– Pharmacovigilance– Publishing ontologies on the Semantic Web
• IRIDA– The IRIDA platform– Adding standards to IRIDA
• Take home message
![Page 58: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/58.jpg)
58
IRI dereferencing
![Page 59: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/59.jpg)
59
Ontobee: publishing biomedical resources on the Semantic Web
HTML for humans …
![Page 60: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/60.jpg)
… RDF for machines
Ontobee: publishing biomedical resources on the Semantic Web
![Page 61: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/61.jpg)
61
Overview• Big Data
– Big Data is BIG– Issues in research
• Semantic Web– Standards: URIs, RDF, SPARQL, OWL– Linked data
• Ontologies– Definition and reasoning– OBO Foundry– Example of existing ontologies– Pharmacovigilance– Publishing ontologies on the Semantic Web
• IRIDA– The IRIDA platform– Adding standards to IRIDA
• Take home message
![Page 62: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/62.jpg)
62
The Integrated Rapid Infectious Disease Analysis (IRIDA) project
• Goal: automate infectious disease outbreak detection and investigation
• Issues: – Integrate WGS, clinical and lab info– Provide relevant tools and validate pipeline
• Methods:– Data standards for information exchange– Analysis pipeline (Galaxy based)– User interface– Additional tools:
• IslandViewer• GenGIS
![Page 63: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/63.jpg)
63
![Page 64: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/64.jpg)
64
Building the IRIDA data standards
• Interview with key personnel at BCCDC• Review of existing resources• Identify “holes”, i.e., missing bits• Collect existing data• Liaise with implementation team• Generate cohesive resource• Validate
![Page 65: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/65.jpg)
65
Relevant data standards
• TypON, the typing ontology• OBI, the ontology for Biomedical Investigations• NGSOnto, Next Generation Sequencing Ontology• NIAIS-GS-BRC core metadata• MIxS ontology• TRANS, Pathogen Transmission ontology• ExO, Exposure Ontology• EPO, Epidemiology Ontology• IDO, Infectious Disease Ontology• Food: USDA, EFSA?
![Page 66: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/66.jpg)
66
Relevant international efforts
• MIxS standard• Global Microbial Identifier• Global Alliance for Genomics and Health• NCBI BioSample• European Nucleotide Archive• …
![Page 67: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/67.jpg)
67
Remaining challenges
• Trust, provenance– Ability to track origin of data to assess whether it
is trustworthy• Data sharing, reuse, policy
– Social and legal issues in getting access to data• Confidentiality
– Privacy concerns when linking data
![Page 68: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/68.jpg)
68
Overview• Big Data
– Big Data is BIG– Issues in research
• Semantic Web– Standards: URIs, RDF, SPARQL, OWL– Linked data
• Ontologies– Definition and reasoning– OBO Foundry– Example of existing ontologies– Pharmacovigilance– Publishing ontologies on the Semantic Web
• IRIDA– The IRIDA platform– Adding standards to IRIDA
• Take home message
![Page 69: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/69.jpg)
69
Take home message
Big data is a big challenge, but we can deal with it if done properly: that will be your responsibility
DO NOT build a black boxDO annotate and describe your dataDO make your data openly available
![Page 70: Big data, Semantic Web and Ontologies Mélanie Courtot, PhD Nov 12 th 2014 mcourtot@sfu.ca 1](https://reader036.vdocuments.pub/reader036/viewer/2022062321/56649dc45503460f94ab75e4/html5/thumbnails/70.jpg)
70
Acknowledgements
• Drs. Fiona Brinkman, Will Hsiao, Ryan Brinkman• The Brinkman^2 labs• Alan Ruttenberg, Barry Smith, Chris Mungall &
OBO• Colleagues at Public Health Agency Canada (Ms
Lafleche, Dr Law)• The IRIDA consortium and the IRIDA ontology
working group (Emma Griffiths and Damion Dooley)