adding more semantics to the social web

67
SIG-SWO Invited Lecture, NII Adding more semantics to the Social Web: structure and content John Breslin, Leader, Unit for Social Software, Insight, NUI Galway ジョン・ブレスリン、上級講師、 アイルランド国立大学ゴールウェイ校 10 th July 2015 Slides http://bit.ly/ジョン・ブレスリン

Upload: john-breslin

Post on 06-Aug-2015

623 views

Category:

Internet


2 download

TRANSCRIPT

SIG-SWO Invited Lecture, NII Adding more semantics to the Social Web: structure and content John Breslin, Leader, Unit for Social Software, Insight, NUI Galway

ジョン・ブレスリン、上級講師、 アイルランド国立大学ゴールウェイ校

10th July 2015 Slides http://bit.ly/ジョン・ブレスリン

John Breslin ジョン・ブレスリン @johnbreslin @jonburesurin

1986: SW radio

Before the Web and satellite phones, the world’s news was communicated and shared by shortwave radio •  I loved it!

Stations “crowdsourced” SINPO data collection, and “gami"ed” this process by returning QSL cards à •  SINPO = signal-interference-noise-propagation-overall report

1990: MAP.COM

•  My "rst “social software” program from 25 years ago •  Fusing data from user logins on a VAX/VMS mainframe with

computer locations in terminal rooms •  Terminal numbers were my unique identi"ers

1998: Set up a gaming forum 2000: Co-founded boards.ie from this

•  Ireland’s largest discussion forum site •  2.5 million visitors/month (~40% of Irish population) •  Irish people seeking information, or just chatting about sports, TV,

politics, health, whatever

•  Spin-o# site adverts.ie •  Classi"ed ads •  Bought by Schibsted Media Group yesterday

2004: Joined DERI at NUI Galway, founded the SIOC project

•  Semantically Interlinked Online Communities

•  Enables interoperability and exchange of social content: •  Blogs, forums, wikis

•  More later…

2011: Co-founder of StreamGlider, Inc.

•  Real-time streaming newsreader for the iPad •  With Nova Spivack and the

late Bill McDaniel

•  Supports social, multimedia, news

•  Can be used as an enterprise dashboard or event display

2013-2015: Startup ecosystem activation

•  Co-founder of NUI Galway Entrepreneurship Society

•  Co-founder of Startup Galway, Galway City Innovation District and the PorterShed

•  Advisor to Irish startups including AYLIEN, BirdLeaf, BuilderEngine and Pocket Anatomy

•  Author of yearly series on “Talented 38 Tech Women”

Big fan of Japanese culture!

•  Set up the mangatoanime.com discussion forum in 2001 •  First anime seen in 1978: Battle of

the Planets 科学忍者隊ガッチャマン

•  Fave manga: Battle Angel Alita 銃夢

•  Established the "rst Isao Tomita 冨田勲 website in 1996 •  Synthesizer musician known for

classical reworkings, "lm / TV works

•  Interviewed Tomita in 1999 à

•  Also like 喜多郎, 坂本龍一, YMO, ピタゴラスイッチ, 本田

National University of Ireland Galway アイルランド国立大学ゴールウェイ校

•  Galway is a small city in the West of Ireland

•  NUI Galway was established in 1845: •  One of Ireland’s seven

universities

•  105 hectares (260 acres) •  120 links with universities

around the world •  17,300 students: •  12,500 undergraduates,

3,600 postgraduates, 1,200 other

•  2,541 sta#: •  1,078 academics, 1,015 admin

and support, 448 research •  90,000 alumni in over a

hundred countries

Notable people and interesting connections to NUI Galway

•  Alice Perry, "rst female graduate engineer in the world, 1849 •  Michael O’Shaughnessy, chief engineer of San Francisco who

commissioned / named the Golden Gate Bridge, graduated in 1884 •  Also, two Galway men founded Menlo Park in Silicon Valley in 1854

•  JRR Tolkien J・R・R・トールキン was an external examiner in 1949 •  John Ryan, Macrovision inventor, studied here in the 1960s •  Current Irish President Higgins and Taoiseach Kenny are graduates •  Honorary degrees given to Nelson Mandela ネルソン・マンデラ,

Hillary Clinton ヒラリー・クリントン, Enya エンヤ •  Actor Martin Sheen マーティン・シーン studied here in 2006 •  The JK Rowling J・K・ローリング (ハリー・ポッター) charity Lumos

partnered with NUI Galway last week to help orphaned children worldwide stay with their families

“Electron” [I am a lecturer in Electronic Engineering] •  The term “electron” was coined

by George Johnstone Stoney •  Professor of Natural

Philosophy QCG (NUI Galway) from 1852-1857

•  Calculated charge (1874) •  Proposed the term

“electron” (1891) •  Electronic: 1,170,000,000

Google results

•  Email: 7,510,000,000 Google

results

Insight Centre for Data Analytics インサイト

•  Ireland’s largest multi-institution ICT research institute, funded by SFI

•  200 researchers •  8 institutions •  30 partners •  €88M in funding

Subsumes DERI (NUI Galway), Clarity (UCD / DCU), Clique (UCD / NUI Galway), TRIL (UCD), 4C (UCC)

Unit for Social Software (USS)

•  Established in 2005 as a sub-cluster of Semantic Web at DERI

•  7 PhDs in progress (including 2 industry-based PhDs, 1 part-time PhD), plus 1 PhD and 1 MSc submitting

•  9 PhD alumni, 3 MSc alumni •  6 postdoc alumni, 2 RA alumni, 15 visiting researcher alumni

(including Yuki Matsuoka PhD from NII)

Background knowledge on the Social Semantic Web ソーシャル・セマンティック・ウェブ

Social platforms are like data silos

Images from pidgintech.com

Many isolated communities of users and their data

Need ways to connect these islands

Allowing users to easily travel from one to another

Enabling users to easily bring their data with them

A two-way street: the Semantic Web can help the Social Web, and vice versa

•  Can use the Semantic Web to describe people, content objects and the connections that bind them all together so that social sites can interoperate via semantics

•  In the other direction, object-centered social websites can serve as rich social data sources for semantic applications

“I think we could...have both Semantic Web technology supporting online communities, but at the same time also online communities can also support Semantic Web data by being the sources of people voluntarily connecting things together.” – Tim Berners-Lee ティム・バーナーズ=リー

Image from tinyurl.com/highway2

Object-centred sociality (AKA social objects)

•  Users are connected via a common object: •  Their job, university, hobbies, interests, a date…

•  “According to this theory, people don’t just connect to each other. They connect through a shared object. […] Good services allow people to create social objects that add value.” – Jyri Engestrom •  Flickr or Instagram = photos •  YouTube or Vimeo = videos •  WordPress or Tumblr = posts •  etc.

The social objects that connect us to others can be represented by semantics

For example

What is the Social Semantic Web (SSW)? ソーシャル・セマンティック・ウェブ

Some SSW vocabularies

•  FOAF •  SIOC •  Created at NUI Galway

•  Online Presence Ontology [OPO] •  Co-created at NUI Galway

•  Semantic Cloud of Tags [SCOT] •  Created at NUI Galway

•  Meaning of a Tag [MOAT]

•  Facebook OGP •  Contributions to RDF version

from NUI Galway •  schema.org •  RDF version at NUI Galway

Facebook Open Graph Protocol schema.org

OPO

SCOT

Semantically Interlinked Online Communities (SIOC)

Creating an ontology needs more than just a spec page: community, evangelism, etc.

A range of SIOC modules were created to extend SIOC Core while avoiding clutter •  SIOC Access (sioce) •  SIOC Actions (sioca) •  SIOC Argumentation (siocr) •  SIOC Chat (siocc) •  SIOC Mining (siocm) •  SIOC Quotes (siocq) •  SIOC Services (siocs) •  SIOC Types (sioct) •  SWAN/SIOC (swansioc)

The foundations are there, so what have we been aiming for since 2008?

1.  Continue dissemination of Social Semantic Web ontologies to increase the level and quality of social semantic data

2.  Transition social semantic data (existing and future) into knowledge

Continued dissemination of the Social Semantic Web… …and making more data

Some applications already using SIOC

Impact: RDFa in Drupal 7

•  Drupal has a 6-7% market share of content management systems

•  Drupal 7 release has Semantic Web support built-in: •  NUI Galway hosted and sponsored the Semantic Drupal “hackathon”

that introduced this RDFa support •  Used on energy.gov, london.gov.uk, www.iq.harvard.edu,

software.intel.com…

•  RDFa (SIOC, FOAF, Dublin Core, SKOS) data used for blog posts, forums, etc.

•  E#orts are currently underway in Drupal 8 to replace some of these terms with types from the schema.org vocabulary (recommended by four major search engines)

Image from tinyurl.com/drupaper

RDFa on london.gov.uk

How much SIOC data is out there?

Images (this one and later backgrounds) from publicdomainpictures.net

Sindice 2012: classes

•  Total instances of SIOC classes: 7.7M •  Up 200k in three months

•  Most occurences: sioc:Item (2.2M) •  Followed by: UserAccount (1.6M), MicroblogPost (1.3M), Post (800k),

User (700k), Comment (400k)… •  Note: 1 billion foaf:Person instances!!!

•  Used on most [distinct] sites: •  Item (7k), UserAccount (7k), Post (3k)… •  Consistent with "ndings by Mika and Potter in 2012: Item (20k),

UserAccount (15k), Post (5k), BlogPost (3k) and Comment (3k) in that order

Sindice 2012: predicates

•  Total instances of SIOC predicates: 22.5M •  Up 400k in three months

•  Most occurences: sioc:follows (4.6M) •  Followed by: topic (4M), account_of (3.5M), has_creator (2.7M),

links_to (1.5M), has_discussion (1.3M)... •  Used on most [distinct] sites: •  has_creator (8k), num_replies (7k), name (2k), account_of (1.5k),

reply_of (1.5k)...

Sindice 2012: namespaces

•  SIOC data is being generated from 10k distinct domains (2k SLDs) (plus 2k domains for the SIOC Types module) •  Increasing by about 100 domains a month •  No doubt helped by Drupal!

•  FOAF data is being generated from 3M distinct domains (100k SLDs) •  Increasing by over 1000 domains a month

Web Data Commons: RDFa data sets from December 2014

1.  foaf:Image (143,818,149 Entities)

2.  og:"article" (65,233,945 Entities)

3.  gd:Breadcrumb (56,755,178 Entities)

4.  foaf:Document (35,991,377 Entities)

5.  sioc:Item (34,880,432 Entities)

6.  skos:Concept (26,315,007 Entities)

7.  og:"website" (23,429,568 Entities)

8.  sioc:Post (19,457,818 Entities)

9.  sioc:Comment (18,946,600 Entities)

10.  gd:Review-aggregate (14,970,496

Entities)

11.  sioc:UserAccount (14,846,680 Entities)

•  Bizer et al., 2012-2014 •  2.01 billion web pages •  20.48 billion RDF triples •  SIOC available from 6-7% of

the PLDs (pay-level domains) with RDFa

•  Top RDFa classes shown on the right

•  Lots of SSW terms still used

2008-2010: Online Presence Ontology

•  OPO aims to unify presence information and status noti"cation processes across di#erent services: •  Twitter, Facebook, Foursquare, etc.

•  Help solve the information overload issue by providing a means to identify to whom / which community presence information should be directed: “sharing spaces”

•  Collaborative e#ort between Université Paris-Sud XI, Orsay, University of Belgrade, NUI Galway and Université Paris-Sorbonne •  Leveraged in NUI Galway’s collaboration with Cisco

2009: SWAN/SIOC W3C IG Note

Collaboration with Harvard Medical School

2010: SIOC and FOAF in Facebook Open Graph RDF

2011: SSW RDF mappings for schema.org

From social semantic data to social semantic knowledge… …and adding more semantics to the content as well as structures

SMOB (Semantic Microblogging) [Google Grant for Passant]

Structure and display opinions and arguments to support their (re)use

OriginalDiscussion

Ontology

Semantic Enrichment

Semantically Enriched

RDFa

Querying

Queryable

User Interface

With Barchart

Schneider, Web Science 2010, ACM SAC 2011, CSCW 2013

Adding discussion summaries to open collaboration systems

Schneider, WikiSym 2012

Augmenting social media items with metadata using related web content

tags?

topic? location?

Kinsella, ECIR 2011, ESWC 2011, Web Science 2010

Last night I saw Connacht play at The Sportsground. The match started well for Connacht with a great try but after half time the opposition closed the gap. Finally we managed to hold out for the win. It was a great game from both sides. Here's a clip of the "rst try.

TAG PREDICTION GEOLOCATION TOPIC

CLASSIFICATION

...didn’t see t h e m a t c h but here’s a s u m m a r y from John..

. . . . . . . . . . . . . .This review of the C o n n a c h t match shows that they are getting back in form!......

href

href

YouTube Title: Fionn Carr try

Category: Sport Tags: rugby, try, carr, connacht

Last night I saw Connacht play at The Sportsground. The match started well for Connacht with a great try but after half time the opposition closed the gap. Finally we managed to hold out for the win. It was a great game from both sides. Here's a clip of the "rst try.

JohnSmith John Smith I’m at the Galway Sportsground

Enhanced topics and tags on items (that can propagate to a user’s pro"le)

href

Aggregated, interoperable and multi-domain SSW user interest pro"les

Orlandi, IEEE WI 2013, I-SEMANTICS 2012, SWJ 2011; S2E Gift Funding from Cisco Foundation

Extract semantic entities from social content…

…categorise, reweight and rank…

…and use these to build SSW user interest pro"les with associated provenance info

ppo:PrivacyPreference

ppo:hasLiteral

rdfs:Literal

rdfs:Resource

ppo:appliesToResource

rdf:Statement trix:Graph

ppo:AccessSpace ppo:hasAccessSpace

ppo:appliesToStatement ppo:appliesToNamedGraph

ppo:hasAccessQuery

ppo:Condition

rdf:Property

ppo:hasProperty ppo:classAsObject ppo:classAsSubject ppo:resourceAsObject

acl:Access

ppo:hasAccess

ppo:resourceAsSubject

ppo:hasCondition

Restrictions Conditions Access Test Queries Access Control Privileges

rdfs:Resource rdfs:Resource rdfs:Class rdfs:Class

rdfs:Literal

This rdfs:Literal represents a SPARQL query as a String.

And with all this social semantic data, who gets to see it? à Privacy Prefs Ont (PPO)

Sacco, IEEE TrustCom 2011; S2E Gift Funding from Cisco Foundation

•  PPM provides two main tasks: •  A user creates his or her privacy preferences •  A requester logs into the other user’s PPM which in turn will give

back a faceted pro"le - "ltered based on the privacy preferences

User B Requester

Privacy Preference Manager

Private Social Semantic Data

Privacy Preferences

User A

WebID

Sacco, 2nd Prize Award in I-Semantics 2012 Demo Track; Led to collaboration with the late George Thomas’ team in the US Department of Health and Human Services

Privacy Preference Manager (PPM)

New SSW topics of interest

•  Semantically Enabled Social Hub to Control Personal Data and Ownership

•  Scalable Topic-Level Sentiment Analysis on Streaming Feeds (with AYLIEN)

•  Social Semantic User Modeling in Online Social Networks for Recommendation (using SAP HANA) [next slide]

•  High-Level Cross-Medium Open-Set Authorship Identi"cation (with AYLIEN)

•  Semantic Crisis Management Framework

“Who’s learning what in MOOCs?” …and where are they learning it •  Social pro"le data (currently

from LinkedIn search) and MOOC data (currently from the Coursera API) combined with geographic LOD

•  Semantic model based on a newly proposed resume ontology along with GeoNames and a proposed schema.org extension for online courses managerscientist software

analyst analyticsseniordataengineer

developer

student

research

The DataScientist’s

Toolbox

Introto DataScience

DataAnalysis

MachineLearning

Computingfor DataAnalysis

R Prog-ramming

Piao, work in progress

Finding out more about current and future e#orts

Join the Social [Web] working group and interest group at the W3C

•  www.w3.org/Social/WG •  Social data syntax •  Social API •  Federation protocol

•  www.w3.org/Social/IG •  Use cases to drive social

standards for both businesses and consumers

•  Social architecture report •  Social vocabularies

Stay tuned for future activities organised by IFIP WG 12.7

•  International Federation for Information Processing Working Group on Social Networking Semantics and Collective Intelligence

•  John Breslin is vice-chair and a co-founding member

The Social Semantic Web ソーシャル・セマンティック・ウェブ

•  Read our "rst book on this topic, published by Springer in 2009 •  Recommended reading for

seminars and courses o#ered by the Utrecht Graduate School of Humanities, Anna University Chennai, and the Technical University of Munich

•  Authors from NUI Galway (Breslin, Passant, Decker)

Social Semantic Web Mining ソーシャル・セマンティック・ウェブ・マイニング

•  A follow-up book published by Morgan & Claypool in 2015 •  Combines the structures put

in place by the Social Semantic Web with knowledge derived from the content of those structures

•  Co-authors from University of Southampton and the University of Chile (Omitola, Ríos, Breslin)

Come to Galway ゴールウェイに来ます

Interested in collaborating and/or being a visiting researcher at NUI Galway? •  SFI ISCA Japan Bilateral

Short-Term Visits •  国際戦略協力賞ー日本 •  Funding available for

Japanese researchers to visit NUI Galway

•  Researchers of all levels are eligible to apply

•  http://bit.ly/iscajapan

•  JSPS Bilateral Programs •  “Open partnership” •  Joint research projects •  Joint seminars •  Deadline 8 September 2015

•  JSPS Long-Term Awards •  Postdoctoral fellowships

[Deadline 4 September 2015] •  Talented researchers abroad

(lecturers, professors) [probably May 2016]

ありがとうございます! Any questions?

•  Ask me now… •  Or email me later at

[email protected]

•  Thanks to Science Foundation Ireland’s ISCA Japan for funding my visit to NII!