sis sat 1000 josh dreller

39
IBM’s Watson: Is the World’s Trivia Champion the Future of Search? Josh Dreller VP Media Technology & Analytics Fuor Digital @fuordigital

Upload: mediapost

Post on 17-May-2015

690 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Sis sat 1000 josh dreller

IBM’s Watson: Is the World’s Trivia Champion the Future of Search?

Josh DrellerVP Media Technology & AnalyticsFuor Digital @fuordigital

Page 2: Sis sat 1000 josh dreller
Page 3: Sis sat 1000 josh dreller
Page 4: Sis sat 1000 josh dreller

Two Types of Innovation

• Incremental – improving current projects and advancing them in a linear fashion

• Grand Challenges – pushing the limits of science– Must be Difficult (has to be a challenge)– Must be Inspiring– Irresistible Vision – “just has to be done”

Page 5: Sis sat 1000 josh dreller

Previous IBM Grand Challenges

Page 6: Sis sat 1000 josh dreller

Open Question Answering• The way normal humans communicate• Natural Language – very ambiguous, but

at the heart of human intelligence

Last night I shot an elephant in my pajamas. How it got into my pajamas I’ll never know.

Page 7: Sis sat 1000 josh dreller

The Ultimate Advancement:Computers that can communicate with humans in Natural Language

Page 8: Sis sat 1000 josh dreller

8What Can Search Learn From Watson?

• The only way to push forward is to take huge leaps and look for self-imposed challenges even if we can’t prove out the business case right now

• What kind of Grand Challenges could Search create?– A non-spammable Search Engine?– No need for Search Engine Optimization?– ??

Page 9: Sis sat 1000 josh dreller
Page 10: Sis sat 1000 josh dreller

The Jeopardy! Challenge: A compelling and notable way to drive and measure the technology of automatic Question Answering along 5 Key Dimensions

Broad/Open Domain

Complex Language

High Precision

Accurate Confidence

HighSpeed

$800In cell division, mitosis

splits the nucleus & cytokinesis splits this liquid

cushioning the nucleus

$200If you're standing, it's the

direction you should look to check out the wainscoting.

$1000Of the 4 countries in the

world that the U.S. does not have diplomatic relations

with, the one that’s farthest north

Page 11: Sis sat 1000 josh dreller

Jeopardy Reaction• Not too inspired at first• Weren’t interested in a circus sideshow• 2009, IBM setup a moch-studio at their

New York research facility• Sparring matches with ex-Jeopardy

winners• Eventually saw the potential and thought

it was something special

Page 12: Sis sat 1000 josh dreller

A Very Simple Question for a Computer

In (( 12,546,798 * P ) ^ 2) / 34,567.46 = ?

= .00885

Greater than or less than 1?50/50 Shot

Page 13: Sis sat 1000 josh dreller

Real Language is Real Hard

• Chess– A finite, mathematically well-defined search

space– Limited number of moves and states– Grounded in explicit, unambiguous

mathematical rules

• Human Language– Ambiguous, contextual and implicit– Grounded only in human cognition– Seemingly infinite number of ways to express the same meaning

Page 14: Sis sat 1000 josh dreller

The Opposite of Current Computer Language

• Questions not designed for a computer to answer – Slang– Crafty questions– Shorthand– Rhyme– Regionalism– Anagrams

• Complex Language!

Page 15: Sis sat 1000 josh dreller

Structured vs. Unstructured Data

One day, from among his city views of Ulm, Otto chose a watercolor to send to Albert Einstein as a remembrance of Einstein´s birthplace.

Person Born In

A. Einstein Ulm

Structured

Unstructured

Where was Einstein born?

Page 16: Sis sat 1000 josh dreller

16Common Sense Knowledge Base

• An ontology of classes and individuals• Parts and materials of objects• Properties of objects (such as color and size)• Functions and uses of objects• Locations of objects and layouts of locations• Locations of actions and events• Durations of actions and events• Preconditions of actions and events• Effects (post conditions) of actions and events• Subjects and objects of actions• Behaviors of devices• Stereotypical situations or scripts• Human goals and needs• Emotions• Plans and strategies• Story themes• Contexts

Can a can

CanCan?

Page 17: Sis sat 1000 josh dreller

17What Can Search Learn From Watson?

• We need to focus on what computers aren’t good at, not what they are good at

• Keywords and Links are not savvy enough. Natural language is key to a next generation search engine

• Most of human knowledge is kept in unstructured data sources or based on common sense context

Page 18: Sis sat 1000 josh dreller
Page 19: Sis sat 1000 josh dreller

The DeepQA Project

• Dr. David Ferrucci• 25-30 full time researchers from many disciplines.• 2007-2011• Millions of dollars• Post Jeopardy implications

Page 20: Sis sat 1000 josh dreller

Speed Results• Deployed Watson

on 2,880 IBM POWER 750 computer cores

• Went from 2 hours per question on a single CPU to an average of just 3 seconds – fast enough to compete with the best.

Page 21: Sis sat 1000 josh dreller

Example Question

IN 1698, THIS COMET DISCOVERER TOOK A SHIP CALLED THE PARAMOUR PINK ON THE FIRST PURELY SCIENTIFIC SEA VOYAGE

Related Content(Structured & Unstructured)

Primary Search

Wilhelm TempelHMS Paramour

Isaac Newton

Halley’s Comet

Pink Panther

Christiaan Huygens

Peter Sellers

Edmond Halley

Candidate Answer Generation

1) Edmond Halley (0.85)2) Christiaan Huygens

(0.20)3) Peter Sellers (0.05)

Merging &Ranking

EvidenceRetrieval

Question Analysis

Keywords: 1698, comet, paramour, pink, …AnswerType(comet discoverer)Date(1698)Took(discoverer, ship)Called(ship, Paramour Pink)…

[0.58 0 -1.3 … 0.97]

[0.71 1 13.4 … 0.72]

[0.12 0 2.0 … 0.40]

[0.84 1 10.6 … 0.21]

[0.33 0 6.3 … 0.83]

[0.21 1 11.1 … 0.92]

[0.91 0 -8.2 … 0.61]

[0.91 0 -1.7 … 0.60]

EvidenceScoring

Spati

al

Tem

pora

l

Lexic

alTa

xono

mic

Page 22: Sis sat 1000 josh dreller

Confidence is Key• Watson only rings in if it can reach a

statistically significant confidence in time• Some questions take longer than others• Some questions will be able to answer

less confident than others• Watson manages risk in betting based on

confidence

Page 23: Sis sat 1000 josh dreller

Embarrassingly Parallel Computing• Def: “Little or no effort is required to separate

the problem into a number of parallel tasks”• Works on many algorithms at once and comes

back together with confidence scores.– different from distributed computing problems

(such as Google’s MapReduce) that require communication between tasks, especially communication of intermediate results.

Page 24: Sis sat 1000 josh dreller

24What Can Search Learn From Watson?• We can’t be bound by the constraints of

current technology• Two fold process of first coming up with

answers then vetting them with more evidence

• Ideas like Parallel Processing will allow us to jump ahead

Page 25: Sis sat 1000 josh dreller
Page 26: Sis sat 1000 josh dreller

Ken Jennings & Brad Rutter

Page 27: Sis sat 1000 josh dreller

27

The Best Human Performance: Our Analysis Reveals the Winner’s Cloud

Winning Human Performance

2007 QA Computer System

Grand Champion Human Performance

Top human players are remarkably

good.

Each dot represents an actual historical human Jeopardy! game

More Confident Less Confident

Computers?

Page 28: Sis sat 1000 josh dreller

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

% Answered

Baseline

12/2007

8/2008

5/2009

10/2009

11/2010

12/2008

DeepQA: Incremental Progress in Precision and Confidence 6/2007-11/2010

5/2008

Now Playing in the Winners Cloud

4/2010

Prec

isio

n

Page 29: Sis sat 1000 josh dreller

Confidence Bar

Page 30: Sis sat 1000 josh dreller

30What Can Search Learn From Watson?

• Confidence bar would be a great addition to SERPs

• We must benchmark what is “good” and then aim higher

• These things take time (and money)

Page 31: Sis sat 1000 josh dreller
Page 32: Sis sat 1000 josh dreller

TJ Watson Research CenterYorktown, NYTwo Games: Aired February 14-16, 2011

Page 33: Sis sat 1000 josh dreller

The End – Humans Win

Page 34: Sis sat 1000 josh dreller
Page 35: Sis sat 1000 josh dreller

Financial Industry• Generates large amounts of data and

growing 70% per year• Not just numbers, but all info that would

influence the biz landscape (news, articles, blogs, etc)

• Recent financial crisis shows failures of lack of understanding in interdependencies

Page 36: Sis sat 1000 josh dreller

Most Confident Diagnosis: Diabetes and EsophogitisMost Confident Diagnosis: Diabetes

UTI

Diabetes

Influenza

hypokalemia

Renal failure

esophogitis

Diagnosis Models

Symp

FamHist

Meds

Find Confidence

Most Confident Diagnosis: Influenza

Most Confident Diagnosis: UTI

Considers and synthesizes a broad range of evidence improving quality, reducing cost

DeepQA in Continuous Evidence-Based Diagnostic Analysis

Symptoms

Tests/FindingsMedications

Family History

Notes/Hypotheses

Huge Volumes of Texts, Journals, References, DBs etc.

Patient History

Page 37: Sis sat 1000 josh dreller

37What Can Search Learn From Watson?

• Even the most daunting task can be overcome

• It’s not company versus company, it’s stretching human knowledge

• How can search engines help other industries

Page 38: Sis sat 1000 josh dreller

Sources

• “What is Watson?” presentation by Adam Lally, IBM Research

• Jeopardy website with videos: http://www.jeopardy.com/minisites/watson/

• NYTimes article: “What Is I.B.M.’s Watson?http://www.nytimes.com/2010/06/20/magazine/20Computer-t.html?_r=2&ref=opinion

• Wired magazine:“IBM’s Watson Supercomputer Wins Practice Jeopardy Round”http://www.wired.com/epicenter/2011/01/ibm-watson-jeopardy/#

• More technical: AI magazine “Building Watson: An Overview of the DeepQA Project”http://www.stanford.edu/class/cs124/AIMagzine-DeepQA.pdf

Page 39: Sis sat 1000 josh dreller

39

Thank You