bigdata ibm partners - computerworldweb.idg.no/app/web/online/event/channelworld/cw... · bigdata...
TRANSCRIPT
BigData IBM Partners "Fisk innsikt fra datahavet"
Big data kommer i en størrelse – Big!
90 % av verdens samlede datamengde har blitt produsert
i løpet av de siste to årene.
I denne sesjonen tar vi en praktisk tilnærming til Big Data;
hva slags dataformer finnes,
hvordan kan man håndtere volumet praktisk og effektivt og ikke minst –
hvilke muligheter finnes for sluttbrukerne om de forvandler støy til
innsikt?
Trond Bjerkvold
Sjefkonsulent
Servere
IBM
Christopher Conradi
Technical professional Cognos
Business Intelligence
and Watson
IBM
Sal 1
10:00-10:30
BigData IBM Partners "Fisk innsikt fra datahavet"
Big data kommer i en størrelse – Big!
90 % av verdens samlede datamengde har blitt produsert
i løpet av de siste to årene.
I denne sesjonen tar vi en praktisk tilnærming til Big Data;
hva slags dataformer finnes,
hvordan kan man håndtere volumet praktisk og effektivt og ikke minst –
hvilke muligheter finnes for sluttbrukerne om de forvandler støy til
innsikt?
Trond Bjerkvold
Sjefkonsulent
Servere
IBM
Christopher Conradi
Technical professional Cognos
Business Intelligence
and Watson
IBM
Sal 1
10:00-10:30
Tenk på... ...alle bøker skrevet
...alle filmer regissert
...alle bilder tatt
...all musikk komponert
...alle malerier malt
...alle sms-er sendt
frem til 2011
10%
Verdens samlede datamengde anno 2013
Hvor kommer all data fra?
1911 IBM Is Founded 1911 A culture of THINK 1911 The Punched Card Tabulator 1914 The Accessible Workforce 1914 The Professional Sales Force 1911 A Commitment to Employee Education 1917 Patents and Innovation 1924 The Making of IBM 1928 The IBM Punched Card 1928 Optimization of Global Railways 1931 Pioneering Machine-Aided Translation 1934 The Automation of Personal Banking 1935 Radiotype Wireless Data Transmission 1937 Automated Test Scoring 1937 The Social Security System 1938 The Optimization of Oil Supplies 1940s--- Preservation of Culture 1944 1st Corporate Pure Research Lab 1945 The Origins of Computer Science 1946 1st Commercial Electronic Calculator 1949 Creation of the World Trade Corporation 1950s Popularizing Math and Science 1951 IBM 700: Computing comes to Business 1952 1st Magnetic Tape Storage 1953 Equal Opportunity Workforce 1956 Corporate Design Program 1956 1st Magnetic Hard Disk 1957 FORTRAN Programming Language 1958 1st Salaried Workforce 1959 The Mainframe 1959 Smarter Healthcare Management 1960 1st Online Reservation System 1961 The Selectric Typewriter 1962 Pioneering Speech Recognition 1963 1st National Air Defense Network 1963 Predictive Crime Fighting 1964 System 360: Computer Systems 1965 Exploring Undersea Frontiers 1967 DRAM: Invention of On-Demand Data 1967 Fractal Geometry 1968 Information Management System 1969 Magnetic Stripe Technology 1969 The Apollo Missions 1969 Securing Online Transactions 1970 Relational Database 1971 The Floppy Disk 1971 Environmental Responsibility Leadership 1972 The Invention of Magneto-Optical Disk 1972 Innovating the Self-Service Kiosk (ATM) 1973 The UPC Barcode 1976 Tracking Infectious Diseases 1977 Cryptography for a Connected World 1980 RISC Architecture 1981 The PC: Personal Computing 1981 Excimer Laser Surgery 1981 IGF - Financing Technical Innovation 1981 The Networked Business Place 1986 Scanning Tunneling Microscope 1986 The Emergence of the CIO 1986 High-Temperature Superconductors 1987 The Rise of the Internet 1988 Optimizing the Food Supply 1990 Innovating the Fan Experience 1994 Silicon Germanium Chips 1995 e-business 1996 Deep Thunder 1996 The Application of Spintronics 1997 Bringing Order to Unstructured Data 1997 Copper Interconnects 1997 Deep Blue 1998 WebSphere 2000 LINUX: The Era of Open Innovation 2000 The Cell Broadband Engine 2001 The First Multi-Core, 1GHz Processor 2002 Nanotechnology 2003 The Invention of Service Science 2003 A Global Volunteer Network 2003 Preserving the Legacy of Film 2003 A Business and Its Beliefs 2004 Blue Gene 2004 The World Community Grid 2004 New Business Models for Telecom The Mobilization of Relief Efforts 2004 Racetrack Memory 2005 Mapping of Humanity’s Family Tree 2005 Pioneering Genetic Privacy 2006 A Global Innovation Jam 2006 The Globally Integrated Enterprise 2006 Information-Based Medicine 2007 The Management of Transportation Flow 2008 Corporate Service Corps 2008 Breaking the Petaflop Barrier 2008 Smarter Planet 2008 Sustainable Cocoa 2009 Smarter Water Management 2009 The DNA Transistor 2009 Nationwide Smart Energy & Water Grid 2009 Medicine On Demand
2009 The Invention of Stream Computing
2011 A Computer Called “Watson”
Co
mp
ute
r In
tellig
en
ce
Time
Tabulating Era
Computing Era
Smart Systems Era
We Are Entering a New Era
Sym
pto
ms
UTI
Diabetes
Influenza
hypokalemia
Renal failure
no abdominal pain no back pain no cough no diarrhea
(Thyroid Autoimmune)
Esophagitis
pravastatin Alendronate
levothyroxine hydroxychloroquine
Diagnosis Models
frequent UTI
cutaneous lupus
hyperlipidemia osteoporosis
hypothyroidism
Confidence
difficulty swallowing
dizziness
anorexia
fever dry mouth thirst
frequent urination
Fa
mily
H
isto
ry
Graves’ Disease
Oral cancer Bladder cancer Hemochromatosis Purpura
Pa
tie
nt
His
tory
M
edic
ation
s
Fin
din
gs
supine 120/80 mm HG
urine dipstick:
leukocyte esterase
urine culture: E. Coli
heart rate: 88 bpm
Symptoms A 58-year-old woman complains of
dizziness, anorexia, dry mouth, increased
thirst, and frequent urination. She had
also had a fever. She reported no pain in
her abdomen, back, and no cough, or
diarrhea.
A 58-year-old woman presented to her
primary care physician after several days
of dizziness, anorexia, dry mouth,
increased thirst, and frequent urination.
She had also had a fever and reported
that food would “get stuck” when she
was swallowing. She reported no pain in
her abdomen, back, or flank and no
cough, shortness of breath, diarrhea, or
dysuria
Family History Her family history included oral and
bladder cancer in her mother, Graves'
disease in two sisters, hemochromatosis
in one sister, and idiopathic
thrombocytopenic purpura in one sister
Patient History
Her history was notable for cutaneous
lupus, hyperlipidemia, osteoporosis,
frequent urinary tract infections, a left
oophorectomy for a benign cyst, and
primary hypothyroidism, diagnosed a
year earlier
Her medications were levothyroxine,
hydroxychloroquine, pravastatin, and
alendronate.
Medications Findings A urine dipstick was positive for
leukocyte esterase and nitrites. The
patient given a prescription
fo ciprofloxacin for a urinary tract
infection. 3 days later, patient reported
weakness and dizziness. Her supine
blood pressure was 120/80 mm Hg, and
pulse was 88.
• Extract Symptoms from record
• Use paraphrasings mined from text to handle alternate
phrasings and variants
• Perform broad search for possible diagnoses
• Score Confidence in each diagnosis based on
evidence so far
• Identify negative Symptoms
• Reason with mined relations to explain away
symptoms (thirst is consistent w/ UTI)
• Extract Family History
• Use Medical Taxonomies to generalize medical
conditions to the granularity used by the models
• Extract Patient History
• Extract Medications
• Use database of drug side-effects
• Together, multiple diagnoses may best explain
symptoms
• Extract Findings: Confirms that UTI was present
Most Confident Diagnosis: Diabetes
Most Confident Diagnosis: UTI Most Confident Diagnosis: Esophagitis
Most Confident Diagnosis: Influenza
Watson på legekontoret
Watson er: En kunnsig intelligent,
selvlærende
datamaskin
som forstår naturlig språk.
og adopterende
Disse bruker watson i dag
Verdi av informasjon
Tid
Nåtid
Alt handler om dette
IBM Business Analytics
Software
• IBM® InfoSphere® BigInsights
Enterprise Edition
Hardware
• IBM System x® iDataPlex® dx360 M3
• IBM System Storage® DS5300
2011
2.6 petabytes
2015
20+ petabytes
According to GigaOm, Netflix looks at 30 million
“plays” a day, including when you
pause, rewind and fast forward,
four million ratings by Netflix subscribers, three
million searches as well as the time of day when
shows are watched and on what devices.
SAN
SAN
DS8700
HS22
2x X5570 2.93 GHz
48GB RAM
World
HS22
2x X5570 2.93 GHz
48GB RAM
France
Loreen -
Euphoria -
2012
Eurovision
Song Contest
Winning in
Baku.
Photo: SVT
Tweets filtered
by Eurovision
specific hashtags:
#eurovision
#eurofrancetv…
Real-time
tweet
analysis
Square Kilometer Array
Raw data rate: 14 Exabytes / day
1GB = 10 min full HDTV
1Exabyte = 20’000 years full HDTV
Trond Bjerkvold
Sjefkonsulent
Servere
IBM
Christopher Conradi
Technical professional Cognos
Business Intelligence
and Watson
IBM
+47 928 95 322
+47 932 10 752