tiara: a visual exploratory text analytic system

19
Intelligent Database Systems Lab 國國國國國國國國 National Yunlin University of Science and Technology 1 TIARA: A Visual Exploratory Text Analytic System Presenter : Wei-Hao Huang Authors : Furu Wei, Shixia Liu, Yangqiu Song, Shimei Pan Michelle X. Zhou, Weihong Qian, Lei Shi, Li Tan Qiang Zhang SIGKDD 2010

Upload: apollo

Post on 22-Feb-2016

38 views

Category:

Documents


0 download

DESCRIPTION

TIARA: A Visual Exploratory Text Analytic System. Presenter : Wei- Hao Huang Authors : Furu Wei, Shixia Liu, Yangqiu Song, Shimei Pan Michelle X. Zhou, Weihong Qian , Lei Shi, Li Tan Qiang Zhang SIGKDD 2010. Outlines. Motivation Objectives - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: TIARA: A Visual Exploratory Text Analytic System

Intelligent Database Systems Lab

國立雲林科技大學National Yunlin University of Science and Technology

1

TIARA: A Visual Exploratory Text Analytic System

Presenter : Wei-Hao Huang  Authors : Furu Wei, Shixia Liu, Yangqiu Song, Shimei Pan

Michelle X. Zhou, Weihong Qian, Lei Shi, Li Tan Qiang Zhang

SIGKDD 2010

Page 2: TIARA: A Visual Exploratory Text Analytic System

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

2

Outlines Motivation Objectives Methodology Experiments Conclusions Comments

Page 3: TIARA: A Visual Exploratory Text Analytic System

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

3

Motivation· The large collection of text to locate needed

information or simply deciding is very costly and time-consuming.

· Although a number of text analysis technologiesare often abstract and complex, may not be consumable by users.

Page 4: TIARA: A Visual Exploratory Text Analytic System

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Objectives

4

• To present exploratory visual analytic system called TIARA (Text Insight via Automated Responsive Analytics).

• To combine text analytics and interactive visualization to help users explore and analyze large collections of text.

Documents TIARA System

Page 5: TIARA: A Visual Exploratory Text Analytic System

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

5

Methodology· TIARA

Topic Analysis Topic Ranking Keyword based Topic Summarization Time-sensitive Keyword Extraction

Page 6: TIARA: A Visual Exploratory Text Analytic System

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.TIARA

6

Page 7: TIARA: A Visual Exploratory Text Analytic System

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.TIARA System architecture

7

Database File system

Page 8: TIARA: A Visual Exploratory Text Analytic System

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Topic Analysis· To use unsupervised learning methods.· is the number of Document · is word of Document· is vocabulary of size· K is the number of topic· is document-topic distribution

matrix· is topic-word distribution matrix

8

N1 N2

K1 0 1

K2 1 1

K1 K2

V1 0.3 0.7

V2 0.8 0.1

Term frequencies in each cluster

Page 9: TIARA: A Visual Exploratory Text Analytic System

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Topic Ranking· Topic rank is measured by a combination of

both topic content coverage and topic variance.

9

Page 10: TIARA: A Visual Exploratory Text Analytic System

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Keyword based Topic Summarization

10

Page 11: TIARA: A Visual Exploratory Text Analytic System

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Time-sensitive Keyword Extraction

11

Page 12: TIARA: A Visual Exploratory Text Analytic System

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

12

Time-sensitive Keyword Extraction

Page 13: TIARA: A Visual Exploratory Text Analytic System

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Experiments· Time-sensitive keyword extraction procedure

Completeness Distinctiveness

· Response Time· Data set:

A personal email collection with 8326 email messages. Emergency room data set containing 23,501 patient

records.

13

Page 14: TIARA: A Visual Exploratory Text Analytic System

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Completeness· Defined as whether we can recover the

original keywords of a topic by combining the keywords associated associated with each time segment.

14

Page 15: TIARA: A Visual Exploratory Text Analytic System

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Distinctiveness· Defined as whether we can distinguish one

topic segment from another based on their associated keywords to avoid redundancy.

15

Page 16: TIARA: A Visual Exploratory Text Analytic System

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Completeness and Distinctiveness Results

16

Page 17: TIARA: A Visual Exploratory Text Analytic System

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.Response Time

17

Page 18: TIARA: A Visual Exploratory Text Analytic System

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

18

Conclusions

• TIARA tightly integrates text analytics with interactive visualization to support effective exploratory text analysis.

• Future work Add sentence-base summaries Support other languages Improve performance

Page 19: TIARA: A Visual Exploratory Text Analytic System

Intelligent Database Systems Lab

N.Y.U.S.T.

I. M.

19

Comments· Advantages

─ To explore and analyze large text collections with interactive visualization

· Applications─ Text mining