tweetool (0. 1 100 version) final report

18
Tweetool (0. 1 100 version) Final Report Yilei Qian Computer Science University of Southern California [email protected] A Twitter Recommend System based on Topic Modeling

Upload: sofia

Post on 23-Feb-2016

55 views

Category:

Documents


0 download

DESCRIPTION

A Twitter Recommend System based on Topic Modeling. Tweetool (0. 1 100 version) Final Report. Yilei Qian Computer Science University of Southern California [email protected]. Ideas. Following too many points on Twitter Too many news every day - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Tweetool (0. 1  100  version) Final Report

Tweetool (0. 1 100 version)Final Report

Yilei QianComputer Science University of Southern [email protected]

A Twitter Recommend System based on Topic Modeling

Page 2: Tweetool (0. 1  100  version) Final Report

Ideas

• Following too many points on Twitter• Too many news every day• Cannot find the interested and valued news

• Don’t know the name which user want to follow• Need someone to recommend who to follow• Need someone to recommend the hottest news

• Use topic modeling to re-rank all the user

Page 3: Tweetool (0. 1  100  version) Final Report

Traditional Method

Page 4: Tweetool (0. 1  100  version) Final Report

Traditional Method

Page 5: Tweetool (0. 1  100  version) Final Report

Traditional Method

Page 6: Tweetool (0. 1  100  version) Final Report

Topic Modeling

Page 7: Tweetool (0. 1  100  version) Final Report

Topic Modeling

Page 8: Tweetool (0. 1  100  version) Final Report

Topic Modeling

• a topic model is a type of statistical model for discovering the abstract "topics" that occur in a collection of documents.  

• Always used in natural language processing.

Reference Papers:Steyvers,m. and Griffiths, T., “Probabilistic topic models,” Hand book of latent semantic analysisBlei, D.M and Ng, A.Y and Jordan, M.I, “Latent Dirichlet Allocation”, The Journal of Machine Learning Research 2003

Page 9: Tweetool (0. 1  100  version) Final Report

Label based LDA

Step:1. Build the LDA Model2. Train the model instance by train document3. Run the LDA for all the data based on trained model

instance

Problem:4. Punctuation marks. E.g. “”,.={}() …5. Frequent words. E.g I , you…. 6. Other Noise

Page 10: Tweetool (0. 1  100  version) Final Report

Result Generate

1. By Angle

Value = 2. By Distance

Value =

Page 11: Tweetool (0. 1  100  version) Final Report

13-Dimension Topics

1. Art & Design2. Book3. Business4. Charity5. Entertainment6. Family7. Fashion8. Food & Drink9. Health10. Music11. News12. Science & Technology13. Sports

Page 12: Tweetool (0. 1  100  version) Final Report

Languages & Tools

• Web UI: HTML + AJAX(Unfinished) +CSS(unfinished)+Twitter REST API

• Android UI: Java, Android 2.1(unfinished)• Server Side: Java 1.6, Servlet 2.0, Spring 3.0, Hibernate 3.3• Twitter API: Twitter4j 2.2.1 (300 request per hour)• Server: Tomcat 7.08• Database: MySQL 5.5• Data Package: JSON• Develop Platform: Eclipse 3.4• Total code lines: 2000(+) + 2421 + 462 = 5000(+)• Subversion:

• http://tweetool-yilei.googlecode.com/svn/trunk/tweetool-yilei-read-only

Page 13: Tweetool (0. 1  100  version) Final Report

Architecture

DB

Twitterfetch

LLDATweetool

Hibernate DAO

Work Flow

Servlets

Work Flow

Work Flow

Mobile DeviceHTML

APPLICATIONCONTEXT

Page 14: Tweetool (0. 1  100  version) Final Report

Distributed Crawler & Computing

Page 15: Tweetool (0. 1  100  version) Final Report

Problems(endless T_T)

1. High noise in topic model• Few words, Odd marks, Abbreviation

2. Unfamiliar with Twitter API, A lot of bugs3. Transaction Problems4. The Ugly UI5. Poor performance6. Don’t have enough time. Many functions are

unfinished7. Tweetool system should be reconstructed !!!Environment: 7000+Users 22,0000+Tweets

Page 16: Tweetool (0. 1  100  version) Final Report

Future Work

1. Try to finish it2. Debug3. Build a better train file4. Add feedback function5. Better topics classification

Page 17: Tweetool (0. 1  100  version) Final Report

Web UI (Design Version)

Page 18: Tweetool (0. 1  100  version) Final Report

Android UI

FunctionButton

FunctionButton

FunctionButton

FunctionButton

Titile

Main Menu News Menu

Title

News

News

News