chaoyang university of technology clustering web transactions using rough approximation source :...

21
Chaoyang University of Technology Clustering web transactions using rough approximation Source : Fuzzy Sets and Systems 148 (2004) 131–138 Author : Supriya Kumar Dea, P. Radha Krishnab. Adviser : RC. Chen Present : Yu-Hsiang Fu ( 傅傅傅 ) Date :2006/12/14 Chaoyang University of Technology Chaoyang University of Technology

Upload: dwain-stewart

Post on 08-Jan-2018

216 views

Category:

Documents


0 download

DESCRIPTION

Chaoyang University of Technology 2006/12/143 Abstract Web usage mining is the application of data mining techniques Discovering user access patterns from web access log Using rough sets can effectively mine web log records to discover web page access patterns

TRANSCRIPT

Page 1: Chaoyang University of Technology Clustering web transactions using rough approximation Source : Fuzzy Sets and Systems 148 (2004) 131–138 Author : Supriya

Chaoyang University of Technology

Clustering web transactions using rough approximation

Source : Fuzzy Sets and Systems 148 (2004) 131–138

Author : Supriya Kumar Dea, P. Radha Krishnab.

Adviser : RC. Chen

Present : Yu-Hsiang Fu (傅昱翔 )

Date :2006/12/14

Chaoyang University of TechnologyChaoyang University of Technology

Page 2: Chaoyang University of Technology Clustering web transactions using rough approximation Source : Fuzzy Sets and Systems 148 (2004) 131–138 Author : Supriya

2006/12/14 2

Chaoyang University of Technology Outline

• Abstract• Introduction• Rough Set• Rough Set Approximation• Experimental Results• Conclusions• References

Page 3: Chaoyang University of Technology Clustering web transactions using rough approximation Source : Fuzzy Sets and Systems 148 (2004) 131–138 Author : Supriya

2006/12/14 3

Chaoyang University of Technology Abstract

• Web usage mining is the application of data mining techniques

• Discovering user access patterns from web access log

• Using rough sets can effectively mine web log records to discover web page access patterns

Page 4: Chaoyang University of Technology Clustering web transactions using rough approximation Source : Fuzzy Sets and Systems 148 (2004) 131–138 Author : Supriya

2006/12/14 4

Chaoyang University of Technology Introduction (1/2)

• WWW includes a huge number of hyperlinks ,access and usage information.

• Web Mining– Web content mining– Web structure mining– Web usage mining

Page 5: Chaoyang University of Technology Clustering web transactions using rough approximation Source : Fuzzy Sets and Systems 148 (2004) 131–138 Author : Supriya

2006/12/14 5

Chaoyang University of Technology Introduction (2/2)

• User’s behaviors– Click stream is the sequence of clicks or pages

requested as a visitor explores a Web site.• Web transaction

– A user session is the click-stream of page views for a single user across the entire web.

• The usage patterns are different for different users that navigates the same pattern in different ways.

Page 6: Chaoyang University of Technology Clustering web transactions using rough approximation Source : Fuzzy Sets and Systems 148 (2004) 131–138 Author : Supriya

2006/12/14 6

Chaoyang University of Technology Rough Set (1/5)

• The Rough Set theory was introduced by Zdzislaw Pawlak in the early 1980s.

• Rough Set deals with the classification analysis of data table.

• Rough Set develop efficient searching for relevant tolerance relations and extract interesting patterns in data.

Page 7: Chaoyang University of Technology Clustering web transactions using rough approximation Source : Fuzzy Sets and Systems 148 (2004) 131–138 Author : Supriya

2006/12/14 7

Chaoyang University of Technology Rough Set (2/5)

• Universe and Relation

Page 8: Chaoyang University of Technology Clustering web transactions using rough approximation Source : Fuzzy Sets and Systems 148 (2004) 131–138 Author : Supriya

2006/12/14 8

Chaoyang University of Technology Rough Set (3/5)

• Lower and Upper Approximation

( surely )

( possible )

Page 9: Chaoyang University of Technology Clustering web transactions using rough approximation Source : Fuzzy Sets and Systems 148 (2004) 131–138 Author : Supriya

2006/12/14 9

Chaoyang University of Technology Rough Set (4/5)

• Boundary and Negative region

Page 10: Chaoyang University of Technology Clustering web transactions using rough approximation Source : Fuzzy Sets and Systems 148 (2004) 131–138 Author : Supriya

2006/12/14 10

Chaoyang University of Technology Rough Set (5/5)

Page 11: Chaoyang University of Technology Clustering web transactions using rough approximation Source : Fuzzy Sets and Systems 148 (2004) 131–138 Author : Supriya

2006/12/14 11

Chaoyang University of TechnologyRough Set Approximation (1/7)

• A user transaction is a sequence of items

• Let there be m users and the user transactions be

• Let U be the set of distinct n clicks (hyperlinks/URLs) clicked by users

Page 12: Chaoyang University of Technology Clustering web transactions using rough approximation Source : Fuzzy Sets and Systems 148 (2004) 131–138 Author : Supriya

2006/12/14 12

Chaoyang University of TechnologyRough Set Approximation (2/7)

Page 13: Chaoyang University of Technology Clustering web transactions using rough approximation Source : Fuzzy Sets and Systems 148 (2004) 131–138 Author : Supriya

2006/12/14 13

Chaoyang University of TechnologyRough Set Approximation (3/7)

Page 14: Chaoyang University of Technology Clustering web transactions using rough approximation Source : Fuzzy Sets and Systems 148 (2004) 131–138 Author : Supriya

2006/12/14 14

Chaoyang University of TechnologyRough Set Approximation (4/7)

Page 15: Chaoyang University of Technology Clustering web transactions using rough approximation Source : Fuzzy Sets and Systems 148 (2004) 131–138 Author : Supriya

2006/12/14 15

Chaoyang University of TechnologyRough Set Approximation (5/7)

Page 16: Chaoyang University of Technology Clustering web transactions using rough approximation Source : Fuzzy Sets and Systems 148 (2004) 131–138 Author : Supriya

2006/12/14 16

Chaoyang University of TechnologyRough Set Approximation (6/7)

Page 17: Chaoyang University of Technology Clustering web transactions using rough approximation Source : Fuzzy Sets and Systems 148 (2004) 131–138 Author : Supriya

2006/12/14 17

Chaoyang University of TechnologyRough Set Approximation (7/7)

Page 18: Chaoyang University of Technology Clustering web transactions using rough approximation Source : Fuzzy Sets and Systems 148 (2004) 131–138 Author : Supriya

2006/12/14 18

Chaoyang University of TechnologyExperimental Results (1/2)

• Log files form www.idrbt.ac.in .– The web sites consists of 62 web pages and 283

links.– Log files record every click that user make.– Session time is 30 min.

Page 19: Chaoyang University of Technology Clustering web transactions using rough approximation Source : Fuzzy Sets and Systems 148 (2004) 131–138 Author : Supriya

2006/12/14 19

Chaoyang University of TechnologyExperimental Results (2/2)

• Steps:– First, the data is preprocessed and transformed.– Second, computing similarity upper approximation for

each transaction.– Finally, clusters of transactions using rough approxim

ation (threshold = 0.5).

Page 20: Chaoyang University of Technology Clustering web transactions using rough approximation Source : Fuzzy Sets and Systems 148 (2004) 131–138 Author : Supriya

2006/12/14 20

Chaoyang University of Technology Conclusion• This paper presented a novel algorithm for

clustering using rough approximation to cluster the web transactions of user access.

• This approach is useful to find interesting user access patterns in web log.

• The result can be helpful for building up adaptive web according to the user’s behavior.

Page 21: Chaoyang University of Technology Clustering web transactions using rough approximation Source : Fuzzy Sets and Systems 148 (2004) 131–138 Author : Supriya

2006/12/14 21

Chaoyang University of Technology References• Zdzislaw Pawlak,Jerzy Grzymala-Busse,Roman Slowinski, and Wojciech Ziarko, Rough S

ets, COMMUNICATIONS OF THE ACM November 1995/Vol. 38, No. 11, 88-95• Zdzislaw Pawlak, Rough Sets (Abstract) ,262-264• Zdzisław Pawlak , Andrzej Skowron , Rudiments of rough sets , Information Sciences 177

(2007) 3–27• Nils Kammenhuber, Julia Luxenburger, Anja Feldmann, Gerhard Weikum, Web Search Cli

ckstreams, IMC’06, October 25–27, 2006,• A, Jain, Data Clustering: A Review , ACM Computing Suversy, Vol 31, No 3, September

1999 ,274-275,281-285