classical music for rock fans?: novel recommendations for expanding user interests makoto nakatsuji,...
TRANSCRIPT
Classical Music for Rock Fans?:Novel Recommendations for Expanding User In-terests
Makoto Nakatsuji, Yasuhiro Fujiwara, Akimichi Tanaka, Toshio Uchiyama, Ko Fu-jimura 1, Toru Ishida2
1NTT Cyber Solutions Laboratories, NTT Corporation 2Department of Social Informatics, Kyoto University
CIKM 2010
2011. 2. 11.
Summarized and Presented by Kim Chung Rim, IDS Lab., Seoul National University
Copyright 2010 by CEBT
Contents
Introduction
Goal
Concept Explanation
Novelty
User Interest Model
User Similarity
Evaluation
Conclusion & Discussion
2
Copyright 2010 by CEBT
Introduction
Recommender systems are widely used by content providers
Increases chance of commercial success
Many content providers adopt methods based on collab-orative filtering
3
Copyright 2010 by CEBT
Weakness of basic CF method
It is apt to recommend the types of items that have been accessed by the user
Rock music is more likely to be recommended when the user previously rated on rock music only.
However, users may have various interests other than items that he has rated before
User often needs recommendations of other types of items
4
Copyright 2010 by CEBT
Goal
The goal of this paper lies in three folds
Introducing a new measure ‘novelty’
Integrate the taxonomy-based user similarity to the basic CF algorithm
Identify items with higher novelty for the active user
5
Copyright 2010 by CEBT
Concept - Novelty
Novel items are
items that cannot be easily discovered by the user
For example, a user who is interested in Rock music cannot easily discover interesting items in Classical music
Novelty is calculated using Taxonomy of items
6
Copyright 2010 by CEBT
Concept – User Interest Model
Users who are interested in some items are also inter-ested in classes that include those items
Therefore it can be said that the rating values of items in a class reflect user’s interest of that class
Authors calculate user interest of a class C by simply ag-gregating the interest score of all subclasses of C
7
Copyright 2010 by CEBT
Concept – User Similarity
User similarity is measured using user interest model and the original CF method (user rating behavior)
Where
8
Copyright 2010 by CEBT
Concept – Similarity against Items
Similarity of users calculated using Pearson correlation
Can be any other similarity measures, such as
Cosine Similarity
Jaccard Similarity
9
Copyright 2010 by CEBT
Concept – Similarity against Classes
Similarity against Classes can be measured as following:
Where
10
Copyright 2010 by CEBT
Methodology - Relatedness
Using Similarity measure ,
A user graph can be generated where nodes are users and the edge weights being
Edge weights are normalized to represent probability to move to adjacent node
RWR is performed on the graph until convergence
Each node holds a probability that a walk from active user a will pass through user u on the graph (relatedness)
11
Copyright 2010 by CEBT
Methodology – Rating Prediction
Using relatedness scores obtained from the user graph,
topN nodes with highest relatedness score are selected
– Top 40 for Movie dataset, Top 30 for Music dataset
Ratings of items are recalculated
The relatedness score is used instead of
12
Copyright 2010 by CEBT
Evaluation - Datasets
Several Datasets are used for the experiment
Rating against movies
– MovieLens Dataset : 212,586 ratings from 943 users on 1,682 movies
Rating against non-Japanese music artists
– Music Dataset from Doblog : 48,695 ratings from 3,545 users on 21,214 artists
– Taxonomy provided from ListenJapan : There are 279 genres in the taxonomy
Rating against Japanese music artists
– Music Dataset from Doblog : 58,104 ratings from 2,800 users on 7,421 artists
– Taxonomy provided from ListenJapan : There are 153 genres in the taxonomy
13
Copyright 2010 by CEBT
Evaluation – Methodology
The Dataset D is randomly divided into two parts:
Training dataset T
Prediction dataset P
Users who have items whose classes are in P but not in T can be generated
Varying the ratio of T to D (T/D), previously explained al-gorithms are run to predict user rating
14
Copyright 2010 by CEBT
Evaluation - Measurement
To measure how accurate the rating prediction is,
MAE(Mean Absolute Error) is calculated
To measure the coverage of algorithms
15
Copyright 2010 by CEBT
Evaluation – Compared similarity measure
Pearson Correlation coefficient
Cosine-based approach
Method proposed by Ziegler(WWW 05)
Taxonomy (Jaccard&Pearson)
Taxonomy (Jaccard)
16
Copyright 2010 by CEBT
Results - Accuracy
17
Copyright 2010 by CEBT
Results - Accuracy
18
Copyright 2010 by CEBT
Results - Coverage
19
Copyright 2010 by CEBT
Results - Coverage
20
Copyright 2010 by CEBT
Conclusion & Discussion
This paper uses rating of item as well as the taxonomy of items to calculate the similarity between two users.
Using such similarity measure and RWR, users who are not similar to the active user but who the walk passes through frequently can be extracted.
Such users’ items are then used to identify items with high novelty to expand users’ interests
21
Copyright 2010 by CEBT
Thank you
22