Josephine Griffith - Some Research Resources

Resources and Progress (as of June 2009)

Relevant Call for Papers and Conferences (2009/2010)

Conferences etc. to check for next year:

Dutch-Belgian Information Retrieval Workshop
International Workshop on Web Personalization
Recommender Systems and Intelligent User Interfaces
Information Retrieval in Context
ACM conference on Recommender Systems
International Conference on Knowledge Engineering and Knowledge Management
AICS
KES

Reviews, Papers etc.

Camera ready paper to AICS (August 10th, 2007)

(Slightly) modified version of paper from IJPRAI (special issue of Personalization Techniques and Recommender Systems) to be published as book chapter under Series in Machine Perception and Artificial Intelligence Submitted 4th July 2007

Submission to AICS (June 29th, 2007)

Quick summary of Weighting Schemes in Collaborative Filtering (June 2007)

Datasets

Netflicks Dataset

100 million ratings
480, 189 users (with non-consecutive IDs from 1 to 2649429)
17,770 movies (with consecutive IDs from 1 onwards)
Integer ratings (1-5)
Also have date of each rating and title and release date for each movie.

Jester Dataset

BookCrossing Dataset

Trawl of lastfm (konstas09 sigir paper)?

Some Recent Relevant Literature

Konstas, I. and Stathopoulos, V. and Jose, J.M. On social networks and collaborative recommendation In: The 32nd Annual ACM SIGIR Conference, 19-23 July 2009, Boston, Massachusetts.

Interesting from a few perspectives:

Social network from last.fm representing explicit (friendship) links between users, incorporating social tags and also implicit ratings, based on number of times tracks are listened to.
Use random walk with restarts algorithm for prediction and compare this to basic Pearson Correlation CF approach
Have dataset available for download/use (though not available at given url at moment. http://www.dcs.gla.ac.uk/~jj/data/lastfm_dataset.html)
some related literature to follow up:
- fouss, pirotte, renders, saerens, "Random walk computation of similarities between nodes of a graph with application to collaborative recommendation"
- yildirim et al. A random walk method for alleviating the sparsity problem in collaborative filtering

Temporal Collaborative Filtering With Adaptive Neighbourhoods, Neal Lathia, Stephen Hailes, Licia Capra state that ''a growing dataset leads to changing features of the ratings; both global and user statistics derived from CF data (and often used to predict ratings) change over time.''

Abhay S. Harpale,Yiming Yang, Personalized Active Learning for Collaborative Filtering, SIGIR, 2008

Active learning identifies the most informative set of training examples through minimum user interaction. This work uses an aspect model.

Bhaskar Mehta, Wolfgang Nejdl, Attack Resistant Collaborative Filtering, SIGIR, 2008

Approach based on SVD

Nathan N. Liu, Qiang Yang, EigenRank: A Ranking-Oriented Approach to Collaborative Filtering, SIGIR 2008

Two ranking algorithms: greedy and random walk models.
Two datasets: each movie and (subset of) Netflix
Evaluation: Compared to user-based, item-based standard CF algorithms and also Kendall Rank Correlation coefficient
Evaluation metric: normalized discounted cumulative gain

R-R. Liu and C-X. Jia and T. Zhou and D. sun and B-H. Wang, Personal recommendation via Modified Collaborative Filtering, EPL, arXiv:0801.1333v3, 2008

There have been a number of papers by these authors on a similar theme to this - weighting the similarity measure which is used to find simliar users by incorporating a factor based (inversely) on the number of ratings an item has received. The motivation being that ratings on more unique items give more information than ratings on very popular items. A tunable parameter, alpha, causes more or less credence to be given to this factor and the goal is to first show that including such a factor is an improvement over the base case, finding the optimal value of alpha. Their results show best performance at alpha = 1.85 I implemented the weighted similarity formula (though unsure of some parameters they used re. neighbourhood selection) and looked at average MAE results over 10 runs (my usual testing methodology of 10% test users). I could not see any statistically significant difference in results over different values of alpha or over the basic appraoch using Pearson Correlation similarity that I have been using up to now.?

T. Zhou and L.-L. Jiang and R.-Q. Su and Y.-C. Zhang, Effect of initial configuration on network-based recommendation, EPL, 81, 2008.

Uses an item-item similaity matrix $W$ to represent a network of connected objects where similarity is calculated based on the number of ratings an item receives

Less Recent Literature (to read again)

Piao et al. Research on Entropy-based Collaborative Filtering Algorithms. IEEE International Conference on e-Business Engineering. 2007.

Uses an entropy-based measure such that for the feature of number of ratings an item receives, give more ''weight'' to the ratings from users who have rated many items (a way to combine number of ratings an item receives and number of ratings a user gives). The actual CF approach tested seems slightly confused and not too convincing ... a basic item-item nearest neighbour collaborative filtering approach where item similarity is calculated based on item entropy and joint item entropy. Seems to me that the formula used is a mutual information formula ... in either case, not too sure what it means in terms of item similarity, e.g., what does joint entropy of 2 items X and Y mean? Results don't seem overly convincing when compared with a user-user and item-item based approach using cosine similarity. One interesting ideas is to use the entropy measure as a way to have a ''top 20'' of recommended items which can be recommended to users who have not rated any items or have no or few neighbours. However, there is very little difference between the top-20 ordering of items based on a raw count of the number of ratings and based on the entropy score: the top 11 positions have exactly the same items in the same order and there is only some small reordering of items from position 12 to 24.

Zhang et al. Efficient Bayesian Hierarchical User Modeling for Recommendation Systems. SIGIR 2007.

Mei et al. VideoReach: An Online Video Recommendation System. SIGIR 2007.

Ma et al. Effective Missing Data Prediction for Collaborative Filtering. SIGIR 2007 This paper concentrates mostly on missing data prediction algorithms, showing their approach to perform well against the nearest similar algorithm and also a set of bench mark approaches. They also use an extended version of Herlocker's significance weighting approach for both users and items to overcome the problems with the Pearson Correlation value being high even though the users may only have co-rated a few items. The steps in their approach are, for some missing value r by user u on item i:

Get neighbours of user u using Pearcon correlation with significance weighting, and choosing neighbours over a threshold n
Get neighbours of item i using Pearcon correlation with significance weighting, and choosing neighbours over a threshold theta
If there are no neighbours, do not predict any value - leave as zero
If there are only neighbours for users, then predict the missing value using user u's mean rating and the (weighted) rating that each user has given to the item
If there are only neighbours for items, then predict the missing value using item i's mean rating and the (weighted) rating that user u has given to each of the neighbour items
If there are both user and item neighbours, then use a weighted sum of 4 and 5: lambda(4)+(1-lambda)*(5)

Settings for parameters were testing empirically (using MovieLens and EachMovie) and the best ones chosen were:

for significance weighting, for users, the threshold was 30, e.g., if users have co-rated less than 30 items their correlation value is reduced by (num_co_rated_items/30)
for significance weighting, for items, the threshold was 25, e.g., if the item has been co-rated by less than 25 users then the correlation value between items is reduced by (num_co_rated_items/25)
for neighbourhood selection, the threshold chosen for both user-user and item-item neighbours was 0.4, i.e., users/items with correlation above 0.4 with the current user/item are chosen.
For prediction, more weight is given to the user-user prediction part of the equation when both user-user and item-item infomration is being used. Lambda is set to 0.7.

The testing methodology was to extract 500 users and from these 500 choose 300 as training (three sets: 100, 200, 500) and 200 for testing. For the active users, the number of rated items was varied from 5, 10 and 20 (Given 5, Given 10 and Given 20).

B.K. Mohan and B.J. Keller and N. Ramakrishnan, Scouts, Promoters, and Connectors: The Roles of Ratings in Nearest-Neighbor Collaborative Filtering, ACM Transactions on the Web, 1(2), 2007.

''To further recommendation system acceptance and deployment, we require new tools and methodolgies to manage an installed recommender and develop insights into the roles played by ratings''.
''We expect that users with good scouts and good neighbourhoods will be most satisfied with the system in the future"
''In the item-based algorithm we find a strong correlation between a user's number of ratings and his or her aggregated promoter value.''
''The obvious rule to follow when introducting a new movie is to have it rated directly by PROLIFIC users who posses high aggregated promoter values.''

Some SA literature of note: Ziegler et al. Spreading Activation Models for Trust Propagation discuss some useful implementation details of Spreading activation approaches. Han et al. (An Adaptive Spreading Activation Scheme for Performing More Effective Collaborative Recommendation) specifically apply a SA approach to collaborative filtering although they only use the SA approach to find neighbours, and this is aided by the existence of a set of ''rating similarity matrices'', one for each item, which essentially gives information on the items users have co-rated with the same values. These matrics must be pre-computed (I think?). Once neighbours are found, then a weighted sum of the ratings of the top 100 users for the test items are used to form predictions for test items. Kovacs et al. Recommending in Context: A spreading Activation Model that is Independent of the Type of Recommender System and Its Contents discuss the limitations of an SA approach to incorporating context and present an extended graph model which they claim can represent context. This extended graph model has 2 nodes associated with each link: a context node and a link type node.

E. Aimeur and F.S.M. Onana, Better control on recommender systems The 8th IEEE International Conference on E-Commerce Technology and The 3rd IEEE International Conference on Enterprise Computing, E-Commerce, and E-Services, 2006
Comments: Aimeur et al. present an approach which attempts to incorporate the notion of trust where each user manages own neighbourhood and can see statistics on each of its neighbours. Uses the idea of 'local' and 'global' recommendations: local based on users own neighbourhood and global based on the entire dataset. The local neighbourhood is formed by using a collaborative filtering technique to find an initial set of users who are presented to user and the user can choose which ones they want in their neighbourhood. Use dataset from epinions.com . Other work on trust in recommender systems: O'Donovan and Smyth: Trust in recommender Systems, IUI '05.

P. Symeonidis and A. Nanopoulos and A. Papadopoulos and Y. Manolopoulos, Collaborative Filtering: Fallacies and Insights in Measuring Similarity, Proceedings of Web Mining Workshop in conjunction with PKDD/ECML, 2006
Comments: Symeonidis et al. present an ''improved'' similarity measure that considers co-rated items in greater detail than that used in Herlocker (significance measure: Herlocker, Konstan and Riedl, An empirical analysis of design choices, 2002 \cite{herlocker02}). The motivation is that ''a desirable property of a similarity measure is to maximise, $x$, the number of items that are co-rated by the test user and each of his neighbors, relative to the number, $y$, of items rated by only one of them''. Similar to the TF/IDF measure used by Wang et. al. For example, two users may have the same co-rated value, where one user has rated many items and the second has only rated a few items. Symeonidis argue these users should not be seen as the same with respect to the co-rated similarity value. For example, taking three users, the current active user who has rated 4 items and 2 other users: user 1 and user 2. If user 1 has rated 7 items, 2 of these in common with the active user and user 2 has rated three items, 2 of which are in common with the active user then the co-rated value between the current active user and the two other users is the same-even though user 1 and the active user potentially differ on 7 items whereas user 2 and the active user differ only on 3 items. This difference is taken into account by the use of a ratio: x/(x+y) where x is the number of co-rated items and y is the number of items rated only by one of the users.

R. Jin, L. Si, C. Zhai, A study of mixture models for collaborative filtering, Information Retrieval, 9:357-382, 2006
Comments: Comparison of a number of probabilistic models, focusing on three properties of collaborative filtering techniques and ascertaining how the performance of a technique is related to whether the technique satisfies the properties. The properties are:

model allows for separate clustering of users and items
model allows users/items to be in multiple clusters, e.g., a user could have different interests
model distinguishes between intrinsic preferences and rating patterns, (i.e., “model captures variance in rating patterns among users with similar interests”)

The probabilistic models compared are: Bayesian Clustering, Aspect Models, Flexible Mixture Models, Joint Mixture Models and Decoupled Models. The performance of these is also compared to some memory-based techniques (Pearson Correlation Coefficient, Vector Similarity and Personality Diagnosis). Results show that the Decoupled model which satisfies all of the three properties outperforms the other probabilistic models. The decoupled probabilistic model is an extension of the flexible mixture model which has two hidden variables that “account for rating patterns and intrinsic preference of users” Experiments were also performed varying the sparseness of the dataset and results showed that "when the amount of training data is small, it is better to use memory-based approaches for collaborative filtering". Also, it was shown that smoothing methods improved the performance of collaborative filtering systems, particularly when the number of training users was small. (Also see related work by Pennock on "Analysis of the Axiomatic Foundations of Collaborative Filtering" but they do not measure the usefulness of the proposed axioms experimentally).

Jun Wang, Arjen P. de Vries, Marcel J.T. Reinders, A User-Item Relevance Model for Log-based Collaborative Filtering, European Conference on Information Retrieval (ECIR 2006).
Comments: Two main issues: 1) Use a weighting scheme similar to TF-IDF using co-rated information which allows ''smoothing''. With respect to item-item similarity the formula used is similar to Karypis (2000), "Evaluation of Item-based Top-N Recommender Algorithms": sim(item_1, item_2) = [c(item_1,item_2)/c(item_2)]/c(item_1)^alpha where c(item_1,item_2) is the number of users who have rated both items 1 and 2; c(item_i) is the number of users who have rated item i and alpha is a tuning parameter. They use a probabilistic relevance model for recommendation. 2) Use ''log-based'' ratings, i.e. implicitly gather ratings (from music playlists and web query logs to form a music dataset). The authors claim that using this type of dataset (binary values indicating, for the most part, that an item was liked or 'relavant') makes the collaborative filtering problem more similar to text retrieval than to traditional collaborative filtering based on ratings.

Also of note: ''Traditionally, collaborative filtering makes a distinction between user-based and item-based approaches. Our probabilistic user-item relevance model, derived with an information retrieval view on collaborative filtering, demonstrates that the user-based and item-based models are equivalent from the probabilistic point of view, since they have actually been derived from the same generative relevance model.''

J.D.M. Rennie and N. Srebro, Fast maximum margin matrix factorization for collaborative prediction , ICML 2005
Comments:

J. Wang, A. de Vries, M. Reinders, Unifying User-based and Item-based Collaborative Filtering Approaches by Similarity Fusion, pp.501-508, SIGIR, 2006
Comments: The paper describes an approach which combines three types of ratings, derived from the user-item matrix, using a generative probabilistic framework. The motivation is that normally, a small portion of the collaborative filtering matrix is used for prediction. For example, for user-user based collaborative filtering, the ratings that the top-N similar users (to the active user) have given to the test item are used (called SUR), and for item-item based collaborative filtering, only the top-N similar items (to the test item) by the active user are used (called SIR). The approach uses three types of ratings to generate predictions: user-user (SUR), item-item (SIR), both as described above and also the known ratings that similar users (to active users) have given items similar to test item (SUIR). A prediction is made by averaging the individual predictions weighted by their confidence where the confidence of a prediction "can be estimated by considering its similarity towards the test user and towards the test item". Experiments were performed using EachMovie, MovieLens and BookCrossing datasets. Results showed that the approach outperformed all other approaches tested. In general it was found that "clearly more similar ratings (using SUR and SIR) provide more accurate predictions". While SUIR ratings are in general less accurate that SURs and SIRs, these may indeed complement missing or unreliable SUR and SIR ratings."

X. Song, B. Tseng, C-Y. Lin, M-T. Sun, Personalized Recommendation Driven by Information Flow, pp. 509-516, SIGIR 2006.
Comments: Paper focuses on information propagation (specifically recommendation propagation) in a network: "Leverage users' access patterns to model information flow and generate effective personalized recommendations". View the collaborative filtering problem from the perspective of: "given that a user X acts on an item Y (e.g. rates and item Y), who will then be more likely to act on item Y?" To model the problem they use a weighted directed network (which they call "Early Adoption Based Information Flow - EABIF) using timestamp information to ascertain if for two users a and b, rating an item Y, a rated Y before b. Information flow between nodes is modelled using ergodic Markov chains (where all states are reachable from all other states and so convergence is guaranteed). In addition users' adoption patterns are clustered regarding the topics of the items/documents the accessed/rated.