Difference between revisions of "Restaurant Recommendations Based On Review Content"

From Cohen Courses
Jump to navigationJump to search
m
m
Line 18: Line 18:
  
 
We can consider a model similar to the [[UsesMethod::Rosen-Zvi_et_al,_The_Author-Topic_Model_for_Authors_and_Documents|Author-Topic model]] where we choose a topic distribution based on user/restaurant for each review. Alternatively, we can consider a nonparametric [[UsesMethod::Hierarchical Dirichlet process]] over two "groups" of reviews (one for restaurants, another from users).
 
We can consider a model similar to the [[UsesMethod::Rosen-Zvi_et_al,_The_Author-Topic_Model_for_Authors_and_Documents|Author-Topic model]] where we choose a topic distribution based on user/restaurant for each review. Alternatively, we can consider a nonparametric [[UsesMethod::Hierarchical Dirichlet process]] over two "groups" of reviews (one for restaurants, another from users).
 +
 +
[[Image:hdp lda for review.png]]
  
 
By measuring the similarity (maybe [[UsesMethod::Jensen-Shannon divergence]]?) between a user's taste and a restaurant's characteristics, we hope to be able to recommend a few candidate restaurants for the user.
 
By measuring the similarity (maybe [[UsesMethod::Jensen-Shannon divergence]]?) between a user's taste and a restaurant's characteristics, we hope to be able to recommend a few candidate restaurants for the user.

Revision as of 00:33, 29 September 2011

Basic idea

Current recommendation systems rely on collaborative filtering. Suppose we want to recommend a product to John. One way is to look for users who share similar rating patterns as John, and use the ratings from these like minded users to recommend a few products to John. Another way would be to build a item-item matrix that determines the similarity between pairs of items. From this matrix, as well as the John's data (ratings, etc), we can try to infer his tastes and recommend similar items.

For our 11-763 project, we propose a way recommendation system that looks at the text in user reviews.

Assumptions

  1. The text in a user's review will reflect his personal tastes and preferences in restaurants. For instance, he might mention his favorite food, be particular about the service or tend to talk about the ambiance of the restaurant.
  2. By looking at all the reviews for a specific restaurant, we can infer the strengths/weaknesses of the restaurant. For instance, if many reviews talk about the excellent service, we can use this knowledge in our recommendation system.
  3. The user has the same "tastes" regardless of which restaurants he go to.

Brief summary of method

For our problem, we assume that we have a set of users , set of items (things) and set reviews where each for some user, thing and sequence words.

Topic models have been widely used in text modelling to learn about topics that are being mentioned in text. For our problem, we shall learn topic distributions over a restaurant's review as well as a user's reviews. Both reviews about restaurants, or review made by a user will share the same topics. Hence, a user's "taste" would be represented by a distribution over topics. Similarly, a restaurant's characteristic would also be represented by a distribution over topics.

We can consider a model similar to the Author-Topic model where we choose a topic distribution based on user/restaurant for each review. Alternatively, we can consider a nonparametric Hierarchical Dirichlet process over two "groups" of reviews (one for restaurants, another from users).

Hdp lda for review.png

By measuring the similarity (maybe Jensen-Shannon divergence?) between a user's taste and a restaurant's characteristics, we hope to be able to recommend a few candidate restaurants for the user.

Dataset

Yelp academic dataset

We will probably focus on restaurants in a city, maybe Pittsburgh or New York City.

Baseline

For our baseline, we are considering traditional collaborative filtering methods. We will find a set of users that are most similar to the current user (via their ratings of restaurants), and aggregate a set of positively rated restaurants, which we will use as a candidate set to recommend to the user.

Evaluation

When a user reviews a restaurant, we can assume that he has personally visited the place. Hence, we intend to identify a sample group of users, and take out their reviews as a test set. If a system recommends a restaurant that is in the test set and has been positively reviewed by the user, we would consider it to be a good recommendation (afterall the user went to the place and gave positive ratings for it!)