Project Proposal Second Draft:Daniel and Sherry

From Cohen Courses
Revision as of 16:01, 15 February 2011 by Ssahebi (talk | contribs) (Created page with '== Team Members == Daniel Mills Shaghayegh (Sherry) Sahebi == Dataset == Russian movie social network data: This dataset, which can only be us…')
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Team Members

Daniel Mills

Shaghayegh (Sherry) Sahebi

Dataset

Russian movie social network data: This dataset, which can only be used in this project and is not going to be released publicly, is consisted of two files. The first one includes directed connection between users and the second one contains user ratings on Russian movies.

Project Ideas

We propose to explore the properties of a large social network tied with movie ratings. We will investigate a variety of topics within this, using a mix of supervised and unsupervised machine learning methods.

Tasks

  • Predict evolution of the social network using interest similarity.
  • Predict user ratings of movies based on their ratings of other movies and ratings made by their friends in addition to using social structure of the network.
  • Detecting hidden communities.

Evaluation

Using cross-validation, we can compare predicted user ratings with actual ratings. Hidden communities are harder to evaluate, but can potentially be used as features in other tasks. Prediction of change in the network can be directly measured using recall and precision of new links predicted.

Potential Methods

  • Analyze data for correlation before doing anything else
  • Linear classification for link prediction
  • Graph clustering, such as spectral clustering
  • Regression models
  • Collaborative Filtering for rating prediction