Analyzing User Tweets around foursquare checkins
Comments
Note: A previous project for this course used 4-square data and location-based Twitter data, which might be available. --Wcohen 20:31, 10 October 2012 (UTC)
Project idea
Recently there has been a massive increase in the usage of location sharing social networks. Social networks such as FourSquare have brought a new way of social interaction where in an user checks in to a physical location (Food, College & University, Nightlife Spots etc). FourSquare allows the user checkins to be published as tweets. We plan to analyze the tweeting behaviour of the user after their foursquare checkin.
Team
Data
We have data for tweets over a week for around 300,000 users over the world. We expect that there will be significant number of foursquare checkins in the tweets. As a starting point we will start our analysis on this data and once we have a proof of concept we will start gathering more data. The present data has been generously shared to us by Hazim Almuhimedi, a Phd student of Institute of Software Research at CMU.
Tasks
- For a user, we plan to analyze tweets (within a small interval) after their foursquare check-in, to see if the user talks about things related to the places in which he/she has checked in.
- Analyzing all the tweets that follow foursquare check-in to a particular place (or category), to see what percentage of the users do tweet about that place.
- Find out the topics that users mostly talk about when they are at a particular place.
- Once we have all the tweets about a particular place, analyze the overall sentiment about that place. (For example, a particular restaurant is liked by most people or not).
Note : We will be able to do the task of sentiment analysis only if we find out that a significant number of people actually tweet about a place they are in after checking into that place.
Evaluation
- Quantitative: Build a small annotated test dataset to evaluate the accuracy of our prediction.
- Qualitative : For sentiment analysis on restaurant tweets, we will see if the overall sentiment correlates with the ratings on other famous social networks like Yelp.
Key Technical Challenges
- We might not have sufficient amount of data if we narrow to a single location (for example a particular restaurant)
- Given the limited amount of data, we are not sure if we can do topic modelling accurately (since tweets are inherently short)