|
|
Line 7: |
Line 7: |
| I am a second year Masters student in the LTI. I have taken this course because the syllabus matter aligns pretty well with my current research on Information Extraction. | | I am a second year Masters student in the LTI. I have taken this course because the syllabus matter aligns pretty well with my current research on Information Extraction. |
| | | |
− | [[Link to my wiki page for the "Analysis of Social Media" course taken in Spring, 2011.]] | + | [[Analysis of Social Media Spring, 2011|Link to my wiki page for the "Analysis of Social Media" course taken in Spring, 2011.]] |
− | | |
− | Intention behind taking the course:
| |
− | Interest in social media, which is a rapidly growing phenomenon on web. Along with other things, i was particularly interested in modeling and analysis of the discussion between people in online forums, or as comments to blogs or news-stories. Such comments are pretty interesting in terms of their length and the kind of information they carry--while some are pretty small and loosely written, others are long and well-crafted ones. Want to do a project on similar lines analyzing this domain of social interaction.
| |
− | Research Interests:
| |
− | Language Technologies, Semantic Analysis, Discourse Analysis and Text modeling.
| |
− | | |
− | ---
| |
− | ==Wiki Pages Created==
| |
− | ====[[Blog summarization: CIKM 2007]]====
| |
− | ====[[Identifying influential bloggers: WSDM 2008]]====
| |
− | ====[[Analyzing and Predicting Youtube Comments Rating: WWW2010]]====
| |
− | | |
− | == Project Proposal ==
| |
− | Finding bias-groups in discussions on blogs
| |
− | | |
− | == Team Members ==
| |
− | Subhodeep Moitra (smoitra@cs.cmu.edu)
| |
− | Srivastava (manajs@cs.cmu.edu)
| |
− | | |
− | == Data Set ==
| |
− | Yano and Smith dataset of blogs and comments from 40 blog-sites focused on American politics [http://www.ark.cs.cmu.edu/blog-data/]
| |
− | | |
− | == Goal of the Project ==
| |
− | | |
− | We aim at modeling and estimating the bias groups among the users who make comments on blogs. For any blog, the users making comments either agree or disagree with the opinions of the author or of other users making comments. Also, these agreements and disagreements could be on various sub-topics discussed within a single blog. We aim at estimating which users are agreeing or disagreeing on what sub-topics of a given blog. We have gone through few papers which tackle different aspects of this problem separately. Hu et. al. [1] did extraction based summarization of sentences from blog-posts based on the content of the comments. Such an attempt is useful for us, so that we can relate the discussions in the comments with the sub-topics in the blog-posts. Another interesting work by Mishne and Glance [2] aims at detecting disputes in comments to web-blogs, which again relates to what we attempt to do. Another paper by Schuth et. al. [3] aims at finding the comments which relate to one thread of discussion. This is particularly useful in cases where the users cannot reply to other users’ comments explicitly. The techniques used in this paper could be useful in our case, to find out the likely discussion thread among all the posts on a certain blog.
| |
− | | |
− | == References ==
| |
− | 1] Hu M., Sun A., Lim E., “Comments-Oriented Blog Summarization by Sentence
| |
− | Extraction”, 16th ACM Conference on Information and Knowledge Management, 2007
| |
− | | |
− | [2] Mishne G., Glance N., “Leave a Reply: An Analysis of Weblog Comments”, Third Annual Workshop on the Web-logging Ecosystem, 2006
| |
− | [3] Schuth A., Marx M., Rijke M., “Extracting the discussion structure in comments on news-articles”, Proceedings of the 9th Annual ACM Workshop on Web-Information and Data Management, 2007
| |