Social Media Analysis 10-802 in Fall 2012
- William will be out of town (so no office hours) Fri 10/26 and Fri 11/2.
- The first class is Tuesday 9/11.
- William's office hours will be 1-2pm Friday (except when otherwise noted).
- William will be out of town Friday 9/21.
Instructor and Venue
- Instructor: William Cohen, Machine Learning Dept and LTI
- Course secretary: Sharon Cavlovich, firstname.lastname@example.org, 412-268-5196
- When/where: T/R 10:30-11:50, 4303 GHC, starting after the IC has finished. The first class will be Tuesday 9/11.
- Note: the room has been moved!
- Course Number: MLD 10-802, cross-listed as LTI 11-772
- Prerequisites: a machine learning course (e.g., 10-701 or 10-601) or consent of the instructor.
- TAs: Bhavana Dalvi and Aasish Pappu
- Syllabus: To best posted, but will approximately follow the syllabus I used last time
- Office hours:
- William: 1-2pm Fri
- Aasish: 2-3pm Thu [GHC 6223]
- Bhavana: 12-1pm Tue [GHC 5509]
- Assignment submission / any course related queries should be directed to email@example.com for timely response.
- Course mailing list for announcements: firstname.lastname@example.org
Clarification/announcement: This will be a regular 12-credit course (despite the some listings of it as a 6-credit course).
The most actively growing part of the web is "social media" - e.g.. wikis, blogs, bboards, and collaboratively-developed community sites like Flikr and YouTube. This course will review selected papers from the recent research literature that address the problem of analyzing and understanding social media. Topics that will be covered include:
- Text analysis techniques for sentiment analysis, analysis of figurative language, authorship attribution, and inference of demographic information about authors (e.g., age or sex).
- Community analysis techniques for detecting communities, predicting authority, assessing influence (e.g. in viral marketing), or detecting spam.
- Visualization techniques for understanding the interactions within and between communities.
- Learning techniques for modeling and predicting trends in social media, or predicting other properties of media (e.g., user-provided content tags.)
Students should have a machine learning course (e.g., 10-601 or similar) or consent of the instructor. Readings will be based on research papers. Grades will be based on class participation, paper presentations, and a project. More specifically, students will be expected to:
- Prepare summaries of the papers discussed in class. Summaries will be posted on this wiki.
- Present and summarize one or more "optional" papers from the syllabus (or some other mutually agreeable paper) to the class.
- Do a course project in a group of 2-3 people. The end result of the project will be a written report, with format and length appropriate for a conference publication.
Here are some projects from previous years, so you can get some ideas of the scope.
- Exploration of Structure and Dynamics of Large Phone and SMS Networks, Leman Akoglu and Bhavana Dalvi.
- Modeling Microblogs using Topic Models, Kriti Puniyani.
- TED - Comments Worth Understanding, Aasish Pappu and Gopala Krishna Anumanchipalli.
- An analysis of perspectives in interactive settings, Dong Nguyen. (Actually, a workshop paper based on the project).
Syllabus and Readings
- Syllabus for Analysis of Social Media 10-802 in Spring 2011
- Syllabus for Analysis of Social Media 10-802 in Spring 2010
- Syllabus from the Fall 2007 version of the course
The first half of the course, roughly, will be presentations of background material. Luckily there are some very good recent surveys on this.
- Opinion mining and sentiment analysis, by Bo Pang and Lillian Lee, in Foundations and Trends in Information Retrieval 2(1-2), pp. 1–135, 2008. Also available as a book or e-book.
- Networks, Crowds, and Markets, by David Easley and Jon Kleinberg. This very readable text has been recently published, as a textbook for an undergrad class. PDF is still available free on-line
- A survey of statistical network models, Goldenberg, Zheng, Fienberg, and Airoldi. A survey article.
The second half of the course will be presentations of recent research papers.
- Important terms used in Analysis of Social Media
- Problems frequently addressed in Analysis of Social Media
- Computational methods frequently used in Analysis of Social Media
- Datasets studied in Analysis of Social Media
- Recent or influential technical papers in Analysis of Social Media
Assignment 1: Due 10:30 AM Tuesday, Sep 25
Send to me (William Cohen), Aasish, and Bhavana an email listing 3 papers you plan to summarize with title, authors, link to on-line version
Any not-yet-summarized papers listing as "papers to review" on this years or last year’s syllabus (I guess last year they were "papers to present") are pre-approved - otherwise, it's fine to pick something new, and we will give you feedback if a paper is a problem. Also, don't propose papers I’ve presented in class, or papers that already have entries in the wiki.
Assignment 2: Due 10:30 AM Thus, Sep 27
Send in your first summary - by sending a link of the new wikipage to email@example.com following instructions: Submission Instructions for submitting paper reviews
Here are some example summaries:
Some examples without the new 'study guide' section are here:
- http://malt.ml.cmu.edu/mw/index.php/Turney,_ACL_2002 - from William
- http://malt.ml.cmu.edu/mw/index.php/Pang_et_al_EMNLP_2002 - from William
- http://malt.ml.cmu.edu/mw/index.php/Recent_or_influential_technical_papers_in_Analysis_of_Social_Media - from previous students
Note that you are required to link the summary back into concepts on the wiki via the links AddressesProblem, Category, RelatedPaper, UsesMethod, and UsesDataset. It's ok to link to a problem/paper/method/dataset that's not yet on the wiki, BUT please search first to check if its there.
Assignment 3: Due 10:30 AM Tues, Oct 2
Submit the next two summaries by sending links of the new wikipages to firstname.lastname@example.org following instructions: Submission Instructions for submitting paper reviews
Assignment 4: Due 10:30 AM Tues, Oct 9
Submit your preliminary project proposal by sending links of the new wikipages to email@example.com following the submission instructions for project proposal.
You should add your page to the wiki by extending this list of project proposals for 10-802 in Fall 2012.
If you're looking for partners for your project - or looking for projects to join - then please add entries on the following wiki page : Hunt for Project partners
There are instructions for project proposals in the slides for the 10-2 lecture.
Assignment 5: Due 10:30 AM Tues, Oct 16
Submit your final project proposal by sending links of the new wikipages to firstname.lastname@example.org following the submission instructions for project proposal.
Assignment 6: Due 10:30 AM Tues, Nov 6
In this assignment we want you to summarize a related (and non-wikified) paper of an already wikified paper and give a comparative critique for this pair of papers. There is a list of pairs of papers that you can choose from. You need to assign yourself one pair of papers by writing your Andrew-ID in the 3rd column. Please do not select a pair in which one of the papers is summarized by you. We want you to read someone else's summary and understand that paper. Please do not overwrite anybody else's choice. Link to the list of papers : ToWikify
Pages to create : You can to click on the name of the non-wikified paper (column 2), and fill it up with summary. Then create another page to compare the 2 papers and put that link in the column 4.
Details to be put in your submission : For the paper summary of non-wikified paper follow the same instructions as earlier assignments. On the comparison page, first list the two papers your are comparing along with wiki-links to their summaries. Then you can say whether the papers are similar, different or influenced based on the one or more of the following in your own words.:
- dataset used
- big idea
Also answer following 6 questions in the same order at the end of the wiki-page you created. (please note than answers to following questions will help us in understanding the usefulness of creating such wiki-pages and having study-plans. There are points to answer these questions. But the values filled here, won't affect your or any other student's grades.)
- How much time did you spend reading the (new, non-wikified) paper you summarized?
- How much time did you spend reading the old wikified paper?
- How much time did you spend reading the summary of the old paper?
- How much time did you spend reading background materiel?
- Was there a study plan for the old paper?
- if so, did you read any of the items suggested by
- the study plan? and how much time did you spend with reading them?
- Give us any additional feedback you might have about this assignment.
You can email email@example.com with subject containing [review-4] once you are done with the assignment.
Assignment 7: Due 11:59 PM EST , 11th Dec 2012
Final project reports are due on 11th December 11:59 pm EST. Please email you PDF files to firstname.lastname@example.org with subject heading containing string "[report Tuesday-slotID]" or "[report Thursday-slotID]". The day and slotID depends on your final project presentations schedule posted at : Final_project_presentations_1 and Final_project_presentations_2
The report should be submitted in a conference paper format, 6-10 pages in length. File format should be "PDF". You should explain your problem definition, motivation, datasets, methods, experiments, related work, conclusions, future work(if any) in the report. Preferred template : 
Grades are based on:
- The paper presentation
- The project (writeup and presentation).
- Class participation.
I use some discretion in assigning grades but my guidelines for grading are announced in the overview talk.