Social Media Analysis 10-802 in Spring 2010

From Cohen Courses
Jump to: navigation, search



  • (3/3) I've added a schedule for mid-term project reports, and also final project reports, to the syllabus. Also, a reminder for speakers:
    • After you're done with the pages (for your presentation), send me an email with links to the wikipages you've added, so I can grade them.
  • (2/12) Presentations: I've added presentation slots to the wiki.
    • Everyone should sign up for a presentation time by the end of next week (Friday, 2/19). You don't need to pick a paper now, just a time, but you should definitely pick a paper and clear it with me at least one week before your presentation.
    • Before your presentation, you should also
      1. Follow this example page and create a page for your presentation itself, linking it in to the appropriate spot in the syllabus as I did under 2/16.
      2. Follow one of these examples and create a page for the paper you're presenting: Knuth,_CACM_1984 Turney,_ACL_2002 Pang_et_al_EMNLP_2002
      3. Add at least four other pages to the wiki that correspond to methods, datasets, or problems addressed in the paper (or else related papers). You can find examples of these types of pages under the resources section of this page.
      • Note: When you add these pages, make sure you use the right "semantic links", as in the examples, for the properties UsesMethod, UsesDataset, RelatedPaper, etc, as in the examples. (If you do it right they will show up at the bottom of the page under "Facts about....")
    • After you're done with the pages, send me an email with links to the wikipages you've added, so I can grade them.
  • (2/8) I added a list of proposal titles and teams to the page for 2/2, when the proposals were presented. Feel free to clarify the titles, some of which I invented, or link to descriptions, if you want to share them. Also make a note if you are interested getting new team members, some people are still looking.
  • (1/28) Reminder, the project proposal is due Monday 2/1. You need to send William, by midnight:
    • A one-page description of what you plan to do.
    • At most two slides (in Powerpoint or PDF) summarizing the idea to present in class Tuesday.
    • If you're looking for a partner, Sky put together a wiki page to help: Project Brainstorming for 10-802 in Spring 2010
  • (1/26) William's office hours are Thursday 1:30-2:30.

Instructor and Venue

  • Instructor: William Cohen, Machine Learning Dept and LTI
  • Course secretary: Sharon Cavlovich,, 412-268-5196
  • When/where: Tues-Thus 10:30-11:50, 4102 GHC
  • Course Number: MLD 10-802, cross-listed as LTI 11-772
  • Prerequisites: a machine learning course (e.g., 10-701 or 10-601) or consent of the instructor.
  • TA: there is no TA for this course
  • Syllabus: Syllabus for Analysis of Social Media 10-802 in Spring 2010
  • Office hours: Thursday 1:30-2:30, or by appointment with Sharon.

Clarification/announcement: This will be a regular 12-credit course (despite the preliminary listing of it as a 6-credit course).


The most actively growing part of the web is "social media" - e.g.. wikis, blogs, bboards, and collaboratively-developed community sites like Flikr and YouTube. This course will review selected papers from the recent research literature that address the problem of analyzing and understanding social media. Topics that will be covered include:

  • Text analysis techniques for sentiment analysis, analysis of figurative language, authorship attribution, and inference of demographic information about authors (e.g., age or sex).
  • Community analysis techniques for detecting communities, predicting authority, assessing influence (e.g. in viral marketing), or detecting spam.
  • Visualization techniques for understanding the interactions within and between communities.
  • Learning techniques for modeling and predicting trends in social media, or predicting other properties of media (e.g., user-provided content tags.)

Students should have a machine learning course (e.g., 10-601 or similar) or consent of the instructor. Readings will be based on research papers. Grades will be based on class participation, paper presentations, and a project. More specifically, students will be expected to:

  • Prepare summaries of the papers discussed in class. Summaries will be posted on this wiki.
  • Present and summarize one or more "optional" papers from the syllabus (or some other mutually agreeable paper) to the class.
  • Do a course project in a group of 2-3 people. The end result of the project will be a written report, with format and length appropriate for a conference publication.

Syllabus and Readings

Older syllabi:

The first half of the course, roughly, will be presentations of background material. Luckily there are some very good recent surveys on this.

The second half of the course will be presentations of recent research papers.

Other Resources


Grades are based on:

  • Presentation:
    • 15% background material on wiki
    • 10% talk
  • Project:
    • 25% background material on wiki
    • 10% talk
    • 20% conference-paper writeup, research
  • Class participation:
    • 20%: read the material in advance, come with some questions/comments in mind.

Course Project