Comparison: Collier et al. Journal of Biomedical Semantics 2011 and Sadilek et al Sixth AAAI International Conference on Weblogs and Social Media (ICWSM)

From Cohen Courses
Jump to navigationJump to search

Papers

The papers are :

Comparison

Method

The first paper uses two supervised clustering methods namely Support Vector Machines and Naïve Bayes to identify self protection tweets. They also used the Simple Rule Language toolkit as they thought that custom built regular expressions could achieve higher precision. The second paper uses a semi-supervised cascade based approach to learn a robust Support Vector Machines to identify tweets which are related to sickness.

Dataset

  • Similarity : Both the papers use user tweets for the analysis of their work. Though they were collected over different times.
  • Difference: The first paper applies additional filters to further refine the tweets they select. They use a bag of 7 keywords to select tweets on topics related to influenza (i.e.: flu, H1N1, swine flu etc.). The second paper just collects normal tweets from users of NYC and then builds a classifier to identify tweets related to illness.

Problem and Big idea

  • Similarity : The idea behind both the works are related as in they both try to (at a high level) analyze public health from social media (tweets specifically)
  • Difference : The first paper aims to address the problem of undetected epidemic disease spreading by detecting patterns (of self protective behavior) in tweets. The second work tries to model the influence of social interaction (physical proximity between friends etc) on the spread of disease, again my analyzing the user's social media behavior (tweets)
  • Long term goal/ big idea : I think both the works have the same long term goal of being able to model the spread of epidemics through social media, and hence be able to detect the spread of epidemics in its early stages thereby allowing us to take sufficient precautions in advance.

Answer to the 6 questions

  • [1] How much time did you spend reading the (new, non-wikified) paper you summarized? (3 hours)
  • [2] How much time did you spend reading the old wikified paper? (1hr with the help of existing wiki summary.)
  • [3] How much time did you spend reading the summary of the old paper? (30 minutes)
  • [4] How much time did you spend reading background materiel? (2 hours)
  • [5] Was there a study plan for the old paper? (Yes, fortunately the study plan was similar as the new paper, so i had already done most of it while summarizing the new paper)
  • [6] Give us any additional feedback you might have about this assignment. -- I think this was a nice idea. Thanks!