Comparison: Collier et al. Journal of Biomedical Semantics 2011 and Sadilek et al Sixth AAAI International Conference on Weblogs and Social Media (ICWSM)

From Cohen Courses
Revision as of 02:07, 6 November 2012 by Rajarshd (talk | contribs)
Jump to navigationJump to search

Papers

The papers are :

Comparison

Method

The first paper uses two supervised clustering methods namely Naive Bayes and SVM. They also used the Simple Rule Language toolkit as they thought that custom built regular expressions could achieve higher precision. The second paper uses a semi-supervised cascade based approach to learn a robust Support Vector Machines.

Dataset

  • Similarity :Both the papers use user tweets for the analysis of their work. Though they were collected over different times.
  • Difference: The first paper applies additional filters to further refine the tweets they select. They use a bag of 7 keywords to select tweets on topics related to influenza (i.e.: flu, H1N1, swine flu etc.). The second paper just collects normal tweets from users of NYC and then build a classifier to identify tweets related to illness.

Problem and Big idea

  • Similarity : The works are related as in they both try to (at a high level) analyze public health from social media (tweets specifically)
  • Difference : The first paper aims to address the problem of undetected epidemic disease spreading by detecting patterns (of self protective behavior) in tweets. The second work tries to analyze the influence of social interaction (physical proximity between friends) on the spread of disease, again my analyzing the user's social media behavior (tweets)
  • Long term goal/ big idea : I think both works have the same long term goal of being able to model the spread of epidemics through social media, and hence being able to detect the break of epidemics in its early stages and hence sufficient precautions.