Difference between revisions of "Zheleva and Getoor, WWW2009, PinKDD2007 paper comparison"

From Cohen Courses
Jump to navigationJump to search
Line 17: Line 17:
  
 
In the first paper, the authors test their methods on a large number of real-world datasets that include data from [[UsesDataset::Flickr]], [[UsesDataset::Facebook]], [[UsesDataset::Dogster]] and [[UsesDataset::BibSonomy]]. In the second paper, the authors generated synthetic graph data with varying statistical and structural assumptions to test how much link information can be disclosed under different anonymization strategies.
 
In the first paper, the authors test their methods on a large number of real-world datasets that include data from [[UsesDataset::Flickr]], [[UsesDataset::Facebook]], [[UsesDataset::Dogster]] and [[UsesDataset::BibSonomy]]. In the second paper, the authors generated synthetic graph data with varying statistical and structural assumptions to test how much link information can be disclosed under different anonymization strategies.
 
== Big idea ==
 
 
  
 
== Questions ==
 
== Questions ==

Revision as of 10:01, 6 November 2012

The papers

Both the papers have been written by the same authors but deal with slightly different albeit related topics.

Problem

The two papers deal with two different aspects of privacy preservation in anonymized data. The first paper attempts to show how an adversary can predict private attributes of users in a social network by exploiting information about friendship links and group membership. The second paper shows how it is possible to infer sensitive/private relationships in graph data even when all of the sensitive edge data has been removed.

Methods

The first paper focuses mainly on adversary techniques to infer senstive information about the attributes of a user. The authors describe various privacy attack techniques using links such as Friend-aggregate model (AGG), Collective classification model (CC), Flat-link model (LINK), and Blockmodeling attach (BLOCK). The authors also some Group-based classification approaches that take advantage of group membership information.

In the second paper, the authors describe 5 different anonymization techniques, namely intact edges, partial-edge removal, cluster edge anonymization, cluster-edge anonymization with constraints and removed edges and then try out link re-identification on the anonymized datasets using the noisy-or model.

Dataset

In the first paper, the authors test their methods on a large number of real-world datasets that include data from Flickr, Facebook, Dogster and BibSonomy. In the second paper, the authors generated synthetic graph data with varying statistical and structural assumptions to test how much link information can be disclosed under different anonymization strategies.

Questions

1. How much time did you spend reading the (new, non-wikified) paper you summarized?

2 hours

2. How much time did you spend reading the old wikified paper?

30 min

3. How much time did you spend reading the summary of the old paper?

10 min

4. How much time did you spend reading background materiel?

20 min

5. Was there a study plan for the old paper? if so, did you read any of the items suggested by the study plan? and how much time did you spend with reading them?

There was no study plan.

6. Give us any additional feedback you might have about this assignment.