The Topic-Perspective Model for Social Tagging Systems

From Cohen Courses
Revision as of 23:51, 1 October 2012 by Mmahavee (talk | contribs)
Jump to navigationJump to search

Citation

The Topic-Perspective Model for Social Tagging Systems Caimei lu, Xiaohua Hu, Xin Chen, Jung-ran Park, TingTing He, and Zhoujun Li

Online version

http://www.pages.drexel.edu/~cl389/dataset/kdd10-lu.pdf

Summary

In this paper, authors propose LDA type[1] generative model for social tag annotation. Usually tags associated with a particularly URL belongs either to the content of the URL or the tagger’s perspective about content of the URL. In data mining applications, we would be interested in separating tags associated with the content from tagger’s perspective. In proposed generative model model, we get probability of each tag being associated with content and tagger perspective. In result section, authors shows that this model improves on previously proposed models for same task. Tags associated with user perspective can help in improving personalized search.


Motivation for proposed model

1. Document is written before a tagger assigns a tag to the document so term generation process for each document should be separated from the tag generation process. They use standard LDA[2] topic model for the term generation process of document.

2.When a user generates a tag for a document, it depends either on topic distribution of the document or user’s perspective. They use switch variable to decide whether the user’s perspective or the document topic is used in generation of the tag.

Model

SocialTagGM.png

As shown in figure, model is divided in two parts by dashed line. Right part shows the normal LDA[3] generative model. Left part shows how tags are generated. To generate each tag, first an indicator variable x is generated. If x equals 1, then tag is generated using document’s topic distribution. If x is 0 then first a user perspective p is sampled using perspective distribution of user then tag t is drawn from the tag distribution of perspective p. (SAI_p). Model figure.

Related papers

There has been a lot of work on anomaly detection in graphs.

  • The paper by Moonesinghe and Tan ICTAI06 finds the clusters of outlier objects by doing random walk on the weighted graph.
  • The paper by Aggarwal SIGMOD 2001 proposes techniques for projecting high dimensional data on lower dimensions to detect outliers.

Study plan

  • Article:Bipartite graph:[4]
  • Article:Anomaly detection:[5]
  • Paper:Topic sensitive pagerank:[6]
    • Paper:The PageRank Citation Ranking: Bringing Order to the Web:[7]
  • Paper:Multilevel k-way Partitioning Scheme for Irregular Graphs:[8]