NiLao LiuLiu YandongLiu project abstract

From Cohen Courses
Jump to navigationJump to search
Team

Ni Lao, Liu Liu, Yandong Liu

Hidden Variable Detection for Social Network Analysis

Most social network analysis tasks are based on Entity Relation (ER) graphs. Examples include citation networks (e.g., CiteSeer, DBLP); movie database (e.g., IMDB), music database (Konstas et al., 2009); homeland security, (Lin & Chalupsky, 2008), and relations among person or nations (Kemp et al., 2006; Kok & Domingos, 2007). As shown by Minkov & Cohen (2007, 2008), the particular paths on ER graphs can be very indicative to the reasoning of entity attributes or relations among entities. However, since the number of path is generally exponential to the maximum path length, models based on these paths can be complex and inefficient for learning and inference.

Inducing hidden variables to graphic models is believed to simplify model structure and improve prediction quality. Therefore, the goal of this project is to explore the benefit of hidden variable detection for relational learning. The task we will be working on is social network analysis (Kok & Domingos, 2007).

Earlier approach based on signature detection is inefficient and does not guarantee the improvement of objective function (Elidan et al., 2000). Later approach based on multi-clustering suffers similar problem (Kok & Domingos, 2007). In this proposal, we plan to introduce hidden variables directly base on the gradient of objective function (regularized data likelihood), therefore guarantee the improvement of objective function during training.


What you plan to do with what data

CRFs Hidden Variable Detection on Social Network Analysis data (Kok & Domingos, 2007)

Why you think it’s interesting

Hidden Variable Detection is a fundamental goal for knowledge discovery and data modeling, and Social Network Analysis seems to be the right kind of task to benifit from it.

Any relevant superpowers you might have

I have worked on CRFs feature selection task on the Social Network Analysis data. I have worked on link prediction tasks based on path features of entities relation graphs.

How you plan to evaluate your work

Same as Kok & Domingos (2007).

What techniques you plan to use

Introduce hidden variables directly base on the gradient of objective function (regularized data likelihood)

What question you want to answer

Does the introduction of hidden variables improves prediction quality?

references

Gal Elidan, Noam Lotner, Nir Friedman and Daphne Koller. Discovering hidden variables: A structure Based-Approach. Neural Information Processing Systems (NIPS), 2000.

Kemp, C., Tenenbaum, J. B., Griffiths, T. L., Yamada, T. & Ueda, N. (2006). Learning systems of concepts with an infinite relational model. AAAI 2006

Kok, S., & Domingos, P. (2007) Statistical Predicate Invention. Proceedings of the Twenty-Fourth International Conference on Machine Learning (pp. 433-440). Corvallis, Oregon: ACM Press.

Einat Minkov, William W. Cohen, Learning Graph Walk Based Similarity Measures for Parsed Text in EMNLP 2008

Einat Minkov, William W. Cohen (2007) Learning to Rank Typed Graph Walks: Local and Global Approaches in WebKDD and SNA-KDD joint workshop

Shou-de Lin, Hans Chalupsky (2008), Discovering and Explaining Abnormal Nodes in Semantic Graphs, IEEE Transactions on Knowledge and Data Engineering, Volume 20 , Issue 8 Pages 1039-1052