Difference between revisions of "Document representation and query expansion models for blog recommendation"

From Cohen Courses
Jump to navigationJump to search
(Created page with 'This is a summary of research paper as part of Social Media Analysis 10-802, Fall 2012. == Citation == J. Arguello, J. L. Elsas, J. Callan, and J. G. Carbonell. Document represe…')
 
Line 9: Line 9:
  
 
== Abstract from the paper ==
 
== Abstract from the paper ==
 
+
We explore several different document representation models and two query expansion models for the task of recommending blogs to a user in response to a query. Blog relevance ranking differs from traditional document ranking in ad-hoc information retrieval in several ways: (1) the unit of output (the blog) is composed of a collection of documents (the blog posts) rather than a single document, (2) the query represents an ongoing – and typically multifaceted – interest in the topic rather than a passing ad-hoc information need and (3) due to the propensity of spam, splogs, and tangential comments, the blogosphere is particularly challenging to use as a source for high-quality query expansion terms. We address these differences at the document representation level, by comparing retrieval models that view either the blog or its constituent posts as the atomic units of retrieval, and at the query expansion level, by making novel use of the links and anchor text in Wikipedia1 to expand a user’s initial query. We develop two complementary models of blog retrieval that perform at comparable levels of precision and recall. We also show consistent and significant improvement across all models using our Wikipedia expansion strategy.
  
 
== Summary ==
 
== Summary ==

Revision as of 00:33, 5 November 2012

This is a summary of research paper as part of Social Media Analysis 10-802, Fall 2012.

Citation

J. Arguello, J. L. Elsas, J. Callan, and J. G. Carbonell. Document representation and query expansion models for blog recommendation. In Proc. of the 2nd Intl. Conf. on Weblogs and Social Media (ICWSM), 2008.

Online Version

Direct PDF link

Abstract from the paper

We explore several different document representation models and two query expansion models for the task of recommending blogs to a user in response to a query. Blog relevance ranking differs from traditional document ranking in ad-hoc information retrieval in several ways: (1) the unit of output (the blog) is composed of a collection of documents (the blog posts) rather than a single document, (2) the query represents an ongoing – and typically multifaceted – interest in the topic rather than a passing ad-hoc information need and (3) due to the propensity of spam, splogs, and tangential comments, the blogosphere is particularly challenging to use as a source for high-quality query expansion terms. We address these differences at the document representation level, by comparing retrieval models that view either the blog or its constituent posts as the atomic units of retrieval, and at the query expansion level, by making novel use of the links and anchor text in Wikipedia1 to expand a user’s initial query. We develop two complementary models of blog retrieval that perform at comparable levels of precision and recall. We also show consistent and significant improvement across all models using our Wikipedia expansion strategy.

Summary

Overview

This paper proposes some techniques for query expansion for document representation which is used for blog recommendation.

Proposed Techniques

Evaluation

Discussion

Related Papers

Study Plan

Resources useful for understanding this paper