Difference between revisions of "Z. Kou and W. Cohen. SDM 2007"

From Cohen Courses
Jump to navigationJump to search
Line 20: Line 20:
  
 
'''text region detection'''
 
'''text region detection'''
 +
 
The task is to find the text regions in figures, where each figure may contain several panels.
 
The task is to find the text regions in figures, where each figure may contain several panels.
* Relational information comes from: the prediction of other candidates within the same panel and the predictions of four neighbor panels (up, down, right and left).
+
Relational information comes from the prediction of other candidates within the same panel and the predictions of four neighbor panels (up, down, right and left).
* The predictions of the candidate region and neighbor regions return a vector of binary features indicating whether a character is found in those regions.
+
The predictions of the candidate region and neighbor regions return a vector of binary features indicating whether a character is found in those regions.
  
 
'''webpage classification'''
 
'''webpage classification'''
 +
 
The task is to classify web pages into different categories such as .  
 
The task is to classify web pages into different categories such as .  
 
Relational information comes from the number of incoming and outgoing links in each category.
 
Relational information comes from the number of incoming and outgoing links in each category.
  
 
'''NER'''
 
'''NER'''
* Relational information comes from nearby words. In this case Stacked Graphical Model is the same with SSL.
+
 
 +
Relational information comes from nearby words. In this case Stacked Graphical Model is the same with SSL.

Revision as of 16:30, 23 October 2010

Citation

Zhenzhen Kou and William W. Cohen. Stacked Graphical Models for Efficient Inference in Markov Random Fields in SDM-2007.

Online version

Stacked Graphical Models

Summary

This paper is an extension of Stacked Sequential Learning and shows how stacking can be used in non-sequential tasks, such as text region detection and document classification.

The paper shows their method has very fast inference time, 40 to 80 times faster than Gibbs sampling. Within their model, the relational information is captured by augmenting data with the predictions of related instances. They applied their method on various problems with different datasets: Web KB dataset, Cora network, and GENIA dataset.

Example Stacked Graphical Models Usage

text region detection

The task is to find the text regions in figures, where each figure may contain several panels. Relational information comes from the prediction of other candidates within the same panel and the predictions of four neighbor panels (up, down, right and left). The predictions of the candidate region and neighbor regions return a vector of binary features indicating whether a character is found in those regions.

webpage classification

The task is to classify web pages into different categories such as . Relational information comes from the number of incoming and outgoing links in each category.

NER

Relational information comes from nearby words. In this case Stacked Graphical Model is the same with SSL.