Difference between revisions of "A Clustering Approach for the Nearly Unsupervised Recognition of Nonliteral Language, EACL-2006"

From Cohen Courses
Jump to navigationJump to search
Line 23: Line 23:
  
 
== Result ==
 
== Result ==
# TroFi algorithm outperforms the baseline by '''24.4%''' (on human-labeled data)
+
# TroFi achieved F1-score of 0.538, and outperforms the baseline by '''24.4%''' (on human-labeled data)
 
# Build the TroFi Example Base, which is freely available metaphor annotated resource
 
# Build the TroFi Example Base, which is freely available metaphor annotated resource
  
 
== Discussion and Thought ==
 
== Discussion and Thought ==
 
# This work explore a approach of metaphor identification which is relatively less mentioned. Compared with selection restriction modeling or lexicon-based methods, this method requires less human involvements, and adopt the technologies borrowed from word sense disambiguation
 
# This work explore a approach of metaphor identification which is relatively less mentioned. Compared with selection restriction modeling or lexicon-based methods, this method requires less human involvements, and adopt the technologies borrowed from word sense disambiguation

Revision as of 16:48, 7 November 2012

Citation

Birke, J. and A. Sarkar. 2006. A clustering approach for the nearly unsupervised recognition of nonliteral language. In Proceedings of EACL-06, pages 329–336.

Online Version

pdf link to the paper

Method Summary

  • TroFi (TropeFinder) System
  1. Task: Classifying literal and nonliteral usages of verbs
  2. Approach: Use nearly unsupervised word-sense disambiguation and * clustering techniques
  • Processing Steps
  1. KE Algorithm: Similarity-based word-sense disambiguation algorithm
    • Similarities are calculated between:
      1. Sentences containing the word we wish to disambiguate (the target word)
      2. Collections of seed sentences (feedback sets)
  2. Clean the Feedback Sets
    • In order to remove false attraction
    • 4 Principle of Scrubbing
      1. Human annotations (in DoKMIE) are reliable
      2. Phrasal and expression verbs are often indicative of nonliteral uses
      3. Content words appearing in both feedback sets should be avoided
      4. Learning & voting: Use four learners (A, B, C, D) to vote the best form of scrubbing action

Result

  1. TroFi achieved F1-score of 0.538, and outperforms the baseline by 24.4% (on human-labeled data)
  2. Build the TroFi Example Base, which is freely available metaphor annotated resource

Discussion and Thought

  1. This work explore a approach of metaphor identification which is relatively less mentioned. Compared with selection restriction modeling or lexicon-based methods, this method requires less human involvements, and adopt the technologies borrowed from word sense disambiguation