Difference between revisions of "Class meeting for 10-605 Rocchio and Hadoop Workflows"

Latest revision as of 16:16, 14 October 2015

Joachims, Thorsten, A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization. Proceedings of International Conference on Machine Learning (ICML), 1997.
Relevance Feedback in Information Retrieval, SMART Retrieval System Experiments in Automatic Document Processing, 1971, Prentice Hall Inc.
Schapire et al, Boosting and Rocchio applied to text filtering, SIGIR 98.
Littlestone, Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm, MLJ 1988. Includes the mistake-bound theory.

@@ Line 32: / Line 32: @@
 * Schapire et al, [http://dl.acm.org/citation.cfm?id=290996 Boosting and Rocchio applied to text filtering], SIGIR 98.
 * Littlestone, [http://www.springerlink.com/index/X1022977778L1777.pdf Learning quickly when irrelevant attributes abound: A new linear-threshold algorithm], MLJ 1988. Includes the mistake-bound theory.
+=== Things to Remember ===
+* The TFIDF representation for documents.
+* The Rocchio algorithm.
+* Why Rocchio is easy to parallelize.