Difference between revisions of "Class meeting for 10-605 in Fall 2016 Streaming Naive Bayes"

Latest revision as of 11:26, 11 August 2017

This is one of the class meetings on the schedule for the course Machine Learning with Large Datasets 10-605 in Fall 2016.

Slides

Slides in Powerpoint, in PDF - the stream-and-sort pattern, and large-vocabulary Naive Bayes
Today's quiz

Readings for the Class

Required: my notes on streaming and Naive Bayes
Optional: If you're interested in reading more about smoothing for naive Bayes, I recommend this paper: Peng, Fuchun, Dale Schuurmans, and Shaojun Wang. "Augmenting naive Bayes classifiers with statistical language models." Information Retrieval 7.3 (2004): 317-345.

Things to Remember

Zipf's law and the prevalence of rare features/words
Communication complexity
Stream and sort
- Complexity of merge sort
- How pipes implement parallel processing
- How buffering output before a sort can improve performance
- How stream-and-sort can implement event-counting for naive Bayes

@@ Line 1: / Line 1: @@
-This is one of the class meetings on the [[Syllabus for Machine Learning with Large Datasets 10-605 in Spring 2013|schedule]] for the course [[Machine Learning with Large Datasets 10-605 in Spring_2013]].
+This is one of the class meetings on the [[Syllabus for Machine Learning with Large Datasets 10-605 in Fall 2016|schedule]] for the course [[Machine Learning with Large Datasets 10-605 in Fall 2016]].
 === Slides ===
-* [http://www.cs.cmu.edu/~wcohen/10-605/stream-nb.pptx Slides 1 - streaming Naive Bayes]
+* [http://www.cs.cmu.edu/~wcohen/10-605/2016/stream-and-sort.pptx Slides in Powerpoint], [http://www.cs.cmu.edu/~wcohen/10-605/2016/stream-and-sort.pdf in PDF] - the stream-and-sort pattern, and large-vocabulary Naive Bayes
-* [http://www.cs.cmu.edu/~wcohen/10-605/stream-and-sort.pptx Slides 2 - the stream-and-sort pattern, and large-vocabulary Naive Bayes]
+* [https://qna-app.appspot.com/edit_new.html#/pages/view/aglzfnFuYS1hcHByGQsSDFF1ZXN0aW9uTGlzdBiAgIDQy6a0CAw Today's quiz]
+=== Readings for the Class ===
-=== Readings for the Class ===
+* Required: [http://www.cs.cmu.edu/~wcohen/10-605/notes/scalable-nb-notes.pdf my notes on streaming and Naive Bayes]
+* Optional:  If you're interested in reading more about smoothing for naive Bayes, I recommend this paper:  Peng, Fuchun, Dale Schuurmans, and Shaojun Wang. "Augmenting naive Bayes classifiers with statistical language models." Information Retrieval 7.3 (2004): 317-345.
+=== Things to Remember ===
-* None required.  If you're interested in reading more about smoothing for naive Bayes, I recommend this paper:  Peng, Fuchun, Dale Schuurmans, and Shaojun Wang. "Augmenting naive Bayes classifiers with statistical language models." Information Retrieval 7.3 (2004): 317-345.
+* Zipf's law and the prevalence of rare features/words
+* Communication complexity
+* Stream and sort
+** Complexity of merge sort
+** How pipes implement parallel processing
+** How buffering output before a sort can improve performance
+** How stream-and-sort can implement event-counting for naive Bayes

Difference between revisions of "Class meeting for 10-605 in Fall 2016 Streaming Naive Bayes"

Latest revision as of 11:26, 11 August 2017

Slides

Readings for the Class

Things to Remember

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools