Difference between revisions of "Class meeting for 10-605 in Fall 2016 Streaming Naive Bayes"
From Cohen Courses
Jump to navigationJump to search (Created page with "This is one of the class meetings on the schedule for the course Machine Learning with Large Data...") |
m (Wcohen moved page Class meeting for 10-605 Streaming Naive Bayes to Class meeting for 10-605 in Fall 2016 Streaming Naive Bayes) |
||
(11 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
− | This is one of the class meetings on the [[Syllabus for Machine Learning with Large Datasets 10-605 in | + | This is one of the class meetings on the [[Syllabus for Machine Learning with Large Datasets 10-605 in Fall 2016|schedule]] for the course [[Machine Learning with Large Datasets 10-605 in Fall 2016]]. |
=== Slides === | === Slides === | ||
− | * [http://www.cs.cmu.edu/~wcohen/10-605/stream- | + | * [http://www.cs.cmu.edu/~wcohen/10-605/2016/stream-and-sort.pptx Slides in Powerpoint], [http://www.cs.cmu.edu/~wcohen/10-605/2016/stream-and-sort.pdf in PDF] - the stream-and-sort pattern, and large-vocabulary Naive Bayes |
− | + | * [https://qna-app.appspot.com/edit_new.html#/pages/view/aglzfnFuYS1hcHByGQsSDFF1ZXN0aW9uTGlzdBiAgIDQy6a0CAw Today's quiz] | |
+ | === Readings for the Class === | ||
− | === | + | * Required: [http://www.cs.cmu.edu/~wcohen/10-605/notes/scalable-nb-notes.pdf my notes on streaming and Naive Bayes] |
+ | * Optional: If you're interested in reading more about smoothing for naive Bayes, I recommend this paper: Peng, Fuchun, Dale Schuurmans, and Shaojun Wang. "Augmenting naive Bayes classifiers with statistical language models." Information Retrieval 7.3 (2004): 317-345. | ||
+ | |||
+ | === Things to Remember === | ||
− | * | + | * Zipf's law and the prevalence of rare features/words |
+ | * Communication complexity | ||
+ | * Stream and sort | ||
+ | ** Complexity of merge sort | ||
+ | ** How pipes implement parallel processing | ||
+ | ** How buffering output before a sort can improve performance | ||
+ | ** How stream-and-sort can implement event-counting for naive Bayes |
Latest revision as of 11:26, 11 August 2017
This is one of the class meetings on the schedule for the course Machine Learning with Large Datasets 10-605 in Fall 2016.
Slides
- Slides in Powerpoint, in PDF - the stream-and-sort pattern, and large-vocabulary Naive Bayes
- Today's quiz
Readings for the Class
- Required: my notes on streaming and Naive Bayes
- Optional: If you're interested in reading more about smoothing for naive Bayes, I recommend this paper: Peng, Fuchun, Dale Schuurmans, and Shaojun Wang. "Augmenting naive Bayes classifiers with statistical language models." Information Retrieval 7.3 (2004): 317-345.
Things to Remember
- Zipf's law and the prevalence of rare features/words
- Communication complexity
- Stream and sort
- Complexity of merge sort
- How pipes implement parallel processing
- How buffering output before a sort can improve performance
- How stream-and-sort can implement event-counting for naive Bayes