Difference between revisions of "Class meeting for 10-605 Overview"
From Cohen Courses
Jump to navigationJump to search (Wcohen moved page Class meeting for 10-605 Overview to Class meeting for 10-605 in Fall 2016 Overview) |
|||
Line 1: | Line 1: | ||
− | + | ||
+ | This is one of the class meetings on the [[Syllabus for Machine Learning with Large Datasets 10-605 in Fall 2016|schedule]] for the course [[Machine Learning with Large Datasets 10-605 in Fall 2016]]. | ||
+ | |||
+ | === Slides === | ||
+ | |||
+ | * [http://www.cs.cmu.edu/~wcohen/10-605/overview.pptx Slides in Powerpoint] | ||
+ | * [http://www.cs.cmu.edu/~wcohen/10-605/overview.pdf Slides in PDF] | ||
+ | |||
+ | === Homework === | ||
+ | |||
+ | * Before the next class: review your probabilities! You should be familiar with the material in these lectures: | ||
+ | ** [https://mediatech-stream.andrew.cmu.edu/Mediasite/Play/9e04feebd4bb4900a8c828388be620d91d?catalog=81e613d0-fda8-47a4-8340-86b96d5a3cbb my overview lecture from 10-601 ] (lecture from 1-13-2016) | ||
+ | ** [https://mediatech-stream.andrew.cmu.edu/Mediasite/Play/e99b074dadb24a11a68b6dae418ac9a91d?catalog=81e613d0-fda8-47a4-8340-86b96d5a3cbb first 20 minutes of second over lecture for 10-601] (lecture from 1-16-2016, up to the 'joint distribution' section) | ||
+ | The slides used in these lectures are [[10-601_Introduction_to_Probability|posted here]], along with some review notes for what is covered. | ||
+ | |||
+ | And after each lecture in this class there will be a quiz. | ||
+ | * Today's quiz: [https://qna-app.appspot.com/edit_new.html#/pages/view/aglzfnFuYS1hcHByGQsSDFF1ZXN0aW9uTGlzdBiAgIDQqdaqCQw] | ||
+ | |||
+ | === Readings for the Class === | ||
+ | |||
+ | * [http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/pubs/archive/35179.pdf The Unreasonable Effectiveness of Data] - Halevy, Pereira, Norvig | ||
+ | |||
+ | === Also discussed === | ||
+ | * [http://www-2.cs.cmu.edu/~wcohen/postscript/ijcai-93.ps William W. Cohen (1993): Efficient pruning methods for separate-and-conquer rule learning systems in IJCAI 1993: 988-994] | ||
+ | * [http://www-2.cs.cmu.edu/~wcohen/postscript/ml-95-ripper.ps William W. Cohen (1995): Fast effective rule induction in ICML 1995: 115-123.] | ||
+ | * [http://dl.acm.org/citation.cfm?id=1073017&bnc=1 Scaling to very very large corpora for natural language disambiguation], Banko & Brill, ACL 2001 | ||
+ | |||
+ | === Things to remember === | ||
+ | |||
+ | * Why use big data? | ||
+ | ** Simple learning methods with large data sets can outperform complex learners with smaller datasets | ||
+ | ** The ordering of learning methods, best-to-worst, can be different for small datasets than from large datasets | ||
+ | ** The best way to improve performance for a learning system is often to collect more data | ||
+ | ** Large datasets often imply large classifiers | ||
+ | |||
+ | * Asymptotic analysis | ||
+ | ** It measures number of operations as function of problem size | ||
+ | ** Different operations (eg disk seeking, scanning, memory access) can have very very different costs | ||
+ | ** Disk access is cheapest when you scan sequentially |
Revision as of 14:57, 10 August 2017
This is one of the class meetings on the schedule for the course Machine Learning with Large Datasets 10-605 in Fall 2016.
Slides
Homework
- Before the next class: review your probabilities! You should be familiar with the material in these lectures:
- my overview lecture from 10-601 (lecture from 1-13-2016)
- first 20 minutes of second over lecture for 10-601 (lecture from 1-16-2016, up to the 'joint distribution' section)
The slides used in these lectures are posted here, along with some review notes for what is covered.
And after each lecture in this class there will be a quiz.
- Today's quiz: [1]
Readings for the Class
- The Unreasonable Effectiveness of Data - Halevy, Pereira, Norvig
Also discussed
- William W. Cohen (1993): Efficient pruning methods for separate-and-conquer rule learning systems in IJCAI 1993: 988-994
- William W. Cohen (1995): Fast effective rule induction in ICML 1995: 115-123.
- Scaling to very very large corpora for natural language disambiguation, Banko & Brill, ACL 2001
Things to remember
- Why use big data?
- Simple learning methods with large data sets can outperform complex learners with smaller datasets
- The ordering of learning methods, best-to-worst, can be different for small datasets than from large datasets
- The best way to improve performance for a learning system is often to collect more data
- Large datasets often imply large classifiers
- Asymptotic analysis
- It measures number of operations as function of problem size
- Different operations (eg disk seeking, scanning, memory access) can have very very different costs
- Disk access is cheapest when you scan sequentially