Difference between revisions of "Class meeting for 10-605 in Fall 2016 Overview"

From Cohen Courses
Jump to navigationJump to search
Line 3: Line 3:
 
=== Slides ===
 
=== Slides ===
  
* [http://www.cs.cmu.edu/~wcohen/10-605/overview.pptx Slides in Powerpoint]
+
* [http://www.cs.cmu.edu/~wcohen/10-605/2016/overview.pptx Slides in Powerpoint]
* [http://www.cs.cmu.edu/~wcohen/10-605/overview.pdf Slides in PDF]
+
* [http://www.cs.cmu.edu/~wcohen/10-605/2016/overview.pdf Slides in PDF]
  
 
=== Homework ===
 
=== Homework ===

Revision as of 16:33, 1 August 2017

This is one of the class meetings on the schedule for the course Machine Learning with Large Datasets 10-605 in Fall 2016.

Slides

Homework

  • Today's quiz: [1]

Readings for the Class

Also discussed

Things to remember

  • Why use big data?
    • Simple learning methods with large data sets can outperform complex learners with smaller datasets
    • The ordering of learning methods, best-to-worst, can be different for small datasets than from large datasets
    • The best way to improve performance for a learning system is often to collect more data
    • Large datasets often imply large classifiers
  • Asymptotic analysis
    • It measures number of operations as function of problem size
    • Different operations (eg disk seeking, scanning, memory access) can have very very different costs
    • Disk access is cheapest when you scan sequentially