Difference between revisions of "Class meeting for 10-405 Deep Learning"

From Cohen Courses
Jump to navigationJump to search
Line 5: Line 5:
 
* Lecture 1: [http://www.cs.cmu.edu/~wcohen/10-405/deep-1.pptx Powerpoint], [http://www.cs.cmu.edu/~wcohen/10-405/deep-1.pdf PDF].
 
* Lecture 1: [http://www.cs.cmu.edu/~wcohen/10-405/deep-1.pptx Powerpoint], [http://www.cs.cmu.edu/~wcohen/10-405/deep-1.pdf PDF].
  
* Lecture 2: [http://www.cs.cmu.edu/~wcohen/10-405/deep-2.pptx Powerpoint], [http://www.cs.cmu.edu/~wcohen/10-405/deep-2.pdf PDF] (draft)
+
* Lecture 2: [http://www.cs.cmu.edu/~wcohen/10-405/deep-2.pptx Powerpoint], [http://www.cs.cmu.edu/~wcohen/10-405/deep-2.pdf PDF]  
  
 
* Lecture 3: [http://www.cs.cmu.edu/~wcohen/10-405/deep-3.pptx Powerpoint], [http://www.cs.cmu.edu/~wcohen/10-405/deep-3.pdf PDF] (draft)
 
* Lecture 3: [http://www.cs.cmu.edu/~wcohen/10-405/deep-3.pptx Powerpoint], [http://www.cs.cmu.edu/~wcohen/10-405/deep-3.pdf PDF] (draft)

Revision as of 21:02, 21 March 2018

This is one of the class meetings on the schedule for the course Machine Learning with Large Datasets 10-405 in Spring 2018.

Slides

Quizzes

These are not updated yet --Wcohen (talk) 14:58, 19 March 2018 (EDT)

Sample code

Readings

Things to remember

  • The underlying reasons deep networks are hard to train
  • Exploding/vanishing gradients
  • Saturation
  • The importance of key recent advances in neural networks:
  • Matrix operations and GPU training
  • ReLU, cross-entropy, softmax
  • How backprop can be generalized to a sequence of assignment operations (autodiff)
    • Wengert lists
    • How to evaluate and differentiate a Wengert list
  • Common architectures
    • Multi-layer perceptron
    • Recursive NNs (RNNS) and Long/short term memory networks (LSTMs)
    • Convolutional Networks (CNNs)