Difference between revisions of "10-601 Deep Learning 1"

From Cohen Courses
Jump to navigationJump to search
 
(One intermediate revision by the same user not shown)
Line 4: Line 4:
  
 
* [http://www.cs.cmu.edu/~wcohen/10-601/deep-1.pptx Slides in PowerPoint],[http://www.cs.cmu.edu/~wcohen/10-601/deep-1.pdf Slides in PDF].
 
* [http://www.cs.cmu.edu/~wcohen/10-601/deep-1.pptx Slides in PowerPoint],[http://www.cs.cmu.edu/~wcohen/10-601/deep-1.pdf Slides in PDF].
* [http://www.cs.cmu.edu/~wcohen/10-601/deep-2.pptx Slides in PowerPoint],[http://www.cs.cmu.edu/~wcohen/10-601/deep-2.pdf Slides in PDF].
 
  
 
=== Readings ===
 
=== Readings ===
Line 21: Line 20:
 
** Matrix operations and GPU training
 
** Matrix operations and GPU training
 
** ReLU, cross-entropy, softmax
 
** ReLU, cross-entropy, softmax
* Convolutional networks
 
** 2-d convolution
 
** How to construct a convolution layer
 
** Architecture of CNN: convolution/downsampling pairs
 

Latest revision as of 14:18, 11 April 2016

This a lecture used in the Syllabus for Machine Learning 10-601B in Spring 2016

Slides

Readings

This area is moving very fast and the textbooks are not up-to-date. Some recommended readings:

  • Neural Networks and Deep Learning An online book by Michael Nielsen, pitched at an appropriate level for 10-601, which has a bunch of exercises and on-line sample programs in Python.

For more detail, look at the MIT Press book (in preparation) from Bengio - it's very complete but also fairly technical.

Things to remember

  • The underlying reasons deep networks are hard to train
    • Exploding/vanishing gradients
    • Saturation
  • The importance of key recent advances in neural networks:
    • Matrix operations and GPU training
    • ReLU, cross-entropy, softmax