Difference between revisions of "10-601 Deep Learning 1"

From Cohen Courses
Jump to navigationJump to search
Line 22: Line 22:
 
===  Summary  ===
 
===  Summary  ===
  
* To be added
+
* The underlying reasons deep networks are hard to train
 +
** Exploding/vanishing gradients
 +
** Saturation
 +
* The importance of key recent advances in neural networks:
 +
** Matrix operations and GPU training
 +
** ReLU, cross-entropy, softmax
 +
* Convolutional networks
 +
** 2-d convolution
 +
** How to construct a convolution layer
 +
** Architecture of CNN: convolution/downsampling pairs

Revision as of 14:06, 5 April 2016

This a lecture used in the Syllabus for Machine Learning 10-601B in Spring 2016

Slides

Readings

This area is moving very fast and the textbooks are not up-to-date. Some recommended readings:

I also used some on-line visualizations in the materials for the lecture, especially the part on ConvNets.

For more detail, look at the [http://www.deeplearningbook.org/ MIT Press book (in preparation) from Bengio


Summary

  • The underlying reasons deep networks are hard to train
    • Exploding/vanishing gradients
    • Saturation
  • The importance of key recent advances in neural networks:
    • Matrix operations and GPU training
    • ReLU, cross-entropy, softmax
  • Convolutional networks
    • 2-d convolution
    • How to construct a convolution layer
    • Architecture of CNN: convolution/downsampling pairs