Difference between revisions of "10-601 Deep Learning 1"

Latest revision as of 14:18, 11 April 2016

This area is moving very fast and the textbooks are not up-to-date. Some recommended readings:

Neural Networks and Deep Learning An online book by Michael Nielsen, pitched at an appropriate level for 10-601, which has a bunch of exercises and on-line sample programs in Python.

For more detail, look at the MIT Press book (in preparation) from Bengio - it's very complete but also fairly technical.

The underlying reasons deep networks are hard to train
- Exploding/vanishing gradients
- Saturation
The importance of key recent advances in neural networks:
- Matrix operations and GPU training
- ReLU, cross-entropy, softmax

@@ Line 4: / Line 4: @@
 * [http://www.cs.cmu.edu/~wcohen/10-601/deep-1.pptx Slides in PowerPoint],[http://www.cs.cmu.edu/~wcohen/10-601/deep-1.pdf Slides in PDF].
-* [http://www.cs.cmu.edu/~wcohen/10-601/deep-2.pptx Slides in PowerPoint],[http://www.cs.cmu.edu/~wcohen/10-601/deep-2.pdf Slides in PDF].
 === Readings ===
@@ Line 21: / Line 20: @@
 ** Matrix operations and GPU training
 ** ReLU, cross-entropy, softmax
-* Convolutional networks
-** 2-d convolution
-** How to construct a convolution layer
-** Architecture of CNN: convolution/downsampling pairs