Difference between revisions of "10-601 Deep Learning 1"

Revision as of 13:06, 5 April 2016

This area is moving very fast and the textbooks are not up-to-date. Some recommended readings:

Neural Networks and Deep Learning An online book by Michael Nielsen, pitched at an appropriate level for 10-601, which has a bunch of exercises and on-line sample programs in Python.
Stanford CS class CS231n: Convolutional Neural Networks for Visual Recognition has nice on-line notes.

I also used some on-line visualizations in the materials for the lecture, especially the part on ConvNets.

For more detail, look at the [http://www.deeplearningbook.org/ MIT Press book (in preparation) from Bengio

The underlying reasons deep networks are hard to train
- Exploding/vanishing gradients
- Saturation
The importance of key recent advances in neural networks:
- Matrix operations and GPU training
- ReLU, cross-entropy, softmax
Convolutional networks
- 2-d convolution
- How to construct a convolution layer
- Architecture of CNN: convolution/downsampling pairs

Revision as of 13:06, 5 April 2016 (view source) Wcohen (talk \| contribs) (→‎Summary) ← Older edit		Revision as of 13:06, 5 April 2016 (view source) Wcohen (talk \| contribs) (→‎Summary) Newer edit →
Line 20:		Line 20:


−	=== ~~Summary~~ ===	+	=== Things to remember ===

	* The underlying reasons deep networks are hard to train		* The underlying reasons deep networks are hard to train