Difference between revisions of "10-601 Deep Learning 1"

From Cohen Courses
Jump to navigationJump to search
Line 14: Line 14:
 
I also used some on-line visualizations in the materials for the lecture, especially the part on ConvNets.
 
I also used some on-line visualizations in the materials for the lecture, especially the part on ConvNets.
 
* [https://en.wikipedia.org/wiki/Convolution the Wikipedia page for convolutions] has nice animations of 1-D convolutions.
 
* [https://en.wikipedia.org/wiki/Convolution the Wikipedia page for convolutions] has nice animations of 1-D convolutions.
* [http://matlabtricks.com/post-5/3x3-convolution-kernels-with-online-demo  On-line demo of 2-D convolutions for image processing.
+
* [http://matlabtricks.com/post-5/3x3-convolution-kernels-with-online-demo  On-line demo of 2-D convolutions] for image processing.
 
* [https://cs.stanford.edu/people/karpathy/convnetjs/demo/mnist.html  There's an on-line demo of CNNs] which are trained in your browser (!)
 
* [https://cs.stanford.edu/people/karpathy/convnetjs/demo/mnist.html  There's an on-line demo of CNNs] which are trained in your browser (!)
 
* [http://scs.ryerson.ca/~aharley/vis/conv/  3D visualization of a trained net.]
 
* [http://scs.ryerson.ca/~aharley/vis/conv/  3D visualization of a trained net.]

Revision as of 15:16, 11 April 2016

This a lecture used in the Syllabus for Machine Learning 10-601B in Spring 2016

Slides

Readings

This area is moving very fast and the textbooks are not up-to-date. Some recommended readings:

I also used some on-line visualizations in the materials for the lecture, especially the part on ConvNets.

For more detail, look at the MIT Press book (in preparation) from Bengio - it's very complete but also fairly technical.

Things to remember

  • The underlying reasons deep networks are hard to train
    • Exploding/vanishing gradients
    • Saturation
  • The importance of key recent advances in neural networks:
    • Matrix operations and GPU training
    • ReLU, cross-entropy, softmax
  • Convolutional networks
    • 2-d convolution
    • How to construct a convolution layer
    • Architecture of CNN: convolution/downsampling pairs