Difference between revisions of "10-601 Deep Learning 2"

From Cohen Courses
Jump to navigationJump to search
(Created page with "This a lecture used in the Syllabus for Machine Learning 10-601B in Spring 2016 === Slides === * [http://www.cs.cmu.edu/~wcohen/10-601/deep-1.pptx Slides in PowerPoint],...")
 
 
(4 intermediate revisions by the same user not shown)
Line 3: Line 3:
 
=== Slides ===
 
=== Slides ===
  
* [http://www.cs.cmu.edu/~wcohen/10-601/deep-1.pptx Slides in PowerPoint],[http://www.cs.cmu.edu/~wcohen/10-601/deep-1.pdf Slides in PDF].
+
* [http://www.cs.cmu.edu/~wcohen/10-601/deep-2.pptx Slides in PowerPoint],[http://www.cs.cmu.edu/~wcohen/10-601/deep-2.pdf Slides in PDF].
 +
 
 +
Wrapup from next lecture:
 +
* [http://www.cs.cmu.edu/~wcohen/10-601/deep-wrapup.pptx Slides in PowerPoint],[http://www.cs.cmu.edu/~wcohen/10-601/deep-wrapup.pdf Slides in PDF].
  
 
=== Readings ===
 
=== Readings ===
Line 16: Line 19:
 
* [https://cs.stanford.edu/people/karpathy/convnetjs/demo/mnist.html  There's an on-line demo of CNNs] which are trained in your browser (!)
 
* [https://cs.stanford.edu/people/karpathy/convnetjs/demo/mnist.html  There's an on-line demo of CNNs] which are trained in your browser (!)
 
* [http://scs.ryerson.ca/~aharley/vis/conv/  3D visualization of a trained net.]
 
* [http://scs.ryerson.ca/~aharley/vis/conv/  3D visualization of a trained net.]
 +
 +
The LSTM figures and examples I used are mostly from
 +
* [http://colah.github.io/posts/2015-08-Understanding-LSTMs/ Christopher Olah's blog]
 +
* [http://karpathy.github.io/2015/05/21/rnn-effectiveness/ The unreasonable effectiveness of RNNs]
 +
* For a great counterpoint: see [http://nbviewer.jupyter.org/gist/yoavg/d76121dfde2618422139 Yoav Goldberg's response]
  
 
===  Things to remember  ===
 
===  Things to remember  ===
  
 +
* How backprop can be generalized to a sequence of assignment operations
 
* Convolutional networks
 
* Convolutional networks
 
** 2-d convolution
 
** 2-d convolution
 
** How to construct a convolution layer
 
** How to construct a convolution layer
 
** Architecture of CNN: convolution/downsampling pairs
 
** Architecture of CNN: convolution/downsampling pairs
 +
* Recurrent neural networks
 +
** When they are useful
 +
** Why they are hard to train (if trained naively)
 +
** The basic ideas used in an LSTM: forget, insert, and output gates

Latest revision as of 10:17, 13 April 2016

This a lecture used in the Syllabus for Machine Learning 10-601B in Spring 2016

Slides

Wrapup from next lecture:

Readings

This area is moving very fast and the textbooks are not up-to-date. Some recommended readings:

I also used some on-line visualizations in the materials for the lecture, especially the part on ConvNets.

The LSTM figures and examples I used are mostly from

Things to remember

  • How backprop can be generalized to a sequence of assignment operations
  • Convolutional networks
    • 2-d convolution
    • How to construct a convolution layer
    • Architecture of CNN: convolution/downsampling pairs
  • Recurrent neural networks
    • When they are useful
    • Why they are hard to train (if trained naively)
    • The basic ideas used in an LSTM: forget, insert, and output gates