Difference between revisions of "10-601 Deep Learning 2"
From Cohen Courses
Jump to navigationJump to search (Created page with "This a lecture used in the Syllabus for Machine Learning 10-601B in Spring 2016 === Slides === * [http://www.cs.cmu.edu/~wcohen/10-601/deep-1.pptx Slides in PowerPoint],...") |
(→Slides) |
||
(4 intermediate revisions by the same user not shown) | |||
Line 3: | Line 3: | ||
=== Slides === | === Slides === | ||
− | * [http://www.cs.cmu.edu/~wcohen/10-601/deep- | + | * [http://www.cs.cmu.edu/~wcohen/10-601/deep-2.pptx Slides in PowerPoint],[http://www.cs.cmu.edu/~wcohen/10-601/deep-2.pdf Slides in PDF]. |
+ | |||
+ | Wrapup from next lecture: | ||
+ | * [http://www.cs.cmu.edu/~wcohen/10-601/deep-wrapup.pptx Slides in PowerPoint],[http://www.cs.cmu.edu/~wcohen/10-601/deep-wrapup.pdf Slides in PDF]. | ||
=== Readings === | === Readings === | ||
Line 16: | Line 19: | ||
* [https://cs.stanford.edu/people/karpathy/convnetjs/demo/mnist.html There's an on-line demo of CNNs] which are trained in your browser (!) | * [https://cs.stanford.edu/people/karpathy/convnetjs/demo/mnist.html There's an on-line demo of CNNs] which are trained in your browser (!) | ||
* [http://scs.ryerson.ca/~aharley/vis/conv/ 3D visualization of a trained net.] | * [http://scs.ryerson.ca/~aharley/vis/conv/ 3D visualization of a trained net.] | ||
+ | |||
+ | The LSTM figures and examples I used are mostly from | ||
+ | * [http://colah.github.io/posts/2015-08-Understanding-LSTMs/ Christopher Olah's blog] | ||
+ | * [http://karpathy.github.io/2015/05/21/rnn-effectiveness/ The unreasonable effectiveness of RNNs] | ||
+ | * For a great counterpoint: see [http://nbviewer.jupyter.org/gist/yoavg/d76121dfde2618422139 Yoav Goldberg's response] | ||
=== Things to remember === | === Things to remember === | ||
+ | * How backprop can be generalized to a sequence of assignment operations | ||
* Convolutional networks | * Convolutional networks | ||
** 2-d convolution | ** 2-d convolution | ||
** How to construct a convolution layer | ** How to construct a convolution layer | ||
** Architecture of CNN: convolution/downsampling pairs | ** Architecture of CNN: convolution/downsampling pairs | ||
+ | * Recurrent neural networks | ||
+ | ** When they are useful | ||
+ | ** Why they are hard to train (if trained naively) | ||
+ | ** The basic ideas used in an LSTM: forget, insert, and output gates |
Latest revision as of 09:17, 13 April 2016
This a lecture used in the Syllabus for Machine Learning 10-601B in Spring 2016
Slides
Wrapup from next lecture:
Readings
This area is moving very fast and the textbooks are not up-to-date. Some recommended readings:
- Stanford CS class CS231n: Convolutional Neural Networks for Visual Recognition has nice on-line notes.
I also used some on-line visualizations in the materials for the lecture, especially the part on ConvNets.
- the Wikipedia page for convolutions has nice animations of 1-D convolutions.
- On-line demo of 2-D convolutions for image processing.
- There's an on-line demo of CNNs which are trained in your browser (!)
- 3D visualization of a trained net.
The LSTM figures and examples I used are mostly from
- Christopher Olah's blog
- The unreasonable effectiveness of RNNs
- For a great counterpoint: see Yoav Goldberg's response
Things to remember
- How backprop can be generalized to a sequence of assignment operations
- Convolutional networks
- 2-d convolution
- How to construct a convolution layer
- Architecture of CNN: convolution/downsampling pairs
- Recurrent neural networks
- When they are useful
- Why they are hard to train (if trained naively)
- The basic ideas used in an LSTM: forget, insert, and output gates