Difference between revisions of "Class meeting for 10-405 Deep Learning"
From Cohen Courses
Jump to navigationJump to search (Created page with "This is one of the class meetings on the schedule for the course Machine Learning with Large Data...") |
|||
Line 10: | Line 10: | ||
=== Quizzes === | === Quizzes === | ||
+ | |||
+ | These are not updated yet --[[User:Wcohen|Wcohen]] ([[User talk:Wcohen|talk]]) 14:58, 19 March 2018 (EDT) | ||
* [https://qna.cs.cmu.edu/#/pages/view/75 Quiz for lecture 1] | * [https://qna.cs.cmu.edu/#/pages/view/75 Quiz for lecture 1] |
Revision as of 14:58, 19 March 2018
This is one of the class meetings on the schedule for the course Machine Learning with Large Datasets 10-405 in Spring 2018.
Slides
- Lecture 1: Powerpoint, PDF.
- Lecture 2: Powerpoint, PDF.
- Lecture 3: Powerpoint, PDF.
Quizzes
These are not updated yet --Wcohen (talk) 14:58, 19 March 2018 (EDT)
Sample code
Readings
- Automatic differentiation:
- William's notes on automatic differentiation, and the Python code for a simple Wengart list generator and a sample use of a one.
- Domke's blog post - clear but not much detail - and another nice blog post.
- The clearest paper I've found is Reverse-Mode AD in a Functional Framework: Lambda the Ultimate Backpropagator
- More general neural networks:
- Neural Networks and Deep Learning An online book by Michael Nielsen, pitched at an appropriate level for 10-601, which has a bunch of exercises and on-line sample programs in Python.
- For much much more detail, look at the MIT Press book (in preparation) from Bengio - it's very complete but also fairly technical.
Things to remember
- The underlying reasons deep networks are hard to train
- Exploding/vanishing gradients
- Saturation
- The importance of key recent advances in neural networks:
- Matrix operations and GPU training
- ReLU, cross-entropy, softmax
- How backprop can be generalized to a sequence of assignment operations (autodiff)
- Wengert lists
- How to evaluate and differentiate a Wengert list
- Common architectures
- Multi-layer perceptron
- Recursive NNs (RNNS) and Long/short term memory networks (LSTMs)
- Convolutional Networks (CNNs)