Difference between revisions of "Class meeting for 10-605 Deep Learning"
From Cohen Courses
Jump to navigationJump to searchLine 12: | Line 12: | ||
** [https://justindomke.wordpress.com/2009/03/24/a-simple-explanation-of-reverse-mode-automatic-differentiation/ Domke's blog post] - clear but not much detail - and [http://colah.github.io/posts/2015-08-Backprop/ another nice blog post]. | ** [https://justindomke.wordpress.com/2009/03/24/a-simple-explanation-of-reverse-mode-automatic-differentiation/ Domke's blog post] - clear but not much detail - and [http://colah.github.io/posts/2015-08-Backprop/ another nice blog post]. | ||
** The clearest paper I've found is [http://www.bcl.hamilton.ie/~barak/papers/toplas-reverse.pdf Reverse-Mode AD in a Functional Framework: Lambda the Ultimate Backpropagator] | ** The clearest paper I've found is [http://www.bcl.hamilton.ie/~barak/papers/toplas-reverse.pdf Reverse-Mode AD in a Functional Framework: Lambda the Ultimate Backpropagator] | ||
+ | |||
+ | * More general neural networks: | ||
+ | ** [http://neuralnetworksanddeeplearning.com/index.html Neural Networks and Deep Learning] An online book by Michael Nielsen, pitched at an appropriate level for 10-601, which has a bunch of exercises and on-line sample programs in Python. | ||
+ | For more detail, look at [http://www.deeplearningbook.org/ the MIT Press book (in preparation) from Bengio] - it's very complete but also fairly technical. | ||
+ | |||
+ | |||
+ | === Things to remember === | ||
+ | * The underlying reasons deep networks are hard to train | ||
+ | * Exploding/vanishing gradients | ||
+ | * Saturation | ||
+ | * The importance of key recent advances in neural networks: | ||
+ | * Matrix operations and GPU training | ||
+ | * ReLU, cross-entropy, softmax |
Revision as of 17:16, 17 October 2016
This is one of the class meetings on the schedule for the course Machine Learning with Large Datasets 10-605 in Fall_2016.
Slides
- TBD
Readings
- Automatic differentiation:
- William's notes on automatic differentiation.
- Domke's blog post - clear but not much detail - and another nice blog post.
- The clearest paper I've found is Reverse-Mode AD in a Functional Framework: Lambda the Ultimate Backpropagator
- More general neural networks:
- Neural Networks and Deep Learning An online book by Michael Nielsen, pitched at an appropriate level for 10-601, which has a bunch of exercises and on-line sample programs in Python.
For more detail, look at the MIT Press book (in preparation) from Bengio - it's very complete but also fairly technical.
Things to remember
- The underlying reasons deep networks are hard to train
- Exploding/vanishing gradients
- Saturation
- The importance of key recent advances in neural networks:
- Matrix operations and GPU training
- ReLU, cross-entropy, softmax