Difference between revisions of "Class meeting for 10-605 Deep Learning"

From Cohen Courses
Jump to navigationJump to search
(Created page with "This is one of the class meetings on the schedule for the course Machine Learning with Large Datase...")
 
 
(23 intermediate revisions by one other user not shown)
Line 1: Line 1:
This is one of the class meetings on the [[Syllabus for Machine Learning with Large Datasets 10-605 in Fall 2016|schedule]] for the course [[Machine Learning with Large Datasets 10-605 in Fall_2016]].
+
This is one of the class meetings on the [[Syllabus for Machine Learning with Large Datasets 10-605 in Fall 2017|schedule]] for the course [[Machine Learning with Large Datasets 10-605 in Fall_2017]].
  
 
=== Slides ===
 
=== Slides ===
  
* TBD
+
* Lecture 1: [http://www.cs.cmu.edu/~wcohen/10-605/deep-1.pptx Powerpoint], [http://www.cs.cmu.edu/~wcohen/10-605/deep-1.pdf PDF].
  
 +
* Lecture 2: [http://www.cs.cmu.edu/~wcohen/10-605/deep-2.pptx Powerpoint], [http://www.cs.cmu.edu/~wcohen/10-605/deep-2.pdf PDF].
 +
 +
* Lecture 3: [http://www.cs.cmu.edu/~wcohen/10-605/deep-3.pptx Powerpoint], [http://www.cs.cmu.edu/~wcohen/10-605/deep-3.pdf PDF].
 +
 +
=== Quizzes ===
 +
 +
* [https://qna.cs.cmu.edu/#/pages/view/75 Quiz for lecture 1]
 +
* [https://qna.cs.cmu.edu/#/pages/view/79 Quiz for lecture 2]
 +
* [https://qna.cs.cmu.edu/#/pages/view/212 Quiz for lecture 3]
 +
 +
=== Sample code ===
 +
 +
* [http://www.cs.cmu.edu/~wcohen/10-605/code/xman.py Expression manager]
 +
* [http://www.cs.cmu.edu/~wcohen/10-605/code/sample-use-of-xman.py Sample use of the expression manager]
  
 
=== Readings ===
 
=== Readings ===
  
* William's notes on [http://www.cs.cmu.edu/~wcohen/10-65/notes/autodiff.pdf automatic differentiation].
+
* Automatic differentiation:
 +
** William's notes on [http://www.cs.cmu.edu/~wcohen/10-605/notes/autodiff.pdf automatic differentiation], and the Python code for a simple [http://www.cs.cmu.edu/~wcohen/10-605/code/xman.py Wengart list generator] and a [http://www.cs.cmu.edu/~wcohen/10-605/code/sample-use-of-xman.py sample use of a one].
 +
**  [https://justindomke.wordpress.com/2009/03/24/a-simple-explanation-of-reverse-mode-automatic-differentiation/ Domke's blog post] - clear but not much detail - and [http://colah.github.io/posts/2015-08-Backprop/  another nice blog post].
 +
** The clearest paper I've found is [http://www.bcl.hamilton.ie/~barak/papers/toplas-reverse.pdf Reverse-Mode AD in a Functional Framework: Lambda the Ultimate Backpropagator]
 +
 
 +
* More general neural networks:
 +
** [http://neuralnetworksanddeeplearning.com/index.html Neural Networks and Deep Learning] An online book by Michael Nielsen, pitched at an appropriate level for 10-601, which has a bunch of exercises and on-line sample programs in Python.
 +
** For much much more detail, look at [http://www.deeplearningbook.org/ the MIT Press book (in preparation) from Bengio] - it's very complete but also fairly technical.
 +
 
 +
=== Things to remember ===
 +
* The underlying reasons deep networks are hard to train
 +
* Exploding/vanishing gradients
 +
* Saturation
 +
* The importance of key recent advances in neural networks:
 +
* Matrix operations and GPU training
 +
* ReLU, cross-entropy, softmax
 +
* How backprop can be generalized to a sequence of assignment operations (autodiff)
 +
** Wengert lists
 +
** How to evaluate and differentiate a Wengert list
 +
* Common architectures
 +
** Multi-layer perceptron
 +
** Recursive NNs (RNNS) and Long/short term memory networks (LSTMs)
 +
** Convolutional Networks (CNNs)

Latest revision as of 13:38, 31 October 2017

This is one of the class meetings on the schedule for the course Machine Learning with Large Datasets 10-605 in Fall_2017.

Slides

Quizzes

Sample code

Readings

Things to remember

  • The underlying reasons deep networks are hard to train
  • Exploding/vanishing gradients
  • Saturation
  • The importance of key recent advances in neural networks:
  • Matrix operations and GPU training
  • ReLU, cross-entropy, softmax
  • How backprop can be generalized to a sequence of assignment operations (autodiff)
    • Wengert lists
    • How to evaluate and differentiate a Wengert list
  • Common architectures
    • Multi-layer perceptron
    • Recursive NNs (RNNS) and Long/short term memory networks (LSTMs)
    • Convolutional Networks (CNNs)