Difference between revisions of "Class meeting for 10-605 Deep Learning"

From Cohen Courses
Jump to navigationJump to search
 
(19 intermediate revisions by one other user not shown)
Line 1: Line 1:
This is one of the class meetings on the [[Syllabus for Machine Learning with Large Datasets 10-605 in Fall 2016|schedule]] for the course [[Machine Learning with Large Datasets 10-605 in Fall_2016]].
+
This is one of the class meetings on the [[Syllabus for Machine Learning with Large Datasets 10-605 in Fall 2017|schedule]] for the course [[Machine Learning with Large Datasets 10-605 in Fall_2017]].
  
 
=== Slides ===
 
=== Slides ===
  
* TBD
+
* Lecture 1: [http://www.cs.cmu.edu/~wcohen/10-605/deep-1.pptx Powerpoint], [http://www.cs.cmu.edu/~wcohen/10-605/deep-1.pdf PDF].
  
 +
* Lecture 2: [http://www.cs.cmu.edu/~wcohen/10-605/deep-2.pptx Powerpoint], [http://www.cs.cmu.edu/~wcohen/10-605/deep-2.pdf PDF].
 +
 +
* Lecture 3: [http://www.cs.cmu.edu/~wcohen/10-605/deep-3.pptx Powerpoint], [http://www.cs.cmu.edu/~wcohen/10-605/deep-3.pdf PDF].
 +
 +
=== Quizzes ===
 +
 +
* [https://qna.cs.cmu.edu/#/pages/view/75 Quiz for lecture 1]
 +
* [https://qna.cs.cmu.edu/#/pages/view/79 Quiz for lecture 2]
 +
* [https://qna.cs.cmu.edu/#/pages/view/212 Quiz for lecture 3]
 +
 +
=== Sample code ===
 +
 +
* [http://www.cs.cmu.edu/~wcohen/10-605/code/xman.py Expression manager]
 +
* [http://www.cs.cmu.edu/~wcohen/10-605/code/sample-use-of-xman.py Sample use of the expression manager]
  
 
=== Readings ===
 
=== Readings ===
  
 
* Automatic differentiation:
 
* Automatic differentiation:
** William's notes on [http://www.cs.cmu.edu/~wcohen/10-605/notes/autodiff.pdf automatic differentiation].
+
** William's notes on [http://www.cs.cmu.edu/~wcohen/10-605/notes/autodiff.pdf automatic differentiation], and the Python code for a simple [http://www.cs.cmu.edu/~wcohen/10-605/code/xman.py Wengart list generator] and a [http://www.cs.cmu.edu/~wcohen/10-605/code/sample-use-of-xman.py sample use of a one].
 
**  [https://justindomke.wordpress.com/2009/03/24/a-simple-explanation-of-reverse-mode-automatic-differentiation/ Domke's blog post] - clear but not much detail - and [http://colah.github.io/posts/2015-08-Backprop/  another nice blog post].
 
**  [https://justindomke.wordpress.com/2009/03/24/a-simple-explanation-of-reverse-mode-automatic-differentiation/ Domke's blog post] - clear but not much detail - and [http://colah.github.io/posts/2015-08-Backprop/  another nice blog post].
 
** The clearest paper I've found is [http://www.bcl.hamilton.ie/~barak/papers/toplas-reverse.pdf Reverse-Mode AD in a Functional Framework: Lambda the Ultimate Backpropagator]
 
** The clearest paper I've found is [http://www.bcl.hamilton.ie/~barak/papers/toplas-reverse.pdf Reverse-Mode AD in a Functional Framework: Lambda the Ultimate Backpropagator]
Line 15: Line 29:
 
* More general neural networks:  
 
* More general neural networks:  
 
** [http://neuralnetworksanddeeplearning.com/index.html Neural Networks and Deep Learning] An online book by Michael Nielsen, pitched at an appropriate level for 10-601, which has a bunch of exercises and on-line sample programs in Python.
 
** [http://neuralnetworksanddeeplearning.com/index.html Neural Networks and Deep Learning] An online book by Michael Nielsen, pitched at an appropriate level for 10-601, which has a bunch of exercises and on-line sample programs in Python.
For more detail, look at [http://www.deeplearningbook.org/ the MIT Press book (in preparation) from Bengio] - it's very complete but also fairly technical.
+
** For much much more detail, look at [http://www.deeplearningbook.org/ the MIT Press book (in preparation) from Bengio] - it's very complete but also fairly technical.
 
 
  
 
=== Things to remember ===
 
=== Things to remember ===
Line 25: Line 38:
 
* Matrix operations and GPU training
 
* Matrix operations and GPU training
 
* ReLU, cross-entropy, softmax
 
* ReLU, cross-entropy, softmax
 +
* How backprop can be generalized to a sequence of assignment operations (autodiff)
 +
** Wengert lists
 +
** How to evaluate and differentiate a Wengert list
 +
* Common architectures
 +
** Multi-layer perceptron
 +
** Recursive NNs (RNNS) and Long/short term memory networks (LSTMs)
 +
** Convolutional Networks (CNNs)

Latest revision as of 13:38, 31 October 2017

This is one of the class meetings on the schedule for the course Machine Learning with Large Datasets 10-605 in Fall_2017.

Slides

Quizzes

Sample code

Readings

Things to remember

  • The underlying reasons deep networks are hard to train
  • Exploding/vanishing gradients
  • Saturation
  • The importance of key recent advances in neural networks:
  • Matrix operations and GPU training
  • ReLU, cross-entropy, softmax
  • How backprop can be generalized to a sequence of assignment operations (autodiff)
    • Wengert lists
    • How to evaluate and differentiate a Wengert list
  • Common architectures
    • Multi-layer perceptron
    • Recursive NNs (RNNS) and Long/short term memory networks (LSTMs)
    • Convolutional Networks (CNNs)