Difference between revisions of "Class meeting for 10-605 Deep Learning"

From Cohen Courses
Jump to navigationJump to search
 
(21 intermediate revisions by one other user not shown)
Line 1: Line 1:
This is one of the class meetings on the [[Syllabus for Machine Learning with Large Datasets 10-605 in Fall 2016|schedule]] for the course [[Machine Learning with Large Datasets 10-605 in Fall_2016]].
+
This is one of the class meetings on the [[Syllabus for Machine Learning with Large Datasets 10-605 in Fall 2017|schedule]] for the course [[Machine Learning with Large Datasets 10-605 in Fall_2017]].
  
 
=== Slides ===
 
=== Slides ===
  
* TBD
+
* Lecture 1: [http://www.cs.cmu.edu/~wcohen/10-605/deep-1.pptx Powerpoint], [http://www.cs.cmu.edu/~wcohen/10-605/deep-1.pdf PDF].
  
 +
* Lecture 2: [http://www.cs.cmu.edu/~wcohen/10-605/deep-2.pptx Powerpoint], [http://www.cs.cmu.edu/~wcohen/10-605/deep-2.pdf PDF].
  
=== Readings ===
+
* Lecture 3: [http://www.cs.cmu.edu/~wcohen/10-605/deep-3.pptx Powerpoint], [http://www.cs.cmu.edu/~wcohen/10-605/deep-3.pdf PDF].
 +
 
 +
=== Quizzes ===
 +
 
 +
* [https://qna.cs.cmu.edu/#/pages/view/75 Quiz for lecture 1]
 +
* [https://qna.cs.cmu.edu/#/pages/view/79 Quiz for lecture 2]
 +
* [https://qna.cs.cmu.edu/#/pages/view/212 Quiz for lecture 3]
  
* Automatic differentiation:
+
=== Sample code ===
** William's notes on [http://www.cs.cmu.edu/~wcohen/10-605/notes/autodiff.pdf automatic differentiation].
 
**  [https://justindomke.wordpress.com/2009/03/24/a-simple-explanation-of-reverse-mode-automatic-differentiation/ Domke's blog post] - clear but not much detail
 
  
http://colah.github.io/posts/2015-08-Backprop/ - another nice blog post
+
* [http://www.cs.cmu.edu/~wcohen/10-605/code/xman.py Expression manager]
 +
* [http://www.cs.cmu.edu/~wcohen/10-605/code/sample-use-of-xman.py Sample use of the expression manager]
  
http://homes.cs.washington.edu/~naveenks/files/2009_Cranfield_PPT.pdf - ok slide deck
+
=== Readings ===
  
http://www.bcl.hamilton.ie/~barak/papers/toplas-reverse.pdf - clearest paper I found so far
+
* Automatic differentiation:
 +
** William's notes on [http://www.cs.cmu.edu/~wcohen/10-605/notes/autodiff.pdf automatic differentiation], and the Python code for a simple [http://www.cs.cmu.edu/~wcohen/10-605/code/xman.py Wengart list generator] and a [http://www.cs.cmu.edu/~wcohen/10-605/code/sample-use-of-xman.py sample use of a one].
 +
**  [https://justindomke.wordpress.com/2009/03/24/a-simple-explanation-of-reverse-mode-automatic-differentiation/ Domke's blog post] - clear but not much detail - and [http://colah.github.io/posts/2015-08-Backprop/  another nice blog post].
 +
** The clearest paper I've found is [http://www.bcl.hamilton.ie/~barak/papers/toplas-reverse.pdf Reverse-Mode AD in a Functional Framework: Lambda the Ultimate Backpropagator]
  
https://github.com/HIPS/autograd - similar project
+
* More general neural networks:
 +
** [http://neuralnetworksanddeeplearning.com/index.html Neural Networks and Deep Learning] An online book by Michael Nielsen, pitched at an appropriate level for 10-601, which has a bunch of exercises and on-line sample programs in Python.
 +
** For much much more detail, look at [http://www.deeplearningbook.org/ the MIT Press book (in preparation) from Bengio] - it's very complete but also fairly technical.
  
http://arxiv.org/pdf/1502.05767v2.pdf - survey paper I haven't really looked at
+
=== Things to remember ===
 +
* The underlying reasons deep networks are hard to train
 +
* Exploding/vanishing gradients
 +
* Saturation
 +
* The importance of key recent advances in neural networks:
 +
* Matrix operations and GPU training
 +
* ReLU, cross-entropy, softmax
 +
* How backprop can be generalized to a sequence of assignment operations (autodiff)
 +
** Wengert lists
 +
** How to evaluate and differentiate a Wengert list
 +
* Common architectures
 +
** Multi-layer perceptron
 +
** Recursive NNs (RNNS) and Long/short term memory networks (LSTMs)
 +
** Convolutional Networks (CNNs)

Latest revision as of 13:38, 31 October 2017

This is one of the class meetings on the schedule for the course Machine Learning with Large Datasets 10-605 in Fall_2017.

Slides

Quizzes

Sample code

Readings

Things to remember

  • The underlying reasons deep networks are hard to train
  • Exploding/vanishing gradients
  • Saturation
  • The importance of key recent advances in neural networks:
  • Matrix operations and GPU training
  • ReLU, cross-entropy, softmax
  • How backprop can be generalized to a sequence of assignment operations (autodiff)
    • Wengert lists
    • How to evaluate and differentiate a Wengert list
  • Common architectures
    • Multi-layer perceptron
    • Recursive NNs (RNNS) and Long/short term memory networks (LSTMs)
    • Convolutional Networks (CNNs)