Difference between revisions of "Class meeting for 10-605 Deep Learning"

From Cohen Courses
Jump to navigationJump to search
Line 27: Line 27:
 
* Matrix operations and GPU training
 
* Matrix operations and GPU training
 
* ReLU, cross-entropy, softmax
 
* ReLU, cross-entropy, softmax
* How backprop can be generalized to a sequence of assignment operations
+
* How backprop can be generalized to a sequence of assignment operations (autodiff)
 
 
* How automatic differentiation works
 
 
** Wengert lists
 
** Wengert lists
 
** How to evaluate and differentiate a Wengert list
 
** How to evaluate and differentiate a Wengert list
 
 
* Common architectures
 
* Common architectures
 
** Multi-layer perceptron
 
** Multi-layer perceptron
 
** Recursive NNs (RNNS) and Long/short term memory networks (LSTMs)
 
** Recursive NNs (RNNS) and Long/short term memory networks (LSTMs)
 
** Convolutional Networks (CNNs)
 
** Convolutional Networks (CNNs)

Revision as of 17:57, 28 October 2016

This is one of the class meetings on the schedule for the course Machine Learning with Large Datasets 10-605 in Fall_2016.

Slides

Quizzes

Readings

Things to remember

  • The underlying reasons deep networks are hard to train
  • Exploding/vanishing gradients
  • Saturation
  • The importance of key recent advances in neural networks:
  • Matrix operations and GPU training
  • ReLU, cross-entropy, softmax
  • How backprop can be generalized to a sequence of assignment operations (autodiff)
    • Wengert lists
    • How to evaluate and differentiate a Wengert list
  • Common architectures
    • Multi-layer perceptron
    • Recursive NNs (RNNS) and Long/short term memory networks (LSTMs)
    • Convolutional Networks (CNNs)