Difference between revisions of "Class meeting for 10-605 in Fall 2016 Hadoop Overview"

From Cohen Courses
Jump to navigationJump to search
(Created page with "This is one of the class meetings on the schedule for the course Machine Learning with Large Data...")
 
 
(18 intermediate revisions by the same user not shown)
Line 1: Line 1:
This is one of the class meetings on the [[Syllabus for Machine Learning with Large Datasets 10-605 in Spring 2014|schedule]] for the course [[Machine Learning with Large Datasets 10-605 in Spring_2014]].
+
This is one of the class meetings on the [[Syllabus for Machine Learning with Large Datasets 10-605 in Fall 2016|schedule]] for the course [[Machine Learning with Large Datasets 10-605 in Fall 2016]].
  
 
=== Slides ===
 
=== Slides ===
  
 +
Map-reduce overview:
  
* [http://www.cs.cmu.edu/~wcohen/10-605/beyond-hadoop.pptx Workflows in PIG]
+
* [http://www.cs.cmu.edu/~wcohen/10-605/2016/map-reduce.pptx Map-Reduce overview - ppt], [http://www.cs.cmu.edu/~wcohen/10-605/2016/map-reduce.pdf pdf]  
  
=== Readings ===
+
Other:
  
* None required.  A nice on-line resource for PIG is the on-line version of the O'Reilly Book [http://chimera.labs.oreilly.com/books/1234000001811/index.html Programming Pig].
+
* [http://www.cs.cmu.edu/~wcohen/10-605/annotated-hadoop-log.txt A log of me interacting with Hadoop] (streaming Hadoop only).
 +
 
 +
=== Quiz ===
 +
 
 +
* To be posted
 +
 
 +
=== Readings for the Class ===
 +
 
 +
* There are lots of on-line tutorials for Hadoop.  The [http://shop.oreilly.com/product/0636920010388.do O'Reilly Book] is also quite good.
 +
 
 +
=== Things to Remember ===
 +
 
 +
* Hadoop terminology: HDFS, shards, job tracker, combiner, mapper, reducer, ...

Latest revision as of 11:37, 11 August 2017

This is one of the class meetings on the schedule for the course Machine Learning with Large Datasets 10-605 in Fall 2016.

Slides

Map-reduce overview:

Other:

Quiz

  • To be posted

Readings for the Class

  • There are lots of on-line tutorials for Hadoop. The O'Reilly Book is also quite good.

Things to Remember

  • Hadoop terminology: HDFS, shards, job tracker, combiner, mapper, reducer, ...