Difference between revisions of "Class meeting for 10-605 in Fall 2016 Hadoop Overview"

From Cohen Courses
Jump to navigationJump to search
Line 3: Line 3:
 
=== Slides ===
 
=== Slides ===
  
 +
Map-reduce overview - probably will do this Thursday:
  
* [http://www.cs.cmu.edu/~wcohen/10-605/beyond-hadoop.pptx Workflows for Hadoop]
+
* [http://www.cs.cmu.edu/~wcohen/10-605/d_mapreduce.ppt Map-Reduce overview - ppt], [http://www.cs.cmu.edu/~wcohen/10-605/d_mapreduce.pdf pdf]  
* The phrases example:
 
** [http://www.cs.cmu.edu/~wcohen/10-605/pig-example/phrases.pig PIG source code]
 
** [http://www.cs.cmu.edu/~wcohen/10-605/pig-example/SmoothedPKL.java Java source code]
 
* Some other examples:
 
** [http://www.cs.cmu.edu/~wcohen/10-605/pig-example/phirl-naive.pig Naive Similarity Join]
 
** [http://www.cs.cmu.edu/~wcohen/10-605/pig-example/phirl.pig Optimized Similarity Join]
 
  
Also:
+
Other:
* [http://www.cs.cmu.edu/~wcohen/10-605/pig-example/tips-for-debugging-pig.txt My comments on debugging PIG.]
 
  
=== Readings ===
+
* [http://www.cs.cmu.edu/~wcohen/10-605/annotated-hadoop-log.txt  A log of me interacting with Hadoop] (streaming Hadoop only).
  
* None required.  A nice on-line resource for PIG is the on-line version of the O'Reilly Book [http://chimera.labs.oreilly.com/books/1234000001811/index.html Programming Pig].
+
=== Readings for the Class ===
 +
 
 +
* There are lots of on-line tutorials for Hadoop.  The [http://shop.oreilly.com/product/0636920010388.do O'Reilly Book] is also quite good.

Revision as of 14:18, 17 September 2015

This is one of the class meetings on the schedule for the course Machine Learning with Large Datasets 10-605 in Spring_2015.

Slides

Map-reduce overview - probably will do this Thursday:

Other:

Readings for the Class

  • There are lots of on-line tutorials for Hadoop. The O'Reilly Book is also quite good.