Difference between revisions of "10-601 Big Data"

Revision as of 17:39, 21 July 2014

You should know:

Why locality is important in working with large data
What the relative costs of operations are for accessing disk, network, and memory
What the Hadoop file system (HFS) is
What the stages of Map-Reduce are: map, shuffle, and reduce
Why combiners are often important in Map-Reduce
What sort of tasks Map-Reduce is well-suited for, and what it's not well-suited for
In outline, how Naive Bayes, or some other counting task, could be implemented on map-reduce

Revision as of 17:54, 3 December 2013 (view source) Wcohen (talk \| contribs) (Created page with "This a lecture used in the Syllabus for Machine Learning 10-601 === Slides === * [http://www.cs.cmu.edu/~wcohen/10-601/bigdata-nb.pptx Slides in PowerPoint]. === Readin...")		Revision as of 17:39, 21 July 2014 (view source) Wcohen (talk \| contribs) Newer edit →
Line 1:		Line 1:
−	This a lecture used in the [[Syllabus for Machine Learning 10-601]]	+	This a lecture used in the [[Syllabus for Machine Learning 10-601 in Fall 2014]]

	=== Slides ===		=== Slides ===