| 
View
 

FrontPage

This version was saved 13 years, 10 months ago View current version     Page history
Saved by mike@mbowles.com
on March 16, 2011 at 12:13:05 pm
 

Machine Learning on Big Data with MapReduce

Course objectives:  
Participants will learn to adapt and execute machine learning algorithms in the map reduce framework.  Participants should finish the class able to author their own machine learning algorithms for map reduce and to run them on Amazon Web Services.  Amazon is providing AWS credits for class participants. 

Participants will learn to use python code to author mappers and reducers for “hadoop-streaming”.  For most of the class we will employ “mrjob” - an open-source framework developed at Yelp.  Employing mrjob enables class members to program mappers and reducers in python.  The mrjob framework then submits the mapper-reducer to run locally without using hadoop, to run on Amazon Web Services, or to run them on a private hadoop cluster.  This will simplify the programming tasks.

Schedule: Here's a tentative schedule to give a rough idea of what we intend to cover.  This may change somewhat to meet the interests of the class participants. 

 

Week/Date
Topic
Notes
Week 1
Implementing Algorithms on Big Data
 
April 13
MapReduce, Hadoop Streaming, Mahout, Amazon (AWS, EMR)
 
April 14
mrjob
 
Week 2
Clustering
 
April 20
Canopy Clustering
 
April 21
K-means, EM 
 
Week 3
Supervised Learning
 
April 27 Regularized Regression - glmnet algo for elasticnet  
April 28 SVM - Pegasos algo for two-class and one-class, extensions  
Week 4 Recommender systems  
May 4 Background and simple recommender system  
May 5 SVD methods, SVD on mapReduce, Lanczos algo  
Week 5 Frequent ItemSet Implementations  
May 11 tbd  
May 12 tbd  

 



Prerequisites:
-Facility with undergrad level math and stats (vector calculus, density functions, etc.)
-Comfortable programming  basic python (2.6 or 2.7)
-Install mrjob and boto (these are both python installations)
-Familiarity with basic machine learning.  

Comments (0)

You don't have permission to comment on this page.