Machine Learning on Big Data with MapReduce
Course objectives:
Participants will learn to adapt and execute machine learning algorithms in the map reduce framework. Participants should finish the class able to author their own machine learning algorithms for map reduce and to run them on Amazon Web Services. Amazon is providing AWS credits for class participants.
Participants will learn to use python code to author mappers and reducers for “hadoop-streaming”. For most of the class we will employ “mrjob” - an open-source framework developed at Yelp. Employing mrjob enables class members to program mappers and reducers in python. The mrjob framework then submits the mapper-reducer to run locally without using hadoop, to run on Amazon Web Services, or to run them on a private hadoop cluster. This will simplify the programming tasks.
Schedule:
Week 1 - Intro to map-reduce, AWS (Amazon web services), Mahout and mrjob
Week 2 Clustering
Week 3 Supervised Learning
Week 4 Recommendation Engines
Week 5 Frequent Item Sets
Prerequisites:
-Facility with undergrad level math and stats (vector calculus, density functions, etc.)
-Comfortable programming basic python (2.6 or 2.7)
-Install mrjob and boto (these are both python installations)
-Familiarity with basic machine learning.
Comments (0)
You don't have permission to comment on this page.