| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

mrjob installation

Page history last edited by mike@mbowles.com 10 years ago

Basic Installation

To install mrjob, you first need to install Python.  Your choices for python version are 2.5, 2.6, or 2.7.  Much of the software we'll be using was authored in 2.5, so you'll have the easiest time using 2.5, but the others should also work.  I'm currently using 2.6 without major problems. 

 

You'll need to install two packages - mrjob and boto.  i'm told that boto should get installed when installing mrjob, but it didn't work that way for me.  (fyi i'm on ubuntu linux).  you can download mrjob at:  https://github.com/Yelp/mrjob .  On the Yelp github page, you'll find instructions for installing from source or for installing from PyPI using pip.  Use whichever you prefer.

 

You can download boto at:  http://code.google.com/p/boto/

 

I installed both of these as simple python installations.  I downloaded each file to its own folder, from a command line changed directory to the sub-folder containing the "setup.py" and then on the command line executed >python setup.py install.

 

One of the students emailed that she was running into errors installing mrjob (fyi she was using python 2.7 on windows).  Jimmy Retzlaff (from Yelp) suggested that there might be a problem with the mrjob setup script unless you've got setuptools installed.  Here's a link for setuptools:  http://pypi.python.org/pypi/setuptools

 

Use this page to record notes you have regarding installation, so that we can get everybody up and running ASAP.

 

Nag's Screen Shots

Nag (one of our students) has made available his screen shots for

1.  Setting up AWS account

2.  Installing setuptools

3.  Running simple map reduce job directly on AWS

 

http://nagarun.wordpress.com/2011/04/14/how-i-got-started-with-mapreduce-week-1-handson/

 

Thank Nag for making these available to us. 

 

 

mrjob on Windows Issues:

Here's excerpts from an email chain about running mrjob on windows -

 

-From Peter Harrington

Laura,
I have not seen a boto error before, in fact I never installed boto.  I just looked into it and something called botoemr came with the mrjob download. 

Not sure if that is helpful.  Here are the instructions for setting your AWS environment variables so you don't have to mess with the .conf file.  
To set these in Windows, in your command prompt and enter in the following:

set AWS_ACCESS_KEY_ID = xyxyxyxyxyxyxyxyxy
To verify that it worked, type in:  echo %AWS_ACCESS_KEY_ID%
Make sure you set AWS_SECRET_ACCESS_KEY also.

 

-From Laura
Brilliant!!
I got rid of my mrjob.config and manually set my environment variables and it works.  the mrjob created buckets in E3.

 

Join MrJob Google Group for quick answers to MrJob Questions

Just google "mrjob google group" to see posts or sign up to ask questions and post responses to questions from others.


 



Comments (1)

Jeff McCarrell said

at 3:46 pm on Apr 13, 2011

For OSX, I just cloned the git repository, then python setup.py install which installed (couldn't find libyaml, which is Ok IMO) and boto both. I was able to run the word count example locally.

You don't have permission to comment on this page.