How to Install Sqoop on Amazon Elastic Map Reduce (EMR)
It is possible to install Sqoop on Amazon EMR. You can use Sqoop to import and export data from a relational database such as MySQL. Here’s how I did it with MySQL. If you are using a different database, you’ll probably need a different JDBC connector for that database.
I’m using Amazon’s Hadoop version 0.20.205, which, I think, was the default. You can see all supported versions of Amazon’s Hadoop here:
I downloaded sqoop-1.4.1-incubating__hadoop-0.20.tar.gz from here: http://www.apache.org/dyn/closer.cgi/sqoop/
I downloaded mysql-connector-java-5.1.19.tar.gz from here: http://www.mysql.com/downloads/connector/j/
Once I downloaded these two tar.gz files, I uploaded them to an S3 bucket. I also put this script below in the S3 bucket. Make sure to replace <BUCKET_NAME> with your own bucket name.
#Install Sqoop - s3://<BUCKET_NAME>/install_sqoop.sh
hadoop fs -copyToLocal s3://<BUCKET_NAME>/sqoop-1.4.1-incubating__hadoop-0.20.tar.gz sqoop-1.4.1-incubating__hadoop-0.20.tar.gz
tar -xzf sqoop-1.4.1-incubating__hadoop-0.20.tar.gz
hadoop fs -copyToLocal s3://<BUCKET_NAME>/mysql-connector-java-5.1.19.tar.gz mysql-connector-java-5.1.19.tar.gz
tar -xzf mysql-connector-java-5.1.19.tar.gz
cp mysql-connector-java-5.1.19/mysql-connector-java-5.1.19-bin.jar sqoop-1.4.1-incubating__hadoop-0.20/lib/
After I started a job flow, I added this script as a step to the job flow. You can do this via the API, or the CLI like this:
./elastic-mapreduce -j <JOBFLOW_ID> --jar s3://elasticmapreduce/libs/script-runner/script-runner.jar --arg s3://<BUCKET_NAME>/install_sqoop.sh
Once the step completes, you can run sqoop imports and exports. Here’s an example of a sqoop export:
./sqoop-1.4.1-incubating__hadoop-0.20/bin/sqoop export --connect jdbc:mysql://<MYSQL_HOST>/<DATABASE_NAME> --table <TABLE_NAME> --export-dir <HDFS_PATH> --fields-terminated-by , --input-null-non-string '\\N' --username <USERNAME> --password <PASSWORD>
Hope that helped. Let me know if you have any questions.