Driven Agent Guide: Advanced Installation

version 2.2.6

Driven Agent Guide: Advanced Installation

1. Prerequisites: 1.1. Check your team’s demands and expectations

1.2. Check your system privileges and network access
2. Installing the Driven Agent: 2.1. Downloading the Driven Agent

2.2. Installing the Driven Agent

2.3. Confirming Installation
3. Configuring the Driven Agent: 3.1. Testing the Agent

3.2. Agent Common Options

3.3. Agent Advanced Options
4. Driven Agent for MapReduce: 4.1. Running on Hadoop or YARN

4.2. MapReduce Versions
5. Driven Agent for Hive: 5.1. Hive Version Requirements

5.2. Metadata Support

5.3. Using the Hive Agent Artifact

5.4. What Cluster Work Is Monitored
6. Driven Agent for Apache Spark: 6.1. Spark Version Requirements

6.2. Supported APIs

6.3. Spark Runtimes

6.4. Supported Runtimes
7. Using Driven Agent with Apache Oozie: 7.1. Driven Agent Configuration
8. Advanced Installation: 8.1. Scripted Installation

8.2. Amazon Elastic MapReduce
9. Troubleshooting the Driven Agent: 9.1. Applications cannot send data to the Driven Server

Advanced Installation

Advanced users may wish to script the Driven Agent installation, or use the Driven Agent with Amazon Elastic MapReduce.

Scripted Installation

For advanced users, the Driven Agent can be installed with the following script:

# to download the script
$ wget http://files.concurrentinc.com/driven/2.2/driven-plugin/install-driven-plugin.sh

# for Hive installation
$ bash install-driven-plugin.sh --hive

# for MapReduce installation
$ bash install-driven-plugin.sh --mapreduce

# for Spark installation
$ bash install-driven-plugin.sh --spark

Alternately, as a one-liner:

# for Hive installation
$ export AGENT=hive; curl http://files.concurrentinc.com/driven/2.2/driven-plugin/install-driven-plugin.sh | sh

# for MapReduce installation
$ export AGENT=mr; curl http://files.concurrentinc.com/driven/2.2/driven-plugin/install-driven-plugin.sh | sh

# for Spark installation
$ export AGENT=spark; curl http://files.concurrentinc.com/driven/2.2/driven-plugin/install-driven-plugin.sh | sh

This script will create a .driven-plugin directory in the current user’s home directory, download the latest Driven Agent JAR, and create a symbolic link referencing the latest versions of the driven-agent-[framework].jar.

Re-running the script can be used to safely upgrade the agent.

Note	`driven-agent-[framework].jar` is a unix symbolic link to the latest downloaded version of the agent jar file. This link is created or updated by the install script.

Amazon Elastic MapReduce

For Amazon Elastic MapReduce users, the install-driven-plugin.sh, introduced above, doubles as a bootstrap action.

Use this script as you would any AWS EMR bootstrap action.

EMR 4.x

Amazon introduced a set of changes in EMR version 4.0, that have a direct influence on how to install the Driven agent.

One important change is that bootstrap actions can no longer modify the installation of Hadoop, since Hadoop is only deployed after all bootstrap actions have been executed. The new way of changing the hadoop installation with user defined settings is using the application configuration feature.

Driven provides a set configurations to be used with the different agents. This needs to be done IN ADDITION to configuring the install-driven-plugin bootstrap action above. On the command line simply add the --configurations switch for your framework:

--configurations http://files.concurrentinc.com/driven/2.2/hosted/driven-plugin/configurations-[framework].json"

Replace [framework] with hive for Hive, mr for MapReduce, and spark for Spark.

AWS EMR Before 4.x

For EMR versions prior to 4.x, the only addition is to also use the Amazon provided configure-daemons (s3://elasticmapreduce/bootstrap-actions/configure-daemons) bootstrap action with the following arguments:

--client-opts=-javaagent:/home/hadoop/.driven-plugin/driven-agent-[framework].jar

Replace [framework] with hive for Hive, mr for MapReduce, and spark for Spark.