Driven Agent Guide: Prerequisites

version 2.2.6

1. Prerequisites: 1.1. Check your team’s demands and expectations

1.2. Check your system privileges and network access
2. Installing the Driven Agent: 2.1. Downloading the Driven Agent

2.2. Installing the Driven Agent

2.3. Confirming Installation
3. Configuring the Driven Agent: 3.1. Testing the Agent

3.2. Agent Common Options

3.3. Agent Advanced Options
4. Driven Agent for MapReduce: 4.1. Running on Hadoop or YARN

4.2. MapReduce Versions
5. Driven Agent for Hive: 5.1. Hive Version Requirements

5.2. Metadata Support

5.3. Using the Hive Agent Artifact

5.4. What Cluster Work Is Monitored
6. Driven Agent for Apache Spark: 6.1. Spark Version Requirements

6.2. Supported APIs

6.3. Spark Runtimes

6.4. Supported Runtimes
7. Using Driven Agent with Apache Oozie: 7.1. Driven Agent Configuration
8. Advanced Installation: 8.1. Scripted Installation

8.2. Amazon Elastic MapReduce
9. Troubleshooting the Driven Agent: 9.1. Applications cannot send data to the Driven Server

Prerequisites

Check your team’s demands and expectations

The Driven Agent is a collection of JVM libraries that enables monitoring of the following Hadoop applications in Driven:

Apache Hive
MapReduce
Apache Spark

The Agent works with any job scheduler that can launch an Apache YARN or Apache Spark application. Some schedulers may have specific instructions like:

Apache Oozie

There is one agent JAR file for Hive, another JAR file for MapReduce, and another JAR file for Spark. Thus the appropriate Agent must be downloaded to the host machine the above applications are launched from, and the launch scripts need to be modified to force the JVM binary ($JAVA_HOME/bin/java) to load the Agent. See Installing the Agent for details.

Driven defines an application context as the JVM instance driving and orchestrating the client side of Hadoop applications. Each Hive query or MapReduce job appears as a single Unit of Work in that application. In a single application context, there can be thousands of queries or Units of Work. Each instance of the application entails a shutdown and restart.

Note	To monitor only Cascading applications with Driven, the Driven Agent cannot be used, an installation of the Driven Plugin JAR is required. See Driven Plugin for details.

Check your system privileges and network access

To download the necessary Driven Agent, you will need Internet access from the host machine it will be installed, or sufficient privileges to upload the Agent jar file to the host machine the applications to be monitored will be launched from.

If using a hosted version of Driven Server not downloaded and installed on the local network, the Driven Agent will need direct internet access to the remote Driven Server installation.

If your applications run behind a secure firewall, it may be necessary to download and install a local copy of the Driven Server. See the Driven Administrator Guide for details.