Table of Contents

Driven Agent for Hive

version 1.3.8

Driven Agent for Hive

The Driven Agent for Hive is a JVM level agent library that allows monitoring of Apache Hive queries within Driven. The Agent runs alongside the main exexution of Hive and sends telemetry to a remote Driven server. It supports any kind of Hive deloyment, be it the fat client, the HiveServer2 or Ooozie workflows containing Hive queries. Queries made by any kind of application via JDBC or even ODBC can be monitored as well.

The Agent has to be enabled via a JVM level command line switch. Putting it on the runtime classpath will have no effect.

The Driven Agent for Hive comes in two different forms. The regular artifact, which is a fat jar consisting the Agent and all it’s dependencies and the bundled artifact, which contains all of the former and additionally includes the latest stable version of the Driven plugin.

Hive version requirements

The Driven Agent for Hive can be used with any version of Hive newer than 0.13. It supports MapReduce and Apache Tez as execution engines.

Note
Hive on Apache Spark is currently not supported.

Metadata Support

Driven allows applications to send metadata like name, version number or tags to be send alongside the telemetry of the application. This is supported by the Driven Agent for Hive. The following table shows the properties supported by the Agent:

Table 1. Properties for sending metadata to Driven
Name Example Explanation

driven.app.name

driven.app.name=tps-report

Name of the application

driven.app.version

driven.app.version=1.1.5

Version of the application

driven.app.tags

driven.app.tags=cluster:prod,tps,dept:marketing

Comma separated list of tags.

The properties can be set either within a given Hive QL script via set-commands or can be given on the commandline. It is also possible to add them to the hive-site.xml file.

Tags are a simple, yet powerful way to categorize applications in Driven. They are searchable and allow for quick categorization of applications of a certain kind. The Driven user guide has more information about tags.

Using the Hive agent artifact

The latest version of the regular agent release can be downloaded from the release site:

> wget -i http://{artifactsurl}/driven-agent/{agentVersion}/latest-driven-agent-hive.txt

The Agent can be enabled by extending the HADOOP_OPTS environment variable like so before starting the Hive fat client hive or the HiveServer2.

> export HADOOP_OPTS="-javaagent:/path/to/driven-agent-hive-<version>.jar"
Note
You have to set the HADOOP_OPTS variable, setting the YARN_OPTS variable, even on a YARN based cluster, has no effect.

The agent has to be installed and configured on the host on which the Hive queries are executed. In the case of the fat client, it is sufficient to set the evironment variable in the shell, where hive is going to be launched. The same applies to the newer beeline client, when used without a HiveServer2.

In case of a HiveServer2 deployment, the agent has to be installed on the machine where the server is running. For the agent to work, the HADOOP_OPTS variable has to be set in the environment in which the server is running. Typically this involves modifying the start-up script of the HiveServer2. Some distributions ship with graphical cluster admininstration tools, which allow setting a custon hive-env.sh, which is picked up by the server. The HiveServer2 will show up in driven as a long running app, as soon as the first query is executed on the server.

As explained above, the regular artifact only contains the Agent, not the Driven plugin. For the Agent to properly function the Driven plugin has to be installed alongside the Agent. This is covered in the Driven plugin installation guide and the Driven plugin configuration guide.

Using the Hive Agent Bundle artifact

The Hive Agent Bundle has the exact same functionality as the regular Agent, but it simplifies the installation and configuration of the Agent in certain deployment scenarios. Users of Apache Oozie Apache Oozie should always use the Hive Agent Bundle as explained in the next section.

The latest version of the bundled agent release can be downloaded from the release site:

> wget -i http://{artifactsurl}/driven-agent/{agentVersion}/latest-driven-agent-hive-bundle.txt