Driven Agent Guide: Driven Agent for Apache Spark

version 2.2.6

Driven Agent for Apache Spark

The agent for Apache Spark enables Driven to perform real-time monitoring of any application written with the Spark API.

For instructions on downloading and installing the Driven Agent see sections on downloading and installing the agent.

Note
The current release is part of our Early Access Program (EAP) releases thus supported Versions, APIs, and Runtimes are subject to change before the final release.

Spark Version Requirements

The Driven Agent can be used with the following versions of Spark:

Spark Version

1.6.x

1.5.x

1.4.x

Supported APIs

API Context Supported Comments

Spark Batch

SparkContext

yes

Spark Streaming

StreamingContext

yes

partial test coverage

DataFrames

N/A

no

planned

SQL

SqlContext

no

based on DataFrames

Hive

HiveContext

no

based on DataFrames

Spark Runtimes

Supported Runtimes

Runtime Master Param Supported Comments

Hadoop YARN Client

yarn-client

yes

Hadoop YARN Cluster

yarn-cluster

yes

Spark Standalone

spark://IP:PORT

yes

Apache Mesos

mesos://IP:PORT

untested

Apache YARN

The agent can be used with an existing Apache YARN cluster to deploy Spark applications via the spark-submit shell script in $SPARK_HOME/bin.

If using --master yarn-client switch to submit a Spark application, set the following switch:

--driver-java-options "-javaagent:/path/to/driven-agent-spark-<version>.jar=drivenHosts=<driven host>;drivenAPIkey=<driven api key>"

For example:

spark-submit \
  --master yarn-client \
  --num-executors 3 \
  --driver-memory 4g \
  --executor-memory 2g \
  --executor-cores 1 \
  --driver-java-options "-javaagent:/path/to/driven-agent-spark-<version>.jar=drivenHosts=<driven host>;drivenAPIkey=<driven api key>" \
  --class org.apache.spark.examples.SparkPi \
  "${SPARK_HOME}/lib/spark-examples*.jar" 100

The option --master yarn-client runs the main function of the Spark application locally.

If using --master yarn-cluster switch to submit a Spark application, set the additional switch to ensure the agent jar is uploaded to the YARN cluster:

--driver-java-options "-javaagent:driven-agent-spark-<version>.ja=drivenHosts=<driven host>;drivenAPIkey=<driven api key>" \
--jars "/path/to/driven-agent-spark-<version>.jar"

For example:

spark-submit \
  --master yarn-cluster \
  --num-executors 3 \
  --driver-memory 4g \
  --executor-memory 2g \
  --executor-cores 1 \
  --driver-java-options "-javaagent:driven-agent-spark-<version>.jar=drivenHosts=<driven host>;drivenAPIkey=<driven api key>" \
  --jars "/path/to/driven-agent-spark-<version>.jar" \
  --class org.apache.spark.examples.SparkPi \
  "${SPARK_HOME}/lib/spark-examples*.jar" 100
Note
When using a properties file to configure the agent, you must also set the --files switch referencing the properties file so that it is uploaded to the YARN cluster.