Table of Contents

Driven User Guide

version 2.0.5

Using the App Details Page

The app details page shows all the units of work and steps that are part of an application execution. (You can confirm that you are in the app details page if units of work are listed in the Details Table.) Driven aggregates the performance of individual mappers and reducers and frames them as metrics in the context of the overall application execution. This insight can facilitate easier application optimization and monitoring on the Hadoop cluster.

A search box appears near the top of the page if you want to filter what units of work are displayed. If you are inspecting a complex or long-running application and know the value of a unit of work parameter listed in Figure 1, then the search can help pinpoint runtime factors that pertain to your goals. For a less fine-tuned search, you can also filter the search results that are displayed by using the Status drop-down menu.

Tip
The unit of work search is especially helpful for parsing performance data from a high-volume Hive Server. The Driven agent that transforms the Hive Server telemetry data to the app details page treats the Hive Server as a single application, regardless of the number of Hive queries coming from the server.
app details search parameters
Figure 1. Search field with Units of Work filter expanded

Understanding the Directed Acyclic Graph

Each application instance is represented as a directed acyclic graph (DAG). The graph renders an interactive diagram of the units of work and steps that can reveal underlying slice performance issues. Units of work, step, and slice information are particularly useful to monitor instances of application execution over a period of time as the application grows in complexity and size. In addition, units of work and step details on the DAG and in the table below the graph can be used to:

  • Understand real-time dependencies between steps and units of work

  • Visualize your application, tracking steps in the graph to line numbers in your code

  • Investigate log error messages and stack exceptions

  • Tune application logic

When you execute your application, the underlying framework builds a state model to optimally execute the unit of work on the Hadoop cluster. The Driven Plugin transmits the state model to the Driven application, which represents the execution plan as a DAG.

DAG example app details page
Figure 2. Sample DAG of an application

The DAG representation of each application execution can be useful to stakeholders responsible for documenting how an application has been developed and has performed, such as a documentation analyst. Over a period of time, the analyst might not be able to track and record relevant application details. Because Driven has a persistence layer to store application execution data, past application performance can be recreated by generating a DAG on demand. Without such an interface, it can be difficult to map business needs to technical implementation, especially in work environments that involve large teams that are spread across different regions.

In the graph, each node corresponds to a step or a processing function in your application code. You can refer to the specific code for a step by clicking on the node link. The pop-up window with detailed information is an annotation. See Using Annotations for more information.

DAG node info
Figure 3. Revealing details about a node in a single-unit-of-work DAG

Viewing the Graph

The DAG on the app details page can be viewed in three different ways:

Contracted View - The Contracted View is useful for complex and large applications.

Logical View - The Logical View (default) shows all the steps and resources (excluding implicit resources) and built-in functions.

Physical View - The Physical View shows all the steps including the implicit resources and built-in functions. The Physical View may show more details than the Logical View, if any exist.

Driven represents Cascading’s pipes metaphor as lines connecting steps in the DAG. A step is dependent on another step only if it relies on the execution of the previous step. The Driven Plugin dynamically determines the dependencies between the units of work. If the output (sink) of one unit of work is consumed by another unit of work (as a source), Driven notates that dependency by connecting the two units of work.

Visualizing your end-to-end application as a DAG along with operational data, such as read/write data processed at each step, can provide important insights into improving the performance of the application. For example, reviewing the DAG can expose opportunities to introduce Filter functions in your code upstream to reduce the volumes of the data being processed by the pipes or to make the Join functions more efficient.

Real-Time Visibility into Your Application

Driven can refresh the displayed information as updates stream in from the plugin. This includes display of real-time progress of your application, which includes highlighting the current steps being executed, number of completed steps, and read/write data.

Getting the most current information can be very useful. You might discover in Driven that a long-running job is not executing properly, which could be a signal to terminate the application. Also, for example, if you see sudden slow-down in the progress of your application, you may want to immediately start investigating the reason (network storm or a rogue job submitted to the cluster).

One of the most interesting insights is the ability to track the percentage of applications that have completed in real time. For long-running applications, it is often useful to spot-check the behavior to ensure that there are no anomalies.

Ensure that the Auto Update slider in the top right corner is enabled to allow the displayed Driven data to auto-refresh in real time.

auto update slider
Figure 4. Auto Update slider

Status State of the Application

Driven displays the status of each unit of work in an executing application in the rows of the Details Table. The status of an application instance is indicated by the icon in the top right corner.

The following is a list of the statuses:

Pending_Started_State Icon for both the Started status and the Pending status.

Submitted_State - Submitted status

Running_State - Running status

Successful_State - Successful status

Stopped_State - Stopped status

Skipped_State - Skipped status

Failed_State - Failed status

Stack Trace

For applications in FAILED status, you can view the stack trace of these applications to further investigate for errors. Click the Show stack trace error icon stack-trace-info in the upper right corner to display the stack trace error information.

Details Table

The table under the DAG provides a detailed breakdown of each unit of work in the application run. Some key monitoring assets of the tabular interface include the following capabilities:

  • Click on a hyperlinked unit of work name to focus on component slice performance, JobTrackers, and node statistics

  • Segment unit of work data with correlated JobTracker, step, and customized counter details

Uncovering Bottlenecks with the Timeline Column

Driven helps you visualize instrumentation counters in a context to help you tune your applications. The Timelines of the Details Table provide detailed dashboards of unit of work that comprise the application, helping you to quickly identify which part of your application needs attention (assuming you will first attempt to tune the more resource-draining parts of the application).

Tip
Hover over a segment of the Timeline bar graph to see what status is represented by the color.
Timeline Diagnostics v2
Figure 5. Timelines in the right column help you scan application units of work to uncover possible bottlenecks

Importing Counter Data and Other Metrics

Driven lets you customize most of the information that the table displays. Click the Select table columns icon Counter_Chooser to reveal or conceal columnar metrics. The Status and Name columns cannot be hidden.

The columnar metrics are categorized in the column chooser. Each category can be collapsed or expanded.

See Counter Data and Other Metrics in Tables for more information.

Next