Driven User Guide

version 2.1.4

Understanding the Unit of Work Details Page

The Unit of Work details page can address many questions about an application run. A couple of typical issues that can be addressed include:

How does the application decompose to specific tasks?
Is there a particular cause for performance degradation: data skew, network storm, poor application logic, or inadequate cluster resource provisioning?

Viewing Unit-of-Work Details

The Unit of Work details section contains panels with overall Unit of Work information. In its title bar, this section displays the name of the Unit of Work, a copyable URL to the current view, and a status icon (see Status State of the Application for a list of icons).

Figure 1. Unit-of-work section showing the Status panel

The Status Panel

The Status panel displays a color-coded bar representing the time the Unit of Work spent in each of the states it was in. The timeline is a graphical way to view the amount of time the Unit of Work spent in each of the states it entered, including the current one. The states illustrated by the bar are labeled with begin and end times (time entered and time exited), or dates for long-running units of work. For detailed information on these states, see the Driven State Model

Tip	If the timeline does not show a timestamp for a state, you can view state times in the Unit of Work Details Table. Use the column chooser to add the state-times columns if they are not already included in the table.

Below the timeline is the Progress section, which displays the total number of steps the unit of work has, along with tables that categorize the unit of work’s steps by state. The Active table’s columns are named for the active states that a step could be in, while the Completed table’s columns are named for the end states that a step could be in. The values in these columns are simple counters that show how many steps are in the specified state.

If a unit of work is still active, a Slice Rate graph appears below the Progress section, along with the current number of active slices and the time of last update. Mousing over the graph updates the time and number of active slices to those of the chosen point. For more information, see Counters.

The Counters Panel

The Counters Panel displays unit-of-work-level counters. Theses same counters (from each unit of work) are aggregated to provide the values for the equivalent application-level counters. For more information on Driven counters, see Counters.

The Properties Panel

The Properties Panel reports on the last time data was received about the unit of work.

If the unit of work is part of a Hive-based application, it may display a Statement property containing the SQL executed by the unit of work.

The Environment Panel

The Environment Panel shows the platform on which the unit of work is being executed.

The Directed Acyclic Graph

All Units of Work can be represented by a Directed Acyclic Graph.

Different platforms are able to provide different vertex and edge data and therefore the DAG representations will vary.

All platforms will show data source and sink resource vertices with sanitized information about the resource URI and fields. They will also show at least one processing vertex, with appropriate details for that operation.

Hive Queries and the DAG

A Hive query DAG will be drawn showing a query icon between the source table(s) and output table(s). The user can click on the source and sink resource nodes to see more detailed information about the tables touched by the query. Hive units of work only have one processing vertex representing the HQL query. The details of the query are listed in the Properties tab of the status panel.

DAG example for a Hive query

hive_query_dag

Native Map Reduce and the DAG

Native map reduce DAGs will show map, shuffle, and reduce vertices between the source and sink resource vertices.

DAG example for MR Unit of Work

native_mr_dag

The Cascading Query Planner and the DAG

A key component of a Cascading application is the query planner. When the Cascading application executes, the query planner compiles all the data-processing steps, analyzes dependencies of the steps, and develops the DAG for the application.

DAG rendering as compiled by the query planner

Operation_Dag

The Cascading query planner iterates through the DAG, breaking it into smaller and smaller graphs–called expression graphs–until the graph matches a pattern associated with a unit of work, such as a mapper or a reducer.

Steps associated with their mappers and reducers, as well as their expression

graphs Mapper_Reducer2

Step Table and Slice Histograms

You can add further granular metrics to the slice level of your application by adding counters. Click the Add counters button to display the available counters. Select the desired counter by clicking the checkbox.

Adding counters to the slice performance dashboard

Add_Counters

Understanding Bottlenecks in Your Application

In the slice performance dashboard, you can see the slice (a unit of work such as a map or a reduce task) information at the individual or at an aggregate level.

This example shows skewed data at the slice level

Skew_Data

Observe if any of your slices are skewed. In a MapReduce application, the data is divided and processed in equal-sized chunks. If certain slices are taking more time to finish processing a similar type of task with (assumed) similarly sized data, then it is an anomaly and could indicate application execution problems.

Often, these skews indicate that applications are processing a large number of small files, which usually means that you need to optimize the environment. In other cases, depending on the skew dimension, they could indicate a network issue, which can delay the shuffle-sort operations in MapReduce.

Viewing the Hadoop Dashboard

If there is a Hadoop dashboard for a step, the row for the step has a Job Tracker hyperlink.

Link to Hadoop dashboard

Hadoop_JobTracker_Link

Managing Applications with Tags