Table of Contents

Driven User Guide

version 1.3.8

Focusing on Relevant Data with Searches and Saved Views

The Status Timeline and Status Frequency graphs provide an overall profile of application processing. But you are also likely to want to explore applications in more depth and with a sharper focus than the displayed accumulation of all runtime information provides.

The Driven Plugin collects a rich set of operational data with each application run, which is indexed in the persistence layer of the Driven server. The search functionality and filtering capabilities let you query for these insights.

You can save searches with particular filters and search terms as views so that you (or other users, if you so designate) can retrieve the search criteria and apply to data later with one click. When you save a view, you can choose to save it as a Status View or to focus more granularly by saving it as an Application View. When you drill deeper into the components and runtimes of particular applications, the main area of the window redraws in a different format (see Application View documentation).

The circled elements in Figure 1 show the areas of the Status View that allow you to filter what data populates the metrics on the page when you want to adjust the scope of reported performance data or to see different metrics. The Application View has the same filtering and search handles. Searching, filtering, and using saved views help focus on the dimensions of application runs that are of most interest to you.

searching filtering viewing
Figure 1. Searching and filtering controls of application data, including saved views

Searches that you can initiate at top of the Driven window support several application-, process-, and tap-level query attributes. This is the starting point for mining Driven data if you do not have access to any saved views that can help you.

Note
You can use the asterisk (*) as a wildcard character. The search feature does not support other special characters ("",@, #, $, <, >, ?, etc.) in search strings and does not support spaces.

There are multiple parameters that you can invoke to query the application runs:

  • Application, process, and tap metadata - You can use a filter to specify a category for your search term. Click the All drop-down menu to select a parameter. The available search filter parameters are shown in Figure 2.

  • Date - Click the All dates drop-down menu to select a predefined date range or to customize the date or dates.

  • Status - Click the Status drop-down menu to filter on an application-run parameter that is listed in Figure 3. The predefined Active states filter queries for applications that are in Pending, Started, Submitted, or Running state. The predefined Finished states filter queries for applications that are in Successful, Failed, or Stopped state.

  • Teams - Click the All teams drop-down menu to filter application data based on association with a Driven team. See the documentation about teams for more information.

search filter
Figure 2. Search filter parameters
status filters
Figure 3. Status filter parameters
Tip
If the application ID, tag, or owner is not displayed in the table rows of the search results, use the column chooser to bring these dimensions in view. Figure 4 shows how to access column-chooser attributes.
columnChooser appID
Figure 4. Column chooser icon and excerpt of selectable attributes

Statement and Process ID Filters

You can search by filters that focus on components that are more granular than an application. Searches that are filtered by Statement and Process ID return application instances that have matching criteria on the step level. Information about the step level is on the Flow Details page.

The Statement search filter is useful for finding applications containing select statements at the flow level. Generally, applications can contain SQL statements, such as Hive-based apps with Hive Query Language (HQL). For example, consider the select statement for the Hive Flow -CalculateAverageQuantity flow, which contains:

HiveFlow_Select

Tip
You can use the asterisk symbol (*) as the wildcard at the front and end of the text within the select statement you want to search.

The Process ID search filter is helpful for finding applications with Cascading objects that are executed at the step level. The process ID is correlated with the job ID of Hadoop’s JobTracker.

Note
In a search using the Process ID filter, the wildcard character (*) is NOT supported. Such a search would potentially return all components from every slice in all applications on the system, which can be a very expensive operation.

Tap Filters

The result set of searches filtered by Tap Identifier and Tap Field display applications that run with Taps matching the search criteria. Each application in the results has a Tap identifier or field name (depending on which filter is selected) that matches the string that is typed in the search field.

Tap identifier refers to a uniform resource identifier (URI) for the Tap. The Tap Identifier search filter returns applications that have processed data from a resource matching the identifier-string criteria as entered in the search field. A use case for a tap-identifier search is to discover which applications are accessing particular data resources. For example, a database administrator (DBA) who plans to bring down the system for maintenance could want to schedule a period for the outage by consulting with the people who run or use the affected database resources. A list of applications helps the DBA identify which users and groups access the data resources so that they can be contacted.

A Tap field is usually a descriptor for a field of sourced or sinked data. Searching with the Tap Field filter can help you detect whether or not applications are accessing particular types of data sets. Because a field is a grouping of data by a criterion, the field name in many cases reflects the nature of the data. For example, a field with the name "CC_number" could be used for a field containing credit card numbers.

The ability to easily diagnose application processing on the tap-field level can be useful in environments where a field is known to contain sensitive or confidential data that should not be exposed by running applications. By searching for the field name with the Tap Field filter selected, Driven parses runtime information to focus on whether or not applications process data from the field. Such a search can be a practical use case in an environment that needs to demonstrate or audit for data-security compliance. In this type of use case, the absence of search hits for an application with the given tap field indicates separation from the protected data set.

Saving Search Queries as Views

After searching for applications based on the criteria and filters you have defined, you can save the query as a view. You can then return to the view to retrieve all applications that match the search criteria and filters. Search results of a saved view are dynamic, displaying applications with matching search parameters at the time that the view is opened.

When you save the view, select what information about the matched applications to display:

  • Status View highlights data about application states, graphically displaying Status Timeline and Status Frequency.

  • Application View provides canned quality-of-service statistics and heat maps by selectable timelines, which can serve as a gateway to discovering relevant details about repeated executions of specific applications (see Application Views documentation).

Links to saved views appear in the side panel.

save your view
Figure 5. Saving a view

Sharing Saved Views

After you save a view, you can share it with other Driven users. When you share a view, you must either select some or all of your teams or grant access to all other users of the Driven deployment.

To share a view:

  1. Hover over a view link in the My Views list of the sidebar.

  2. Click the sharing icon (sharing-icon).

  3. Select how you want to share the view:

    • If you want to share the view with teammates only, select Teams and one or more of the entries in the list.

    • If you want to share with all users who can log in to Driven, select Public.

    • If you prefer to share the view more selectively, select Link.

  4. Click SHARE.

  5. If you selected Teams or Public sharing, the saved view appears in the Team Views or Public Views list of the other users. If you selected Link sharing, then copy the URL that is generated in the window and send it to other users of your Driven deployment who need access to the view. (Selecting Link sharing does not populate the saved views that appear in the side bars of other users.)

After sharing a view, you can either change who else can access the view or revoke access from all other users. To modify the share settings, click the sharing icon to configure and submit changes to view access.

Figure 6 shows how saved views are categorized and arranged in the side panel of your Driven window.

app and status views
Figure 6. Saved views in the side panel

Alternatively, you can hover over your user name in the top-right corner and click Saved Views to open a separate window listing all saved views that you can access. The Saved Views window also displays who created each view (Owner), when it was created, and when user access last changed.

My Teams Views

Each link in the My Teams area of the side panel is an entry point to a Status View that is associated with a team of which you are a member. A link in the My Teams area of the side panel opens to a Status View that has the same parameters as the Show All view, except that the data is filtered to show only information that is correlated with the selected team.

The purpose of this type of view is to isolate metrics of application activity that is monitored by a single team. By opening a link under My Teams, you get a broad view of a specific team’s application performance. The view can then be used as a gateway to further refine filter parameters or to start focused searches without extraneous data from applications that are associated with other teams.

Note
My Teams links are generated automatically by virtue of belonging to teams and not actively "saved" by a user.

Case Examples of Searches, Views, and Teams

Joanne is a member of the ServerStar and EasternIT teams. She sees links for both of the teams under My Teams. Other members of the ServerStar team see the same ServerStar link in their My Teams area of the sidebar and access the same information if they click the view. The same applies to the EasternIT team.

Saving a Status View
  1. After Joanne opens the ServerStar link under My Teams, she finds the information useful in general but she wants to refine the filter parameters to focus on application status for the past five days.

  2. She clicks the drop-down menu for filtering dates and selects Custom Dates to set the date range of the last five days.

  3. After naming and saving the search, the view link appears in the Status Views > My Views section of the side panel.

Saving and Sharing an Application View
  1. Joanne wants to troubleshoot why instances of an application named Cost_Analysis, which is administered by the EasternIT team, fail intermittently. She clicks the EasternIT link under My Teams.

  2. She searches the EasternIT Status View with the following parameters:

    • App Name: Cost_Analysis

    • Status: Failed

  3. The timeline and frequency graphs display failed application runs for the whole history stored by Driven.

  4. As a way to gather the most recent root causes of application failure, Joanne further filters the view to 1 week.

  5. To help herself and others find the relevant application execution details in Driven, Joanne saves the view as an Application View called Recent_CAFail.

  6. She wants other people on the EasternIT team to see the data to help troubleshoot the problem. After navigating to Status Views > My Views > Recent_CAFail in the side panel to share the view, she selects only the EasternIT team because no other users should access the application data. When EasternIT teammates are viewing Driven, they will see a Recent_CAFail link in the Application Views > Team Views section of the side panel.

Customizing Searches

You can create custom searches for application runs that have specific attributes, such as having a certain time range for processing or populating a counter with a defined value range. Custom searches are based on Lucene query syntax, which is entered in the search field of a Status View or Application View.

As with searches that do not use Lucene query syntax, custom searches return applications that match your search-parameter values and that are associated with your Driven teams.

Table 1 shows some examples of the types of information that can be retrieved with a custom search and the query statements to obtain the application search results. Refer to the statement syntax examples in Table 1 for guidance on how to construct some types of Lucene queries. For detailed information about the required syntax in search queries, see Apache Lucene - Query Parser Syntax documentation.

Note
The letters in the query statement syntax are case-sensitive.
Table 1. Examples of Custom Search Goals and Statement Syntax
Application Attributes Sample Values for Parameters Statement Syntax

Processing duration; Tag identifier

Duration more than or equals 5 minutes; Tag identifier = production

duration:[300000 TO *] AND tags:production

Pending status time; Runtime

Pending time does not equal 0; Runtime > 0

NOT pendingTime:0 AND (NOT runTime:0)

Application name; Processing duration

Name = Cascading-Hive; Duration more than or equals 5 minutes

name:Cascading-Hive* AND duration:[300000 TO *]

Application name containing spaces

Name = sales region

sales_region*

Counter with a particular value

The BYTES_WRITTEN counter equals 814186673

counters.org\:apache\:hadoop:\mapreduce\:lib\:output\:
FileOutputFormatCounter.BYTES_WRITTEN:814186673

Counters with a value in a particular range

The BYTES_WRITTEN counter equals or is greater than 814186000

counters.org\:apache\:hadoop\:mapreduce\:lib\:output\:
FileOutputFormatCounter.BYTES_WRITTEN:[814186000 TO *]

Path to user directory

Path is /Users/smith/company/code/project

userDir:\/Users\/smith\/company\/code\/project

Note
You must prepend a custom search query for counter values with counters. As also shown in the counter examples, backslashes are required as escape sequences to comment out colons in the paths so that they are not parsed as Lucene syntax.

Table 2 lists Cascading application attributes that are most relevant to searches in Driven. The most commonly used search targets, such as app name, are integrated in the Driven GUI so that you do not need to run a Lucene query to find matches. When an attribute can be located by a search filter or column chooser, Table 2 lists the part of the user interface that can be used to track the information.

Although an application attribute might be searchable with the GUI controls, you might prefer to query for the attribute and value in a Lucene query. This is particularly true when you want to find matches based on a range of values or if you want to run a complex query.

Elasticsearch truncates field values that exceed 16,000 characters. If a search parameter value includes characters that appear only in a string beyond the character limit (such as a very long classpath), Driven does not return a match because it is unable to search for possible matches after 16,000 characters of a string.

Table 2. Searchable Cascading Application Attributes
Attribute Displayed in GUI?

(Duration and other time-based attributes are listed at bottom of table.)

cascadingVersion

Application Details page

classpath

No

command

Searchable with App Command filter;
Displayed on Application Details page

counter
(Note: counters. must prepend the path in the query)

Application Details and Flow Details tables

finished

Searchable with Status filter

frameworks

Application Details page

id (application ID)

Searchable with the App ID filter;
Displayed as the last node of the URL for the Application Details page

jarName

Application Details page

jarPath

Application Details page (click on JAR Information link to display the path)

javaClassVersion

No

javaCompiler

No

javaHome

No

javaInterpreterVersion

No

javaIoTmpdir

No

javaVendor

No

javaVendorUrl

No

jvmMaxMemory

No

localeCountry

No

messagingProtocolVersion

No

name

Searchable with App Name filter;
Displayed on Application View and Application Details pages

osArch

No

osName

No

osVersion

No

owner

Searchable with the App Owner filter;
Displayed on the Application Details page

pid

No

pluginVersion

Displayed on the Application Details page when you hover over the information icon

status

Searchable with one of the Status filters;
Displayed in Status View and Application View

tags

Searchable with the App Tags filter;
Displayed on the Application Details page

type

Types that are identified in the Driven user interface are applications, flows, and steps

userDir

No

userHome

No

userLanguage

No

userRegion

No

userTimezone

No

version

No

Time Attributes

All time attributes with corresponding values can be displayed in the Application Details and Flow Details tables, except for the lastCounterFetchTime and statusTime attributes.

Absolute-time attributes
Values for these attributes are in milliseconds:

finishedTime
lastCounterFetchTime
pendingTime
runTime
startTime
statusTime
submitTime

Duration-time attributes
Values for these attributes are in Unix time:

duration
(duration = Amount of time from Started to Finished status.
Equivalent to the startTime:finishedTime attribute.)
pendingTime:finishedTime
pendingTime:runTime
pendingTime:startTime
pendingTime:submitTime
pendingTime:finishedTime
runTime:finishedTime
startTime:finishedTime
startTime:runTime
startTime:submitTime

Next