Driven Administrator Guide
version 2.0.5Extracting Data with the scope Command
In addition to using the Driven CLI Client for doing backups, you can also use the client to extract data from Driven for integrating with third-party monitoring applications. The scope command is useful for such integration.
$ driven scope --help
java driven.management.scope.Scope [options...]
Optional:
env vars: DRIVEN_CLUSTER, DRIVEN_HOSTS
Option Description
------ -----------
--between <natural language date/time>
--by-parent
--cause, --with-cause [cause or filter with * or ?] all unique failure causes, or only those match filter
--child-id, --with-child-id [id or partial id]
--cluster driven cluster name (default: driven)
--counter <group and counter name, eg. 'foo:bar.counter'>
--debug [Boolean] enable debugging (default: false)
--display-width <Integer> width of display (default: 80)
--duration [[pending, started, submitted, running, interval to calculate duration from (default: [started,
finished]] finished])
--duration-interval time period to filter values, eg. 5min:25min
--duration-period time period to bucket values (default: PT15M)
--entity entity IDs to constrain results
--fields output field names, '*' denotes defaults (default: [type,
id, name, status, duration])
--from <Integer: offset from which to begin returning (default: 0)
results>
--help
--hosts driven server host(s) (default: localhost)
--id, --with-id [id or partial id]
--jmx
--json [Options$JsonOpts] output data as json (default: values)
--limit <Integer: limit the number of results> (default: 1000000)
--name, --with-name [name or filter with * or ?] all unique names of type, or only those match filter
--no-header
--owner, --with-owner [owner or filter with * or ?] all unique owners of type, or only those match filter
--parent-id, --with-parent-id [id or partial id]
--parent-name, --with-parent-name <name of parent>
--parent-status, --with-parent-status <Invertible:
[pending, skipped, started, submitted, running,
successful, stopped, failed, engaged, finished, all]>
--parent-type, --with-parent-type <ProcessType: [cluster,
app, cascade, flow, step, slice, undefined]>
--print print query parameters
--since <natural language date/time, default 2 days from
'till'>
--sort sort field names - default is none
--status, --with-status [Invertible: [pending, skipped,
started, submitted, running, successful, stopped,
failed, engaged, finished, all]]
--status-time [[pending, started, submitted, running, date/time field to filter against. one of: [pending,
finished]] started, submitted, running, finished] (default:
started)
--tag, --with-tag [tag name] unique tags of type, or only those that match
--text-search full search of pre-defined text fields - currently: ID,
name, owner
--till <natural language date/time, default is now>
--type [ProcessType: [cluster, app, cascade, flow, step, the process type (default: app)
slice, undefined]]
--verbose logging level (default: info)
--version
With the scope command, you can query to retrieve information about current and historical processes, where a process can be an application, cascade, flow, step, or slice (a generalization of a Hadoop task).
The command is useful for two types of roles: discovery and monitoring. Discovery is finding specific process instances based on any metadata. Monitoring is observing the changes in metadata of specific process instances (for example, a flow has changed from RUNNING to FAILED status). It also allows you to report on a target process type while refining the results based on parent and target metadata. Additionally, this tool allows you to report on a target process type while refining the results based on parent and target metadata.
Examples of Command Usage
List all Skipped flows in a Running application:
$ driven scope --type flow --status skipped
List all Skipped flows in a “running” application:
$ driven scope --type flow --status skipped --parent-type app --parent-status running
List the current statuses of all flows in all Running applications:
$ driven scope --type flow --status --parent-type app --parent-status running
Or more specifically, for each RUNNING application, list the statuses of their child flows, grouped by application:
$ driven scope --type flow --status --parent-type app --parent-status running --by-parent
Common Command-Line Options
Many CLI options begin start as with; for example, --with-name
. These can be abbreviated further
by removing the with so that you can just use --name
in place of --with-name
.
Filters
Use the following command line for filters:
--type = app, cascade, flow, step, slice
--with-tag = user-defined data for filtering
--with-status = one or more of the following values: PENDING, STARTED, SUBMITTED, RUNNING, SUCCESSFUL, FAILED, STOPEED, SKIPPED.. If blank, all status values will be displayed as a chart.
The ^ (caret) before the option parameter means “not”. For example, ^running
sets the filter condition to not in RUNNING state.
--with-id = filter for an identifier
--with-name = name or name filter
--with-parent-name = in tandem with --parent-type
--with-parent-status = in tandem with --parent-type
--with-parent-id = for listing children of type having the given parent ID, --parent-type is ignored
--statusTime = which status time to filter against; pending, start, submit, run, finished
--till = filter results to date/time
--since = filter results from date/time
--between = filter results between dates/times
Status
Most processes can be in one of nine states. They are:
--pending - when the process is created
--started - when the process has been notified it may start work
--submitted - when the process, or child process, has been submitted to a cluster
--running - when a process is actually executing the data pipeline
--successful - when a process has completed
--failed - when a process has failed
--stopped - when a process, or child process, received a stop notification
--skipped - when a flow was not executed, usually because the sinks were not stale
--status - shows summary of all status values
Duration
To show a timeline of all durations, grouped by period, use the following commands:
--duration = start:finished
--duration-period = the time in which to bucket the results. For example, 10sec, 15min, 2hrs, 1wk
--duration-interval = the range of time to display. For example, 15min:30min
How-To Tips
How do I monitor job in progress?
If you have already identified a step or a flow that you wish to monitor, enter:
$ driven scope --type slice --parent-type step --parent-id _000_ --status
This command summarizes all the slice statuses for the requested step.
How do I list all users currently running applications?
To list all known users or process owners, enter:
$ driven scope --owner
To filter the list to include owners with running apps:
$ driven scope --owner --status running
Where in the code did the job fail?
If you have the app instance parent ID, you can list all the causes for the failure by entering:
$ driven scope --parent-id _000_ --type slice --cause
This command returns a list of all the exceptions and messages thrown.
For additional detailed information, enter the command:
$ driven scope --type slice --status failed \ --fields id,failedBranchName,failedPipeLocation,failedCause,failedMethodLocation