NAME
mjob - manage manta jobs
SYNOPSIS
mjob
[OPTION...] command [command-specific arguments]
DESCRIPTION
mjob allows you to interact with jobs in Manta. Jobs allow you to specify arbitrary compute that operates on manta objects, with Map/Reduce supported as a first-class citizen. Using mjob, you can create, read, monitor and cancel jobs.
The primary reference for a job is its UUID. Most commands operate on jobs by UUID.
COMMON OPTIONS
The following options are supported in all commands:
-a, --account login
Authenticate as account (login name).
-h, --help
Print a help message and exit.
-i, --insecure
This option explicitly allows "insecure" SSL connections and transfers. All
SSL connections are attempted to be made secure by using the CA certificate
bundle installed by default.
-k, --key fingerprint
Authenticate using the SSH key described by fingerprint
. The key must
either be in ~/.ssh
or loaded in the SSH agent via ssh-add
.
-p, --parallel NUM
Limit concurrent operations to NUM. The default varies by command. This
applies to operations issued by mjob itself (e.g., to add inputs or poll on
the job). It has no effect on the concurrency of the job.
--role=ROLE,ROLE,...
Specify which roles to assume for the request.
--user user
Authenticate as user under account.
-u, --url url
Manta base URL (such as https://us-central.manta.mnx.io
).
-v, --verbose
Print debug output to stderr. Repeat option to increase verbosity.
COMMANDS
The following commands and options are supported:
create [OPTIONS...] expression
Creates a job that executes the commands against keys that will be specified
via addinputs
. expression
can specify an arbitrary UNIX pipeline, with
map/reduce phases separated by the ^
or ^^
character(s), respectively.
For example, to specify a simple grep | sort | uniq
job in Manta, the
following invocation would be a likely example (note the | to escape the
|
character):
mjob create grep foo ^^ sort \| uniq -c
This is the fastest and most common form of creating jobs, and runs with default compute container sizes.
Alternatively, jobs can be specified by using a combination of -m
and -r
flags; the same pipeline could be specified with:
mjob create -m 'grep foo' -r 'sort | uniq -c'
The above form is useful for specifying options to each phase. For example:
mjob create --memory 2048 -m 'grep foo'
--memory 8192 -r 'sort | uniq -c'
Overrides the amount of RAM available in each phase (the memory
, disk
,
init
, image
, and count
options impact the next phase).
Jobs can also be specified using a JSON manifest file, as below (see Manta API documentation for the full JSON schema):
cat job.json
{
"phases": [{
"exec": "grep ..."
}, {
"exec": "maggr sum | sort",
"type": "reduce"
}]
}
$ mjob create -f job.json
Lastly, mjob create
can "one line" the use of create, addinputs, watch and
get like the example below; this would print no diagnostics, and would wait
for the job to complete, then dump the output to stdout (as if you had run
find | grep | sort | uniq
locally):
mfind ~~/stor |
mjob create -q -o grep foo ^^ sort \| uniq -c
The following options are supported on create
:
-b, --batch size
When adding inputs, add them in batches of size.
--close
End the input stream once the job is created.
--count num_reducers
Use num_reducers in the reduce phase.
--disk disk
Override the OS quota, and use the specified amount of disk in the next phase.
This option is specified in gigabytes.
--dry-run
Print the job configuration and exit, instead of creating the job.
--memory memory
Override the OS size, and use the specified amount of DRAM in the next phase.
This option is specified in megabytes.
-f, --file file
Read job description from file.
--image version
Specifies an image version semver to use in the next job phase. Must be
specified as a semver string. The default is server-provided and changes
over time.
--init command
Specifies a command to execute in the compute zone for the next map or
reduce phase. This command will be executed once per zone, and will
run before the exec command for the phase. This is useful for setup, etc.
-m, --map command
Specifies a map phase.
-o, --cat-outputs
Wait for job to complete, then fetch and concatenate outputs.
--open
When adding inputs, do not close input, but leave job open.
-q, --quiet
Do not output any informative messages.
-r, --reduce command
Specifies a reduce phase.
-s, --assets path
Specifies an asset to make available in the compute zone that runs in
the next map or reduce phase.
-w, --watch
Wait for job to finish (only use when adding inputs at create time).
addinputs [-b batch] [-o] JOB...
The addinputs command feeds input names from stdin to a list of JobIDs, and by default closes input when done. For example:
cat inputs.txt
~~/stor/foo
~~/stor/bar
$ cat inputs.txt | mjob addinputs $job
-b, --batch size
When adding inputs, add them in batches of size.
-o, --open
When adding inputs, do not close input, but leave job open.
close JOB
Closes input for a given job.
mjob close 3ec32136-b125-11e2-8487-1b418dd6974b
get JOB...
Returns the status
JSON document for a job.
mjob get 3ec32136-b125-11e2-8487-1b418dd6974b
watch JOB
Waits for a given job to reach the done
state.
mjob watch 3ec32136-b125-11e2-8487-1b418dd6974b
cancel JOB...
Cancels a currently running job.
mjob cancel 3ec32136-b125-11e2-8487-1b418dd6974b
outputs JOB...
Returns the list of outputs for a job, as \n
separated names. Note that while
a job is specifically not archived, the list of names is not guaranteed to
be complete or consistent between calls (in particular when there are a large
number of outputs). Once a job is archived, the entire set of names are read
back in a contiguous stream.
mjob outputs 3ec32136-b125-11e2-8487-1b418dd6974b
inputs JOB...
Returns the list of inputs for a job, as \n
separated names. Note that while
a job is specifically not archived, the list of names is not guaranteed to
be complete or consistent between calls (in particular when there are a large
number of outputs). Once a job is archived, the entire set of names are read
back in a contiguous stream.
mjob inputs 3ec32136-b125-11e2-8487-1b418dd6974b
errors JOB...
Returns the list of errors for a job, as \n
separated JSON objects. Note that
while a job is specifically not archived, the list of errors is not guaranteed
to be complete or consistent between calls (in particular when there are a large
number of outputs). Once a job is archived, the entire set of errors are read
back in a contiguous stream.
mjob errors 3ec32136-b125-11e2-8487-1b418dd6974b
failures JOB...
Returns the list of failed inputs for a job, as \n
separated names. Note that
while a job is specifically not archived, the list of names is not guaranteed
to be complete or consistent between calls (in particular when there are a large
number of outputs). Once a job is archived, the entire set of names are read
back in a contiguous stream.
mjob failures 3ec32136-b125-11e2-8487-1b418dd6974b
share JOB
Generates and uploads a self-contained HTML page that describes the job, including its phases, the list of input and output objects, the contents of input and output objects, error details, and so on.
By default, this HTML page is uploaded to ~~/public/jobshares, meaning that it will be publicly accessible. This includes the contents of input and output objects. If you just want to generate the HTML content without uploading it, use the "-s" option and save the output to a file.
mjob share 3ec32136-b125-11e2-8487-1b418dd6974b
-r, --readme README_FILE
Insert the rendered contents of README_FILE
(a Markdown file) directly into
the generated HTML page.
-s, --stdout
Emit the HTML output to stdout and do not upload it to Manta.
list
Lists all jobs for a user (note, this can also be done with a normal mls
call). Optionally takes filters -- such as -s
for state
-- that can be used
to show only certain jobs.
mjob list -s running
-n, --name name
only list jobs with the given name
-l, --long
use a long listing format
-s, --state state
Only list jobs in the given state.
cost JOB
Estimates the cost in USD of a job by creating a Manta job and adding as inputs compute usage reports from /:login/reports/usage/compute. Assets are pulled from /manta/public/jobs/mjob-cost. Note that usage reports are generated asynchronously, so mjob cost may fail when estimating the cost of jobs that were running recently.
$ mjob cost 3ec32136-b125-11e2-8487-1b418dd6974b
-q, --quiet
Do not output any informative messages.
ENVIRONMENT
MANTA_USER
In place of -a, --account
.
MANTA_SUBUSER
In place of --user
.
MANTA_KEY_ID
In place of -k, --key
.
MANTA_ROLE
In place of --role
.
MANTA_URL
In place of -u, --url
.
MANTA_TLS_INSECURE
In place of -i, --insecure
.
The shortcut ~~
is equivalent to /:login
where :login
is the account login name.
DIAGNOSTICS
When using the -v
option, diagnostics will be sent to stderr in bunyan
output format. As an example of tracing all information about a request,
try:
mjob -vv ~~/stor/foo 2>&1 | bunyan
BUGS
DSA keys do not work when loaded via the SSH agent.
Report bugs at Github