mpipe - advanced output pipe for the current task


mpipe [-p] [-r rIdx] [-H header: value ...] [manta path]


Each invocation of mpipe reads data from stdin, potentially buffers it to local disk, and saves it as task output. If a Manta path is given, the output is saved to that path. Otherwise, the object is stored with a unique name in the job's directory. If -p is given, required parent directories are automatically created (like "mkdir -p").

If you use mpipe in a task, the task's stdout will not be captured and saved as it is by default.

As a simple example,

wc | mpipe

is exactly equivalent to just:


since both capture the stdout of wc and emit it as a single output object. But you use mpipe for several reasons:

The shortcut ~~ is equivalent to /:login where :login is the account login name.

A job that creates thumbnails from images might use MANTA_INPUT_OBJECT to infer the desired path for the thumbnail (e.g., ${MANTA_INPUT_OBJECT}-thumb.png) and then use mpipe to store the output there. * Multiple outputs: you can invoke mpipe as many times as you want from a single task to emit more than one object for the next phase (or as a final job output). A job that chunks up daily log files into hourly ones for subsequent per-hour processing would use this to emit 24 outputs for each input. * Special headers: You can specify headers to be set on output objects using the "-H" option to mpipe, which behaves exactly like the same option on the Manta CLI tool mput. * Reducer routing: Finally, in jobs with multiple reducers in a single phase, you can specify which reducer a given output object should be routed to using the "-r" option to mpipe. See "Multiple reducers" below.


wc | mpipe -H 'Access-Control-Allow-Origin: *' ~~/public/wc.txt
$ .. | mpipe -r 2


-f [file name] Treats the contents of file name as stdin.

-H '[http-header]: [value]' Headers to set on the resulting PUT request to Manta. For example, Access-Control-Allow-Origin: *.

-p Turns off retries if a PUT fails.

-r Send the output to a specific reducer.


Report bugs at Github