Node.js SDK for Manta

This is the reference documentation for the Manta Node.js SDK. Manta is Triton's Object Storage Service, which enables you to store data in the cloud and process that data using a built-in compute facility.

This document explains the Node.js API interface and describes the various operations, structures and error codes.

Conventions

Any content formatted like this:

curl -is https://us-central.mnx.io

is a command-line example that you can run from a shell. All other examples and information are formatted like this:

client.ls('/jill/stor/foo', function (err, res) {
    assert.ifError(err);
    ...
});

Installation

First, install the SDK as usual via npm; the package name in npm is manta. You may optionally want to install the package globally with the -g flag to npm, as this should place the node-manta CLI in your $PATH.

npm install manta

Once you've installed the npm package, there a few environment variables that are useful to set if you plan to work with the CLI; these environment variables are not strictly necessary, but they will save you passing in command line options on each invocation. The environment variables that can be set are your SmartDataCenter login name and ssh public key fingerprint (manta uses the same credentials), and the URL of which manta endpoint you wish to interact with. The commands below assume that your SSH public key is the default id_rsa.pub key, located in your $HOME/.ssh directory (on Mac OS X and UNIX environments). The shell command below simply parses the SSH fingerprint and sets that in the requisite environment variable.

export MANTA_KEY_ID=`ssh-keygen -l -f ~/.ssh/id_rsa.pub | awk '{print $2}' | tr -d '\n'`
$ export MANTA_URL=https://us-central.mnx.io/
$ export MANTA_USER=jill

Creating a Client

In order to create a client, use the createClient API available on the top-level of the SDK. The example below assumes that you are using the environment variables you set above.

var assert = require('assert');
var fs = require('fs');
var manta = require('manta');

var client = manta.createClient({
    sign: manta.privateKeySigner({
        key: fs.readFileSync(process.env.HOME + '/.ssh/id_rsa', 'utf8'),
        keyId: process.env.MANTA_KEY_ID,
        user: process.env.MANTA_USER
    }),
    user: process.env.MANTA_USER,
    url: process.env.MANTA_URL
});
assert.ok(client);

console.log('client setup: %s', client.toString());

The options you can pass into createClient are:

Name	JS Type	Description
connectTimeout	Number	(optional): amount of milliseconds to wait for acquiring a socket to Manta; defaults to `0` (infinity)
log	Object	(optional): `bunyan` logger; default is at level `fatal` and writes to `stderr`
headers	Object	(optional): HTTP headers to send on all requests
sign	Function	(required): see `authenticating requests` below
url	String	(required): URL to interact with Manta on
user	String	(optional): `login` name to use when interacting with the `jobs` API

Authenticating Requests

When creating a manta client, you'll need to pass in a callback function for the sign parameter. node-manta ships with two functions that will likely suit your need: privateKeySigner and sshAgentSigner. Both of these callbacks will automatically do the correct crypto for authenticating manta requests, the difference is that privateKeySigner expects (non-passphrase protected) keys to be passed in directly (as a file name), whereas sshAgentSigner will load your credentials on each request from the SSH agent (if available). Both callbacks require you to set the manta user (login) and keyId (SSH key fingerprint).

Note that the sshAgentSigner is not suitable for server applications, or any other system where the performance degradation necessary to interact with SSH is not acceptable; put another way, you should only use it for interactive tooling, such as the CLI that ships with node-manta.

Should you wish to write a custom plugin, the expected implementation of the sign callback is a function of the form function (string, callback). string is generated by node-manta (typically the value of the Date header), and callback is of the form function (err, object), where object has the following properties:

{
    algorithm: 'rsa-sha256',   // the signing algorithm used
    keyId: '7b:c0:5c:d6:9e:11:0c:76:04:4b:03:c9:11:f2:72:7f', // key fingerprint
    signature: $base64_encoded_signature,  // the actual signature
    user: 'mark'   // the user to issue the call as.
}

Use-cases where you would need to write your own signer are things like signing with a smart-card or other HSM, making remote calls to a central system, etc.

Presigned URLs

In some cases you may want your app to be able to generate a full URL, suitable for giving out to others as a link. In these cases, you can use the presigned URL approach, and set an expires parameter. node-manta has a simple API for this that utilizes the same sign callback as all other APIs, but simply does the correct canonicalization for a URL:

    var manta = require('manta');

    var opts = {
        algorithm: 'RSA-SHA256',
        expires: Math.floor(Date.now() / 1000) + 3600, // epoch time
        host: 'us-central.mnx.io',
        keyId: process.env.MANTA_KEY_ID,
        path: '/mark/stor/my_image.png',
        sign: manta.privateKeySigner({
            key: process.env.HOME + '/.ssh/id_rsa',
            keyId: process.env.MANTA_KEY_ID,
            user: process.env.MANTA_USER
        }),
        user: process.env.MANTA_USER
    };

    manta.signUrl(opts, function (err, resource) {
        assert.ifError(err);

        console.log('https://us-central.mnx.io' + resource);
    });

Common API options

All APIs in node-manta have the last two options of the function set to options and callback, where options is (usually) optional. For example, these two calls to info are identical:

var opts = {};
client.info('/jill/stor/foo', opts, function (err, info) {
    assert.ifError(err);
    ...
});

client.info('/jill/stor/foo', function (err, info) {
    assert.ifError(err);
    ...
});

If you are not passing in explicit options, the second form is always there for convenience. All API operations allow you to pass in a standard set of options, which are:

Name	JS Type	Description
headers	Object	Any HTTP headers to be included in this request
req_id	String	A unique identifier for this request (SHOULD be a uuid)
query	Object	A key/value set of parameters to be encoded on the URL's query string

You can always override any node-manta behavior by passing in explicit HTTP headers, but in most cases, you should just use the "higher-level" parameters available in the specific API you are interested in.

Common Callback Parameters

In almost all cases (the exception being the "streaming" APIs like ls) callbacks will be of the form function (error, result), where err is either a JavaScript Error object or null. result is a standard node http.ClientResponse object, where you will be able to access HTTP headers, response codes, etc. Note that if there was an HTTP response code >= 400, then err will be present and filled in with the Manta error code and message (see errors).

Errors

All callback functions may return a Javascript Error object. In most cases, you can simply switch on err.name, which will be correctly filled in from server error codes sent back. The only cases where you cannot are lower-level errors such as ECONNREFUSED that are generated by the node.js runtime. The complete list of manta error names is:

AuthSchemeError
AuthorizationError
BadRequestError
ChecksumError
ConcurrentRequestError
ContentLengthError
InvalidArgumentError
InvalidAuthTokenError
InvalidCredentialsError
InvalidDurabilityLevelError
InvalidKeyIdError
InvalidJobError
InvalidLinkError
InvalidSignatureError
DirectoryDoesNotExistError
DirectoryExistsError
DirectoryNotEmptyError
DirectoryOperationError
JobNotFoundError
JobStateError
KeyDoesNotExistError
NotAcceptableError
NotEnoughSpaceError
LinkNotFoundError
LinkNotObjectError
LinkRequiredError
ParentNotDirectoryError
PreSignedRequestError
RequestEntityTooLargeError
ResourceNotFoundError
RootDirectoryError
ServiceUnavailableError
SSLRequiredError
UploadTimeoutError
UserDoesNotExistError

Directories

client.mkdir(path, [options], callback)

Create or overwrite a directory at path. mkdir is really a PUT operation, so it's slightly different semantics than mkdir(2) in POSIX (meaning, you can call mkdir on the same path twice). There is no return value besides a potential error.

    client.mkdir('/jill/stor/foo', function (err) {
        assert.ifError(err);
        ...
    });

Inputs

Name	JS Type	Description
directory	String	(required) A full Manta path to create
options	Object	(optional) optional overrides for this request
callback	Function	(required) callback of the form `fn(err, res)`

client.mkdirp(path, [options], callback)

Same as mkdir, except, mkdirp creates intermediate directories as required.

    client.mkdirp('/jill/stor/foo/bar/baz', function (err) {
        assert.ifError(err);
        ...
    });

Inputs

Name	JS Type	Description
directory	String	(required) A full Manta path to create
options	Object	(optional) optional overrides for this request
callback	Function	(required) callback of the form `fn(err, res)`

client.ls(path, [options], callback)

Lists directory contents. This API will return an EventEmitter that will emit a stream of entries as they are returned from the server. You can listen for two distinct types directory; records of type object will have slightly more information in the records. Both records will have a type field in them. Otherwise, the returned entries are described below. Optional pagination parameters can be included in the options block, and act as you would expect. There is a server-enforced limit of 1000 entries per list request, which is also the default limit, however you can request a smaller size if need be. You can also choose to only receive objects of a certain type.

var opts = {
    offset: 0,
    limit: 256,
    type: 'object'
};
client.ls('/', opts, function (err, res) {
    assert.ifError(err);

    res.on('object', function (obj) {
        console.log(obj);
    });

    res.on('directory', function (dir) {
        console.log(dir);
    });

    res.once('error', function (err) {
        console.error(err.stack);
        process.exit(1);
    });

    res.once('end', function () {
        console.log('all done');
    });
});

Inputs

Name	JS Type	Description
directory	String	(required) A full Manta path to list
options	Object	(optional) optional overrides for this request
callback	Function	(required) callback of the form `fn(err, res)`

Output Objects

Each output object will be of this schema:

{
    name: 'foo',                            // basename of the entry
    etag: 'AABBCC',                         // only set on objects
    size: 1234,                             // only set on objects; valueOf(content-length)
    type: 'object',                         // one of directory || object
    mtime: '2012-11-09T12:34:56Z'           // ISO8601 timestamp of the last update time
}

client.createListStream(path, [options])

List directory contents. This is the ReadableStream version of ls(). Each object read from the stream has the same format as objects emitted from the EventEmitter returned by ls().

var dir = client.createListStream('/', opts);

dir.on('error', function (err) {
    console.error(err.stack);
    process.exit(1);
});

dir.on('readable', function () {
    var ent;
    while ((ent = dir.read()) !== null) {
        console.log(ent);
    }
});

dir.once('end', function () {
    console.log('all done');
});

Inputs

Name	JS Type	Description
path	String	(required) A full Manta path to list
options	Object	(optional) optional overrides for this request

Objects

client.put(path, stream, options, callback)

Creates or overwrites an (object) key. You pass it in a ReadableStream (note that stream must support pause/resume), and upon receiving a 100-continue from manta, the bytes get blasted up.

In this API, you can either pass in an actual 'size' attribute in the options object. If you set that, that is the content-length header for this request. If you don't set that, the request will be "streaming" (transfer-encoding=chunked), in which case your object either needs to fit into the "default" object size (5Gb currently), OR you need to pass in a header of max-content-length, which will be the _maximum_ size of your data. Additionally, you can/should pass in an 'md5' attribute, and you can pass a 'type' attribute which is really the content-type. If you don't pass in 'type', this API will try to guess it based on the name of the object (using the extension). Lastly, you can pass in a 'copies' attribute, which sets the number of full object copies to make server side (default is 2).

However, like the other APIs, you can additionally pass in extra headers, etc. in the options object as well. In the case of objects this is particularly useful for setting CORS headers, for example.

There is no return value besides error reporting.

Note: The example below uses the memorystream module from NPM.

var crypto = require('crypto');
var MemoryStream = require('memorystream');

var message = 'Hello World'
var opts = {
    copies: 3,
    headers: {
        'access-control-allow-origin': '*',
        'access-control-allow-methods': 'GET'
    },
    md5: crypto.createHash('md5').update(message).digest('base64'),
    size: Buffer.byteLength(message),
    type: 'text/plain'
};
var stream = new MemoryStream();

client.put('/jill/stor/hello_world.txt', stream, opts, function (err) {
    assert.ifError(err);
    ...
});

stream.end(message);

Inputs

Name	JS Type	Description
path	String	(required) A full Manta path to write to
stream	Stream	(required) An instance of a `ReadableStream`
options	Object	(required) overrides for this request; must include `size`
callback	Function	(required) callback of the form `fn(err, res)`

client.createWriteStream(path, options)

Essentially the same API/logic as client.put, but idiomatic to node the node streams model. path and options are the same as put, but this API takes no callback, and instead returns an instance of stream.Writable.

Note that standard node stream semantics don't line up to when Manta has actually committed data, so the stream returned by this API emits a close event that also has the http.Response object.

Note: The example below uses the memorystream module from NPM.

var MemoryStream = require('memorystream');

var message = 'Hello World'
var opts = {
    copies: 3,
    headers: {
        'access-control-allow-origin': '*',
        'access-control-allow-methods': 'GET'
    },
    md5: crypto.createHash('md5').update(message).digest('base64'),
    size: Buffer.byteLength(message),
    type: 'text/plain'
};
var stream = new MemoryStream();
var w = client.createWriteStream('/jill/stor/hello_world.txt', opts);

stream.pipe(w);

w.once('close', function (res) {
    console.log('all done');
});

stream.end(message);

Inputs

Name	JS Type	Description
path	String	(required) A full Manta path to write to
options	Object	(required) overrides for this request; must include `size`

client.get(path, [options], callback)

Fetches an object back from Manta, and gives you a (standard) ReadableStream.

Note this API will validate ContentMD5, and so if the downloaded object does not match, the stream will emit an error.

client.get('/jill/stor/hello_world.txt', function (err, stream) {
    assert.ifError(err);

    stream.setEncoding('utf8');
    stream.on('data', function (chunk) {
        console.log(chunk);
    });
    stream.on('end', function () {
        ...
    });
});

Inputs

Name	JS Type	Description
path	String	(required) A full Manta path to fetch
options	Object	(optional) overrides for this request
callback	Function	(required) callback of the form `fn(err, stream)`

client.createReadStream(path, [options])

Fetches an object as a ReadableStream; this API is basically identical to get, except it's idiomatic to node streaming. Additionally, the returned stream will emit close at the end of request data along with the HTTP Response object.

var stream = client.createReadStream('/jill/stor/hello_world.txt');
stream.pipe(process.stdout);
stream.once('close', function (res) {
    console.error(res.statusCode);
});

Inputs

Name	JS Type	Description
path	String	(required) A full Manta path to read
options	Object	(optional) overrides for this request

Links

client.ln(source, path, [options], callback)

Creates a new link (key) to the source object.

Inputs

Name	JS Type	Description
source	String	(required) Full path to the original object.
path	String	(required) Full path to the new link.
options	Object	(optional) overrides for this request
callback	Function	(required) callback of the form `fn(err, stream)`

There is no return value besides a possible error.

client.ln('/jill/stor/hello_world.txt', '/jill/stor/hola_mundo.txt'  function (err) {
    assert.ifError(err);

    ...
});

client.unlink(path, [options], callback)

Deletes an object or directory from Manta. If path points to a directory, the directory must be empty.

There is no return value besides a possible error.

client.unlink('/jill/stor/hello_world.txt', function (err) {
    assert.ifError(err);

    ...
});

Inputs

Name	JS Type	Description
directory	String	(required) A full Manta path to delete
options	Object	(optional) overrides for this request
callback	Function	(required) callback of the form `fn(err, stream)`

Jobs

client.createJob(job, [options], callback)

Creates a new compute job in Manta.

This API is fairly flexible about what it takes, but really the best thing is for callers to just fully spec out the JSON object, like so:

{
  name: "word count",
  phases: [ {
    exec: "wc"
  }, {
    type: "reduce",
    exec: "awk '{ l += $1; w += $2; c += $3 } END { print l, w, c }'"
  } ]
}

job should be a JSON object that specifies at minimum an Array of phases. As described elsewhere, phases should be a set of objects that define your map/reduce tasks.

That being said, for simple jobs this API allows you to 'cheat' a little bit to get started by just taking in simple strings:

createJob("grep foo", function (err, jobId) {
    assert.ifError(err);
    ...
});

createJob(["grep foo", "grep bar"], function (err, jobId) {
    assert.ifError(err);
    ...
});

Note this form is only useful for map only jobs; you cannot specify reduce tasks in this way.

options allows you to set arbitrary headers (as usual), and callback is of the form function (err, jobId). jobId will be the server-created id for this job, which you can pass into the other job related APIs.

Inputs

Name	JS Type	Description
job	Obect	(required) A job definition object, as described below
options	Object	(optional) optional overrides for this request
callback	Function	(required) callback of the form `fn(err, jobId)`

The full set of allowed options for job:

Name	JS Type	Description
name	String	(optional) An arbitrary name for this job
input	String	(optional) An arbitrary jobId to pipe from
phases	Array	(required) tasks to execute as part of this job

phases must be an Array of Object, where objects have the following properties:

Name	JS Type	Description
type	String	(optional) one of: `map` or `reduce`
assets	Array[String]	(optional) an array of manta keys to be placed in your compute zones
exec	String	(required) the actual (shell) statement to execute
count	Number	(optional) an optional number of reducers for this phase (reduce-only): default is `1`
memory	Number	(optional) an optional amount of DRAM to give to your compute zone

count has a minimum of 1 (default), and a maximum of 1024.
memory must be one of the following: 128, 256, 512, 1024, 2048, 4096, 8192, 16384

Output

Output is simply a String job id.

client.job(jobId, [options], callback)

Retrieves a job from Manta. This is the "overall" object, and will not contain input/output keys or failures.

client.job('d095fd4a-3a3d-11e2-b5f1-7be876f9c2b5', function (err, job) {
    assert.ifError(err);
    ...
});

options allows you to set arbitrary headers (as usual), and callback is of the form function (err, job). job will be the job object you used in createJob with a few additional fields:

state: one of queued, running, or done.
cancelled: boolean - whether the user cancelled this job or not.
inputDone: boolean - whether the user "closed" the input for this job.
timeCreated: ISO8601 Timestamp of when the job was created.
timeDone: ISO8601 Timestamp of when the job was completed.

Inputs

Name	JS Type	Description
jobId	String	(required) A job id
options	Object	(optional) optional overrides for this request
callback	Function	(required) callback of the form `fn(err, jobId)`

Output

Output is a job object, that has the additional properties described above.

client.jobs([options], callback)

Lists all jobs for a user. This will stream back the full set of all jobs for a user. Currently you can filter on state by passing state into options.

client.jobs({state: 'running'}, function (err, res) {
    assert.ifError(err);

    res.on('job', function (j) {
        console.log('%j', j);
    });
});

Inputs

Name	JS Type	Description
options	Object	(optional) optional overrides for this request
callback	Function	(required) callback of the form `fn(err, jobId)`

Output

Output is an EventEmitter; listen for job, error and end.

client.addJobKey(jobId, key, [options], callback)

Submits job key(s) to an existing job in Manta. key can be either a single key or an array of keys.

The keys should be fully specified paths to manta objects:

var keys = [
    '/mark/stor/foo',
    '/dave/stor/bar'
];
client.addJobKey('d095fd4a-3a3d-11e2-b5f1-7be876f9c2b5', keys, function (err) {
    assert.ifError(err);
});

In the options block, in aaddition to the usual stuff, you can pass end: true to close input for this job (so you can avoid calling endJob).

There is no return object besides a possible error.

Inputs

Name	JS Type	Description
jobId	String	(required) A job id
keys	Array[String]	(required) A list of keys to submit to the job
options	Object	(optional) optional overrides for this request
callback	Function	(required) callback of the form `fn(err, jobId)`

client.endJob(jobId, [options], callback)

Closes input for a job, and allows a job to either finish or transition to reduce phases (and then finish).

There is no return object besides a possible error.

client.endJob('d095fd4a-3a3d-11e2-b5f1-7be876f9c2b5', function (err) {
    assert.ifError(err);
});

Inputs

Name	JS Type	Description
jobId	String	(required) A job id
options	Object	(optional) optional overrides for this request
callback	Function	(required) callback of the form `fn(err, jobId)`

client.cancelJob(jobId, [options], callback)

Cancels a job, which means input will be closed, and all processing will be cancelled. You should not expect output from a job that has been cancelled.

There is no return object besides a possible error.

client.cancelJob('d095fd4a-3a3d-11e2-b5f1-7be876f9c2b5', function (err) {
    assert.ifError(err);
});

Inputs

Name	JS Type	Description
jobId	String	(required) A job id
options	Object	(optional) optional overrides for this request
callback	Function	(required) callback of the form `fn(err, jobId)`

client.jobInput(jobId, [options], callback)

Retrieves all successfully submitted input keys for a job as a stream.

client.jobInput('d095fd4a-3a3d-11e2-b5f1-7be876f9c2b5', function (err, res) {
    assert.ifError(err);

    res.on('key', function (k) {
        console.log('Input key: %s', k);
    });

    res.once('end', function () {
        console.log('done');
    });
});

Inputs

Name	JS Type	Description
jobId	String	(required) A job id
options	Object	(optional) optional overrides for this request
callback	Function	(required) callback of the form `fn(err, jobId)`

Output

Output is an EventEmitter; listen for key, error and end.

client.jobOutput(jobId, [options], callback)

Retrieves all successfully written output keys for a job as a stream.

client.jobOutput('d095fd4a-3a3d-11e2-b5f1-7be876f9c2b5', function (err, res) {
    assert.ifError(err);

    res.on('key', function (k) {
        console.log('Output key: %s', k);
    });

    res.once('end', function () {
        console.log('done');
    });
});

Inputs

Name	JS Type	Description
jobId	String	(required) A job id
options	Object	(optional) optional overrides for this request
callback	Function	(required) callback of the form `fn(err, jobId)`

Output

Output is an EventEmitter; listen for key, error and end.

client.jobFailures(jobId, [options], callback)

Retrieves all input keys that had failures, as a stream.

client.jobFailures('d095fd4a-3a3d-11e2-b5f1-7be876f9c2b5', function (err, res) {
    assert.ifError(err);

    res.on('key', function (k) {
        console.error('Input key %s failed', k);
    });

    res.once('end', function () {
        console.log('done');
    });
});

Inputs

Name	JS Type	Description
jobId	String	(required) A job id
options	Object	(optional) optional overrides for this request
callback	Function	(required) callback of the form `fn(err, jobId)`

Output

Output is an EventEmitter; listen for key, error and end.

client.jobErrors(jobId, [options], callback)

Retrieves all errors for a job:

client.jobErrors('d095fd4a-3a3d-11e2-b5f1-7be876f9c2b5', function (err, res) {
    assert.ifError(err);

    res.on('err, function (e) {
        console.error('%j', e);
    });

    res.once('end', function () {

    });
});

Inputs

Name	JS Type	Description
jobId	String	(required) A job id
options	Object	(optional) optional overrides for this request
callback	Function	(required) callback of the form `fn(err, jobId)`

Output

Output is an EventEmitter; listen for err, error and end.

The err object has the following properties:

Name	JS Type	Description
id	String	job id
phase	Number	phase number of the failure
what	String	a human readable summary of what failed
code	String	programmatic error code
message	String	human readable error message
stderr	String	(optional) a manta key that saved the stderr for the given command
key	String	(optional) the input key being processed when the task failed (if manta can determine it)

Object Storage and Converged Analytics

Node.js SDK for Manta

Conventions

Installation

Creating a Client

Authenticating Requests

Presigned URLs

Common API options

Common Callback Parameters

Errors

Directories

client.mkdir(path, [options], callback)

Inputs

client.mkdirp(path, [options], callback)

Inputs

client.ls(path, [options], callback)

Inputs

Output Objects

client.createListStream(path, [options])

Inputs

Objects

client.put(path, stream, options, callback)

Inputs

client.createWriteStream(path, options)

Inputs

client.get(path, [options], callback)

Inputs

client.createReadStream(path, [options])

Inputs

Links

client.ln(source, path, [options], callback)

Inputs

client.unlink(path, [options], callback)

Inputs

Jobs

client.createJob(job, [options], callback)

Inputs

Output

client.job(jobId, [options], callback)

Inputs

Output

client.jobs([options], callback)

Inputs

Output

client.addJobKey(jobId, key, [options], callback)

Inputs

client.endJob(jobId, [options], callback)

Inputs

client.cancelJob(jobId, [options], callback)

Inputs

client.jobInput(jobId, [options], callback)

Inputs

Output

client.jobOutput(jobId, [options], callback)

Inputs

Output

client.jobFailures(jobId, [options], callback)

Inputs

Output

client.jobErrors(jobId, [options], callback)

Inputs

Output