Workflow Execution Service (1.1.0)

Download OpenAPI specification:Download

Run standard workflows on workflow execution platforms in a platform-agnostic way.

Executive Summary

The Workflow Execution Service (WES) API provides a standard way for users to submit workflow requests to workflow execution systems, and to monitor their execution. This API lets users run a single workflow (currently CWL or WDL formatted workflows, other types may be supported in the future) on multiple different platforms, clouds, and environments. Key features of the API:

  • can request that a workflow be run + can pass parameters to that workflow (e.g. input files, cmdline arguments) + can get information about running workflows (e.g. status, errors, output file locations) + can cancel a running workflow

Introduction

This document describes the WES API and provides details on the specific endpoints, request formats, and response. It is intended to provide key information for developers of WES-compatible services as well as clients that will call these WES services.

Use cases include:

  • "Bring your code to the data": a researcher who has built their own custom analysis can submit it to run on a dataset owned by an external organization, instead of having to make a copy of the data
  • Best-practices pipelines: a researcher who maintains their own controlled data environment can find useful workflows in a shared directory (e.g. Dockstore.org), and run them over their data

Standards

The WES API specification is written in OpenAPI and embodies a RESTful service philosophy. It uses JSON in requests and responses and standard HTTP/HTTPS for information transport.

Authorization & Authentication

Users must supply credentials that establish their identity and authorization in order to use a WES endpoint. We recommend that WES implementations use an OAuth2 bearer token, although they can choose other mechanisms if appropriate. WES callers can use the auth_instructions_url from the service-info endpoint to learn how to obtain and use a bearer token for a particular implementation. The WES implementation is responsible for checking that a user is authorized to submit workflow run requests. The particular authorization policy is up to the WES implementer. Systems like WES need to also address the ability to pass credentials with jobs for input and output access. In the current version of WES, the passing of credentials to authenticate and authorize access to inputs and outputs, as well as mandates about necessary file transfer protocols to support, are out of scope. However, parallel work on the Data Object Service is addressing ways to pass around access credentials with data object references, opening up the possibility that a future version of WES will provide concrete mechanisms for workflow runs to access data using credentials different than those used for WES. This is a work in progress and support of DOS in WES will be added in a future release of WES.

Service Info

Get information about the workflow execution service

GetServiceInfo

May include information related (but not limited to) the workflow descriptor formats, versions supported, the WES API versions supported, and information about general service availability.

Responses

Response samples

Content type
application/json
{
  • "id": "org.ga4gh.myservice",
  • "name": "My project",
  • "type": {
    },
  • "description": "This service provides...",
  • "organization": {},
  • "contactUrl": "mailto:support@example.com",
  • "documentationUrl": "https://docs.myservice.example.com",
  • "createdAt": "2019-06-04T12:58:19Z",
  • "updatedAt": "2019-06-04T12:58:19Z",
  • "environment": "test",
  • "version": "1.0.0",
  • "workflow_type_versions": {
    },
  • "supported_wes_versions": [
    ],
  • "supported_filesystem_protocols": [
    ],
  • "workflow_engine_versions": {
    },
  • "default_workflow_engine_parameters": [
    ],
  • "system_state_counts": {
    },
  • "auth_instructions_url": "string",
  • "tags": {
    }
}

Workflow Runs

Submit and monitor workflows in a WES environment

ListRuns

This list should be provided in a stable ordering. (The actual ordering is implementation dependent.) When paging through the list, the client should not make assumptions about live updates, but should assume the contents of the list reflect the workflow list at the moment that the first page is requested. To monitor a specific workflow run, use GetRunStatus or GetRunLog.

query Parameters
page_size
integer <int64>

OPTIONAL The preferred number of workflow runs to return in a page. If not provided, the implementation should use a default page size. The implementation must not return more items than page_size, but it may return fewer. Clients should not assume that if fewer than page_size items are returned that all items have been returned. The availability of additional pages is indicated by the value of next_page_token in the response.

page_token
string

OPTIONAL Token to use to indicate where to start getting results. If unspecified, return the first page of results.

Responses

Response samples

Content type
application/json
{
  • "runs": [
    ],
  • "next_page_token": "string"
}

RunWorkflow

This endpoint creates a new workflow run and returns a RunId to monitor its progress. The workflow_attachment array may be used to upload files that are required to execute the workflow, including the primary workflow, tools imported by the workflow, other files referenced by the workflow, or files which are part of the input. The implementation should stage these files to a temporary directory and execute the workflow from there. These parts must have a Content-Disposition header with a "filename" provided for each part. Filenames may include subdirectories, but must not include references to parent directories with '..' -- implementations should guard against maliciously constructed filenames. The workflow_url is either an absolute URL to a workflow file that is accessible by the WES endpoint, or a relative URL corresponding to one of the files attached using workflow_attachment. The workflow_params JSON object specifies input parameters, such as input files. The exact format of the JSON object depends on the conventions of the workflow language being used. Input files should either be absolute URLs, or relative URLs corresponding to files uploaded using workflow_attachment. The WES endpoint must understand and be able to access URLs supplied in the input. This is implementation specific. The workflow_type is the type of workflow language and must be "CWL" or "WDL" currently (or another alternative supported by this WES instance). The workflow_type_version is the version of the workflow language submitted and must be one supported by this WES instance. The workflow_engine is the engine that supports the workflow_type and must be supported by this WES instance. The workflow_engine_version is the version of workflow engine and must be supported by this WES instance. See the RunRequest documentation for details about other fields.

Request Body schema: multipart/form-data
workflow_params
string
workflow_type
string
workflow_type_version
string
tags
string
workflow_engine
string
workflow_engine_version
string
workflow_engine_parameters
string
workflow_url
string
workflow_attachment
Array of strings <binary>

Responses

Response samples

Content type
application/json
{
  • "run_id": "string"
}

GetRunLog

This endpoint provides detailed information about a given workflow run. The returned result has information about the outputs produced by this workflow (if available), a log object which allows the stderr and stdout to be retrieved, a log array so stderr/stdout for individual tasks can be retrieved, and the overall state of the workflow run (e.g. RUNNING, see the State section).

path Parameters
run_id
required
string

Responses

Response samples

Content type
application/json
{
  • "run_id": "string",
  • "request": {
    },
  • "state": "UNKNOWN",
  • "run_log": {
    },
  • "task_logs_url": "string",
  • "task_logs": [
    ],
  • "outputs": { }
}

GetRunStatus

This provides an abbreviated (and likely fast depending on implementation) status of the running workflow, returning a simple result with the overall state of the workflow run (e.g. RUNNING, see the State section).

path Parameters
run_id
required
string

Responses

Response samples

Content type
application/json
{
  • "run_id": "string",
  • "state": "UNKNOWN"
}

ListTasks

This endpoint provides a paginated list of tasks that were executed as part of a given workflow run. Task ordering should be the same as what would be returned in a RunLog response body.

path Parameters
run_id
required
string
query Parameters
page_size
integer <int64>

OPTIONAL The preferred number of task logs to return in a page. If not provided, the implementation should use a default page size. The implementation must not return more items than page_size, but it may return fewer. Clients should not assume that if fewer than page_size items are returned that all items have been returned. The availability of additional pages is indicated by the value of next_page_token in the response.

page_token
string

OPTIONAL Token to use to indicate where to start getting results. If unspecified, return the first page of results.

Responses

Response samples

Content type
application/json
{
  • "task_logs": [
    ],
  • "next_page_token": "string"
}

GetTask

This endpoint provides a mechanism to retrieve information on a specific task, if it exists

path Parameters
run_id
required
string
task_id
required
string

Responses

Response samples

Content type
application/json
{
  • "name": "string",
  • "cmd": [
    ],
  • "start_time": "string",
  • "end_time": "string",
  • "stdout": "string",
  • "stderr": "string",
  • "exit_code": 0,
  • "system_logs": [
    ],
  • "id": "string",
  • "tes_uri": "string"
}

CancelRun

Cancel a running workflow.

path Parameters
run_id
required
string

Responses

Response samples

Content type
application/json
{
  • "run_id": "string"
}

ServiceInfo

id
required
string

Unique ID of this service. Reverse domain name notation is recommended, though not required. The identifier should attempt to be globally unique so it can be used in downstream aggregator services e.g. Service Registry.

name
required
string

Name of this service. Should be human readable.

required
object (ServiceType)

Type of a GA4GH service

description
string

Description of the service. Should be human readable and provide information about the service.

required
object

Organization providing the service

contactUrl
string <uri>

URL of the contact for the provider of this service, e.g. a link to a contact form (RFC 3986 format), or an email (RFC 2368 format).

documentationUrl
string <uri>

URL of the documentation of this service (RFC 3986 format). This should help someone learn how to use your service, including any specifics required to access data, e.g. authentication.

createdAt
string <date-time>

Timestamp describing when the service was first deployed and available (RFC 3339 format)

updatedAt
string <date-time>

Timestamp describing when the service was last updated (RFC 3339 format)

environment
string

Environment the service is running in. Use this to distinguish between production, development and testing/staging deployments. Suggested values are prod, test, dev, staging. However this is advised and not enforced.

version
required
string

Version of the service being described. Semantic versioning is recommended, but other identifiers, such as dates or commit hashes, are also allowed. The version should be changed whenever the service is updated.

required
object
supported_wes_versions
required
Array of strings

The version(s) of the WES schema supported by this service

supported_filesystem_protocols
required
Array of strings

The filesystem protocols supported by this service, currently these may include common protocols using the terms 'http', 'https', 'sftp', 's3', 'gs', 'file', or 'synapse', but others are possible and the terms beyond these core protocols are currently not fixed. This section reports those protocols (either common or not) supported by this WES service.

required
object
required
Array of objects (DefaultWorkflowEngineParameter)

Each workflow engine can present additional parameters that can be sent to the workflow engine. This message will list the default values, and their types for each workflow engine.

required
object
auth_instructions_url
required
string

A web page URL with human-readable instructions on how to get an authorization token for use with a specific WES endpoint.

required
object
{
  • "id": "org.ga4gh.myservice",
  • "name": "My project",
  • "type": {
    },
  • "description": "This service provides...",
  • "organization": {},
  • "contactUrl": "mailto:support@example.com",
  • "documentationUrl": "https://docs.myservice.example.com",
  • "createdAt": "2019-06-04T12:58:19Z",
  • "updatedAt": "2019-06-04T12:58:19Z",
  • "environment": "test",
  • "version": "1.0.0",
  • "workflow_type_versions": {
    },
  • "supported_wes_versions": [
    ],
  • "supported_filesystem_protocols": [
    ],
  • "workflow_engine_versions": {
    },
  • "default_workflow_engine_parameters": [
    ],
  • "system_state_counts": {
    },
  • "auth_instructions_url": "string",
  • "tags": {
    }
}

RunListResponse

Array of RunStatus (object) or RunSummary (object)

A list of workflow runs that the service has executed or is executing. The list is filtered to only include runs that the caller has permission to see.

next_page_token
string

A token which may be supplied as page_token in workflow run list request to get the next page of results. An empty string indicates there are no more items to return.

{
  • "runs": [
    ],
  • "next_page_token": "string"
}

RunId

run_id
string

workflow run ID

{
  • "run_id": "string"
}

State

string (State)
Enum: "UNKNOWN" "QUEUED" "INITIALIZING" "RUNNING" "PAUSED" "COMPLETE" "EXECUTOR_ERROR" "SYSTEM_ERROR" "CANCELED" "CANCELING" "PREEMPTED"

State can take any of the following values:

  • UNKNOWN: The state of the task is unknown. This provides a safe default for messages where this field is missing, for example, so that a missing field does not accidentally imply that the state is QUEUED.

  • QUEUED: The task is queued.

  • INITIALIZING: The task has been assigned to a worker and is currently preparing to run. For example, the worker may be turning on, downloading input files, etc.

  • RUNNING: The task is running. Input files are downloaded and the first Executor has been started.

  • PAUSED: The task is paused. An implementation may have the ability to pause a task, but this is not required.

  • COMPLETE: The task has completed running. Executors have exited without error and output files have been successfully uploaded.

  • EXECUTOR_ERROR: The task encountered an error in one of the Executor processes. Generally, this means that an Executor exited with a non-zero exit code.

  • SYSTEM_ERROR: The task was stopped due to a system error, but not from an Executor, for example an upload failed due to network issues, the worker's ran out of disk space, etc.

  • CANCELED: The task was canceled by the user.

  • CANCELING: The task was canceled by the user, and is in the process of stopping.

  • PREEMPTED: The task is stopped (preempted) by the system. The reasons for this would be tied to the specific system running the job. Generally, this means that the system reclaimed the compute capacity for reallocation.

"UNKNOWN"

RunStatus

run_id
required
string
state
string (State)
Enum: "UNKNOWN" "QUEUED" "INITIALIZING" "RUNNING" "PAUSED" "COMPLETE" "EXECUTOR_ERROR" "SYSTEM_ERROR" "CANCELED" "CANCELING" "PREEMPTED"

State can take any of the following values:

  • UNKNOWN: The state of the task is unknown. This provides a safe default for messages where this field is missing, for example, so that a missing field does not accidentally imply that the state is QUEUED.

  • QUEUED: The task is queued.

  • INITIALIZING: The task has been assigned to a worker and is currently preparing to run. For example, the worker may be turning on, downloading input files, etc.

  • RUNNING: The task is running. Input files are downloaded and the first Executor has been started.

  • PAUSED: The task is paused. An implementation may have the ability to pause a task, but this is not required.

  • COMPLETE: The task has completed running. Executors have exited without error and output files have been successfully uploaded.

  • EXECUTOR_ERROR: The task encountered an error in one of the Executor processes. Generally, this means that an Executor exited with a non-zero exit code.

  • SYSTEM_ERROR: The task was stopped due to a system error, but not from an Executor, for example an upload failed due to network issues, the worker's ran out of disk space, etc.

  • CANCELED: The task was canceled by the user.

  • CANCELING: The task was canceled by the user, and is in the process of stopping.

  • PREEMPTED: The task is stopped (preempted) by the system. The reasons for this would be tied to the specific system running the job. Generally, this means that the system reclaimed the compute capacity for reallocation.

{
  • "run_id": "string",
  • "state": "UNKNOWN"
}

RunSummary

run_id
required
string
state
string (State)
Enum: "UNKNOWN" "QUEUED" "INITIALIZING" "RUNNING" "PAUSED" "COMPLETE" "EXECUTOR_ERROR" "SYSTEM_ERROR" "CANCELED" "CANCELING" "PREEMPTED"

State can take any of the following values:

  • UNKNOWN: The state of the task is unknown. This provides a safe default for messages where this field is missing, for example, so that a missing field does not accidentally imply that the state is QUEUED.

  • QUEUED: The task is queued.

  • INITIALIZING: The task has been assigned to a worker and is currently preparing to run. For example, the worker may be turning on, downloading input files, etc.

  • RUNNING: The task is running. Input files are downloaded and the first Executor has been started.

  • PAUSED: The task is paused. An implementation may have the ability to pause a task, but this is not required.

  • COMPLETE: The task has completed running. Executors have exited without error and output files have been successfully uploaded.

  • EXECUTOR_ERROR: The task encountered an error in one of the Executor processes. Generally, this means that an Executor exited with a non-zero exit code.

  • SYSTEM_ERROR: The task was stopped due to a system error, but not from an Executor, for example an upload failed due to network issues, the worker's ran out of disk space, etc.

  • CANCELED: The task was canceled by the user.

  • CANCELING: The task was canceled by the user, and is in the process of stopping.

  • PREEMPTED: The task is stopped (preempted) by the system. The reasons for this would be tied to the specific system running the job. Generally, this means that the system reclaimed the compute capacity for reallocation.

start_time
string

When the run started executing, in ISO 8601 format "%Y-%m-%dT%H:%M:%SZ"

end_time
string

When the run stopped executing (completed, failed, or cancelled), in ISO 8601 format "%Y-%m-%dT%H:%M:%SZ"

required
object

Arbitrary key/value tags added by the client during run creation

{
  • "run_id": "string",
  • "state": "UNKNOWN",
  • "start_time": "string",
  • "end_time": "string",
  • "tags": {
    }
}

RunRequest

workflow_params
object

REQUIRED The workflow run parameterizations (JSON encoded), including input and output file locations

workflow_type
required
string

REQUIRED The workflow descriptor type, must be "CWL" or "WDL" currently (or another alternative supported by this WES instance)

workflow_type_version
required
string

REQUIRED The workflow descriptor type version, must be one supported by this WES instance

object
object
workflow_engine
string

The workflow engine, must be one supported by this WES instance. Required if workflow_engine_version is provided.

workflow_engine_version
string

The workflow engine version, must be one supported by this WES instance. If workflow_engine is provided, but workflow_engine_version is not, servers can make no assumptions with regard to the engine version the WES instance uses to process the request if that WES instance supports multiple versions of the requested engine.

workflow_url
required
string

REQUIRED The workflow CWL or WDL document. When workflow_attachments is used to attach files, the workflow_url may be a relative path to one of the attachments.

{
  • "workflow_params": { },
  • "workflow_type": "string",
  • "workflow_type_version": "string",
  • "tags": {
    },
  • "workflow_engine_parameters": {
    },
  • "workflow_engine": "string",
  • "workflow_engine_version": "string",
  • "workflow_url": "string"
}

RunLog

run_id
string

workflow run ID

object (RunRequest)

To execute a workflow, send a run request including all the details needed to begin downloading and executing a given workflow. If workflow_engine and workflow_engine_version are not provided, servers can use the most recent workflow_engine_version of workflow_engine that WES instance uses to process the request if supports for the requested workflow_type.

state
string (State)
Enum: "UNKNOWN" "QUEUED" "INITIALIZING" "RUNNING" "PAUSED" "COMPLETE" "EXECUTOR_ERROR" "SYSTEM_ERROR" "CANCELED" "CANCELING" "PREEMPTED"

State can take any of the following values:

  • UNKNOWN: The state of the task is unknown. This provides a safe default for messages where this field is missing, for example, so that a missing field does not accidentally imply that the state is QUEUED.

  • QUEUED: The task is queued.

  • INITIALIZING: The task has been assigned to a worker and is currently preparing to run. For example, the worker may be turning on, downloading input files, etc.

  • RUNNING: The task is running. Input files are downloaded and the first Executor has been started.

  • PAUSED: The task is paused. An implementation may have the ability to pause a task, but this is not required.

  • COMPLETE: The task has completed running. Executors have exited without error and output files have been successfully uploaded.

  • EXECUTOR_ERROR: The task encountered an error in one of the Executor processes. Generally, this means that an Executor exited with a non-zero exit code.

  • SYSTEM_ERROR: The task was stopped due to a system error, but not from an Executor, for example an upload failed due to network issues, the worker's ran out of disk space, etc.

  • CANCELED: The task was canceled by the user.

  • CANCELING: The task was canceled by the user, and is in the process of stopping.

  • PREEMPTED: The task is stopped (preempted) by the system. The reasons for this would be tied to the specific system running the job. Generally, this means that the system reclaimed the compute capacity for reallocation.

object (Log)

Log and other info

task_logs_url
string

A reference to the complete url which may be used to obtain a paginated list of task logs for this workflow

Array of Log (object) or TaskLog (object) or null
Deprecated

The logs, and other key info like timing and exit code, for each step in the workflow run. This field is deprecated and the task_logs_url should be used to retrieve a paginated list of steps from the workflow run. This field will be removed in the next major version of the specification (2.0.0)

outputs
object

The outputs from the workflow run.

{
  • "run_id": "string",
  • "request": {
    },
  • "state": "UNKNOWN",
  • "run_log": {
    },
  • "task_logs_url": "string",
  • "task_logs": [
    ],
  • "outputs": { }
}

Log

name
string

The task or workflow name

cmd
Array of strings

The command line that was executed

start_time
string

When the command started executing, in ISO 8601 format "%Y-%m-%dT%H:%M:%SZ"

end_time
string

When the command stopped executing (completed, failed, or cancelled), in ISO 8601 format "%Y-%m-%dT%H:%M:%SZ"

stdout
string

A URL to retrieve standard output logs of the workflow run or task. This URL may change between status requests, or may not be available until the task or workflow has finished execution. Should be available using the same credentials used to access the WES endpoint.

stderr
string

A URL to retrieve standard error logs of the workflow run or task. This URL may change between status requests, or may not be available until the task or workflow has finished execution. Should be available using the same credentials used to access the WES endpoint.

exit_code
integer <int32>

Exit code of the program

system_logs
Array of strings

System logs are any logs the system decides are relevant, which are not tied directly to a workflow. Content is implementation specific: format, size, etc.

System logs may be collected here to provide convenient access.

For example, the system may include an error message that caused a SYSTEM_ERROR state (e.g. disk is full), etc.

{
  • "name": "string",
  • "cmd": [
    ],
  • "start_time": "string",
  • "end_time": "string",
  • "stdout": "string",
  • "stderr": "string",
  • "exit_code": 0,
  • "system_logs": [
    ]
}

TaskLog

name
required
string

The task or workflow name

cmd
Array of strings

The command line that was executed

start_time
string

When the command started executing, in ISO 8601 format "%Y-%m-%dT%H:%M:%SZ"

end_time
string

When the command stopped executing (completed, failed, or cancelled), in ISO 8601 format "%Y-%m-%dT%H:%M:%SZ"

stdout
string

A URL to retrieve standard output logs of the workflow run or task. This URL may change between status requests, or may not be available until the task or workflow has finished execution. Should be available using the same credentials used to access the WES endpoint.

stderr
string

A URL to retrieve standard error logs of the workflow run or task. This URL may change between status requests, or may not be available until the task or workflow has finished execution. Should be available using the same credentials used to access the WES endpoint.

exit_code
integer <int32>

Exit code of the program

system_logs
Array of strings

System logs are any logs the system decides are relevant, which are not tied directly to a workflow. Content is implementation specific: format, size, etc.

System logs may be collected here to provide convenient access.

For example, the system may include an error message that caused a SYSTEM_ERROR state (e.g. disk is full), etc.

id
required
string

A unique identifier which may be used to reference the task

tes_uri
string

An optional URL pointing to an extended task definition defined by a TES api

{
  • "name": "string",
  • "cmd": [
    ],
  • "start_time": "string",
  • "end_time": "string",
  • "stdout": "string",
  • "stderr": "string",
  • "exit_code": 0,
  • "system_logs": [
    ],
  • "id": "string",
  • "tes_uri": "string"
}

TaskListResponse

Array of objects (TaskLog)

The logs, and other key info like timing and exit code, for each step in the workflow run.

next_page_token
string

A token which may be supplied as page_token in workflow run task list request to get the next page of results. An empty string indicates there are no more items to return.

{
  • "task_logs": [
    ],
  • "next_page_token": "string"
}

DefaultWorkflowEngineParameter

name
string

The name of the parameter

type
string

Describes the type of the parameter, e.g. float.

default_value
string

The stringified version of the default parameter. e.g. "2.45".

{
  • "name": "string",
  • "type": "string",
  • "default_value": "string"
}

WorkflowTypeVersion

workflow_type_version
Array of strings

an array of one or more acceptable types for the workflow_type

{
  • "workflow_type_version": [
    ]
}

WorkflowEngineVersion

workflow_engine_version
Array of strings

An array of one or more acceptable engines versions for the workflow_engine

{
  • "workflow_engine_version": [
    ]
}