Data Model
Data Model
Currently, the data model is built around two main models, the Tool and the ToolVersion. A Tool describes one pairing of a tool and one Docker image. A ToolVersion describes a particular iteration of a tool as fixed by a reference to a specific image id, a Dockerfile (which was used to build an image) as fixed by a URL, and a descriptor for the tool in either CWL or WDL.
Design Decisions
We are starting with a read-only API due to potentially different views and approaches to registration/security.
Multiple formats for descriptors such as CWL and WDL are permitted.
TRS Tool and TRS Tool Version IDs
Each implementation of TRS can choose its own identifier scheme, as long as it follows these guidelines:
- TRS Tool and TRS Tool Version IDs are strings made up of uppercase and
lowercase letters, decimal digits, hyphen, period, underscore and tilde
(
[A-Za-z0-9.-_~]
). See RFC 3986 § 2.3. TRS Tool and TRS Tool Version IDs MAY further contain percent characters (%
) whenever they are used in API calls, but only if they were introduced through percent-encoding of any non-valid characters (see next bullet point). - TRS Tool and TRS Tool Version IDs can contain other characters, but they MUST
be percent-encoded (see RFC 3986 §
2.4) into valid
TRS Tool and TRS Tool Version IDs as per the previous rule whenever they are
used in API calls. This is because non-encoded IDs may interfere with the
interpretation of routes, e.g., for the
/tools/{id}/versions/{version_id}
endpoint. - Any given TRS Tool or TRS Tool Version ID MUST always identify the same resource (tool or tool version, respectively) on a given TRS implementation. This constraint aids with reproducibility.
- TRS implementations MAY have more than one TRS Tool or TRS Tool Version ID mapping to the same resource (tool or tool version, respectively).
TRS URIs
To conveniently pass content references to TRS resources, e.g., to advise a
WES or
TES service which tool
(version) to use, we define a URI syntax for TRS-accessible content. Strings of
the form trs://<server>/<id>
(unversioned) and
trs://<server>/<id>/<version_id>
(versioned) mean “you can fetch the content
with TRS Tool ID <id>
and, if provided, TRS Tool Version ID <version_id>
from the TRS server at <server>
“.
For example, if a TRS client was asked to process
trs://trs.example.org/tool_ABC/v1.2.3
, it would know that it could, e.g.,
issue GET
requests to
https://trs.example.org/api/ga4gh/trs/v2/tools/tool_ABC
and
https://trs.example.org/api/ga4gh/trs/v2/tools/tool_ABC/versions/v1.2.3
to
fetch information about tool tool_ABC
and its version v1.2.3
from a v2
TRS API hosted at https://trs.example.org/
, respectively.
Note that clients issuing requests to TRS services MUST NOT encode forward the special characters slashes separating the
trs
,<server>
,<id>
and<version_id>
components of TRS URIs. However, TRS Tool IDs and TRS Tool Version IDs containing non-valid characters MUST be encoded individually before constructing TRS URIs. For example, for a TRS Tool IDtool#1
and a TRS Version ID(1)
, the correct TRS URI for servertrs.example.org
would betrs://trs.example.org/tool%231/%281%29
, wheretool%231
and%281%29
are the percent-encoded TRS Tool and TRS Tool Version IDs, respectively.Also note that to ensure reproducibility, servers implementing multiple versions of the TRS API specification MUST ensure that, within the limits of schema differences across different API versions, corresponding endpoints return consistent responses.
This recommendation is intended to mirror the discussion that went into the DRS URI scheme, with the necessary additions to account for versioned TRS URIs.
Misc
The entire schema is shown below, but a more useful form is the Swagger editor to view our schema in progress
Note that the swagger editor itself can kickstart a project by generating servers and clients in a variety of languages.
Central GA4GH Service Registry
For informational purposes, we recommend that TRS implementations add themselves to https://github.com/ga4gh/tool-registry-service-schemas/blob/develop/registry.json to provide for the possibility of creating a global indexing service and to allow others to more easily discover a TRS implementation.
Outstanding Questions
Authentication and Authorization
GA4GH recommends the use of the OAuth 2.0 framework (RFC 6749) for authentication and authorization. It is also recommended that implementations of this standard implement and follow the GA4GH Authentication and Authorization Infrastructure (AAI) standard.
While the TRS standard itself does not define any behaviour specific to authorization, given that it hosts and shares publicly available workflows. For future expansion, we recommend that if authorization is needed, that it follows the OAuth 2.0 recommendations as defined above.
Other
- How do we track authorship? Should we track authorship of the tool metadata, the Docker image, or the underlying algorithm, or all of above?
- What do we need to provide to allow for indexing and external services like an external sparql service.