A collection of questions (and hopefully useful answers).


Background

Why Brokers?

We have found that there are widely used Identity Providers (IdP). These authentication mechanisms provide no authorization information (custom claims or scopes) but are ubiquitous for authentication at the institution level. The use of a Broker and Clearinghouse enables attaching information to the usual OIDC flow so that IdPs can be used with customized claims and scopes.

Here is a diagram of a single broker. This is one possible way to use this spec.

In this diagram, the Broker relies on a separate service for fetching visas, which stores assertions from multiple sources. The visa assertions are obtained by the Clearinghouse after a successful login, and used to determine a researcher’s access in the Clearinghouse system.

The Broker, Clearinghouse, and Visa Issuer may be separate services (as shown in this diagram), but in other configurations they may be run as parts of a single service, or as separate services run by single organization. Data holders and data controllers should explore their options to decide what best fits their needs.

Flows

The following sequence diagrams are included to help explain the intended flows documented in the accompanying specification.

What is the complete end to end flow using token exchange?

(last updated June 2022)

This flow is the preferred flow for v1.2+ of the specification.

The exchange flow does not ever distribute the initial Passport-Scoped Access Token beyond the client application. A token exchange operation is executed by the client, in exchange for a Passport JWT that may be used downstream to access resources. In this example flow, the Passport is included as authorization in the POST to a DRS server. The token exchange has also specified known resources that will limit the audience of the Passport.

Researcher AAI Data Controller Data Holder User Agent Client Broker IdP Visa Issuer(s) Clearing House Data OIDC ref OIDC flow results in the client holding a *Passport-Scoped Access Token*. Exchange Token exchange (note: for clarity these form values have not actually been URL encoded) Content-Type: application/x-www-form-urlencoded   grant_type=urn:ietf:params:oauth:grant-type:token-exchange &requested_token_type=urn:ga4gh:params:oauth:token-type:passport &subject_token_type=urn:ietf:params:oauth:token-type:access_token &subject_token=<Passport-Scoped Access Token> &resource=https://drs.example.com/dataset1 &resource=https://drs.example.com/dataset2 Informative Only (not defined in the specification) Fetch signed visa(s) for user Return signed visa(s) for user [ "<visa1>", "<visa2>" ] Token exchange return { "access_token": "<Passport>", "issued_token_type": "urn:ga4gh:params:oauth:token-type:passport", "token_type": "Bearer" }   ▼ Passport content as example (decoded from the base64 of the JWT Passport) ▼   { "typ": "vnd.ga4gh.passport+jwt", "alg": "<algorithm-identifier>", "kid": "<key-identifier>" } . { "iss": "https://<issuer-website>", "sub": "<subject-identifier>", "aud": [ "https://drs.example.com" ], "iat": <seconds-since-epoch>, "exp": <seconds-since-epoch>, "jti": "<token-identifier>", "ga4gh_passport_v1": [ "<visa1>", "<visa2>", ... ] } . <signature> Use Client requests data POST /ga4gh/drs/v1/objects/dataset1/access/s3 HTTP/1.1 Host: drs.example.com Content-Type: application/json   { "passports": [ "<Passport>" ] } Decision is made to release data using information contained in the passport token - and this decision is coordinated with the data system to facilitate that actual data release. Client is given data

What is the complete end to end flow using /userinfo?

(last updated June 2022)

This flow is the flow for v1.0 implementations of the specification.

The flow as used by Elixir uses the initial Passport-Scoped Access Token as a token handed to downstream resource servers. These servers can use this token, in conjunction with a callback to the broker’s Userinfo Endpoint, to obtain the Passport content in JSON format.

Researcher AAI Data Controller Data Holder User Agent Client Broker IdP Visa Issuer(s) Clearing House Data OIDC ref OIDC flow results in the client holding a *Passport-Scoped Access Token*. Use Client requests data { Authorization: Bearer <Passport-Scoped Access Token> } Clearing house asks for list of signed visa(s) { Authorization: Bearer <Passport-Scoped Access Token> } Informative Only (not defined in the specification) Fetch signed visa(s) for user Return signed visa(s) for user [ "<visa1>", "<visa2>" ] UserInfo endpoint returns list of signed visa(s) { "iss": "https://<issuer-website>/", "sub": "<subject-identifier>", "ga4gh_passport_v1": [ "<visa1>", "<visa2>", ... ] } Decision is made to release data using information contained in the passport - and this decision is coordinated with the data system to facilitate that actual data release. Client is given data

Trust

What’s with all the signed passports and visas etc? Why so complex?

(last updated July 2022)

The practical operation of a loosely coupled federated ecosystem like GA4GH Passports requires establishing trust relationships between the participating entities (see OpenID Connect Federation for an interesting discussion of the properties of multilateral federations).

Any entity that is asked to make a decision about sharing data needs to have a priori made the decision “which other entities do I trust?”. In genomics, a single decision to allow data sharing might involve simultaneous trusting of multiple entities - human genomics is complex!

When presented with a Passport and Visas from entities that they trust - they can rely on the information (claims) provided to make data sharing decisions. If presented with information from entities that are untrusted - the content of the message is irrelevant as there is no basis on which to believe the content.

So we can see that trust is a crucial element of a working federation. How do we establish these trust relationships?

GA4GH Passports and Visas leverage the mechanisms present in JWT as used by the OIDC standards to cryptographically “sign” tokens containing claims. Signed tokens can be “verified” using public/private keys.

Why are the Visa claims formatted as JWTs inside the Passport?

(last updated August 2022)

If visas were not signed separately from a Passport, then Clearinghouses would require a high degree of confidence that the signing Passport Issuer or Broker was accurately representing the contents of Visas from the original Visa Issuer; there would be no mechanisms to prevent a malicious or defective Passport Issuer from misrepresenting Visa contents. By signing Visas separately from Passports, we prevent intermediate parties from tampering with Visa contents, allowing Clearinghouses to safely rely on Visa contents based on their trust relationship with the Visa Issuer.

What are the ways that key management is done in practice?

(last updated July 2022)

There are a variety of approaches that can be used for key management for Passports and Visas. We will first detail those that can be used for Passports and then discuss some extra wrinkles for Visas.

For this discussion we assume there is a concrete JWT from issuer https://issuer.example.org (possibly a Broker or a Visa Issuer).

My clearinghouse service ‘trusts’ the above issuer to help make data access decisions - and that trust is stated probably through some sort of explicit configuration. For our example, let’s imagine it has a hypothetical YAML configuration file

trusted_brokers:
  - https://issuer.example.org
  - https://login.broker.com

trusted_visa_issuers:
  - https://dac.gov.world

The service now wants to verify a Passport or Visa JWT purporting to be from that issuer.


Use jku in the header of the JWT

The JKU is the URL of a JSON file containing the issuers’ public keys.

{
  "iss": "https://issuer.example.org",
  "jku": "https://issuer.example.org/public-keys.json",
  "kid": "key-october-1"
}

For our concrete example we say that it is a JSON file residing at https://issuer.example.org/public-keys.json (see RFC 7517 “JSON Web Key”).

IMPORTANTLY, for the secure use of this key management technique - the JKU MUST also be allow-listed as part of the configuration of OUR service. For example:

trusted_brokers:
  - issuer: https://issuer.example.org
    jku: https://issuer.example.org/public-keys.json

  - issuer: https://login.broker.com
    jku: https://keys.broker.com/set

It is NEVER valid to even attempt to access a JKU from a JWT header - unless the URL is already known to belong to the given issuer. See “JWT Forgery via unvalidated jku parameter”.

To verify a JWT, the content of the JKU file is loaded and the kid is looked up in key set. The signature is validated using the public key found.

Although this configuration requires explicit registration of JKUs, the content of the key sets can allow the best practice of key rotation.

The content of the JKU file is designed to be cached aggressively, but as long as the file is fetched every few days/weeks, the set of keys can evolve/rotate.


Use the kid in the header of the JWT along with OIDC Discovery

The OpenID Connect Discovery protocol allows the use of the JWT issuer URL - in conjunction with a fetch of a /.well-known/openid-configuration - to look up the location of the public key set file. See OpenID Provider Metadata specification and the jwks_uri entry.

As with JKU - it is important that OIDC discovery is limited only to JWT issuer URLs that are in some way allow-listed. It is NEVER valid to perform discovery on an arbitrary issuer encountered in a JWT. Luckily, the concept of allow-listing issuers is already in some way inherent to the way trust relationships are established, and hence this allow list should already be present in the system.

Also as with JKU, the content of the discovery protocol and key sets can be cached aggressively. This means that the double step of the discovery protocol is not required on every JWT verification.


Exchange public keys beforehand with each trusted entity

This is an approach used by the NIH - and is appropriate if the number of trusted entities is small - such that the public keys of each trusted entity can be exchanged out of band (and their rotation/updating can also be managed out of band).

Configuration may be

trusted_brokers:
  - issuer: https://issuer.example.org
    keys:
      key-october-1: "ABCDTRTFDSFSDFSF...."

  - issuer: https://login.broker.com
    keys:
      kid123456: "ABCDTRTFDSFSDFSF...."

For any kid encountered in a JWT, the corresponding public key is already available for signature validation.

An even safer version of this approach is to perform the key verification across every public key you hold before even decoding the JWT JSON and then confirm the kid. This avoids ever even needing to JSON decode data from untrusted entities.


Requirement for JKU in Visas

When it comes to Visas, there is an extra wrinkle - unlike Brokers, Visa Issuers do not need to have been part of an OIDC flow. It is possible that the visa issuer URI is not even a locatable reference (e.g. urn:example.com:dac-world-visa-issuer).

Therefore the OIDC Discovery technique is not appropriate and hence the requirements of the AAI standard regarding the presence of JKUs.


Client Software

Use of GA4GH passport/visas in Single Page Apps (SPAs)

It is currently recommended that single page apps (SPA) such as a React/VueJS websites are NOT used for scenarios where the single page app code is solely responsible for the handling of genomic passports and tokens.

A SPA contains all the source of the application in public - and hence cannot possess a ‘client secret’ in an OIDC flow (the ‘client secret’ is used to prove the identity of the client software and is an important risk mitigation that prevents unconstrained use of an accidentally leaked/stolen token).

Techniques such as PKCE can be used to allow a SPA to participate in an OIDC flow - and this is not forbidden by the specification - but there are still unresolved questions about how SPAs can prove client identity in things like the token exchange that retrieves passports or other tokens.

Therefore if writing SPA websites for genomic data handling, it is recommended to use a backend set of services to execute OIDC flows and token exchanges (even if the rest of the SPA can operate purely on the front end). These backend service can hold secrets and hence can prove client identity - and the backend service can then securely participate in token exchange and retain tokens/passports.

There is an emerging standard DPoP that may remove some of these limitations - (OAuth 2.0 Demonstrating Proof-of-Possession at the Application Layer)

  • and it will be considered for future versions of the AAI specification.