Re: [OAUTH-WG] Second WGLC on "JSON Web Token (JWT) Profile for OAuth 2.0 Access Tokens"

Hi Vittorio,

I am referring to the email you sent on April the 29 th which is copied 
below.

1) You wrote:

    /> targeting of access tokens/

    Let me think about that a bit longer.

    I acknowledge that the decision of including an audience has the
    effect of letting the AS track when the client accesses a particular
    resource,
    but at the same time that’s completely mainstream and very much by
    design in a very large number of cases. As such, I find the language
    you are suggesting to be potentially confusing, as it positions this
    as an exception vs a privacy protecting mainstream that is in fact
    not common,
    and ascribes to the client more latitude than I believe is
    legitimate to expect or grant.

    *I’ll try to come up with concise language that clarifies to the
    reader that the current mechanism does allow AS tracking*.

Since the last draft has been published on the 27 th, you have not 
proposed any "concise language that clarifies to the reader
that the current mechanism does allow AS tracking".

2) You also wrote about the "sub" uniqueness:

    As long as an identifier identifies one resource only, it satisfies
    uniqueness. It doesn’t have to be a singleton.

RFC 7519 defines in section 4.1.2 the semantics of the "sub" claim using 
the following sentence:

    The subject value MUST either be scoped to be locally unique in the
    context of the issuer or be globally unique.

The text does NOT say that the subject value "MUST be scoped to be 
locally unique in the context of the *resource server*".
Changing the semantics of an already defined claim is not permitted. If 
you would like to have such a semantics available,
a new claim should be defined (and it would be very nice to have it !).

3) The text is the privacy considerations section states:

    Although the ability to correlate requests might be required by 
design in many scenarios, there are scenarios where the authorization
    server might want to prevent correlation to preserve the desired 
level of privacy.

In the real world, it is also clients or end-users which would like to 
prevent correlation to preserve their desired level of privacy.

A better sentence would be:

    Although the ability to correlate requests might be required by 
design in many scenarios, there are scenarios where the authorization
    server *or the client* might want to prevent correlation to preserve 
the desired level of privacy.

4) The text continues with:

    Authorization servers should choose how to assign "sub" values 
according to the level of privacy required by each
    situation.  For instance: if a solution requires preventing 
tracking  principal activities across multiple resource servers,
    the  authorization server should ensure that JWT access tokens meant 
for different resource servers have distinct "sub"
    values that cannot be correlated in the event of resource servers 
collusion.

Authorization servers are not necessarily able to choose the level of 
privacy required by each situation. When there are different
situations for the same resource server, the scope is (unfortunately at 
the moment) the only way to select the "level of privacy that is required".

The example ("For instance:") is only an example that provides a vague 
recommendation for the ASs which is NOT conformant
with the semantics of the "sub" claim as defined in RFC 7519.

What should be discussed here are not "examples" or what an 
authorization server should do, but explanations about the implications
for the end-user or for the client for the various values that can be 
placed into the "sub" claim by an AS. The problem is wider that simply
a collusion between resource servers, but also with other servers that 
DO NOT participate in any OAuth exchange.

RFC 6973 (Privacy Considerations) states in section 7 : Guidelines

    This section provides guidance for document authors in the form of a
    questionnaire about a protocol being designed.
    The questionnaire may be useful at any point in the design process,
    particularly after document authors have developed
    a high-level protocol model as described in [RFC4101].

One of the questions is:

    f. *Correlation*.  Does the protocol allow for correlation of
    identifiers ?  Are there expected ways that information exposed
    by the protocol will be combined or *correlated with information
    obtained outside the protocol* ?

It is important to provide an answer to these two questions.

Hereafter is some text that is fully conformant with RFC 7519 which 
should be incorporated into the privacy considerations section
which explains the implications of the two (and only two) flavours of 
the "sub" claim.

    When the sub claim contains a locally unique identifier in the
    context of the issuer, this allows the tracking of principal activities
    across multiple resource servers.

    When the sub claim contains a globally unique identifier, this
    allows to correlate principal activities across multiple resource
    servers, while in addition, this globally unique identifier may also
    allow to correlate the principal activities on servers where
    no access has been performed by the principals to these servers but
    where the same globally unique identifiers are being used
    by these servers.

Denis

> Thanks Denis for the thorough commentary.
>
> /> The title of this spec./
>
> Fixed, thanks!
>
> /> The client MUST NOT inspect the content of the access token/
>
> This is really a sticky point. I really want to acknowledge your PoV 
> on this, but at the same time I found this to be one of the biggest 
> sources of issues in the use of JWT for access tokens hence I feel we 
> really need to give solid guidance here. Let me expand further on the 
> reasoning behind it, and perhaps we can get to language that satisfies 
> both PoVs.
>
> To me the key point is that clients should not write /code/ that 
> inspects access tokens. Taking a dependency on the ability to do so is 
> ignoring fundamental information about the architecture and 
> relationships between OAuth roles, and suggests an ability of the 
> client to understand the semantic of the content that cannot be 
> assumed in the general case. I expanded on the details in my former 
> reply to you on this topic, I would recommend referring to it. Clients 
> violating this simple principle has been one of the most common 
> sources of production issues I had to deal with in the past few years, 
> and one of the hardest to remediate given that clients are hard to 
> update and sometimes the things they relied on were irremediably lost. 
> This is why I am inclined to put in here strong language.
>
> That said: I have nothing against client developers examining a 
> network trace and drawing conclusions based on the content of what 
> they see. That doesn’t create any hard dependencies and has no 
> implications in respect to changes in the solution behavior. However I 
> am not sure how to phrase that in the specification, given that 
> referring to the client inevitably refers to its code. I am open to 
> suggestions.
>
> >  3)…
>
> I have a pretty hard time following the chain of reasoning in this 
> section. Let me attempt to tackle it to the best of my understanding.
>
> I think the key might be
>
> /> a client should be able to choose whether it wishes the sub claim 
> to contain [..]/
>
> I don’t think that should be a choice left to the client. In business 
> systems, my experience is that the type of identifiers to be used 
> (when the IdP gives any choice at all)  is established at resource 
> provisioning time. I am not aware of mechanisms thru which a client 
> signals the nature of the identifier to be used, nor that would be 
> fully feasible (the resource knows what it needs to perform its function).
>
> Furthermore:
>
> /> which has nothing to do with uniqueness since the value changes for 
> every generated token./
>
> Again, this is something that was touched on in my former reply to 
> your message. As long as an identifier identifies one resource only, 
> it satisfies uniqueness. It doesn’t have to be a singleton.
>
> Finally, the scope is optional (for good reasons: 1^st party and non 
> delegation scenarios don’t require it) hence it cannot be relied upon 
> for properties that should hold in every scenario.
>
> In summary: per the preceding thread on this topic, the consensus was 
> that varying the sub content was a satisfactory way of protecting 
> against correlation. I don’t a gree that clients should have a 
> mechanism to request different sub flavors, as that decision should be 
> done out of band by the AS and RS; and the scope isn’t always 
> available anyway.
>
> /> targeting of access tokens/
>
> Let me think about that a bit longer.
>
> I acknowledge that the decision of including an audience has the 
> effect of letting the AS track when the client accesses a particular 
> resource, but at the same time that’s completely mainstream and very 
> much by design in a very large number of cases. As such, I find the 
> language you are suggesting to be potentially confusing, as it 
> positions this as an exception vs a privacy protecting mainstream that 
> is in fact not common, and ascribes to the client more latitude than I 
> believe is legitimate to expect or grant.
>
> I’ll try to come up with concise language that clarifies to the reader 
> that the current mechanism does allow AS tracking.
>
> *From: *OAuth <oauth-bounces@ietf.org> on behalf of Denis 
> <denis.ietf@free.fr>
> *Date: *Wednesday, April 29, 2020 at 09:12
> *To: *"oauth@ietf.org" <oauth@ietf.org>
> *Subject: *Re: [OAUTH-WG] Second WGLC on "JSON Web Token (JWT) Profile 
> for OAuth 2.0 Access Tokens"
>
> You will find four comments numbered 1) to 4).
>
> *1) *The title of this spec. is:
>
> JSON Web Token (JWT) Profile for OAuth *2.0* Access Tokens
>
> So, this spec. is supposed to be targeted to OAuth *2.0. * However, 
> the header at the top of the page omits to mention it.
>
> Currently, it is :
>
> Internet-Draft OAuth Access Token JWT Profile           April 2020
>
> It should rather be:
>
> Internet-Draft OAuth *2.0* Access Token JWT Profile           April 2020
>
> *2)* The following text is within section 6.
>
> The client MUST NOT inspect the content of
> the access token: the authorization server and the resource server
> might decide to change token format at any time (for example by
> switching from this profile to opaque tokens) hence any logic in the
> client relying on the ability to read the access token content would
> break without recourse.
> Nonetheless, authorization servers should
> not assume that clients will comply with the above.
>
> It is of a primary importance that clients MAY be able to inspect 
> tokens before transmitting them.
> The "MUST NOT" is not acceptable.
>
> The above text should be replaced with:
>
> Reading the access token content may be useful for the user to verify 
> that
> the access token content matches with its expectations. However,
> the authorization server and the resource server might decide to 
> change the
> token format at any time.  Thus, the client should not expect to 
> always be
> in a position to read the access token content.
>
> The remaining of the text about this topic is fine.
>
>
> *3) *The next topic is about the sub claim.
>
> The text states:
>
> Although the ability to correlate requests might be required by
> design in many scenarios, there are scenarios where the authorization
> server might want to prevent correlation to preserve the desired
> level of privacy. Authorization servers should choose how to assign
> sub values according to the level of privacy required by each
> situation.
>
> I have a set of questions:
>
>  1. How can authorization servers choose how to assign sub values
>     according to the level of privacy required "by each situation" ?
>  2. How can authorization servers know the level of privacy required
>     "by each situation" ?
>  3. How can the users be informed of the level of privacy required "by
>     each situation" ?
>  4. How can the users *consent* with the level of privacy required "by
>     each situation" ?
>
> Currently, the request MUST include either a resource parameter or an 
> aud claim parameter, while it MAY include a scope parameter.
>
> The syntax of the scope parameter is a list of space-delimited, 
> case-sensitive strings (RFC 6749). It is thus subject to private 
> agreements
> between clients and Authorization Servers. Since the scope is being 
> returned, it is a primary importance that the returned scope matches
> with its expectations before transmitting the token to a Resource Server.
>
> In theory, a client should be able to choose whether it wishes the sub 
> claim to contain :
>
>   * a global unique identifier for all ASs ("globally unique"),
>   * a unique identifier for each AS ("locally unique in the context of
>     the issuer"),
>   * a different pseudonym for each RS, or
>   * a different pseudonym for each authorization token request.
>
> The only variable parameter that it can use for this purpose in the 
> token request is the scope parameter.
>
> RFC 7519 states is section 4.1.2:
>
> The subject value MUST either be scoped to be locally unique in the 
> context of the issuer
> or be globally unique.
>
> It is quite hard to recognize that the sub claim is able to carry a 
> different pseudonym for each RS, i.e. for case (c), or
> a different pseudonym for each authorization token request, i.e. for 
> case (d), which has nothing to do with uniqueness
> since the value changes for every generated token.
>
> This has implications about the following text:
>
> For instance: if a solution requires preventing tracking
> principal activities across multiple resource servers, the
> authorization server should ensure that JWT access tokens meant for
> different resource servers have distinct sub values that cannot be
> correlated in the event of resource servers collusion.
>
> Since it addresses case (c).
>
> and also about the following text:
>
> 4.b) Similarly: if a solution requires preventing a resource server from
> correlating the principal’s activity within the resource itself, the
> authorization server should assign different sub values for every JWT
> access token issued.
>
> Since it addresses case (d).
>
> This means that the current text placed in the privacy considerations 
> section was a good attempt to address the case,
> but that the text needs to be revised.
>
> Proposed text replacement for all the previously quoted sentences:
>
> According to RFC 7519 (4.1.2): The subject value MUST either be scoped 
> to be locally unique in the context of the issuer or be globally unique.
>
> When the sub claim contains a globally unique identifier, this allows 
> to correlate principal activities across multiple resource servers, 
> while in addition,
> this globally unique identifier may also allow to correlate the 
> principal activities on servers where no access has been performed by 
> the principals
> to these servers but where the same globally unique identifiers are 
> being used by these servers.
>
> When the sub claim contains a locally unique identifier in the context 
> of the issuer, this also allows the tracking of principal activities 
> across multiple resource servers.
>
> The scope request parameter is the only way to influence on the 
> content of the sub claim parameter. Its meaning is subject to a 
> private agreement
> between the client and the AS, which means that the use of the scope 
> parameter is the only way to choose between a locally unique identifier
> in the context of the issuer or a globally unique identifier.
>
> Since the scope parameter is being returned, it is a primary 
> importance that the returned scope matches with the expectations of 
> the client before transmitting
> the token to a Resource Server.
>
> However, there are other cases where the client would like to be able 
> to choose whether it wishes the sub claim to contain :
>     - a different pseudonym for each RS so that different resource 
> servers will be unable to correlate its activities, or
>     - a different pseudonym for each authorization token request, so 
> that the same resource server cannot correlate its activities 
> performed at different instant of time.
>
> Considering the semantics of the sub claim, these two cases cannot be 
> currently supported.
>
>
> *4) *The next topic is about the targeting of access tokens
>
> Text had been proposed before the last conference call. Then, the 
> topic has been presented at the very end of the last conference call, 
> but no text has been included
> in the next draft.
>
> Here is a revised text be included in the privacy considerations section:
>
> For security reasons, some clients may be willing to target their 
> access tokens but, for privacy reasons, may be unwilling to disclose 
> to Authorization Servers
> an identification of the Resource Servers they are going to access, so 
> that Authorization Servers will be unable to know which resources 
> servers are being accessed.
> The disclosure of the Resource Servers names allows the Authorization 
> Servers to list all the Resource Servers being access by all its users 
> and in addition to list pairs
> of (Principal, Resource Servers) which allow to trace all the users 
> accesses to Resource Servers performed through a given Authorization 
> Server. When a token is targeted,
> this profile does not contain provisions to address these two threats.
>
> Denis
>
>     Hi all,
>
>     This is a second working group last call for "JSON Web Token (JWT)
>     Profile for OAuth 2.0 Access Tokens".
>
>     Here is the document:
>
>     https://tools.ietf.org/html/draft-ietf-oauth-access-token-jwt-06
>
>     Please send your comments to the OAuth mailing list by April 29, 2020.
>
>     Regards,
>
>      Rifaat & Hannes
>
>
>
>     _______________________________________________
>
>     OAuth mailing list
>
>     OAuth@ietf.org  <mailto:OAuth@ietf.org>
>
>     https://www.ietf.org/mailman/listinfo/oauth
>