Re: [Txauth] Multiple Access Tokens in XYZ

Justin Richer <jricher@mit.edu> Thu, 19 March 2020 13:37 UTC

Return-Path: <jricher@mit.edu>
X-Original-To: txauth@ietfa.amsl.com
Delivered-To: txauth@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id ECA3D3A29DB for <txauth@ietfa.amsl.com>; Thu, 19 Mar 2020 06:37:44 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 6r1jojX-_c-k for <txauth@ietfa.amsl.com>; Thu, 19 Mar 2020 06:37:40 -0700 (PDT)
Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id DE1B23A29CC for <txauth@ietf.org>; Thu, 19 Mar 2020 06:37:39 -0700 (PDT)
Received: from [192.168.1.5] (static-71-174-62-56.bstnma.fios.verizon.net [71.174.62.56]) (authenticated bits=0) (User authenticated as jricher@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 02JDbb2r001301 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 19 Mar 2020 09:37:37 -0400
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\))
From: Justin Richer <jricher@mit.edu>
In-Reply-To: <B180D137-D98C-453D-A814-A7853F00E789@lodderstedt.net>
Date: Thu, 19 Mar 2020 09:37:36 -0400
Cc: txauth@ietf.org
Content-Transfer-Encoding: quoted-printable
Message-Id: <140C2C24-2CCE-4848-AC4A-E16154CF17B7@mit.edu>
References: <B8EF1463-6059-4F50-A4F5-A339A083ED96@lodderstedt.net> <8966AC86-A51E-4E3A-A0D1-9DF407EA71B4@mit.edu> <2D373529-AA1C-450E-8F7D-955437F9CC4B@lodderstedt.net> <C68DEB0E-0A1F-469C-A06A-86B26111D7C3@mit.edu> <325684E9-590E-4095-8C58-E0B7CD9AF467@lodderstedt.net> <8C28432D-E110-40B1-B9D4-61777D1953C4@mit.edu> <B180D137-D98C-453D-A814-A7853F00E789@lodderstedt.net>
To: Torsten Lodderstedt <torsten@lodderstedt.net>
X-Mailer: Apple Mail (2.3445.104.11)
Archived-At: <https://mailarchive.ietf.org/arch/msg/txauth/SDxHGE_FHG-TbH-gT_La9RDmlno>
Subject: Re: [Txauth] Multiple Access Tokens in XYZ
X-BeenThere: txauth@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <txauth.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/txauth>, <mailto:txauth-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/txauth/>
List-Post: <mailto:txauth@ietf.org>
List-Help: <mailto:txauth-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/txauth>, <mailto:txauth-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 19 Mar 2020 13:37:52 -0000

I see what you’re going for here. I think the key point comes down to this:

> - The client knows what it wants to do and where

That’s knowledge is exactly why I would argue that the client would have to explicitly request multiple access tokens in order to get them. 

I’m worried about requiring all clients to be prepared to accept multiple access tokens. In a lot of big cloud deployments, it’s absolutely based on location. But that’s not the only dispatch for security domains. A client would need to know, ultimately, what a token is for and where to use it. And we’d also need to deal with cases that allow for subdomains, paths, query parameters, and other variability of an API’s URLs. After all, I’m probably going to send that same token to a bunch of different URLs in order to do a bunch of different things, even if they’re all within the same “RS” or “domain” or whatever. Which brings us to an underlying problem — I don’t think there’s a good way to reference the identity of an RS. Solid is attempting to do that using WebID’s as service identifiers, and while that’s interesting, it’s deeply tied to their system where everything knows what a WebID is and what to do with it. I think it’s a bad idea to depend on that kind of thing for a general purpose system. 

I’m hesitant to have clients depend on being told that information. I think if we go down that route we’re going to have to also tell clients things like “this is only good for GET requests” and “this is good for subdomains on this location” and “this can be used for anything except this one exception”. And it doesn’t fit well when you’re trying to mix two different APIs that have really different structures. Things like GraphQL and REST lead to pretty different URL designs, and TxAuth should be usable for all of that. It feels like too much automated configuration of a client instead of the client just “doing something”, which I think is going to be the common case. In other words, I do think that the client software is going to be bespoke for the API that it’s calling. However, the security library that speaks the TxAuth piece doesn’t have to be, and the protocol itself doesn’t need to be. But the protocols should allow the client software to express to the security layer what it knows about the API.

And speaking of common cases, I think it’s actually much more rare to have multiple security domains covered by an AS in a way that would need multiple encryptions or targets. I think many of us see it because we work with large enterprise-scale systems with multiple domains that we want to manage all at once. In my experience, it’s much more common to have a client talk to one AS to get one token for one RS. One of my goals with this is to not make it complicated for simple clients, and I think having to be prepared to get multiple access tokens is too complex for simple clients. I might be wrong, but it’s based on my experience across a lot of different kinds of APIs. The idea of splitting up tokens like below feels REALLY complex, especially when I asked for a single one. I get why you want to do it, it makes sense for the AS to be able to do something like that, but from a client’s perspective it’s a lot more complicated without a clear idea of what the identity of an RS is. I think solving that problem is a HUGE issue that we should put firmly out of scope.

The thing is, though, you’re absolutely right that there’s a need for this kind of multiple tokens. So in my view, the lift of having a client know about the domains that it’s calling is a lot less than the client having to potentially deal with more tokens than it asked for, and knowing how to correctly dispatch those. Also the semantics of what the “resources” object represents changes, since I could potentially be getting two tokens back that do parts of the one thing that I asked for, and the client now needs to know which to use for what. If the client has to name the splits itself, that implies the client knows what each “resources” sub-object represents and knows how to apply that to its “call the API” code. The security layer doesn’t know or care, but the code that knows how to manipulate the API and its data knows and cares.

As for the token content — that’s solidly an implementation decision and orthogonal to this discussion. You can do all of this multi access token stuff with introspection and reference-based tokens. Access tokens themselves need to stay opaque to the client and to the protocol at large. I don’t believe that’s up for debate.

Thanks so much for pushing this conversation, and now I’m more convinced than ever that handling multiple tokens in the response is something we need to figure out within this group. 

 — Justin

> On Mar 17, 2020, at 5:22 AM, Torsten Lodderstedt <torsten@lodderstedt.net> wrote:
> 
> Hi Justin,
> 
> thanks for explaining the different options. I’m well aware of the super refresh token (and remember the discussions back then in Taipei :-)), I have implemented systems using this and other patterns, too.
> 
> The underlying assumption for most of those patterns is that the client upfront knows the boundaries between RS security domains, which typically means the solution is bespoken. 
> 
> TXAuth is chartered to develop a protocol and not a framework. What I’m looking for is interoperable protocol support for use of RS-specific self-contained access tokens in multi-RS deployments.
> 
> Why RS-specific self-contained access tokens? 
> This is in my experience the most efficient way to empower high-volume/high-load services in a very efficient, secure, and privacy preserving fashion. 
> 
> - Every token contains exactly the data the RS needs to perform access control decisions locally. No need for further database lookups or AS callbacks, that’s really fast and keeps cost of the AS function low.
> - The token itself can be encrypted to protect this data using a RS-specific key, one could even use HMACs to protect integrity and authenticity (fast as well). 
> - The token can have a RS-specific lifetime.
> - Since every token is restricted to a single RS audience, those tokens also have a baseline replay detection built-in. 
> 
> I think this pattern makes sense in environments with multiple RSs (e.g. different products) as well. But since every token is minted to the specific requirements of a certain RS, the AS must be able to mint different tokens. That doesn’t work properly without some support in the protocol.
> 
> Is there a need for multi access tokens support? 
> Well, you implemented it, I implemented it, and I think a couple of other implementers did it with OAuth 2 in the past. So there seems to be some need. Why does the rest use the single token pattern? I think some deployments will indeed only have a single service, but I bet a lot of implementers did it because their product does not support anything else. 
> 
> I have experienced this myself when I designed the architecture of the yes ecosystem. It is a federation of authorization servers with associated services where every AS represents a certain bank. Since our partners shall be able to implement their AS using the product they like, I needed to go with the least common denominator - single access token. This has a significant consequences: our tokens are basically handles, so every service calls back to the AS to obtain its data for every service request. This degrades performance significantly and, since those tokens are good for multi audiences, it forces us to generally use sender constrained tokens, which increases complexity for clients. 
> 
> I would like to give implementers more options in the TXAuth space. That’s why I advocate to build-in support für multiple access tokens into TXAuth. 
> 
> My proposal is based on the following assumptions:
> - Token format, content, encryption keys and so on are defined as part of the interface between AS and RS
> - The client knows what it wants to do and where
> - Every party contributes the information it has to the overall process to make it work simply and effectively for everyone. 
> 
> There is no change/addition needed to the request syntax. All it takes is your new multi token syntax (+ a small addition) in the response. 
> 
> The client uses the “resources" structure to communicate what (actions, further elements) it wants to do and where (locations).
> 
> [
>    {
>      “actions": ["read", "write"],
>      "locations": ["https://example.com/resource"],
>      “data": ["foo", "bar"]
>    },
>    {
>      “actions": ["write"],
>      "locations": ["https://other_example.com/resource"],
>      “data": ["foo", "bar"]
>    }
> ]
> 
> One deployment might use a single token for all RSs, in this case the token response remains unchanged: 
> 
> {
> "access_token":{
>   "value":"08ur4kahfga09u23rnkjasdf",
>   "type":"bearer"
> }
> }
> 
> If the AS has the need to issue multiple access tokens, it could, for example, use the “locations" elements to determine what tokens it needs to create. Such an AS then uses the multiple_access_tokens structure augmented by corresponding "locations” entries in the token response: 
> 
> "multiple_access_tokens":{
>   "token_a":{
>     "value":"OS9M2PMHKUR64TB8N6BW7OZB8CDFONP219RP1LT0",
>     "type":"bearer",
>     "locations":[
>       "https://example.com/resource"
>     ]
>   },
>   "token_b":{
>     "value":"UFGLO2FDAFG7VGZZPJ3IZEMN21EVU71FHCARP4J1",
>     "type":"bearer",
>     "locations":[
>       "https://other_example.com/resource"
>     ]
>   }
> }
> 
> Since the client passed the locations values in the request, it is also able to determine where to use what access token. 
> 
> I think that’s pretty simple, especially from a client perspective.  
> 
> And If the client wants to split access tokens further apart, e.g. to obtain tokens with less privileges, it can do so using the request syntax you defined: 
> 
> resources: {
>   token1: [{
>           actions: ["read", "write", "dolphin"],
>           locations: ["https://server.example.net/", "https://resource.local/other"],
>           datatypes: ["metadata", "images"]
>    }],
>    token2: [{
>           actions: ["foo", "bar", "dolphin"],
>           locations: ["https://resource.other/"],
>           datatypes: ["data", "pictures"]
>    }]
> }
> 
> In the simplest case, the AS would return the data as in your proposal.
> 
> If the client asks for a partitioning of privileges that goes across RS security domains like this
> 
> {
> "resources":{
>   "token1":[
>     {
>       "actions":[ "read", "write","dolphin" ],
>       "locations":[ "https://server.example.net/","https://resource.local/other"],
>       "datatypes":[ "metadata","images"]
>     },
>     {
>       "actions":["read","write"],
>       "locations":["https://example.com/resource"]
>     }
>   ],
>   "token2":[
>     {
>       "actions":["foo","bar", "dolphin"],
>       "locations":["https://resource.other/"],
>       "datatypes":["data","pictures"]
>     }
>   ]
> }
> }
> 
> the AS would need to further partition the pre-defined tokens like this:
> 
> "multiple_access_tokens”:{
>   “token1/a":{
>     "value":"OS9M2PMHKUR64TB8N6BW7OZB8CDFONP219RP1LT0",
>     "type":"bearer",
>     "locations":["https://server.example.net/","https://resource.local/other"]
>   },
>   “token1/b":{
>     "value":"OS9M2PMHKUR64TB8N6BW7OZB8CDFONP219RP1LT0",
>     "type":"bearer",
>     "locations":["https://example.com/resource"]
>   },
>   “token2":{
>     "value":"UFGLO2FDAFG7VGZZPJ3IZEMN21EVU71FHCARP4J1",
>     "type":"bearer",
>     "locations":[
>       "https://other_example.com/resource"
>     ]
>   }
> }
> 
> Naming of the tokens is a bit tricky but I think solvable.
> 
> What do you think?
> 
> best regards,
> Torsten.
> 
>> On 15. Mar 2020, at 15:26, Justin Richer <jricher@mit.edu> wrote:
>> 
>> On Mar 15, 2020, at 8:58 AM, Torsten Lodderstedt <torsten@lodderstedt.net> wrote:
>>> 
>>>> On 15. Mar 2020, at 03:25, Justin Richer <jricher@mit.edu> wrote:
>>>> 
>>>> So if the AS needs a client to get different access tokens to call different RS domains, it does exactly what we do in OAuth 2 today — it tells the client to get two different access tokens. 
>>> 
>>> How does this work in XYZ?
>>> 
>> 
>> Without using the multi-access-token thing I’m proposing in this thread, the client would just make two separate transaction calls to get two different tokens. There’s a few ways that shakes out depending on some of the details. In the OAuth world that amounts to involving the user twice, and it might be the same in XYZ if you’re asking for different things:
>> 
>> 1. Client: Start TX-1 (R-1)
>> 2. User: Approve R-1
>> 3. AS: Issue AT-1(R-1)
>> 4. Client: Start TX-2 (R-1)
>> 5. User: approve R-2
>> 6. AS: Issue AT-2(R-2)
>> 
>> Unless you’re getting a super refresh token upfront and then calling for two downgraded access tokens later — which does work, and I’ve built out systems that do exactly that. XYZ can do that trick too.
>> 
>> 1. Client: Start TX-1 (R-1, R-2)
>> 2. User: Approve R-1, R-2
>> 3. AS: Issue AT1 (R-1, R-2)
>> 4. Client: Continue TX-1 (R-2)
>> 5. AS: Issue AT-2 (R-2)
>> 
>> But we’ve got another thing we can use in XYZ to help this, the user handle. This lets a trusted client tell the AS that it believes the same user is still there and asking the question, so if the access rights are OK then you don’t need to involve the user again. We invented this construct with UMA2, where it’s called the persisted claims token (PCT).
>> 
>> 1. Client: Start TX-1 (R-1)
>> 2. User: Approve R-1
>> 3. AS: Issue AT-1 (R-1), user handle U-1
>> 4. Client Start TX-2 (R-2, U-1)
>> 5. AS: Issue AT-2 (R-2)
>> 
>> Now: With the multi-token request, we can collapse this all back to a single transaction with multiple outputs:
>> 
>> 1. Client: Start TX-2 (token1: R-1, token2: R-2)
>> 2. User Approve R-1, R-2
>> 3. AS: Issue AT-1 (token1: R-1), AT-2 (token2: R-2)
>> 
>> I haven’t liked any of the multi-access-token solutions to date because they make things weird for single access token requests. I like this idea because it’s an optimization for a complex case that doesn’t change the behavior for the simple case, and in fact doesn’t even change the expectations for the simple case. To me, that’s important.
>> 
>> — Justin
>