Re: [GNAP] Review of draft-ietf-gnap-core-protocol-00 (first half)

Hi Fabien,

I think we all have a bad experience with huge issue backlogs, which is one reason why creating issues for all these comments would be a pain.

I would suggest that the editor(s) should address all the simple issues (those where the text can be easily fixed or clarified, or those where I clearly don’t know what I’m talking about). Then I will open issues for the remaining ones.

Thanks,

                Yaron

From: Fabien Imbault <fabien.imbault@gmail.com>
Date: Monday, October 19, 2020 at 10:08
To: Yaron Sheffer <yaronf.ietf@gmail.com>
Cc: GNAP Mailing List <txauth@ietf.org>
Subject: Re: [GNAP] Review of draft-ietf-gnap-core-protocol-00 (first half)

Hi, 

These are very useful feedback. We do need to take the time to read again the entire document, as it was patched up to the last minute.

There are many places where several options are considered (some of which might need further explanation, from some of the questions I read). The objective was to openly expose some design choices for the group to consider, especially when it was new to the spec or when there was no consensus. And then decide what to keep with the group. 

It will be worth investigating every comment. Considering their number, I'm just wondering if the mailing list is the best place to follow up the details. Wouldn't issues on github be better suited for that task? (as the document source will be on github anyway). 

Cheers

Fabien

On Sun, Oct 18, 2020 at 9:44 PM Yaron Sheffer <yaronf.ietf@gmail.com> wrote:

Hi Justin, all,

The draft has undergone major changes since we started the design team, so I spent some time reviewing it. I only got as far as Sec. 3, and I fully intend to review the remainder of the document. But I figured since I have so many comments, I might as well send out what I have.

* General comment: we have dozens, maybe hundreds of variants/options here. I think we need to define a must-implement subset, or we will never reach interoperability.

* Abstract: the boilerplate text (MUST etc.) usually goes to the Introduction, not the Abstract.

* 1. "The requesting party operating the software" - I think these actions are typically performed by the resource owner, who is not necessarily the requesting party.

* RS, aka API: we are using the word "API" several times in a more generic sense (management API, identity API), so maybe replace with "RS, typically a protected API".

* 1.2, "key" is a too generic term in the context of a security protocol. Can we call it something more specific, maybe "holder identity key"?

* 1.3.1. "augmented with a hash of the security information" - this is too implementation-specific, suggest "is cryptographically bound to the security information". In general, step (7) is way too detailed, and verges on the normative.

* 1.3.2: it is unclear to me what the goal of the user-code interaction is, in other words what is the threat model. If it's just "second factor, something you have", then an OTP MFA device would be simpler and offer the same security.

* 1.3.5 (4): Typically the access token would not be sent once it is expired, right? Let's make the example more realistic.

* 2: "OpenID Connect claims request" - why not "identity claims request" (which just happens to be similar to the OpenID request)?

* Never liked the "dolphin" here... If this is listed under "actions", it should be a verb, e.g. "swim".

* 2.1: an array for a single AT vs. an object for multiple ATs? This is extremely non-intuitive and probably ugly to code. Also, why should (multiple) tokens be named if a single token doesn't have to, why not list them in an array instead?

* 2.1.1, "locations": are these really URIs or just URLs? On the other hand, if these are URIs, why do we need a separate "identifier"? There's too much semantic overlap between "type", "location" and "identifier" - specifying just "type" may be sufficient.

* 2.1.1: it's implied but should be spelled out that the semantics of this example is the cross product of permissions: I am requesting to read and write and "dolphin" both "metadata" and "images", and that applies to both "locations" - 12 different possible actions.

* 2.1.2: I don't see the value of the notion of "expansion". It is sufficient that the string is understood by the AS.

* 2.1.2, "in some situations the value is intended to be seen and understood be the RC developer" - shouldn't we require this value to be always fully documented by the AS owner, so that the RC knows what it is requesting?

* 2.1.4: why are flags represented as resources? The answer, "this is how OAuth 2 does it" is not great.

* "the each access tokens" -> each access token.

* I suggest to extend the editor's note to clarify what is the use case, so that the group can decide whether split tokens are indeed necessary.

* Why is key binding of the access token even optional?

* 2.2, "Subject identifiers requested by the RC serve only to identify the RO in the context of the AS and can't be used as communication channels by the RC" - this sounds a bit naive, people will do that anyway. So why is this a useful statement? (And similarly in 3.4)

* 2.3: what's the benefit of including a URI (URL really) in "display"? Do we really want the user (RO) to click links provided by the RC?

* 2.3: why are we sending a JWK *as well as* a cert? Are we checking that the cert contains the same public key as the cert?

* 2.3, " the AS MUST ensure that the key used to sign the request... is associated with the instance identifier" - can we be much more specific here, e.g. the instance_id MUST be byte-identical to the Common Name of the cert? This "association" is never clarified, either here or in Sec. 8.

* 2.3.2: allowing to send the key in multiple formats virtually invites security vulnerabilities. Why is this a good thing?

* 2.3.2: cert#256 is not a key, it's an identifier. Why do we include it here? Note that in Sec. 8.3 you are using the full certificate rather than the fingerprint.

* 2.3.2: why are we using raw PEM certificates instead of the JOSE representation?

* 2.3.3: For obvious security reasons, I suggest adding: The AS MAY choose to avoid displaying an arbitrary URI. RC developers must only assume that the "name" field will be displayed.

* 2.3.4, "created just for this request" -> created just for this series of requests.

* 2.3.4: I suggest referring to RCs that present unknown keys as "anonymous RC", and then in 2.4, say that the AS MUST NOT accept any assertions from an anonymous RC.

* 2.4: IMO assertion validation is a MUST, and we should specify what we require for every assertion type we allow.

* 2.4: moreover, when possible, the AS MUST match the assertions to the presented sub_id. It may be a hint, but you're not allowed to lie.

* 2.4, "If the identified RQ does not match the RO present at the AS" - I thought there are cases where the RQ is different from the RO. Also, I am wondering why this is a SHOULD, not a MUST.

* 2.4.1 (but probably applicable to any references): what is the lifetime of the reference? As an RC, how long can I assume the user reference is still valid?

* 2.5: "preferred_locales" is obviously not a mode of interaction. Why is it bundled here?

* 2.5: the way "callback" is described here is as a capability, not an interaction mode. The example only strengthens this view, "callback" is a modifier of the "redirect" mode.

* 2.5: "use code" -> user code.

* 2.5.1: this unnecessary polymorphism IMO just complicates implementations and prevents extensibility. Instead, I would suggest:

redirect: {}

redirect:{max_length: 255}

redirect:{max_length: 255, callback: {...}}

* 2.5.2: the "app" option is problematic, as you already note. The RC may not even know if a given URL will result in an application launching.

* 2.5.2: and shouldn't the RC include a list of supported applications? (Which would have lovely privacy implications.)

* 2.5.3: there's a parenthetical discussion on whether the URL is unique. But the nonce parameter implies that the "hash" parameter is unique per request, making this discussion moot.

* 2.5.3: even if we need a different kind of post-interaction "redirect", where the RC is not involved, there must be a way to ensure that the RC receives positive (cryptographically protected) confirmation that the authorization was successful. In other words, what we're discussing here is not an RC interaction mode, it is something else, and should probably have a separate protocol element to describe it.

* 2.5.3: In the example for "interact" as an array (which I support), the second "redirect" and "user_code" should be independent elements of the array, not members of an object. This assumes they are truly independent interaction modes.

* 2.5.3.1: what does "RQ is present on the request" mean?

* 2.5.3.2: typo: HTTP the request.

* 2.7: this seems to assume a short access token (which implies a bearer token). What if we want to use longer self-contained access tokens containing full keys? Should we have a separate "grant_id" value instead?

* 2.8: another problem with this stanza is that the RC needs to know a priority that the AS supports it. What happens if the AS doesn't?

* 2.8: I suggest to move this section into a separate I-D, or at least a separate appendix.

* 2.9: it makes sense to bundle extensions and the "capabilities" section into one.

* 2.9: The AS MUST ignore any unknown extension.

* 3: it's confusing that we're using access_token for two different things. Maybe we could change the RC/AS one to "as_access_token"?

* 3.1: MAY vary *by* request.

* 3.1: how is the client behavior "deterministic in all cases" - how does the RC know whether the AS allows continuation or only allows one request at a time?

* 3.2.1: the "value" of the access token is opaque in some cases (bearer token) but not in others.

* 3.2.1: The ASCII restriction on "value" is insufficient. Needs to be printable ASCII, 0x20<v<=0x7f. And I'm not sure if this doesn't leave some characters that still need to be escaped.

* 3.2.1: why is "expires_in" optional? Typically access tokens are not revoked, and the RC needs a way to manage their lifetime.

* 3.2.1: "resources" refers to this same section. Did you mean 2.1.2, or are we missing a subsection describing resources/rights?

* 3.2.1: "key": another polymorphism alert!

* 3.2.1: the example at the bottom of the section talks about a way to present the token, I believe this discussion is out of context here.

* 3.3: the AS knows now which interaction modes are available. Why not pick just one for its response, to simplify the protocol? As one example, the AS can then more easily decide on timeout behavior.

* 3.3.3: Suggest to add to the last paragraph: If the RC does not receive a callback until an RC-determined timeout occurs, the RC MUST act as if the entire request failed.

* 3.4: "updated_at": Unix time is nice, but for absolute time it's common to use ISO 8601 format.

* 3.6: YES, the "error" element is exclusive!

* 3.6: in SecEvent we tried to use RFC 7807 "problem details", I think it could work nicely here.

* 10.4: "indicating that GNAP" - sentence is broken.

* This "speculative" access is not useful if the response is only a MAY. Why would the RC try it if it is unlikely to provide useful information?

* 15: the BCP195 ref is broken - the author names are missing (yes I am one of them...).

* Appendix D: can we name the first scenario, maybe "Simple API Access"?

-- 
TXAuth mailing list
TXAuth@ietf.org
https://www.ietf.org/mailman/listinfo/txauth