Re: [GNAP] DID as sub_id or assertion?

Denis,

Many of your comments here would be best addressed to the SECEVENT working group as feedback on the subject identifiers draft. There is new text in -07 that helps establish this as useful outside of SET/JWT, and I would guess the author would be amenable to clarifying that further.

 — Justin

> On Mar 11, 2021, at 1:50 PM, Denis <denis.ietf@free.fr> wrote:
> 
> Hi Fabien,
> 
> Thank you for launching the discussion. 
> 
> SECEVENT draft-07 does not really offer new possibilities over draft-06. The major change is simply  in the syntax naming.
> 
> The response to your question is not limited to an exclusive choice between option 1 and option 2. :-)
> 
> The definition of a subject identifier in draft-07 is :
> 
> 3.  Subject Identifiers 
>    A Subject Identifier is a JSON [RFC7159] object whose contents may be
>    used to identify a subject within some context.  An Identifier Format
>    is a named definition of a set of information that may be used to
>    identify a subject, and the rules for encoding that information as a
>    Subject Identifier; they define the syntax and semantics of Subject 
>    Identifiers.
> With such a definition is it possible to request some types of subject identifiers to be placed into an access token, 
> to place subjects identifiers (with their values) into an access token and for a RS to compare these subjects identifiers 
> (with their values) against validation rules enacted by a RO acting as an ADF (Access Decision Function) which also use
> the same subjects identifiers (with their values).
> 
> A DID is a persistent global unique identifier. 
> 
> In issue #210  ("sub_id" claims #210) I already proposed the following examples for globally unique identifiers (gu_id) :
> (1) For a globally unique identifier:
> 
> "sub_id": {
>        "format": "gu_id",
>        "gu_id": " email",
>        "email: "john@hughes.com" <mailto:john@hughes.com>,
>      }
> 
> or
> 
> "sub_id": {
>        "format": "gu_id",
>        "gu_id": "ssn",
>        "ssn": { 
>                     "cn": "250"
>                     "value": "160087404506527",
>                   },
>      }
> In order to address DIDs, I would simply add the following example:
> 
> "sub_id": {
>        "format": "gu_id",
>        "gu_id": " did",
>        "did: "did:DID:Method:DID Method Specific String",
>      }
> "sub_ids" claims are much better when compared to "sub" claims.
> 
> The definition of Subject Identifier in the draft-07 (section 3) is : 
>    A Subject Identifier is a JSON [RFC7159] object whose contents may be
>    used to identify a subject within some context.  
> However, page 2 of the draft-07 states:
>    As described in Section 1.2 of SET [RFC8417], subjects related to
>    security events may take a variety of forms, including but not
>    limited to a JWT [RFC7519] principal, an IP address, a URL, etc.
> RFC 8417 (Security Event Token (SET)) states in its abstract:
> Abstract
> 
>    This specification defines the Security Event Token (SET) data
>    structure.  A SET describes statements of fact from the perspective
>    of an issuer about a subject.  These statements of fact represent an
>    event that occurred directly to or about a security subject, for
>    example, a statement about the issuance or revocation of a token on
>    behalf of a subject.  This specification is intended to enable
>    representing security- and identity-related events.  A SET is a JSON
>    Web Token (JWT), which can be optionally signed and/or encrypted.
>    SETs can be distributed via protocols such as HTTP.
> 
> There is a need to be able to use subject identifiers that will not necessarily be used in
> a Security Event Token (SET) data structure.
> 
> These Subject Identifiers might still be used in a Security Event Token (SET) data structure, but not necessarilly.
> When supporting RBAC (Role Based Access Control), it would be nice to define "sub_ids" able to support roles.
> When supporting ABAC (Attribute Bases Access Control), it would be nice to define "sub_ids" able to support
> group memberships with two flavours of them : (a) hierarchical groups and (b) functional groups;
> The current title of the draft is: "Subject Identifiers for Security Event Tokens" .
> More appropriate titles would be:
> Subject Identifiers for Security Event Tokens and other purposes
> or simply
> Subject Identifiers
> The first sentence of the abstract should also be slightly modified.
> 
> Denis
> 
> 
>> Hi Tobias,
>> 
>> Option 2's pros and cons are the consequence of solely relying only on an opaque identifier managed by the AS, independantly from external identifiers. 
>> 
>> That's an additional level of indirection, not unlike how indexes are usually managed in a database. But there's obviously the additional complexity of making a just in time translation into a practical identifier (like email for account recovery, as discussed by Justin). 
>> 
>> "Assertions" should be what the AS (possibly with the assistance of other parties) can confidently say about the subject. 
>> 
>> Thanks for the great feedback, as well as Justin's. Since the slides were written, SECEVENT draft-07 already offers new possibilities which is good news. 
>> 
>> Fabien
>> 
>> Le mer. 10 mars 2021 à 22:02, Tobias Looker <tobias.looker@mattr.global> <mailto:tobias.looker@mattr.global> a écrit :
>> > Thanks for the feedback, although I'm not sure what you mean by too abstract.
>> 
>> Apologies what I mean by that was I feel like Option 2 would expose too much complexity to clients that would be required to construct a request to the AS. Option 1 IMO strikes a better balance by still offering the required flexibility but keeping things relatively simple.
>> 
>> > Just a quick comment: an assertion in option 2 is not "the fact the identifier is resolvable to cryptographic material that can be used to validate cryptographic assertions". It is merely a statement by the AS that "the subject corresponding to the reference has a DID" ("has" is the assertion here). 
>> 
>> Yes I understand to drill down on this more though couldn't the entire request made by the client to the AS also be considered to be an assertion? Therefore the "sub_ids" request element is an assertion by the client about who it believes the subject to be identified as? If this is true then to me the "assertions" request element is there for assertions that parties other than the client are making about the subject (e.g an id_token).
>> 
>> Thanks,
>>  <https://mattr.global/>	 	
>> Tobias Looker
>> Mattr
>> +64 (0) 27 378 0461
>> tobias.looker@mattr.global <mailto:tobias.looker@mattr.global>
>>  <https://mattr.global/>	 <https://www.linkedin.com/company/mattrglobal>	 <https://twitter.com/mattrglobal>	 <https://github.com/mattrglobal>
>> This communication, including any attachments, is confidential. If you are not the intended recipient, you should not read it - please contact me immediately, destroy it, and do not copy or use any part of this communication or disclose anything about it. Thank you. Please note that this communication does not designate an information system for the purposes of the Electronic Transactions Act 2002.
>> 
>> 
>> On Thu, Mar 11, 2021 at 9:20 AM Fabien Imbault <fabien.imbault@gmail.com <mailto:fabien.imbault@gmail.com>> wrote:
>> Thanks Justin.
>> 
>> It's good to clarify what we mean by assertion. I used a more mundane meaning. 
>> 
>> I dislike email/phone as security disasters waiting to happen. I get that they are practical. 
>> 
>> I think I have what I need to write the PRs based on sub_ids (based on the current draft-07). 
>> 
>> Fabien 
>> 
>> Le mer. 10 mars 2021 à 20:18, Justin Richer <jricher@mit.edu <mailto:jricher@mit.edu>> a écrit :
>> Hi Fabien,
>> 
>> First, thanks for doing the heavy lift of writing this topic up. Of these options, I would strongly prefer option (1). The difference between assertions and identifiers is that an assertion is a fully packaged set of attributes, one of which can be an identifier. They are, generally speaking, cryptographically protected bundles. They’re designed to carry a bunch of different information about a user and the authentication event in a form that’s independently verifiable and auditable. SAML assertions (XML document) and ID Tokens (signed JWTs) are the most common ones we see today, but other formats exist and will keep being invented. 
>> 
>> An identifier is a different kind of creature. It is a single statement that identifies the party in question in the context of the party doing the identification. It’s this full context that the receiver of the identifier needs to interpret it in. When using a federated identity protocol, each RP is going to have :some: kind of local user information store that they’re going to tie into, some field in a database they’ll key off of. It’s just never going to be the case that all RPs will be able to use server-provided opaque identifiers for all cases. Account recovery and binding is one such use case where a user-facing identifier like an email address makes a LOT of sense, for one small example. This is an array in the request and response precisely because a user could be known by several different identifiers to a party, and an RP will have a better idea about what kinds of things it wants to correlate against, and it’s up to the AS to provide that mapping. I think that if we don’t define a way to describe the format, the “as_id” field is going to get structure back-patched onto it like is being proposed here:
>> 
>> https://mattrglobal.github.io/oidc-portable-identities/ <https://mattrglobal.github.io/oidc-portable-identities/>
>> 
>> I’m personally not a fan of that approach as it mixes the data layering too much, and you end up making guesses about whether a field is meaningful or a pointer. I think GNAP will be much better served by using a data structure for this from the start. DIDs should be another “format” of the subject identifier. I don’t think GNAP should define this format, and I’ve dropped a note to the SECEVENT mailing list asking about adding DIDs to the subject identifiers draft before it gets published. DIDs are not assertions, even in the cases where they point to fully functional DID documents with user information in them. (And for what it’s worth, I think the OIDC world should also use the SECEVENT draft inside ID Tokens to solve the use case of the draft above).
>> 
>> For the user != RO cases, I think that potentially having a way to signal those two items separately in the request and response could be useful. Right now all the user/subject stuff is about identifying “the user that’s currently here at the client instance”, in a variety of forms that the client instance and AS can negotiate. If the client or AS has some other notion about who the RO is and how that could be separate from the current user, we could provide a way to signal that like you’re proposing below. In my mental model, there are going to be a lot of use cases where the “RO” isn’t known to the client at all, but rather to the AS and is based on what the client’s asking for. So I could identify the current user and ask for a specific medical record, and the AS realizes it has to go bug a particular doctor to approve things directly. This is instead of interacting with the current user and failing in an awkward way when the current user doesn’t have the rights to do things, but someone else would. But we’d want the system to be able to detect if the current user is the doctor, so we can just interact with them directly. Does that all track?
>> 
>>  — Justin
>> 
>>> On Mar 10, 2021, at 4:10 AM, Fabien Imbault <fabien.imbault@gmail.com <mailto:fabien.imbault@gmail.com>> wrote:
>>> 
>>> Hello,
>>> 
>>> Notice the question mark in the title. Not to be considered as an editor's comment.
>>> This thread intends to provide a detailed comment on the interesting feedback by Kristina: "I would catch up on the thread to understand why DIDs are thought to be treated as assertions".
>>> In the chat, I answered: "Happy to get the discussion on DIDs. Actually, could be either or, depending on what we intend to do." Let me expand a bit on that, to make it understandable (didn't have the time during the meeting, unfortunately).
>>> 
>>> This relates to the items described on slides 20 and 34, which I'll explain further. I'll highlight as transparently as possible what I think would be the benefits and downsides of each. Both are probably viable options, but they correspond to different scopes and objectives. I'll spend a bit more time on option 2, since it is a different approach compared to the current draft.
>>> 
>>> 1. The first possibility is to use SECEVENT throughout. 
>>> In that case, DIDs should be a part of sub_ids (in the core or in the extension registry, depending on what Annabelle's WG decides). More generally, the WG would have to decide which additional sub_ids it needs (as per Yaron's comment).
>>> 
>>> That's obviously the closest to what is described in draft-04. It allows a communication of subject information between the client and the AS, by using global identifiers (mail, phone, etc.), or using a local opaque reference - i.e. governed by the AS (that's probably what we would put in the examples, unless someone has a better idea - cf issues #16 and #42). 
>>> 
>>> Assertions relate for instance to the proof of presence of the RO. The current draft-04 therefore plans for id_tokens and saml2, maybe there could be other needs later (extension registry).
>>> 
>>> - Benefits: identifiers are a key component of GNAP, which makes integration easier. Possibly one could choose to use DIDs throughout.
>>> - Downsides: email/phone identifiers will probably be used throughout by most devs (since that's correlation information they have in their user DB). Which means we should limit what assertions contain, to avoid doing links between weak global identifiers and hopefully stronger assertions. That's not really a problem if it's limited to authentication events mostly (in that perspective, samlv2 could make sense in the core). An assertion is probably a single value then (not an array).   
>>> 
>>> Option 1 can be summarized as: 
>>> 
>>>  "subject": {
>>>    "sub_ids": { },			// request and response (SECEVENT -> including DID possibly)                   
>>>    "assertions": [ ],                   // request and response (id_token or samlv2)
>>> }
>>> ("hints" do not exist currently, but are partially covered separately via "request.user" / the relevance of "principal", discussed in the slide, is a question mark too - which is a different discussion on end-user != RO).
>>> 
>>> 
>>> 2. An alternate possibility is to use SECEVENT only as input/hint
>>> In that case, DIDs should be a part of assertions (which have a different meaning compared to option 1).
>>> 
>>> The AS only returns an local opaque reference as_ref (while in option 1, it was only one of many possibilities). Its only job is to help the client differentiate the response subject. The AS policy might further define the scope of that reference, as suggested in https://github.com/ietf-wg-gnap/gnap-core-protocol/issues/210 <https://github.com/ietf-wg-gnap/gnap-core-protocol/issues/210>, although I personally viewed that by default as (2) "as_id" an identifier locally unique to that AS for all the RSs.
>>> 
>>> SECEVENTS are still useful as an input, that's what I presented in request.hints.self.sub_ids (but it's the responsibility of the AS to match it with its own reference system - from instance from the email to AS's local record XUT2MFM1XBIKJKSDU8Q). Note: since the slides, a new draft-ietf-secevent-subject-identifiers-07 has been published. Notably this includes an opaque identifier, which the client can use to pass the as_ref hint to the AS. Thus we wouldn't need any further addition to SECEVENT, although hints would benefit from further additions (like DIDs for instance). Notice also that subject_types_supported becomes less important, because in the worst case scenario, it's simply a hint that wouldn't be understood and therefore taken into account by the AS.
>>> 
>>> The assertions array has a more important role than in option 1 (with the explicit aim of separating between client hints and validated info asserted by the AS). It serves as a generic/extendible mapping structure, e.g. "the AS asserts that opaque reference XUT2MFM1XBIKJKSDU8Q is matching with these (more or less verifiable) statements." The name assertion therefore corresponds to the responsibility of the AS in delivering those statements (cf should/must, issue #49). Some of these statements relate to identity or auth events (ex: DID, id_token), some might directly be useful for authorization decisions (like ZKP example in the parental control example). Possibly some of these validated assertions could be reused in next interactions.
>>> 
>>> Another question that popped-up is the scope of sub_ids. Should they be about the same user? (I think so). In the case of remote ROs especially, it might be important that DID be considered as an assertion (about someone else's) and not as sub_ids, because the AS needs to be careful about who the client wants to reach (avoid spam, etc.). This is consistent with the rest of discussion on what to do when end-user != RO, although using DIDComm might seem a bit early (the paint is very fresh). 
>>> 
>>> Option 2 was summarized as: 
>>> 
>>>  // (suggestion only, not as an editor)
>>> "subject": {
>>>    "as_ref": { 				// response only (wouldn’t require SECEVENT)
>>> “as”: “https://ex1.as.com <https://ex1.as.com/>”,		
>>> “ref”: “XUT2MFM1XBIKJKSDU8QM”   
>>>    },                   
>>>    "assertions": [ ],                   // request and response (id_token/jwkthumb/DID/VC/etc. -> whatever info can be validated by the AS)
>>> 
>>>     "hints": {                           // request only (optional)
>>>       "self": { 		        // replaces request.user (support SECEVENT here)
>>>    “sub_ids”: { },		// SECEVENT (including opaque in draft-07 - that could contain the ref)
>>>    “assertions”: [ ]            // see examples: VC on DOB or ZKP on age (wider scope, possibly through extensions)  
>>> },                  
>>>       "principal": {			// new proposal presented at IETF110 (just an idea)
>>> 	   “automated”: true,		// rule engine		
>>> 	   “async”: { }			// remote ROs
>>> }              
>>>    }
>>> }
>>> 
>>> - Benefits: as_ref is self supporting (no strong coupling with SECEVENT), but can still take SECEVENT as an input hint for the AS. It avoids the risk of making an official association between a weak global identifier (e.g. email, used elsewhere) and hopefully stronger local assertions. GNAP would be fully agnostic to whatever identity system is used, only facilitating interoperability through                                       assertions (second only in importance compared to the AS opaque reference). Some assertions (possibly extensions) could also be useful for the AS decision (e.g. VC or ZKP).
>>> - Downsides: opaqueness of as_ref is a feature, not a bug. The opaque reference is fully managed by the AS, therefore not inherently portable (cf portability discussions in OIDC currently). There's no explicit binding to an official identity, only a contextual mapping to AS validated assertions. It makes it more difficult to match the identifiers from one AS to another or to correlate with the client user DB (still possible via assertions, if the AS allows it). Assertions are much more generic (possibly via extensions), but might require a more advanced mechanism to request/response only the most relevant information (which makes the AS policy critical here).
>>> 
>>> I hope that clarifies the reasoning for both options. Again I'm not saying option 2 is better, just saying the trade-offs are different (+ knowing that we need to keep it simple).
>>> - option 1 puts more in sub_ids and less in assertions
>>> - option 2 puts less in the reference and more in assertions 
>>> All of this is open to discussion. More generally speaking, on every part in orange and question mark (?) I would love your ideas and criticisms. What I intended as an editor is only to highlight where the WG has important choices to make, before we can make the related PRs.
>>> 
>>> A further side comment: IETF EAT RATS was suggested for assertions, but relates more to the client attestations. It was added as a comment in https://github.com/ietf-wg-gnap/gnap-core-protocol/issues/44 <https://github.com/ietf-wg-gnap/gnap-core-protocol/issues/44>, which is a distinct topic we'll work on.
>>> 
>>> Fabien (editor's hat off)
>>> 
>>> -- 
>>> TXAuth mailing list
>>> TXAuth@ietf.org <mailto:TXAuth@ietf.org>
>>> https://www.ietf.org/mailman/listinfo/txauth <https://www.ietf.org/mailman/listinfo/txauth>
>> 
>> 
>> -- 
>> TXAuth mailing list
>> TXAuth@ietf.org <mailto:TXAuth@ietf.org>
>> https://www.ietf.org/mailman/listinfo/txauth <https://www.ietf.org/mailman/listinfo/txauth>
>> 
>> This communication, including any attachments, is confidential. If you are not the intended recipient, you should not read it - please contact me immediately, destroy it, and do not copy or use any part of this communication or disclose anything about it. Thank you. Please note that this communication does not designate an information system for the purposes of the Electronic Transactions Act 2002.
>> 
>> 
>