Re: [Id-event] I-D Action: draft-ietf-secevent-subject-identifiers-11.txt

Hi Justin,

Your changes look good to me and I created another PR
<https://github.com/richanna/secevent/pull/9> incorporating your updates
and some. Again thank you for creating the PR. I cannot merge since I don't
have write access to the repo.
@Backman, Annabelle <richanna@amazon.com>  Can you please review and merge
the changes? I will publish draft-12 as soon as datatracker opens.

-Prachi

On Tue, Jul 12, 2022 at 10:36 AM Prachi Jain <prachi.jain1288@gmail.com>
wrote:

> Thanks for submitting the PR, Justin. Appreciate it !!
> I will take a look later today and publish if no changes need to be made.
>
> -Prachi
>
> On Tue, Jul 12, 2022 at 10:30 AM Justin Richer <jricher@mit.edu> wrote:
>
>> I haven’t heard anything from the editors on this, so I went ahead and
>> created a PR to restore the DID language that was accepted during WGLC, as
>> well as add a generic URI format, as discussed in the email thread below.
>>
>> https://github.com/richanna/secevent/pull/8
>>
>> I would encourage the editors to accept this change and publish a new
>> version once the datatracker opens again, and hopefully we can move this
>> document forward to its next review stages.
>>
>>  — Justin
>>
>> On May 31, 2022, at 4:25 PM, Justin Richer <jricher@mit.edu> wrote:
>>
>> Annabelle and I have had a chance to discuss this directly, but I wanted
>> to take a moment to record my response here for the group as well. I
>> believe we now understand each other that the `did` format should be
>> restored, alongside a generic `uri` format, with overall guidance on where
>> and how to use each. Namely, use the most specific semantically appropriate
>> format that you can. The reasons for my stance, and what I believe are the
>> conclusions we agreed to, are discussed inline below:
>>
>> On May 18, 2022, at 6:29 PM, Backman, Annabelle <richanna@amazon.com>
>> wrote:
>>
>> There appear to be some issues with -11:
>>
>>    1. The definition for `did` was removed, but not the `did` entry in
>>    the format registry
>>    2. No replacement `url` format was added.
>>
>>
>> Justin, my understanding is that your concerns are directed at the
>> proposal to *replace* `did` with `url`, and thus would not be addressed
>> by adding the missing `url` format. Is that correct? Assuming that is the
>> case...
>>
>>
>> My concerns with the removal of `did` would NOT be addressed by the
>> addition of a generic `url` or `uri` format. The primary reason for this,
>> and to me a primary driver for the subject identifiers work, is that the
>> subject identifier format defines not only the syntax of the identifier but
>> also its semantic content. I do not believe that it is appropriate to
>> remove the semantic information from the format and push it all down into
>> the lower layer.
>>
>>
>> Replacing `did` with `url` doesn't push the semantic information
>> anywhere; the semantic information is there in the lower layer already.
>> Having a separate `did` format pulls that information up into the subject
>> identifier format layer, encoding the same information twice. That
>> significantly complicates processing and could hurt interoperability.
>>
>>
>> In fact, it does the opposite. One could make the argument that because
>> we have “mailto:” URLs (rfc2368) and “tel:” URLs (rfc3966) then we don’t
>> actually need the `email_address` or `phone_number` formats either, since
>> we could just encode all that in the URL itself. And then there’s no need
>> for an `opaque` because you could easily use a `urn` to solve that problem.
>> Even the issuer/subject pair COULD be formatted as a single URL, if someone
>> just sat down and made a syntax for it (and people argued for exactly that
>> in OIDC, but it didn’t get anywhere).
>>
>> So, in that world, why even bother with the subject identifiers? Let me
>> tell you why:
>>
>> When I’m creating a subject identifier block in my application, I know
>> what kind of identifier it is. I want to tell the receiver that I
>> specifically know what kind of identifier it is. The syntax for formatting
>> the identifier itself is incidental to this — particularly if that syntax
>> is itself a URL.
>>
>>
>> Consider the scenario where we have both `url` and `did` format types. An
>> issuer might encode a DID using either format type; do processors that
>> expect DIDs need to support both? If so then we've just made their lives
>> harder. More likely, some would support both and some wouldn't, leading to
>> unnecessary pain for parties that have to interoperate across processors
>> and/or issuers.
>>
>>
>> We’d expect to use `did` here. I would not expect a processor to support
>> both formats if they’re specifically looking for DIDs.
>>
>>
>> Now consider the scenario where we just have `url`. A processor that
>> accepts DID URLs (possibly alongside other non-URL identifier formats) and
>> no other URL types will see the `url` format, assume the value is a DID,
>> and attempt to validate it or otherwise process it as a DID. Note that this
>> step is necessary even if we have a `did` format, as it's always possible
>> that the issuer provided a malformed subject identifier. Likewise, a
>> processor that expects some other type of URL (e.g., an https URL) will
>> have to parse the URL and confirm it has the expected scheme, and depending
>> on the use case may also need to apply other security checks (e.g.,
>> matching against allowed origins, ensuring that the URL doesn't contain a
>> username or password, etc.).
>>
>>
>> This is exactly why we shouldn’t have just `url` without other layers. If
>> I’m processing a URL as an identifier, I may or may not want to do specific
>> things with that URL. Or it might simply just be an identifier string, like
>> someone’s homepage. I would be much more comfortable if the `url` format
>> did not have any additional processing implied, but that more specific
>> formats could require such processing, as you’d expect a DID to do in most
>> cases.
>>
>> I think the malformed subject identifier example is a strawman - any
>> identifier could be “malformed”. But instead of allowing the processor to
>> have a much more limited check of “is this a DID?”, we now have to have a
>> wider check of “is this a URL, is it a kind I know how to process, and is
>> there more processing that I need to do with it?”, and that’s where all of
>> the problems in the above example come in to play.
>>
>>
>> In the case where a processor accepts both DIDs and some other type of
>> URL, they have to parse and validate the URL and then branch based on the
>> scheme, instead of just branching based on the identifier format.
>>
>>
>> Could a processor figure out that there was a DID url inside of a `url`
>> block? Sure — but those are semantically different identifiers, just like
>> if I had put a `mailto:` URL inside of a `url` block, I would not expect
>> that to be treated with any particular equivalence to the same email
>> address in an `email_address` block. And I think the draft can actually be
>> explicit about that distinction:
>>
>>  - there’s no guarantee of equivalence between the information in
>> different formats
>>  - you should use the most specific format for the information you’re
>> trying to convey
>>
>>
>> Are there other scenarios where the issuer or processor encounters more
>> significant pain if we just have `url` versus if we have `url` and `did`?
>>
>>
>> Yes, I think the entire act of punting everything to the lower layer
>> causes nothing BUT pain. This confusion stems from the fact that both URIs
>> and the subject identifier formats both specify some level of semantic and
>> syntactic constraint. However, mixing them in the way proposed is deeply
>> problematic and would be disastrous in practice.
>>
>> As such, the subject identifiers format should continue to provide
>> semantic information about its contents, just like it has in the past
>> before draft -10, and not simply turn into a meaningless way to put URLs
>> into a JSON object.
>>
>>  — Justin
>>
>>
>> —
>> Annabelle Backman (she/her)
>> richanna@amazon.com
>>
>>
>>
>>
>> On Apr 26, 2022, at 5:36 PM, Justin Richer <jricher@mit.edu> wrote:
>>
>> CAUTION: This email originated from outside of the organization. Do not
>> click links or open attachments unless you can confirm the sender and know
>> the content is safe.
>>
>>
>>
>> I strongly disagree with the editor's removal of "did" from the spec and
>> the reasons for doing so.pushing the semantic information off into a lower
>> layer is not helpful in terms of complexity nor application. Now an
>> application will need to parse the various url's to know what they are
>> instead of being told in the data structure what's in there.
>>
>> -Justin
>> ________________________________________
>> From: Id-event [id-event-bounces@ietf.org] on behalf of
>> internet-drafts@ietf.org [internet-drafts@ietf.org]
>> Sent: Thursday, April 21, 2022 3:56 PM
>> To: i-d-announce@ietf.org
>> Cc: id-event@ietf.org
>> Subject: [Id-event] I-D Action:
>> draft-ietf-secevent-subject-identifiers-11.txt
>>
>> A New Internet-Draft is available from the on-line Internet-Drafts
>> directories.
>> This draft is a work item of the Security Events WG of the IETF.
>>
>>        Title           : Subject Identifiers for Security Event Tokens
>>        Authors         : Annabelle Backman
>>                          Marius Scurtescu
>>                          Prachi Jain
>>        Filename        : draft-ietf-secevent-subject-identifiers-11.txt
>>        Pages           : 22
>>        Date            : 2022-04-21
>>
>> Abstract:
>>   Security events communicated within Security Event Tokens may support
>>   a variety of identifiers to identify subjects related to the event.
>>   This specification formalizes the notion of subject identifiers as
>>   structured information that describe a subject, and named formats
>>   that define the syntax and semantics for encoding subject identifiers
>>   as JSON objects.  It also defines a registry for defining and
>>   allocating names for such formats, as well as the sub_id JSON Web
>>   Token (JWT) claim.
>>
>>
>> The IETF datatracker status page for this draft is:
>> https://datatracker.ietf.org/doc/draft-ietf-secevent-subject-identifiers/
>>
>> There is also an htmlized version available at:
>>
>> https://datatracker.ietf.org/doc/html/draft-ietf-secevent-subject-identifiers-11
>>
>> A diff from the previous version is available at:
>>
>> https://www.ietf.org/rfcdiff?url2=draft-ietf-secevent-subject-identifiers-11
>>
>>
>> Internet-Drafts are also available by rsync at rsync.ietf.org
>> ::internet-drafts
>>
>>
>> _______________________________________________
>> Id-event mailing list
>> Id-event@ietf.org
>> https://www.ietf.org/mailman/listinfo/id-event
>>
>> _______________________________________________
>> Id-event mailing list
>> Id-event@ietf.org
>> https://www.ietf.org/mailman/listinfo/id-event
>>
>>
>>
>> _______________________________________________
>> Id-event mailing list
>> Id-event@ietf.org
>> https://www.ietf.org/mailman/listinfo/id-event
>>
>>
>> _______________________________________________
>> Id-event mailing list
>> Id-event@ietf.org
>> https://www.ietf.org/mailman/listinfo/id-event
>>
>