Re: [OAUTH-WG] Namespacing "type" in RAR

Dick Hardt <dick.hardt@gmail.com> Tue, 21 July 2020 17:10 UTC

Return-Path: <dick.hardt@gmail.com>
X-Original-To: oauth@ietfa.amsl.com
Delivered-To: oauth@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EB8753A0C26 for <oauth@ietfa.amsl.com>; Tue, 21 Jul 2020 10:10:39 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.097
X-Spam-Level:
X-Spam-Status: No, score=-0.097 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_FONT_LOW_CONTRAST=0.001, HTML_MESSAGE=0.001, PDS_OTHER_BAD_TLD=1.999, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id pSPxT6Xz0SOz for <oauth@ietfa.amsl.com>; Tue, 21 Jul 2020 10:10:35 -0700 (PDT)
Received: from mail-lf1-x12e.google.com (mail-lf1-x12e.google.com [IPv6:2a00:1450:4864:20::12e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4C19B3A0C25 for <oauth@ietf.org>; Tue, 21 Jul 2020 10:10:35 -0700 (PDT)
Received: by mail-lf1-x12e.google.com with SMTP id u25so12091169lfm.1 for <oauth@ietf.org>; Tue, 21 Jul 2020 10:10:35 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=48U9M5/cA/OEyxpKYwzQ14If3tyZ/iPbh4xLHj4v3RM=; b=u4POegYt88y+ffI0XtSluUG1z4H3sjW35udj9OrcC0jsulZxEXNbm0/z4Be8bhaZs8 rwUMOmVAcaskoUoHz0bpdwySgFQyqX/T/66O6dV2nKbRzaJdJ4c+molX7Jjjv29gR3Bf pzy71/ngulJLTX1Zs6rL0FKWoNY/AdazTdq2O08UFS0ouAZ3vi8VhRiZFfVlyK9LI/Mq Vb56uhFplySX9jR/uXLFZiiRPcKM/79p4T3jCnB6IsANVsYXhFvGcSBduxgvRhetk/Gh dg0P8Twu0MYVVWAHTuzHHN1T67WLrIBht4gcWBrfzkkkwLXxoMbrqSqBQLCUIvNXmTEy ViNQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=48U9M5/cA/OEyxpKYwzQ14If3tyZ/iPbh4xLHj4v3RM=; b=qgzZ6SKbRJY+H5pAmXZjILQSHNGUvAxOh5onSrjtY4V3cS2iDVsCCUMVpRPdxEwWTQ p9ZxWbVVFdUyfR0mYg6JkAecIv7YStqaDmNLTg4SOtC+JXeH/7EsKJacVlz67cU2Tv8v pXWZO8wgnS9mik+H3ws7+r7KEwKR9XniEIXn8IQoRxmy8g+PBqK4engy5Jac7cA0wZWp jugXZf+CeWkaM4+pwSHFy+yyMep3MR34FvIMF2SHfVxt/8pz4q8XGb10O1MXBgIXo+14 BJBo1e4IBycMjHJ2e812hlLIQn1AyrXaAE5wiMbwwMy6MkXGyJBPMPmzJJXO+yCfyhFr Qt0w==
X-Gm-Message-State: AOAM533uNNsHsQ8Zpt8dhU24I7TzbnF9jJ+U8C0pmfkbK9wQ2IuKEqa/ 2VL2SQyyTpg9Rubvo1G9y/3KqbAWWMilroxQLh4=
X-Google-Smtp-Source: ABdhPJxog7g4rMuQ8caM8B91IjHGuVb7FgT0aakFKEZSpN5AGgf0hkMqnkFTdB+3gdNZhAj3K3p6A1UnxhhCTMDzdV8=
X-Received: by 2002:ac2:4a9d:: with SMTP id l29mr5597525lfp.23.1595351433215; Tue, 21 Jul 2020 10:10:33 -0700 (PDT)
MIME-Version: 1.0
References: <E9F67961-B83D-40EF-A9CC-F3E4B495379F@mit.edu> <CAD9ie-tTTBTGGq_Dw16efNt6OMgDgKnat0_G-AkvDaizgOEjLQ@mail.gmail.com> <AAF45754-674D-4034-AA86-DDFBCEC6802D@mit.edu> <CAD9ie-tCPymDtqXAyB=WAKmtg2LXHXY==1Jbm6icwwLwL5W1Aw@mail.gmail.com> <094C7F56-93F7-41EC-AD94-A0752E76BD9D@mit.edu> <CAD9ie-tX+C1BvRRMH5E9-05X8YG02r3m1EpMn91Vruv+zsMyeg@mail.gmail.com> <CC8861CE-E535-4290-9E31-E037849ED509@mit.edu> <CAD9ie-tZkKtwVwLUSf2FJ9Xm-80dBupYKvbdmywSgA3M64B_7g@mail.gmail.com> <847FC552-84CA-4227-9768-8BA488B7FEFE@mit.edu> <CAD9ie-s7aZxt5wGjQgdDXSB3AsK1Ovr=cReiC9phnhYpYF-WGQ@mail.gmail.com>
In-Reply-To: <CAD9ie-s7aZxt5wGjQgdDXSB3AsK1Ovr=cReiC9phnhYpYF-WGQ@mail.gmail.com>
From: Dick Hardt <dick.hardt@gmail.com>
Date: Tue, 21 Jul 2020 10:09:57 -0700
Message-ID: <CAD9ie-syeKtktYpi4Seboz1S0ugBZPhP+3hntod5f_3WynCXzw@mail.gmail.com>
To: Justin Richer <jricher@mit.edu>
Cc: oauth <oauth@ietf.org>
Content-Type: multipart/alternative; boundary="00000000000078050905aaf6b314"
Archived-At: <https://mailarchive.ietf.org/arch/msg/oauth/g4sUnhMap_mvPz7ButFySd5Eo4o>
Subject: Re: [OAUTH-WG] Namespacing "type" in RAR
X-BeenThere: oauth@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: OAUTH WG <oauth.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/oauth>, <mailto:oauth-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/oauth/>
List-Post: <mailto:oauth@ietf.org>
List-Help: <mailto:oauth-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/oauth>, <mailto:oauth-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 21 Jul 2020 17:10:40 -0000

An explanation of the issues in Unicode can be found here:

https://en.wikipedia.org/wiki/Unicode_equivalence#Character_duplication



On Tue, Jul 21, 2020 at 10:03 AM Dick Hardt <dick.hardt@gmail.com> wrote:

>
> The following are the same URI, but are different strings:
>
> “https://schema.example.org/v1”
> “HTTPS://schema.example.org/v1 <https://schema.example.org/v1>”
> “https://SCHEMA.EXAMPLE.ORG/v1 <https://schema.example.org/v1>”
>
> Before comparing them to each other, they must be canonicalized so that
> they become the same string.
>
> From earlier in this thread, I am NOT suggesting that it must be a URI,
> nor that it is required:
>
> Since the type represents a much more complex object then a JWT claim, a
> client developer's tooling could pull down the JSON Schema (or some such)
> for a type used in their source code, and provide autocompletion and
> validation which would improve productivity and reduce errors. An AS that
> is using a defined type could use the schema for input validation. Neither
> of these would be at run time. JSON Schema allows comments and examples.
>
> What is the harm in non-normative language around a retrievable URI?
>
>
> On Tue, Jul 21, 2020 at 9:58 AM Justin Richer <jricher@mit.edu> wrote:
>
>> String comparison works just fine when the strings happen to be URIs, and
>> you aren’t treating them as URIs:
>>
>> “https://schema.example.org/v1”
>>
>> Is different from
>>
>> “https://schema.example.org/v2”
>>
>> And both are different from
>>
>> “https://schema.example.org:443/v1/“
>>
>> All of these are strings, and the strings happen to be URIs but that’s
>> irrelevant to the comparison process. Can you please help me understand why
>> doing a string comparison on these values does not work in exactly the same
>> way it would for “foo”, “bar”, and “baz” values? Why would these need to be
>> canonicalized to be compared? The definition of a JSON string is an ordered
>> set of unicode code points, and this can be compared byte-wise. (Or
>> code-point-wise, whatever’s most correct here.) Can you give me
>> counter-examples as to where string comparison doesn’t work? And can you
>> help me understand how this same worry doesn’t apply to all of the rest of
>> the values in the RAR specification, which are also strings and will need
>> to be compared?
>>
>> I’m still very confused as to the URI retrieval issue here, if there even
>> is one. It sounds like we’re both saying that it could be useful if type
>> values are retrievable when they’re URIs, but that would be something to
>> augment a process and not required for the RAR spec. I’m against requiring
>> the value to be a URI and against requiring the AS to process that URI *as
>> a URI* at runtime. Anything that an AS wants to do with the “type”
>> value, including providing additional tooling and validation, is up to the
>> AS and outside of the spec.
>>
>>  — Justin
>>
>> On Jul 21, 2020, at 12:35 PM, Dick Hardt <dick.hardt@gmail.com> wrote:
>>
>> This statement:
>>
>> “compare two strings so that they’re exact”
>>
>> does not work for either Unicode or URIs. A string, and a canonicalized
>> Unicode string are not the same thing. Similar for a URI. I have assumed
>> you understand the canonicalization requirement, but it does not sound like
>> you do. Would you like examples?
>>
>>
>> wrt. the AS and URI, *you* keep saying that *I* said the AS would
>> retrieve the URI. I HAVE NOT SAID THAT!
>>
>> I am suggesting that the URI MAY be retrievable, and I gave examples on
>> how that would be useful for tooling for client developers, and for an AS
>> in doing input validation. The URI would NOT be retrieved at run time.
>>
>>
>> On Tue, Jul 21, 2020 at 7:35 AM Justin Richer <jricher@mit.edu> wrote:
>>
>>> If we treat all the strings as just strings, without any special
>>> internal format to be specified or detected, then comparing the strings is
>>> a well-understood and well-documented process. I also think that we
>>> shouldn’t invent anything here, so if there’s a better way to say “compare
>>> two strings so that they’re exact” then that’s what I mean. Sorry if that
>>> was unclear.
>>>
>>> I’m saying the AS should *not* retrieve the URI passed in the “type”
>>> value. You brought that up and then described the process that the AS would
>>> take to do so. I have said from the start that the use of a URI is for name
>>> spacing and not for addressing content to be fetched, so I’m confused why
>>> you think I intend otherwise.
>>>
>>>  — Justin
>>>
>>> On Jul 20, 2020, at 2:59 PM, Dick Hardt <dick.hardt@gmail.com> wrote:
>>>
>>> Canonicalization of URIs and unicode is fairly well specified. I was not
>>> suggesting we invent anything there.
>>>
>>> A byte comparison, as you suggested earlier, will be problematic, as I
>>> have pointed out.
>>>
>>> I'm confused why you are still talking about the AS retrieving a URI.
>>>
>>> ᐧ
>>>
>>> On Mon, Jul 20, 2020 at 4:42 AM Justin Richer <jricher@mit.edu> wrote:
>>>
>>>> Since this is a recommendation for namespace, we could also just say
>>>> collision-resistant like JWT, and any of those examples are fine. But that
>>>> said, I think there’s something particularly compelling about URIs since
>>>> they have somewhat-human-readable portions. But again, I’m saying it should
>>>> be a recommendation to API developers and not a requirement in the spec. In
>>>> the spec, I argue that “type” should be a string, full stop.
>>>>
>>>> If documentation is so confusing that developers are typing in the
>>>> wrong strings, then that’s bad documentation. And likely a bad choice for
>>>> the “type” string on the part of the AS. You’d have the same problem with
>>>> any other value the developer’s supposed to copy over.  :)
>>>>
>>>> I agree that we should call out explicitly how they should be compared,
>>>> and I propose we use one of the handful of existing string-comparison RFC’s
>>>> here instead of defining our own rules.
>>>>
>>>> While the type could be a dereferenceable URI, requiring action on the
>>>> AS is really getting into distributed authorization policies. We tried
>>>> doing that with UMA1’s scope structures and it didn’t work very well in
>>>> practice (in my memory and experience). Someone could profile “type" on top
>>>> of this if they wanted to do so, with support at the AS for that, but I
>>>> don’t see a compelling reason for that to be a requirement as that’s a lot
>>>> of complexity and a lot more error states (the fetch fails, or it doesn’t
>>>> have a policy, or the policy’s in a format the AS doesn’t understand, or
>>>> the AS doesn’t like the policy, etc).
>>>>
>>>> And AS is always free to implement its types in such a fashion, and
>>>> that could make plenty of sense in a smaller ecosystem. And this is yet
>>>> another reason that we define “type” as being a string to be interpreted
>>>> and understood by the AS — so that an AS that wants to work this way can do
>>>> so.
>>>>
>>>>  — Justin
>>>>
>>>> PS: thanks for pointing out the error in the example in XYZ, I’ll fix
>>>> that prior to publication.
>>>>
>>>> On Jul 18, 2020, at 8:58 PM, Dick Hardt <dick.hardt@gmail.com> wrote:
>>>>
>>>> Justin: thanks for kindly pointing out which mail list this is.
>>>>
>>>> To clarify, public JWT claims are not just URIs, but any
>>>> collision-resistant namespace:
>>>> "Examples of collision-resistant namespaces include: Domain Names,
>>>> Object Identifiers (OIDs) as defined in the ITU-T X.660 and      X.670
>>>> Recommendation series, and Universally Unique IDentifiers (UUIDs)
>>>> [RFC4122]."
>>>>
>>>> I think letting the "type" be any JSON string and doing a byte-wise
>>>> comparison will be problematic. A client developer will be reading
>>>> documentation to learn what the types are, and typing it in. Given the wide
>>>> set of whitespace characters, and unicode equivalence, different byte
>>>> streams will all look the same, and a byte-wise comparison will fail.
>>>>
>>>> Similarly for URIs. If it is a valid URI, then a byte-wise comparison
>>>> is not sufficient. Canonicalization is required.
>>>>
>>>> These are not showstopper issues, but the specification should call out
>>>> how type strings are compared, and provide caveats to an AS developer.
>>>>
>>>> I have no idea why you would think the AS would retrieve a URL.
>>>>
>>>> Since the type represents a much more complex object then a JWT claim,
>>>> a client developer's tooling could pull down the JSON Schema (or some such)
>>>> for a type used in their source code, and provide autocompletion and
>>>> validation which would improve productivity and reduce errors. An AS that
>>>> is using a defined type could use the schema for input validation. Neither
>>>> of these would be at run time. JSON Schema allows comments and examples.
>>>>
>>>> What is the harm in non-normative language around a retrievable URI?
>>>>
>>>> BTW: the example in
>>>> https://oauth.xyz/draft-richer-transactional-authz#rfc.section.2 has
>>>> not been updated with the "type" field.
>>>>
>>>>
>>>>
>>>> On Sat, Jul 18, 2020 at 8:10 AM Justin Richer <jricher@mit.edu> wrote:
>>>>
>>>>> Hi Dick,
>>>>>
>>>>> This is a discussion about the RAR specification on the OAuth list,
>>>>> and therefore doesn’t have anything to do with alignment with XAuth. In
>>>>> fact, I believe the alignment is the other way around, as doesn’t Xauth
>>>>> normatively reference RAR at this point? Even though, last I saw, it uses a
>>>>> different top-level structure for conveying things, I believe it does say
>>>>> to use the internal object structures. I am also a co-author on RAR and we
>>>>> had already defined a “type” field in RAR quite some time ago. You did
>>>>> notice that XYZ’s latest draft added this field to keep the two in
>>>>> alignment with each other, which has always been the goal since the initial
>>>>> proposal of the RAR work, but that’s a time lag and not a display of new
>>>>> intent.
>>>>>
>>>>> In any event, even though I think the decision has bearing in both
>>>>> places, this isn’t about GNAP. Working on RAR’s requirements has brought up
>>>>> this interesting issue of what should be in the type field for RAR in OAuth
>>>>> 2.
>>>>>
>>>>> I think that it should be defined as a string, and therefore compared
>>>>> as a byte value in all cases, regardless of what the content of the string
>>>>> is. I don’t think the AS should be expected to fetch a URI for anything. I
>>>>> don’t think the AS should normalize any of the inputs. I think that any
>>>>> JSON-friendly character set should be allowed (including spaces and
>>>>> unicodes), and since RAR already requires the JSON objects to be
>>>>> form-encoded, this shouldn’t cause additional trouble when adding them in
>>>>> to OAuth 2’s request structures.
>>>>>
>>>>> The idea of using a URI would be to get people out of each other’s
>>>>> namespaces. It’s similar to the concept of “public” vs “private” claims in
>>>>> JWT:
>>>>>
>>>>> https://tools.ietf.org/html/rfc7519#section-4.2
>>>>>
>>>>> What I’m proposing is that if you think it’s going to be a
>>>>> general-purpose type name, then we recommend you use a URI as your string.
>>>>> And beyond that, that’s it. It’s up to the AS to figure out what to do with
>>>>> it, and RAR stays out of it.
>>>>>
>>>>>  — Justin
>>>>>
>>>>> On Jul 17, 2020, at 1:25 PM, Dick Hardt <dick.hardt@gmail.com> wrote:
>>>>>
>>>>> Hey Justin, glad to see that you have aligned with the latest XAuth
>>>>> draft on a type property being required.
>>>>>
>>>>> I like the idea that the value of the type property is fully defined
>>>>> by the AS, which could delegate it to a common URI for reuse. This gets
>>>>> GNAP out of specifying access requests, and enables other parties to define
>>>>> access without any required coordination with IETF or IANA.
>>>>>
>>>>> A complication in mixing plain strings and URIs is the
>>>>> canonicalization. A plain string can be a fixed byte representation, but a
>>>>> URI requires canonicalization for comparison. Mixing the two requires URI
>>>>> detection at the AS before canonicalization, and an AS MUST do
>>>>> canonicalization of URIs.
>>>>>
>>>>> The URI is retrievable, it can provide machine and/or human readable
>>>>> documentation in JSON schema or some such, or any other content type. Once
>>>>> again, the details are out of scope of GNAP, but we can provide examples to
>>>>> guide implementers.
>>>>>
>>>>> Are you still thinking that bare strings are allowed in GNAP, and are
>>>>> defined by the AS?
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Jul 17, 2020 at 8:39 AM Justin Richer <jricher@mit.edu> wrote:
>>>>>
>>>>>> The “type” field in the RAR spec serves an important purpose: it
>>>>>> defines what goes in the rest of the object, including what other fields
>>>>>> are available and what values are allowed for those fields. It provides an
>>>>>> API-level definition for requesting access based on multiple dimensions,
>>>>>> and that’s really powerful and flexible. Each type can use any of the
>>>>>> general-purpose fields like “actions” and/or add its own fields as
>>>>>> necessary, and the “type” parameter keeps everything well-defined.
>>>>>>
>>>>>> The question, then, is what defines what’s allowed to go into the
>>>>>> “type” field itself? And what defines how that value maps to the
>>>>>> requirements for the rest of the object? The draft doesn’t say anything
>>>>>> about it at the moment, but we should choose the direction we want to go.
>>>>>> On the surface, there are three main options:
>>>>>>
>>>>>> 1) Require all values to be registered.
>>>>>> 2) Require all values to be collision-resistant (eg, URIs).
>>>>>> 3) Require all values to be defined by the AS (and/or the RS’s that
>>>>>> it protects).
>>>>>>
>>>>>> Are there any other options?
>>>>>>
>>>>>> Here are my thoughts on each approach:
>>>>>>
>>>>>> 1) While it usually makes sense to register things for
>>>>>> interoperability, this is a case where I think that a registry would
>>>>>> actually hurt interoperability and adoption. Like a “scope” value, the RAR
>>>>>> “type” is ultimately up to the AS and RS to interpret in their own context.
>>>>>> We :want: people to define rich objects for their APIs and enable
>>>>>> fine-grained access for their systems, and if they have to register
>>>>>> something every time they come up with a new API to protect, it’s going to
>>>>>> be an unmaintainable mess. I genuinely don’t think this would scale, and
>>>>>> that most developers would just ignore the registry and do what they want
>>>>>> anyway. And since many of these systems are inside domains, it’s completely
>>>>>> unenforceable in practice.
>>>>>>
>>>>>> 2) This seems reasonable, but it’s a bit of a nuisance to require
>>>>>> everything to be a URI here. It’s long and ugly, and a lot of APIs are
>>>>>> going to be internal to a given group, deployment, or ecosystem anyway.
>>>>>> This makes sense when you’ve got something reusable across many
>>>>>> deployments, like OIDC, but it’s overhead when what you’re doing is tied to
>>>>>> your environment.
>>>>>>
>>>>>> 3) This allows the AS and RS to define the request parameters for
>>>>>> their APIs just like they do today with scopes. Since it’s always the
>>>>>> combination of “this type :AT: this AS/RS”, name spacing is less of an
>>>>>> issue across systems. We haven’t seen huge problems in scope value overlap
>>>>>> in the wild, though it does occur from time to time it’s more than
>>>>>> manageable. A client isn’t going to just “speak RAR”, it’s going to be
>>>>>> speaking RAR so that it can access something in particular.
>>>>>>
>>>>>> And all that brings me to my proposal:
>>>>>>
>>>>>> 4) Require all values to be defined by the AS, and encourage
>>>>>> specification developers to use URIs for collision resistance.
>>>>>>
>>>>>> So officially in RAR, the AS would decide what “type” means, and
>>>>>> nobody else. But we can also guide people who are developing
>>>>>> general-purpose interoperable APIs to use URIs for their RAR “type”
>>>>>> definitions. This would keep those interoperable APIs from stepping on each
>>>>>> other, and from stepping on any locally-defined special “type” structure.
>>>>>> But at the end of the day, the URI carries no more weight than just any
>>>>>> other string, and the AS decides what it means and how it applies.
>>>>>>
>>>>>> My argument is that this seems to have worked very, very well for
>>>>>> scopes, and the RAR “type” is cut from similar descriptive cloth.
>>>>>>
>>>>>> What does the rest of the group think? How should we manage the RAR
>>>>>> “type” values and what they mean?
>>>>>>
>>>>>>  — Justin
>>>>>> _______________________________________________
>>>>>> OAuth mailing list
>>>>>> OAuth@ietf.org
>>>>>> https://www.ietf.org/mailman/listinfo/oauth
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>