Re: [GNAP] Feedback on polymorphism

Dick Hardt <dick.hardt@gmail.com> Fri, 23 October 2020 19:53 UTC

Return-Path: <dick.hardt@gmail.com>
X-Original-To: txauth@ietfa.amsl.com
Delivered-To: txauth@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BC1E13A0948 for <txauth@ietfa.amsl.com>; Fri, 23 Oct 2020 12:53:16 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.096
X-Spam-Level:
X-Spam-Status: No, score=-2.096 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_FONT_LOW_CONTRAST=0.001, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id faDuNSu_vX0c for <txauth@ietfa.amsl.com>; Fri, 23 Oct 2020 12:53:13 -0700 (PDT)
Received: from mail-lf1-x12d.google.com (mail-lf1-x12d.google.com [IPv6:2a00:1450:4864:20::12d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id DA1A73A0AE2 for <txauth@ietf.org>; Fri, 23 Oct 2020 12:53:12 -0700 (PDT)
Received: by mail-lf1-x12d.google.com with SMTP id a9so3489921lfc.7 for <txauth@ietf.org>; Fri, 23 Oct 2020 12:53:12 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=eSZfZA0VvCtmfiw1AlCP0sIZPQExQ0xeZbjQLGjL5ZY=; b=gAxyiof4La4dTcwcywYkmhATJrFKFhMJxpyTBYUEdsDQS8mT/5URPDkEzDgSD8UcwH LSVcowH2SaSeIhtAQv2YbY/nvdxbZpY2rlGoUT6fGq63thdxHP+53DLwRTqC3cmn/Dvc UOP0p38WYQZNHR00gB6Bqxorxe2UjLvMnwknHWmmSvCZ9huY8H7A80VYR8kTKsvWqH5G jM2PChRUF4bYuzJoqMMQYLenY5XXfht8knDx4vlvVVSH2dBGJx8or1FHuh/Bnz0jFYb5 mzj3qSn5Wok4IaEidLovMC/Hl4x44aAuJ50weLDi3UDZAiTsa/ZGTVQ2O75Gc5IBZ9ne +ndA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=eSZfZA0VvCtmfiw1AlCP0sIZPQExQ0xeZbjQLGjL5ZY=; b=JyPvO+3CZ2zX+de6Iyg9r67ZPDvt0uP8BfW2SzdTpQd/tp8Ap9ieq7X5ZzolSt3moK QW3KV/z61DrSpAaNrz7+U/MFExJ391liBt4Z9t6r0Z0Lhe4XZW49yOKtQRpRCtFkNPnJ EzjPGAMOj1TAmF4FiJjcmM5fspDnEIEnJIjX1P5d4q1YWYVI/s0JtP22pZBFsNzFs/am bi8iYiYc2ZAGkOR6uCXNmcxbk21/V9QE1dSJlaaufc4AARI+dUL5FNzxLOpVQTvTn0s1 O1cmnRXejfDqQZ0nC9KhlWZ4L8wDoUz1wrPhmH3jIQFyXtOCSvFa0akVWZxKFiafoxon tktA==
X-Gm-Message-State: AOAM533AOJrGz/gU5kZdNUG+ryIhpoSYfGGd+5kEhaRo5sdHmaTzxP6T IZ4Ki/qtHiPV1LJMwp6SeOE3rnldvDqRQPBlrVk=
X-Google-Smtp-Source: ABdhPJw0IbA8bYa9XIaN95jngy/18JjAP088ZvrUOLiP+ef1SmEvPvicYxVx+UnD3gCoEuXHHeSOP1DLAuHd8LY/Ats=
X-Received: by 2002:a05:6512:68a:: with SMTP id t10mr1210844lfe.221.1603482790604; Fri, 23 Oct 2020 12:53:10 -0700 (PDT)
MIME-Version: 1.0
References: <CADxMOMf016Rq2GhRF5utT6KsiQiS3ir4QXjrnW_1+buiqtG7uA@mail.gmail.com> <AAF5364C-F337-483F-B011-A2B11779290E@mit.edu> <CAD9ie-v-RygTSh4VBnAzMDFy6O21sH-Jhh+_7QMVTT0mn_ur2g@mail.gmail.com> <30AE73A0-6A18-454A-AA20-0ECE5AEBD49A@mit.edu>
In-Reply-To: <30AE73A0-6A18-454A-AA20-0ECE5AEBD49A@mit.edu>
From: Dick Hardt <dick.hardt@gmail.com>
Date: Fri, 23 Oct 2020 12:52:34 -0700
Message-ID: <CAD9ie-s1ENJv610qGYLB=OJaX7q2g3G1zWA+YeNWoRPdktVi2A@mail.gmail.com>
To: Justin Richer <jricher@mit.edu>
Cc: Mika Boström <mika.bostrom=40smarkets.com@dmarc.ietf.org>, GNAP Mailing List <txauth@ietf.org>
Content-Type: multipart/alternative; boundary="00000000000023335005b25bee10"
Archived-At: <https://mailarchive.ietf.org/arch/msg/txauth/moDkTIPVSFTPeAHvAclcRsG_pDI>
Subject: Re: [GNAP] Feedback on polymorphism
X-BeenThere: txauth@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <txauth.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/txauth>, <mailto:txauth-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/txauth/>
List-Post: <mailto:txauth@ietf.org>
List-Help: <mailto:txauth-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/txauth>, <mailto:txauth-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 23 Oct 2020 19:53:17 -0000

Justin

I did note that I was the one that argued for instance_id being in the
object. Since it is in the object in the current draft, not including a
pass by reference option would be preferable.

As for concrete examples:
- version of client
- version of OS
- security attestation of OS / device
- location of client device
- network client is operating on

These are all attributes of the client that an AS may require on the
initial grant request, and in future grant requests (which is when an
instance_id) would be used.

/Dick

ᐧ

On Fri, Oct 23, 2020 at 12:42 PM Justin Richer <jricher@mit.edu> wrote:

> Dick,
>
> As you’ll recall, I argued against including the client instance
> identifier inside of the object as a mutually-exclusive field precisely
> because of the principle violation that you are pointing out here, and so
> it’s important to point out that the current text is a compromise that
> needs to be examined in the wider experience of the working group. I am on
> the side of removing the mutually-exclusive “instance_id” option within an
> object, but this needs to be explored.
>
> The crux of my argument is that is exactly a case of pass-by-reference vs
> pass-by-value, and that runtime attestations are not part of the “client
> instance” value itself but rather belong outside of that object in a
> another part of the request. As stated in the editorial notes in this
> section, we need to look carefully at how these concepts fit within the
> model and where we would want to put them. Without concrete examples of
> what these extensions look like and how they’re generated, that is nearly
> impossible to do at this stage. I look forward to seeing examples of this
> kind of data and how it can fit into the protocol.
>
>  — Justin
>
> On Oct 23, 2020, at 3:07 PM, Dick Hardt <dick.hardt@gmail.com> wrote:
>
> Hey Justin,
>
> As the draft has evolved, I question the continued use of polymorphism.
> Note that I appreciate the elegance of using a string for
> pass-by-reference, and an object for pass-by-value.
>
> In the current draft, the
>
> Every time you create or process a field it will mean only one thing, and
> there’s only one field to look at to answer a question.
>
>
> is violated in 2.3.1.  Identifying the RC Instance
> <https://tools.ietf.org/html/draft-ietf-gnap-core-protocol-00#section-2.3.1>
>
>
>    instance_id  An identifier string that the AS can use to identify the
>       particular instance of this RC.  The content and structure of this
>       identifier is opaque to the RC.
>
>    "client": {
>        "instance_id": "client-541-ab"
>    }
>
>    If there are no additional fields to send, the RC MAY send the
>    instance identifier as a direct reference value in lieu of the
>    object.
>
>    "client": "client-541-ab"
>
>
> The instance identifier can be sent two ways. Polymorphism is a
> convenience for the client, but requires the server to have two code paths
> for "instance_id".  We discussed this in the design team, and I argued for
> having "instance_id" in the "client" object so that any updates, such as
> new devices assertions, could be in the "client" object. As noted above,
> while I appreciate the elegance of using a string (handle) to reference a
> previously provided object, it complicates how to update an existing object
> while providing the reference.
>
> In your example of the "key" object below, setting "proof" to bearer would
> avoid the issue you describe:
>
> {
>  "key": {
>      "proof": "bearer"
>     }
> }
>
> In your example, when processing the "key" object, code is having to check
> both the JSON type of the property, as well as check the value of the
> "proof" property. In the example I provided, only the value of "proof"
> needs to be checked. The "proof" property is acting as a type for the "key"
> object.
>
> Not being a Java programmer, I don't know how this would work in a Java
> implementation, but node.js, the processing would need to be done as above.
>
> On a related note, there was significant negative feedback on handles and
> polymorphism in the Hacker News article
> https://news.ycombinator.com/item?id=24855750
>
> /Dick
>
>
> On Fri, Oct 23, 2020 at 10:20 AM Justin Richer <jricher@mit.edu> wrote:
>
>> Hi Mika,
>>
>> Thanks for bringing this topic here — I was able to see the forum
>> discussion that brought you here, and hopefully I can help clear up what I
>> mean with how polymorphism is used in the proposal. The short version is
>> that the goal is to *avoid* the kinds of ambiguity that make insecure
>> protocols, and so in that goal we’re fully aligned. I think that using
>> polymorphism in very specific ways can help that goal — just as I agree
>> that misusing it or applying it sloppily can lead to ambiguous and insecure
>> systems.
>>
>> Some background: I built out the XYZ protocol (one of the predecessors to
>> the initial GNAP Draft) in Java using strongly typed parsers and Java
>> objects specifically to prove to myself that it could be done in a way that
>> made any sense in the code. (My own open source implementation is at
>> https://github.com/bspk/oauth.xyz-java, but note that it’s not yet up to
>> date with the GNAP spec). It was important to me that I be able to use the
>> system-wide configured parsers to implement this and not have to resort to
>> stepping through elements completely by hand. Java doesn’t make it simple
>> to get the hooks into the right places (especially with the Jackson parser
>> that I used), but it is definitely possible to create a deterministic and
>> strongly-typed parser and serializer for this kind of data structure. Some
>> of the rationale for using polymorphism is covered in the trailing appendix
>> of the draft document (
>> https://www.ietf.org/archive/id/draft-ietf-gnap-core-protocol-00.html#name-json-structures-and-polymor),
>> but it’s still good to discuss this here as the working group decides which
>> approaches to take.
>>
>> The driving reason for using polymorphism at the protocol level was to
>> simplify the protocol and make it :more: deterministic to create and
>> process, not less. Every time you create or process a field it will mean
>> only one thing, and there’s only one field to look at to answer a question.
>> Without polymorphic field values, you usually need to rely on mutual
>> exclusivity of fields, which is prone to failure and requires additional
>> error checking. Take for example the key binding of access tokens. An
>> access token could be bound to the RC’s key used during the request, to a
>> different key chosen by the AS, or it could be a bearer token with no key
>> at all. By making the “key” field polymorphic, we can define it in terms of
>> boolean values and objects and express this set of mutually-exclusive
>> options in a non-ambiguous way. Without that, you’d need to have different
>> fields for the options and include additional checks in your parser to make
>> sure they weren’t sent simultaneously, otherwise you could get hit with
>> this potential security vulnerability in an object:
>>
>> {
>>     key: {
>>       proof: httpsig,
>>       jwk: { … key value … }
>>     },
>>     bearer_token: true,
>>     bind_to_rc_key: true
>> }
>>
>> This would be an illegal object as per this alternate proposal, but then
>> you’d have to check each field and make sure it wasn’t put next to others
>> in the same object. I’ve done this exercise with many other protocols and
>> it’s both error prone and easy to ignore since all the “good” examples
>> would pass code that doesn’t check this. With the polymorphic approach to
>> this same field, each of these three mutually-exclusive states is written
>> in a way that they cannot be sent together. It’s not just illegal, it’s
>> impossible and enforced by the syntax of JSON itself.
>>
>> {
>>     key: {
>>       proof: httpsig,
>>       jwk: { … key value … }
>>     }
>> }
>>
>> // bearer token
>>
>> {
>>     key: false
>> }
>>
>> // bound to the RC’s presented key
>>
>> {
>>     key: true
>> }
>>
>> If someone sends a different type for this field, like an array or number
>> or a null, this doesn’t have a defined interpretation in the protocol and
>> would be a protocol level error.
>>
>> While it might sound like polymorphism means that any field could have
>> any type or value, the opposite is true: each possible value is explicitly
>> typed, it’s just that there are potentially different types that express
>> meaning for the field. This applies to all members of all objects
>> (dictionaries) as well as all members of an array (list). Every time you
>> process a field value or other element, you look at the type and then the
>> value to determine what to do with that typed value.
>>
>> In your example below, each field within the dictionary would also need
>> to be typed, and each type would need to have a clear indication of its
>> meaning. To take your strawman key format below, the “modulus” field could
>> be defined polymorphically as either a “bigint” (a JSON number) or an
>> “encoded string” (a JSON string). The definition would further say what
>> exactly the encoding of the string would be. That means that when you read
>> the “modulus” field there wouldn’t be any confusion on what the value was
>> or how it was represented, regardless of the input format. Seeing a number
>> there means exactly one interpretation and seeing a string means exactly
>> one (different) interpretation — but importantly, both of them are a
>> “modulus”, since that’s the field that determines the type. An
>> implementation would likely use an internal BigInteger type of object to
>> represent the field value after parsing, so the question is how to go from
>> the JSON value (which is typed) into the BigInteger value.You don’t just
>> apply the type rules on the “public_key” field, you apply it to all
>> sub-fields of that object.
>>
>> So let’s dig into the specific bug you bring up in the strawman, because
>> it’s interesting: A JSON encoder that encodes numbers as strings, and not
>> numbers, is not compliant with the JSON definitions of the field in
>> question. For another example, the quoted string value of “true” is not
>> equivalent to the boolean value true in JSON, and they shouldn’t be treated
>> the same by a parser implementation when mapping to a concrete object. It’s
>> in this kind of automated guessing that this class of bugs occur, and
>> that’s going to be the case whether or not you take  advantage of JSON’s
>> polymorphic nature. I’ve run into cases where a parser library was trying
>> to be overly “helpful” in doing this kind of mapping, but ended up
>> introducing errors in more strict components downstream. This is something
>> that protocol designers need to be aware of and guard against in the design
>> of the protocol to reduce possible ambiguities. Within GNAP today, we
>> generally have things that branch whether they’re an object (for a rich
>> description of something) or some non-structured special value (for a
>> reference or other item).
>>
>> The design team created some simple JSON Schemas for parts of the
>> protocol during our discussion, but we didn’t include them in the design
>> document due to both lack of time to keep it updated with the rapid changes
>> to the protocol during the design team discussion, and not knowing if there
>> would be interest in such material. I personally think it would be helpful
>> to include as an informative reference in the final document, but that’s
>> something for the working group to take up eventually.
>>
>>  — Justin
>>
>> On Oct 23, 2020, at 10:18 AM, Mika Boström <
>> mika.bostrom=40smarkets.com@dmarc.ietf.org> wrote:
>>
>> Hello, everyone.
>>
>> For background: GNAP/TxAuth/XYZ/Oauth3 came up on a discussion forum and
>> when I made note about certain concerns, I was requested to send my
>> comments to this working group.
>>
>> In short, I believe that the use of polymorphic JSON in the protocol
>> invites subtle and confusing implementation problems. I also searched
>> through the WG archives, and noticed that similar concerns were noted,
>> briefly, in a thread in July.
>>
>> The problem with polymorphic values, as I see it, is that implementations
>> will need to branch on the (inferred) type of a given field. This isn't
>> quite as bad if the types are strictly different, but allows for subtle
>> bugs when the value in question is a dictionary. What makes this
>> unappealing is that "subtle bugs" in security protocols have a habit of
>> turning into vulnerabilities.
>>
>> Let's say we have these imaginary payloads, both possible and valid in
>> the same protocol step:
>>
>> # payload 1
>> {
>>   ...,
>>   "public_key": {
>>     "alg": "rsa",
>>     "modulus": <BIGINT>
>>   }
>> }
>>
>> # payload 2
>> {
>>   ...,
>>   "public_key": {
>>     "alg": "rsa",
>>     "modulus": "<encoded string>"
>>   }
>> }
>>
>> In both cases, the type of "public_key" field is a dictionary. In both
>> cases, they even have the same keys. However, the values in the
>> dictionaries are entirely different, and an implementation will have to
>> branch to at least two possible decoding mechanisms. To make things worse,
>> some JSON implementations may choose to encode non-dictionary values as
>> strings, so it is possible for an originator to transmit what they expect
>> and believe to be payload 1 format, but which the receiver will interpret
>> to be in payload 2 format. And if the encoded string contains only digits,
>> it will even parse correctly as a bignum.
>>
>> While the above is clearly a manufactured scenario, it nonetheless
>> demonstrates the potential for logic bugs with polymorphic JSON. With
>> richer types and more complex dictionaries, there will surely be more room
>> for errors.
>>
>> Ambiguity in protocols is always a source of implementation complexity
>> and interoperability snags, but in an AuthN/AuthZ protocol it is worse:
>> it's terrifying. If GNAP/Oauth3 is intended to supersede Oauth1/2, wouldn't
>> it be in everyone's interest to keep implementation complexity and mistake
>> potential to a minimum?
>>
>> Best regards,
>> Mika
>>
>> --
>> Mika Boström
>> Smarkets
>> --
>> TXAuth mailing list
>> TXAuth@ietf.org
>> https://www.ietf.org/mailman/listinfo/txauth
>>
>>
>> --
>> TXAuth mailing list
>> TXAuth@ietf.org
>> https://www.ietf.org/mailman/listinfo/txauth
>>
> ᐧ
>
>
>