Re: [GNAP] Feedback on polymorphism

Justin Richer <jricher@mit.edu> Fri, 23 October 2020 17:19 UTC

Return-Path: <jricher@mit.edu>
X-Original-To: txauth@ietfa.amsl.com
Delivered-To: txauth@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D0BAE3A10A3; Fri, 23 Oct 2020 10:19:13 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.895
X-Spam-Level:
X-Spam-Status: No, score=-1.895 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_BLOCKED=0.001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id xtoaha5eoroi; Fri, 23 Oct 2020 10:19:10 -0700 (PDT)
Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id CCBDB3A11F9; Fri, 23 Oct 2020 10:18:38 -0700 (PDT)
Received: from [192.168.1.19] (static-71-174-62-56.bstnma.fios.verizon.net [71.174.62.56]) (authenticated bits=0) (User authenticated as jricher@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 09NHIaHV005325 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 23 Oct 2020 13:18:37 -0400
From: Justin Richer <jricher@mit.edu>
Message-Id: <AAF5364C-F337-483F-B011-A2B11779290E@mit.edu>
Content-Type: multipart/alternative; boundary="Apple-Mail=_9DAA2289-876C-4A61-9E5E-1AB8986B210C"
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.1\))
Date: Fri, 23 Oct 2020 13:18:36 -0400
In-Reply-To: <CADxMOMf016Rq2GhRF5utT6KsiQiS3ir4QXjrnW_1+buiqtG7uA@mail.gmail.com>
Cc: txauth@ietf.org
To: Mika Boström <mika.bostrom=40smarkets.com@dmarc.ietf.org>
References: <CADxMOMf016Rq2GhRF5utT6KsiQiS3ir4QXjrnW_1+buiqtG7uA@mail.gmail.com>
X-Mailer: Apple Mail (2.3608.120.23.2.1)
Archived-At: <https://mailarchive.ietf.org/arch/msg/txauth/0RRynhG7hwxpE341K-gSZhRXPIo>
Subject: Re: [GNAP] Feedback on polymorphism
X-BeenThere: txauth@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <txauth.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/txauth>, <mailto:txauth-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/txauth/>
List-Post: <mailto:txauth@ietf.org>
List-Help: <mailto:txauth-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/txauth>, <mailto:txauth-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 23 Oct 2020 17:19:14 -0000

Hi Mika,

Thanks for bringing this topic here — I was able to see the forum discussion that brought you here, and hopefully I can help clear up what I mean with how polymorphism is used in the proposal. The short version is that the goal is to avoid the kinds of ambiguity that make insecure protocols, and so in that goal we’re fully aligned. I think that using polymorphism in very specific ways can help that goal — just as I agree that misusing it or applying it sloppily can lead to ambiguous and insecure systems.

Some background: I built out the XYZ protocol (one of the predecessors to the initial GNAP Draft) in Java using strongly typed parsers and Java objects specifically to prove to myself that it could be done in a way that made any sense in the code. (My own open source implementation is at https://github.com/bspk/oauth.xyz-java <https://github.com/bspk/oauth.xyz-java>, but note that it’s not yet up to date with the GNAP spec). It was important to me that I be able to use the system-wide configured parsers to implement this and not have to resort to stepping through elements completely by hand. Java doesn’t make it simple to get the hooks into the right places (especially with the Jackson parser that I used), but it is definitely possible to create a deterministic and strongly-typed parser and serializer for this kind of data structure. Some of the rationale for using polymorphism is covered in the trailing appendix of the draft document (https://www.ietf.org/archive/id/draft-ietf-gnap-core-protocol-00.html#name-json-structures-and-polymor <https://www.ietf.org/archive/id/draft-ietf-gnap-core-protocol-00.html#name-json-structures-and-polymor>), but it’s still good to discuss this here as the working group decides which approaches to take. 

The driving reason for using polymorphism at the protocol level was to simplify the protocol and make it :more: deterministic to create and process, not less. Every time you create or process a field it will mean only one thing, and there’s only one field to look at to answer a question. Without polymorphic field values, you usually need to rely on mutual exclusivity of fields, which is prone to failure and requires additional error checking. Take for example the key binding of access tokens. An access token could be bound to the RC’s key used during the request, to a different key chosen by the AS, or it could be a bearer token with no key at all. By making the “key” field polymorphic, we can define it in terms of boolean values and objects and express this set of mutually-exclusive options in a non-ambiguous way. Without that, you’d need to have different fields for the options and include additional checks in your parser to make sure they weren’t sent simultaneously, otherwise you could get hit with this potential security vulnerability in an object:

{ 
    key: {
      proof: httpsig,
      jwk: { … key value … }
    },
    bearer_token: true,
    bind_to_rc_key: true
}

This would be an illegal object as per this alternate proposal, but then you’d have to check each field and make sure it wasn’t put next to others in the same object. I’ve done this exercise with many other protocols and it’s both error prone and easy to ignore since all the “good” examples would pass code that doesn’t check this. With the polymorphic approach to this same field, each of these three mutually-exclusive states is written in a way that they cannot be sent together. It’s not just illegal, it’s impossible and enforced by the syntax of JSON itself.

{ 
    key: {
      proof: httpsig,
      jwk: { … key value … }
    }
}

// bearer token

{
    key: false
}

// bound to the RC’s presented key

{
    key: true
}

If someone sends a different type for this field, like an array or number or a null, this doesn’t have a defined interpretation in the protocol and would be a protocol level error.

While it might sound like polymorphism means that any field could have any type or value, the opposite is true: each possible value is explicitly typed, it’s just that there are potentially different types that express meaning for the field. This applies to all members of all objects (dictionaries) as well as all members of an array (list). Every time you process a field value or other element, you look at the type and then the value to determine what to do with that typed value.

In your example below, each field within the dictionary would also need to be typed, and each type would need to have a clear indication of its meaning. To take your strawman key format below, the “modulus” field could be defined polymorphically as either a “bigint” (a JSON number) or an “encoded string” (a JSON string). The definition would further say what exactly the encoding of the string would be. That means that when you read the “modulus” field there wouldn’t be any confusion on what the value was or how it was represented, regardless of the input format. Seeing a number there means exactly one interpretation and seeing a string means exactly one (different) interpretation — but importantly, both of them are a “modulus”, since that’s the field that determines the type. An implementation would likely use an internal BigInteger type of object to represent the field value after parsing, so the question is how to go from the JSON value (which is typed) into the BigInteger value.You don’t just apply the type rules on the “public_key” field, you apply it to all sub-fields of that object. 

So let’s dig into the specific bug you bring up in the strawman, because it’s interesting: A JSON encoder that encodes numbers as strings, and not numbers, is not compliant with the JSON definitions of the field in question. For another example, the quoted string value of “true” is not equivalent to the boolean value true in JSON, and they shouldn’t be treated the same by a parser implementation when mapping to a concrete object. It’s in this kind of automated guessing that this class of bugs occur, and that’s going to be the case whether or not you take  advantage of JSON’s polymorphic nature. I’ve run into cases where a parser library was trying to be overly “helpful” in doing this kind of mapping, but ended up introducing errors in more strict components downstream. This is something that protocol designers need to be aware of and guard against in the design of the protocol to reduce possible ambiguities. Within GNAP today, we generally have things that branch whether they’re an object (for a rich description of something) or some non-structured special value (for a reference or other item). 

The design team created some simple JSON Schemas for parts of the protocol during our discussion, but we didn’t include them in the design document due to both lack of time to keep it updated with the rapid changes to the protocol during the design team discussion, and not knowing if there would be interest in such material. I personally think it would be helpful to include as an informative reference in the final document, but that’s something for the working group to take up eventually.

 — Justin

> On Oct 23, 2020, at 10:18 AM, Mika Boström <mika.bostrom=40smarkets.com@dmarc.ietf.org> wrote:
> 
> Hello, everyone.
> 
> For background: GNAP/TxAuth/XYZ/Oauth3 came up on a discussion forum and when I made note about certain concerns, I was requested to send my comments to this working group.
> 
> In short, I believe that the use of polymorphic JSON in the protocol invites subtle and confusing implementation problems. I also searched through the WG archives, and noticed that similar concerns were noted, briefly, in a thread in July. 
> 
> The problem with polymorphic values, as I see it, is that implementations will need to branch on the (inferred) type of a given field. This isn't quite as bad if the types are strictly different, but allows for subtle bugs when the value in question is a dictionary. What makes this unappealing is that "subtle bugs" in security protocols have a habit of turning into vulnerabilities.
> 
> Let's say we have these imaginary payloads, both possible and valid in the same protocol step:
> 
> # payload 1
> {
>   ...,
>   "public_key": {
>     "alg": "rsa",
>     "modulus": <BIGINT>
>   }
> }
> 
> # payload 2
> {
>   ...,
>   "public_key": {
>     "alg": "rsa",
>     "modulus": "<encoded string>"
>   }
> }
> 
> In both cases, the type of "public_key" field is a dictionary. In both cases, they even have the same keys. However, the values in the dictionaries are entirely different, and an implementation will have to branch to at least two possible decoding mechanisms. To make things worse, some JSON implementations may choose to encode non-dictionary values as strings, so it is possible for an originator to transmit what they expect and believe to be payload 1 format, but which the receiver will interpret to be in payload 2 format. And if the encoded string contains only digits, it will even parse correctly as a bignum.
> 
> While the above is clearly a manufactured scenario, it nonetheless demonstrates the potential for logic bugs with polymorphic JSON. With richer types and more complex dictionaries, there will surely be more room for errors.
> 
> Ambiguity in protocols is always a source of implementation complexity and interoperability snags, but in an AuthN/AuthZ protocol it is worse: it's terrifying. If GNAP/Oauth3 is intended to supersede Oauth1/2, wouldn't it be in everyone's interest to keep implementation complexity and mistake potential to a minimum?
> 
> Best regards,
> Mika
> 
> -- 
> Mika Boström
> Smarkets
> -- 
> TXAuth mailing list
> TXAuth@ietf.org
> https://www.ietf.org/mailman/listinfo/txauth