Re: [saag] ASN.1 vs. DER Encoding

Carsten Bormann <cabo@tzi.org> Sun, 31 March 2019 07:39 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: saag@ietfa.amsl.com
Delivered-To: saag@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A631E12018B for <saag@ietfa.amsl.com>; Sun, 31 Mar 2019 00:39:22 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.2
X-Spam-Level:
X-Spam-Status: No, score=-4.2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id MXFOvfFwkfOE for <saag@ietfa.amsl.com>; Sun, 31 Mar 2019 00:39:20 -0700 (PDT)
Received: from smtp.uni-bremen.de (gabriel-vm-2.zfn.uni-bremen.de [134.102.50.17]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 88BEC1201C2 for <saag@ietf.org>; Sun, 31 Mar 2019 00:39:19 -0700 (PDT)
Received: from [192.168.217.120] (p54A6CE73.dip0.t-ipconnect.de [84.166.206.115]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.uni-bremen.de (Postfix) with ESMTPSA id 44X6mj0wqgzyNM; Sun, 31 Mar 2019 09:39:17 +0200 (CEST)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <C3D9DD15-AB23-4B42-BA61-A4E4CD826B77@huitema.net>
Date: Sun, 31 Mar 2019 09:39:16 +0200
Cc: Benjamin Kaduk <kaduk@mit.edu>, "Dr. Pala" <madwolf@openca.org>, "saag@ietf.org" <saag@ietf.org>
X-Mao-Original-Outgoing-Id: 575710754.508714-7cf8936474ed9dac25fa35a0d31843fe
Content-Transfer-Encoding: quoted-printable
Message-Id: <F6387640-20F3-4B3C-8E61-58CAF7828CA1@tzi.org>
References: <20190326164951.GX4211@localhost> <20190326214816.GB4211@localhost> <1553679912618.8510@cs.auckland.ac.nz> <20190327151545.GG4211@localhost> <20190330153101.GT35679@kduck.mit.edu> <C3D9DD15-AB23-4B42-BA61-A4E4CD826B77@huitema.net>
To: Christian Huitema <huitema@huitema.net>
X-Mailer: Apple Mail (2.3445.9.1)
Archived-At: <https://mailarchive.ietf.org/arch/msg/saag/QNs3-gylwru0uhOQvM1y0Sn9dFA>
Subject: Re: [saag] ASN.1 vs. DER Encoding
X-BeenThere: saag@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Security Area Advisory Group <saag.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/saag>, <mailto:saag-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/saag/>
List-Post: <mailto:saag@ietf.org>
List-Help: <mailto:saag-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/saag>, <mailto:saag-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 31 Mar 2019 07:39:23 -0000

On Mar 30, 2019, at 19:43, Christian Huitema <huitema@huitema.net> wrote:
> 
> The TLS syntax appears specifically designed to avoid many of the pitfalls TLV encodings. For example, the syntax defines the fixed encoding length of all integer and length fields, 

… which causes the result to be larger than it needs to be.
Compare https://tools.ietf.org/html/draft-rescorla-tls-ctls-01 or other proposals in this field.

> and uses intermediate octet array encodings for extensions. It is certainly much easier to get right

Yeah, as we all saw in heartbleed. :-)
[Don’t forget to celebrate its fifth discovery birthday tomorrow.]

> than BER or DER.

Oh, that I can concede.

There is no free lunch in this space.

The TLS encoding works by opting for simplicity, but it requires bespoke en-/decoders, which may or may not (more often) be compiled from the TLS “presentation language”.  These are built from a small set of primitives that are reasonably easy to get right even when manually coding.

TLS can do that because it delegates its more complex parts to other formats such as ASN.1 DER.

Since the mid-1990s, there has been a trend in the industry towards encodings that enable generic en-/decoders, handling the lexical level of (de-)serialization.  This, of course, was not new (RFC 713 had it in 1976, and ASN.1 BER decoding is generic on the lowest level, too).  The good thing is that these generic codecs can be hardened and used in a variety of applications; the bad thing is that the hardening does not always happen and gets more complicated with the complexity of the generic codec.  While the focus has been on text codecs (XML, JSON) for a while, binary codecs also exist (CBOR being an example from this decade, patterned on the earlier msgpack).

Binary generic codecs can opt to have redundant bytewise length information (such as BER does), or opt to count items instead of bytes except at the lowest level (as CBOR does).  Apart from causing pain when serially encoding, the ability to exploit this redundancy with inconsistent data is an attacker’s play field.  It also is useful when it is necessary to skip entire subtrees in one go; CBOR requires visiting all nodes on the subtree being skipped (unless the “wrap in byte string” design pattern is employed).

Generic codecs can be intricately tied to an arcane data model that calls for a data description language (as in ASN.1, where the tie is perceived so tight that it triggered this entire exchange, or in XML, which at least had an evolution of data description languages over its lifetime), or try to map to a generic data model that a programmer might want to use directly.  JSON, msgpack, and CBOR share most of one such generic data model and mostly differ on coverage and extensibility at this level.  Data description languages can then help map from that generic data model to the application data model, also possibly validating input data in the process, but are not required (“schemaless” decoding); CDDL is one such data description language.

Grüße, Carsten