Re: [saag] ASN.1 vs. DER Encoding

Nico Williams <nico@cryptonector.com> Tue, 26 March 2019 21:48 UTC

Return-Path: <nico@cryptonector.com>
X-Original-To: saag@ietfa.amsl.com
Delivered-To: saag@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id ED0BA120303 for <saag@ietfa.amsl.com>; Tue, 26 Mar 2019 14:48:26 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.515
X-Spam-Level:
X-Spam-Status: No, score=-0.515 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FAKE_REPLY_C=1.486, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cryptonector.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id b9Q-2-YkEhGD for <saag@ietfa.amsl.com>; Tue, 26 Mar 2019 14:48:25 -0700 (PDT)
Received: from bisque.maple.relay.mailchannels.net (bisque.maple.relay.mailchannels.net [23.83.214.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id BB2E11200EC for <saag@ietf.org>; Tue, 26 Mar 2019 14:48:24 -0700 (PDT)
X-Sender-Id: dreamhost|x-authsender|nico@cryptonector.com
Received: from relay.mailchannels.net (localhost [127.0.0.1]) by relay.mailchannels.net (Postfix) with ESMTP id 080A75C5B59; Tue, 26 Mar 2019 21:48:23 +0000 (UTC)
Received: from pdx1-sub0-mail-a5.g.dreamhost.com (unknown [100.96.28.55]) (Authenticated sender: dreamhost) by relay.mailchannels.net (Postfix) with ESMTPA id 823CB5C5C71; Tue, 26 Mar 2019 21:48:22 +0000 (UTC)
X-Sender-Id: dreamhost|x-authsender|nico@cryptonector.com
Received: from pdx1-sub0-mail-a5.g.dreamhost.com (pop.dreamhost.com [64.90.62.162]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384) by 0.0.0.0:2500 (trex/5.17.2); Tue, 26 Mar 2019 21:48:22 +0000
X-MC-Relay: Neutral
X-MailChannels-SenderId: dreamhost|x-authsender|nico@cryptonector.com
X-MailChannels-Auth-Id: dreamhost
X-Oafish-Absorbed: 4a8bcc49385ea9a5_1553636902746_805013837
X-MC-Loop-Signature: 1553636902746:248588738
X-MC-Ingress-Time: 1553636902746
Received: from pdx1-sub0-mail-a5.g.dreamhost.com (localhost [127.0.0.1]) by pdx1-sub0-mail-a5.g.dreamhost.com (Postfix) with ESMTP id 2628E7FC36; Tue, 26 Mar 2019 14:48:22 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=cryptonector.com; h=date :from:to:cc:subject:message-id:mime-version:content-type :in-reply-to; s=cryptonector.com; bh=lzX852U0eJNqAUd8JC3wy3f6tSw =; b=DUkfS6+jYrooei2PL53WU8GtkpE90pVSzwEvfigd2edGVNujzv0Ez3tVoIz 7ekFpvwrNtxJLJp3pEUVwhdREtN1ufs/NHLRq6TLQd62TQXW3qlVlLxCOw9rk1xq kfbiuBCqz0GavbAB4Z/8t3uDy5S2DQdRgwAHzSXxxu4eSKKI=
Received: from localhost (unknown [24.28.108.183]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) (Authenticated sender: nico@cryptonector.com) by pdx1-sub0-mail-a5.g.dreamhost.com (Postfix) with ESMTPSA id 0400A7FC30; Tue, 26 Mar 2019 14:48:20 -0700 (PDT)
Date: Tue, 26 Mar 2019 16:48:18 -0500
X-DH-BACKEND: pdx1-sub0-mail-a5
From: Nico Williams <nico@cryptonector.com>
To: "Dr. Pala" <madwolf@openca.org>
Cc: "saag@ietf.org" <saag@ietf.org>
Message-ID: <20190326214816.GB4211@localhost>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <20190326164951.GX4211@localhost>
User-Agent: Mutt/1.9.4 (2018-02-28)
X-VR-OUT-STATUS: OK
X-VR-OUT-SCORE: 0
X-VR-OUT-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgedutddrkedtgdduudekucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuggftfghnshhusghstghrihgsvgdpffftgfetoffjqffuvfenuceurghilhhouhhtmecufedttdenucenucfjughrpeffhffvuffkgggtuggjfgesthdtredttdervdenucfhrhhomheppfhitghoucghihhllhhirghmshcuoehnihgtohestghrhihpthhonhgvtghtohhrrdgtohhmqeenucfkphepvdegrddvkedruddtkedrudekfeenucfrrghrrghmpehmohguvgepshhmthhppdhhvghloheplhhotggrlhhhohhsthdpihhnvghtpedvgedrvdekrddutdekrddukeefpdhrvghtuhhrnhdqphgrthhhpefpihgtohcuhghilhhlihgrmhhsuceonhhitghosegtrhihphhtohhnvggtthhorhdrtghomheqpdhmrghilhhfrhhomhepnhhitghosegtrhihphhtohhnvggtthhorhdrtghomhdpnhhrtghpthhtohepnhhitghosegtrhihphhtohhnvggtthhorhdrtghomhenucevlhhushhtvghrufhiiigvpedt
Archived-At: <https://mailarchive.ietf.org/arch/msg/saag/wcrOD59Xl8yziQ12Rsx8Tm6EaiU>
Subject: Re: [saag] ASN.1 vs. DER Encoding
X-BeenThere: saag@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Security Area Advisory Group <saag.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/saag>, <mailto:saag-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/saag/>
List-Post: <mailto:saag@ietf.org>
List-Help: <mailto:saag-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/saag>, <mailto:saag-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 26 Mar 2019 21:48:27 -0000

I wrote earlier that:

> And then they (rightly!) hate BER/DER/CER, so they propose inventing
> something new, often badly.  In this field, there is nothing new.  I'm
> sure even flatbuffers isn't new.
> 
> I say "rightly" because TLV encodings are just terrible.  We really do
> need non-TLV encodings (see below).

Now to back up that assertion:

1) TLV encodings are bloated by nature due to being highly redundant.

2) That redundancy is a source of errors when manually coding a codec.

   Now, this is not that big a deal because we should all be using code
   generators, and none should be manually writing a codec.  Yet there
   is so much hand-rolled BER/DER codec code out there...

   Also, all encodings will have lengths buried in structures whose
   lengths are also written elsewhere -- this redundancy is not entirely
   avoidable, but TLVs add more of it than is absolutely necessary.

3) DER is a canonical variant of BER, using a) definite length
   encodings of structures (SEQUENCEs) and other things, and b)
   minimal-length variable-length encodings of lengths and values.

   This has a few negative side-effects:

   a) it's not possible to know a length until the value it is the
      length of has been encoded, which means one must encode "from the
      right",

   b) there is no possibility of on-line encoding.

   On the other hand, CER uses indefinite length encoding, which avoids
   those two isses, but then nobody uses CER, not in Internet protocols
   anyways.  IIRC some other choice made in CER's specification turns
   out to be suboptimal, thus the choice of DER or CER is always
   dissatisfying in some sense.

   To be fair, we shouldn't need canonical encodings at all.  And yet
   isn't there a canonical JSON effort?  We never seem to fully stop
   having to re-encode structures...

   Also, it's fair to note that while DER has no possibility of on-line
   encoding, CER's use of indefinite length encodings means that
   decoding is necessarily online, with the recipient not able to know
   the total size of a message as it reads it.

   On the whole, indefinite length encodings are better.

4) BER (and DER and CER) is supposed to be self-describing, which means
   "you don't need to know the schema in order to parse the message",
   but this is only half true, as type information is lost when using
   IMPLICIT tags (you get to know if a value is of structured or scalar
   type, but not the actual type).

   Using EXPLICIT tags, on the other hand, makes the encoding a TLTLV
   encoding, thus adding even more bloat!  We use EXPLICIT tags in
   Kerberos, FYI.

5) If you're doing anything other than dump a structure, you don't need
   it to be self-describing -- you'll know the schema not least because
   we publish them.  As long as there's an indicator of top-level type
   on the outside, you can decode the inside by reference to the schema
   and encoding rules without needing TLV encoding rules.

   Thus there is almost zero benefit to self-describing encodings.
   
   Self-describing encodings are merely a crutch.  Perhaps they had
   their utility before compilers became the norm, but they have provide
   no real benefit now.

I think I could make more arguments in this vein.  I'll stop here.

Nico
--