Re: [Acme] [jose] [Json] Signed JSON document / Json Content Metaheader / JSON Container

Daniel Kahn Gillmor <dkg@fifthhorseman.net> Thu, 29 January 2015 19:40 UTC

Return-Path: <dkg@fifthhorseman.net>
X-Original-To: acme@ietfa.amsl.com
Delivered-To: acme@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D13081A1B27 for <acme@ietfa.amsl.com>; Thu, 29 Jan 2015 11:40:36 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id aArLhtudLf_Y for <acme@ietfa.amsl.com>; Thu, 29 Jan 2015 11:40:30 -0800 (PST)
Received: from che.mayfirst.org (che.mayfirst.org [209.234.253.108]) by ietfa.amsl.com (Postfix) with ESMTP id 337EA1A049A for <acme@ietf.org>; Thu, 29 Jan 2015 11:40:30 -0800 (PST)
Received: from fifthhorseman.net (unknown [38.109.115.130]) by che.mayfirst.org (Postfix) with ESMTPSA id 54CC6F984 for <acme@ietf.org>; Thu, 29 Jan 2015 14:40:27 -0500 (EST)
Received: by fifthhorseman.net (Postfix, from userid 1000) id 7565A203A0; Thu, 29 Jan 2015 14:40:26 -0500 (EST)
From: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
To: acme@ietf.org
In-Reply-To: <CAMm+LwiUbys9J-MnmAcnUxTiKnJTz+49OSxWpUXQtHKgu_pnxw@mail.gmail.com>
References: <CAMm+Lwh12jzrH3ZVaS4HTqkNZkteg9mL+n6LYRsj5P1r-Q-DbQ@mail.gmail.com> <255B9BB34FB7D647A506DC292726F6E1284ED9AA38@WSMSG3153V.srv.dir.telstra.com> <CABzCy2DTa+2usPhGJRX7kq8vdxaC+LgAEgoZWNiBmaQNOaYdEg@mail.gmail.com> <CAMm+Lwirvv5tLU-2AEqnQe9DUDKT=GbJK9Jyy69BJVfeDZjCiA@mail.gmail.com> <20150129042508.GA4845@bacardi.hollandpark.frase.id.au> <54C9C765.7080604@gmail.com> <54CA3242.10109@mit.edu> <CAMm+LwiUbys9J-MnmAcnUxTiKnJTz+49OSxWpUXQtHKgu_pnxw@mail.gmail.com>
User-Agent: Notmuch/0.18.2 (http://notmuchmail.org) Emacs/24.4.1 (x86_64-pc-linux-gnu)
Date: Thu, 29 Jan 2015 14:40:26 -0500
Message-ID: <87iofpcr4l.fsf@alice.fifthhorseman.net>
MIME-Version: 1.0
Content-Type: text/plain
Archived-At: <http://mailarchive.ietf.org/arch/msg/acme/_bvueuvTuO-LVXFM8yEi1qbJuw4>
Subject: Re: [Acme] [jose] [Json] Signed JSON document / Json Content Metaheader / JSON Container
X-BeenThere: acme@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Automated Certificate Management Environment <acme.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/acme>, <mailto:acme-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/acme/>
List-Post: <mailto:acme@ietf.org>
List-Help: <mailto:acme-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/acme>, <mailto:acme-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 29 Jan 2015 19:40:37 -0000

On Thu 2015-01-29 08:53:58 -0500, Phillip Hallam-Baker wrote:
> Canonicalization is the stupidest idea in computer security. It is never
> ever necessary and never ever implemented reliably.
>
> A digital signature signs a sequence of bits. So if you ever want to check
> a signature again, make sure you keep hold of your original sequence of
> bits. Simple!
>
> I see people say that canonicalization is 'essential' in every discussion
> of signatures. What I have never seen is an example of something that is a
> reasonable thing to do that goes wrong if you don't have C15N.

RFC 1847 (Security Multiparts for MIME) [0] recommended this "keep the
original sequence of bits intact" variant way back in 1995:

>>    The entire contents of the multipart/signed container must be treated
>>    as opaque while it is in transit from an originator to a recipient.

But some implementations still get this wrong, and it breaks some signed
messages today, depending on how those messages are handled in transit.
(this is about RFC822 header canonicalization, fwiw, not JSON
canonicalization, but the point is the same).

For example, In 2004 (probably earlier), it was observed that python
e-mail parser failed to "keep the original sequence of bits intact" [1],
and the bug is still not fixed today, afaict.

We're talking about nearly 20 years of failing to do this "easy"
approach.  

Taking the other approach, OpenPGP signatures of textual data use a
canonicalized document format (specifying <CR><LF> line-endings,
ignoring whitespace at the end of lines, back in 1998) [2], and that
seems to have actually worked out OK in terms of interop, though i've
seen some (quickly-fixed) errors pop up over the years.

People want to investigate and parse and display the content they're
working with, even if there is a signature on it.  It's very tempting
for implementors to treat the parsed form as the sole internal data, and
not to keep around the other binary blob.  Having a well-documented
canonicalization procedure makes that easier.

i'm not saying that canonicalization is necessarily the right solution
here, but it's not like the "just keep your bits intact" proposal is
particularly robust either.

       --dkg

[0] https://tools.ietf.org/html/rfc1847#page-4
[1] http://bugs.python.org/issue968430
[2] https://tools.ietf.org/html/rfc2440#section-7.1