Re: [saag] post-X509 cryptographic identities

Henry Story <henry.story@gmail.com> Sat, 15 February 2020 08:36 UTC

Return-Path: <henry.story@gmail.com>
X-Original-To: saag@ietfa.amsl.com
Delivered-To: saag@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CCF79120052 for <saag@ietfa.amsl.com>; Sat, 15 Feb 2020 00:36:50 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.999
X-Spam-Level:
X-Spam-Status: No, score=-1.999 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id xrEqyC1Gvwnk for <saag@ietfa.amsl.com>; Sat, 15 Feb 2020 00:36:48 -0800 (PST)
Received: from mail-wm1-x32a.google.com (mail-wm1-x32a.google.com [IPv6:2a00:1450:4864:20::32a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2B00012001B for <saag@ietf.org>; Sat, 15 Feb 2020 00:36:48 -0800 (PST)
Received: by mail-wm1-x32a.google.com with SMTP id p17so13271355wma.1 for <saag@ietf.org>; Sat, 15 Feb 2020 00:36:48 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=a5rfduTcSzOI52NHs/9fIfjjWofa8z7ghKfVsGrObbw=; b=g4s3ADOcv9j+nmjh78Y6o8TzGfo+HF3tq53Uy9Yyaf7WQHp67o3lCOn60RvycTuso8 iLUR9NT6j33/pJtLjObQxSh9+851pG+nf0OMwEZtz6YGazXw9YoVz2FO5v+ecsaO5e+t dGTlFYcTAJQ7JAMWtfulWrSt4AgA0/GN6Xm5KWYfu6GyAA4AasK5jR3B95M1WHvDDC+S EyobbrI+6fAzvGuQzxOQGh+zNXZvN0UGc0QIyf4goTzKLhT5BoiAqlozXXvJ0SgJZvJd SFR8K//tsQWkxrv/6IXWoOtrj7ORaZSfTfhpVC2srExAAYkfs3CfIvDwc+3xYLu2Iy1J mRrA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=a5rfduTcSzOI52NHs/9fIfjjWofa8z7ghKfVsGrObbw=; b=KW90bgU7uRAs2JCVDaG9W0FtnOqHAzk3wgF7+5ZoYHmoz/hsaGHlcgifrfRysGk7dY xeDDj5Lek8rB+UCY2G/A5sEDminie+8sr4ONRV9+yHPdLoghufYOyjSHB4TYVJIsL5ld kJegzBXjTPU37pE1LTBQ5irmzfPjrhpYWNRRr3vNsRoanuMXYhU2YdstFR6lhhk7nFWk ir5U+B21Et3UEpXezQPIYbLtNXQr9omownffL1DYcVYTUIAZvhfqicy+/88dD1hh/ysi DP+vyzq3AcMrlJdim1fRARKWT806VZ1YoOE4C5nEFtmJPigWNNtF24ddY9+7r8fjS4Fk d0cg==
X-Gm-Message-State: APjAAAVXCgr0z5+BaS0/SoxolUc4PxWgSasoSAjFgPottIv5AoK7bUYC o+rAyN/zCwM7Ed29I+EbqVI=
X-Google-Smtp-Source: APXvYqxjJ8eoiIdWXWZNYppYbw7ilgufYM29/vmWSj9uoNf5YthFCcB7nnUfrLBNrOr/XwlYrGTNJg==
X-Received: by 2002:a7b:c652:: with SMTP id q18mr9622078wmk.123.1581755806627; Sat, 15 Feb 2020 00:36:46 -0800 (PST)
Received: from [192.168.43.200] ([92.184.107.142]) by smtp.gmail.com with ESMTPSA id t12sm10171101wrq.97.2020.02.15.00.36.45 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 15 Feb 2020 00:36:45 -0800 (PST)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 13.0 \(3608.60.0.2.5\))
From: Henry Story <henry.story@gmail.com>
In-Reply-To: <CAMm+LwjiGGyjwRMJSid664bry4-0YfEVu8Nj_gu2qfwE2RdVxw@mail.gmail.com>
Date: Sat, 15 Feb 2020 09:36:43 +0100
Cc: IETF SAAG <saag@ietf.org>, Manu Sporny <msporny@digitalbazaar.com>
Content-Transfer-Encoding: quoted-printable
Message-Id: <11112C67-D11A-4240-9A14-32B3603E0387@gmail.com>
References: <26497.1581418516@dooku> <20200212002125.GO18021@localhost> <alpine.DEB.2.20.2002131443470.25433@grey.csi.cam.ac.uk> <20200213171324.GP18021@localhost> <d3d01f1f-5784-da84-1c59-e636d349bd2a@netmagic.com> <20200213175626.GR18021@localhost> <65357327-e2d7-89cc-221e-ed8ac2875048@netmagic.com> <A91F5BD6-BFBA-4BA7-9158-3F41A8F0F7D9@gmail.com> <20200213191952.GS18021@localhost> <9FEBBD2A-3578-436A-92E3-192CADC9FA8B@gmail.com> <20200213205158.GT18021@localhost> <CAMm+LwhAXWbVL=j3Cek_Sf9eK-aKsQgZ+Gsh55nP3nvur_JSEQ@mail.gmail.com> <CB2A6E0B-E48D-4C1B-9F85-BA6A93963ED6@gmail.com> <CAMm+LwjiGGyjwRMJSid664bry4-0YfEVu8Nj_gu2qfwE2RdVxw@mail.gmail.com>
To: Phillip Hallam-Baker <phill@hallambaker.com>
X-Mailer: Apple Mail (2.3608.60.0.2.5)
Archived-At: <https://mailarchive.ietf.org/arch/msg/saag/8JpCSDYCP1mdvPxp_aI7i4k1Rig>
Subject: Re: [saag] post-X509 cryptographic identities
X-BeenThere: saag@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Security Area Advisory Group <saag.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/saag>, <mailto:saag-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/saag/>
List-Post: <mailto:saag@ietf.org>
List-Help: <mailto:saag-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/saag>, <mailto:saag-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 15 Feb 2020 08:36:51 -0000

On 14 Feb 2020, at 23:26, P. Hallam-Baker <phill@hallambaker.com> wrote:

> On Fri, Feb 14, 2020 at 3:42 PM Henry Story <henry.story@gmail.com> wrote:
>
>> On 14 Feb 2020, at 20:44, Phillip Hallam-Baker <phill@hallambaker.com>
>wrote: 
>>> Syntax is the least important part of PKI but it is a part of
>>> the puzzle. Why oh why do people think canonicalization is relevant? If
>>> you want to be able to verify a signature, you have to keep the original
>>> bits that were signed. End of story.
>>
>> The way to make syntax unimportant is to work on the semantic level. 
>> That is in a way what the Semantic Web does by starting from naming, 
>> and leaving syntax decisions open, allowing multiple ones: 
>> RDF/XML, JSON-LD, Turtle, NTriples, Binary RDF, CSV …
>>
>> But once one moves away from syntax it then becomes very important to
>> canonicalize data, exactly so as to move away from being tied to syntax.
>> A Canoncicalisation of a data then allows one even to discard the 
>> original bytes, if one does not wish to keep to versions of 
>> everything around.
>>
>No, it does not become important to canonicalize. We have been trying and
>failing at that for 30 years and there is no reason to believe things will
>change in the future.

It is true that I am not sure what the computational properties 
of the RDF Normalisation algorithm are. Manu Sporny will have more 
to say on that. 
 
 https://json-ld.github.io/normalization/spec/

>
> Digital signatures authenticate the presentation of the data. If you are
> using Schnorr signatures, the usual form of the signature isn't even
> deterministic. So sign the same octet sequence twice and you get two
> different signatures.

(I wonder if we are talking at cross purposes here.) 
If one has a working Normalisation algorithm, one can reconstruct 
the exact byte sequence from the data at any time. Hence there is no
need to keep the signed byte-sequence around. Clearly one should
not discard the signature itself, but add that to the database.

> I have never seen a situation. where discarding the signed octet sequence
> makes the slightest sense. I am currently using a machine with 64GB of RAM
> and 2TB of disk and that isn't uncommon. A RaPi4 comes with up to 4GB 
> these days. Signed assertions (SAML, PKIX, Mesh) are rarely more than 
> a few KB.

Ah I see. You have a hash of a document which you then sign, and that is
no more than a few KB, which one should keep. I am fine with that. 
And you are right that that is probably all you need if you are signing
a document. But what if you are signing data?

The problem there is that in order to verify the hash, one needs to be able 
to reconstruct from the data the same byte-sequence. The data would usually be a lot larger than a few KB. 

Furthermore the data may be stored in any number of formats which I listed
above including binary formats. So you’d rather like to avoid having to
recompute a signature every time someone requests a different format.
If you tie a signature to a data format, then you can never get rid of the
format in which the data was initially signed. On the other hand if it
is not tied to the format, it can be calculated once, and then served up
whatever the requested format.

>
> The immense complexity of canonicalization means that it should be avoided
> wherever possible. Instead we have people swooping in demanding that it
> be required. Well time to put a lid on that.

So it makes sense for documents that are tied to a byte sequence to not
go through canonicalization. But for data it is a different problem.

Henry