Re: [dmarc-ietf] indeterminisim of ARC-Seal b= value

Peter Goldstein <peter@valimail.com> Thu, 30 March 2017 16:10 UTC

Return-Path: <peter@valimail.com>
X-Original-To: dmarc@ietfa.amsl.com
Delivered-To: dmarc@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 211F5129613 for <dmarc@ietfa.amsl.com>; Thu, 30 Mar 2017 09:10:31 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.019
X-Spam-Level:
X-Spam-Status: No, score=-1.019 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_FONT_FACE_BAD=0.981, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=valimail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id T9YIOvyvOKbh for <dmarc@ietfa.amsl.com>; Thu, 30 Mar 2017 09:10:27 -0700 (PDT)
Received: from mail-qt0-x22d.google.com (mail-qt0-x22d.google.com [IPv6:2607:f8b0:400d:c0d::22d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 8D3CA129631 for <dmarc@ietf.org>; Thu, 30 Mar 2017 09:10:24 -0700 (PDT)
Received: by mail-qt0-x22d.google.com with SMTP id n21so43485396qta.1 for <dmarc@ietf.org>; Thu, 30 Mar 2017 09:10:24 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=valimail.com; s=google2048; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=ght8yUsyWucTAjO/oZQMFKf2GvcrYKqgvtY9am4lBuI=; b=VMQowgG74hMVpEOrTdDLY4cILF7GxRa8BlNaTFSWXcPogGf5TSV9a21o7bMYSKb/zF jc22nSPn/2C81225HI7/lGQMincvsTF8SpZxntJ1SZ5GhwHCfygNtT4r45ZaAXOcTPQQ h7WKuBakOKR0GqYVqyAbmfAZ6zy2YD2qgXBTX9sPrk8L6QKa12oBP1KJRf3yMyNgfzuZ U+iUyXXMp0EAFyS3F18WpCv16HGsn3eUqOKnatRzaTqDhK6rnbz72DA+Vh1g15NPe1SM qzOuviXeKaFspJYJRjIn0FCeI7Mf9v1ApAksRNh1e0zBXDT+cV0+FOWooyVpT+C3MEvu 80nw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=ght8yUsyWucTAjO/oZQMFKf2GvcrYKqgvtY9am4lBuI=; b=LTpPYEr4ZBXoJAsZedO5yRj1HVUhbvG6kZVDKgPsjMdqixwtuy1lh1VjrW8SwcpUCE hu9ySQah2PaIUQpOOESZdfqU6F1lPp8JavDpufRZEglQ6EoTgXxrAPmTrW99KfaQkQ8y gsjg5diiLFakhgRAA2aug7G7dRgeXKVk1zPH7D+ostXbjnVmY0tdANd2gyI/KOqK0qkS E6mXq7QPbiilc68MRyhm3TnxRHgbZ78W87seJ7yw4dH/zQEncVMrJqCYKWyhbNb2GuDZ MYTkI7lRODWcVcCDWNdIJf+9lXJxdLdusPx06RmKqjrjeq4ur3rNZE2Uk9ZxGV1Vc+kW 108w==
X-Gm-Message-State: AFeK/H1CsY9hpv45yGCd/jEz5ouwIXlpR5T387lAxgBjUUyua/2k0uhm5NSkqGpmCoiIk6VUXx1hOl6XjkdMbg==
X-Received: by 10.200.35.184 with SMTP id q53mr527744qtq.235.1490890223414; Thu, 30 Mar 2017 09:10:23 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.12.141.207 with HTTP; Thu, 30 Mar 2017 09:10:22 -0700 (PDT)
In-Reply-To: <CAL0qLwbYCD2nsj62HxjqZ=Wt5oK8W8kbTJ+H5GiMN5M7rMSRAA@mail.gmail.com>
References: <CANtLugO_D1Mz_v_341pc5O1mZ7RhOTrFA3+Ob5-onp72+5uRfA@mail.gmail.com> <20170324212304.85346.qmail@ary.lan> <CANtLugOK4tXqA3ztYwchYsc8+t6KhyNj6mvgEu2wzvwKm_rK7A@mail.gmail.com> <CAL0qLwbYCD2nsj62HxjqZ=Wt5oK8W8kbTJ+H5GiMN5M7rMSRAA@mail.gmail.com>
From: Peter Goldstein <peter@valimail.com>
Date: Thu, 30 Mar 2017 09:10:22 -0700
Message-ID: <CAOj=BA0p2soUa2xoCFq=TAOKuB35AqgNeLuBMjO25EpPeukk2g@mail.gmail.com>
To: "Murray S. Kucherawy" <superuser@gmail.com>
Cc: Gene Shuman <gene@valimail.com>, "dmarc@ietf.org" <dmarc@ietf.org>, John Levine <johnl@taugh.com>
Content-Type: multipart/alternative; boundary="001a113f49e82aa89d054bf4ee37"
Archived-At: <https://mailarchive.ietf.org/arch/msg/dmarc/g7XRN8qPi1Nw0r6Rilz137bfq0A>
Subject: Re: [dmarc-ietf] indeterminisim of ARC-Seal b= value
X-BeenThere: dmarc@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Domain-based Message Authentication, Reporting, and Compliance \(DMARC\)" <dmarc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dmarc>, <mailto:dmarc-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dmarc/>
List-Post: <mailto:dmarc@ietf.org>
List-Help: <mailto:dmarc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dmarc>, <mailto:dmarc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 30 Mar 2017 16:10:31 -0000

>
> I agree that it's impossible to design a signer test suite that confirms
> correct output, without doing crytpo checks of its own with a known-good
> verifier, unless we nail down the syntax the output will use.


Great.  :)  I wasn't sure there was agreement on that in the wider thread.


> For this to work, we'd have to mandate a lot of things, including at least
> these:
> - the order of the tags as presented to the hash algorithm
> - which tags will be present (note that many are not required, including
> "t=")
> - the specific values they will all contain
> - for ARC-Message-Signature, which canonicalization will be used
> - the spacing between the tags; since "relaxed" header canonicalization
> compresses spaces but does not add them, "a=foo;b=foo" is not the same as
> "a=foo; b=foo", but "a=foo;\r\n\tb=foo" is
> - similarly, how signature fields will be wrapped (if at all)
> - what signing key will be used
> - the body content to be signed
> - the header content to be signed
> - the set of header fields that will be signed (which becomes "h=")
> It certainly removes many variables to nail down a test in this way, and
> it's faster to run tests that don't require crypto functions or a
> known-good verifier to confirm.  To be honest, I can't remember if we ever
> considered this sort of approach during the development of DKIM.  I don't
> think we did, and after an admittedly brief search I couldn't find anything
> in the archives.  I think when canonicalization was developed, it was plain
> that tag order wouldn't matter to the verifier, and that was sort of the
> end of it.


So while it is true for a given test case you would need to constrain all
of these items, this is a much larger list than would need to be specified
as part of the proposal under consideration.

Many of these items are already constrained by the controlling RFC (ARC or
DKIM) and the other values specified in the signature.  For example, the
order in which different header fields are combined and hashed is defined
as a function of the tag/value pairs in the signature (i.e in DKIM
signature the 'b' value is impacted by the choice of values for the 'a',
'c','h', etc. tag values).  That ordering is therefore an input to the test
case.  The input message content (headers + body), the signing key(s), etc.
would all also be input to a particular test case.

A test suite should handle this by having test cases that exercise the
library with different inputs that give good coverage across allowed
variations.  That has the side benefit of allowing the test suite
implementer(s) to easily develop targeted test cases that cover those
specific variations in isolation, making it easier to find bugs.

That said, some of these items in the list (intertag spacing, required
tags, canonical values for some of the tags) would need to be constrained
as part of this proposal.  This is clearly a change, but it's a change to a
draft standard that has a small # of implementers and should be relatively
easy to lock down.  And I think it's far less of a change than some others
on this list do.

The obvious counter-argument here is that DKIM has been successful without
> ever being strict about what a signature looks like.  ...  We reached a
> point where we had organically developed a bunch of "good" implementations
> based on the fact that they all appeared to agree with each other.


I think the bigger point being argued is that DKIM has been successful
without ever having a global test suite, because that's the impact of not
making this choice (or committing to a global reference implementation).

Which is mostly true, but frankly not entirely true.  I'm aware of at least
two major vendors who have production DKIM signing implementations with
edge case bugs that would've been caught by such a suite.  Those
implementations made it through development, passed whatever implementation
specific test cases the developers built, and made it to production.
That's what a good test suite helps you avoid.

Since the canonicalization algorithms in ARC are the same as the ones in
> DKIM, and the tag=value and key syntaxes are also the same, we've got a lot
> of concepts and code being recycled here.  I'm not sure how much new
> fragility we should really be concerned about.  I'll know more as I
> complete my implementation.


Paul's point that video and security protocols are traditionally much more
> rigid is of course quite true, and I presume he's thinking of things like
> the headers of DNS, IP, TCP, etc.  But I note that those are based on a
> different model where fields have fixed sizes and predictable positions.
> That's antithetical to email, however, where the fields come in arbitrary
> order, and the only way you know what a given header means is that its name
> precedes its value.  There can even be duplication.  And from one MTA to
> another, header fields can be added, removed, or rearranged.  That's
> impossible in a rigid header definition, so it's perhaps no surprise that
> this community isn't exactly warm to the idea.


I'm assuming that I'm Paul.  :)

I am thinking of protocols like DNS, IP, TCP, but also standards that are
higher up the stack like some video protocols.  The latter standards often
allow large scale re-ordering of the individual chunks and streams of the
video, but usually have byte level constraints on the content of those
chunks.  In my view that's exactly analogous to what I'm proposing here.

I'm also recognizing that, once a standard has a binary component (i.e.
hashes), the level of flexibility that's being claimed no longer actually
exists.  As a specific example, take any email message you like that has
passed DKIM.  Insert a single additional space between any two tags/value
in a passing DKIM signature.  The message now fails DKIM.  That means the
semantic content of the message (and more importantly, how receivers will
treat it) will change.

There seems to be a lot of confusion in this thread about the difference
between strictly specifying fields inside a single new header, and changes
that cross headers.  None of the above mentioned changes to the header set
- adding, deleting, duplicating, etc. - is impacted in any way by the
proposal under discussion.  ARC already needs to handle those cases for the
construction of the ARC seal, and much of the complexity of ARC is around
canonicalizing the header set (specifying an order, eliminating duplicates)
and handling the corner cases in the face of such traditional flexibility.

The question, then, is whether this is a desirable path to follow: Is the
> value of a quick evaluation of a deterministic signer high enough to
> surrender the flexibility of DKIM style of tag spacing and ordering?  And,
> perhaps more importantly, what's the cost of being wrong later?


Agreed.  But I'd rephrase it as:

1. What is the value of having a open, globally available test suite for
ARC against which any implementer can test a new implementation or changes
to an existing implementation to ensure compatibility?   This is compared
to the current system of ad hoc interops at industry events.
2. What is the value of the flexibility of DKIM style of tag spacing and
ordering in the context of ARC?
3. What's the cost of sacrificing #2 for #1?

I hope I've made it clear why I think #1 is valuable.

I haven't actually heard much of an argument for #2 in this thread - I'd
love to understand the perceived benefits.

Most of the arguments have centered around #3, and the idea that lots of
ARC implementation will be adopted from existing DKIM implementations,
which don't have this constraint, and that the implementations will ignore
the new constraint.

Frankly I'm skeptical about this - DKIM is made up of relatively simple
components (RSA signing, Base64 encoding, even canonicalization) - that are
mostly available as standalone libraries in many languages.  Moreover, if
you survey the number of existing libraries per language that implement
both DKIM signing and verification, you come up with a pretty small number
for most popular languages (1-3), and many of those projects are moribund.
In many cases it'll be easier to start from scratch, especially given the
bulk of ARC's complexity is in areas that don't really exist in DKIM
(header set ordering by i, etc.).  I know that if I write a Ruby
implementation, that's likely how I'd approach it.

Even if the DKIM code is adopted, we're talking about a very specific set
of change to two isolated areas.  Looking at OpenARC in particular, that
code is already imposing a preferred tag/value ordering and spacing on the
signing side.  It's just buried in the implementation details.
Changing/standardizing it would not be particularly difficult.  Changing
the verifying code to validate the ordering/spacing would also not be
particularly difficult.  That's pretty much all of the work as applies to
that implementation.

Best,

Peter

-- 


[image: logo for sig file.png]

Bringing Trust to Email

Peter Goldstein | CTO & Co-Founder

peter@valimail.com
+1.415.793.5783