Re: [dns-privacy] I-D Action: draft-ietf-dprive-xfr-over-tls-02.txt

Sara Dickinson <sara@sinodun.com> Tue, 21 July 2020 15:08 UTC

Return-Path: <sara@sinodun.com>
X-Original-To: dns-privacy@ietfa.amsl.com
Delivered-To: dns-privacy@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 35FCF3A0A01 for <dns-privacy@ietfa.amsl.com>; Tue, 21 Jul 2020 08:08:46 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.1
X-Spam-Level:
X-Spam-Status: No, score=-2.1 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=sinodun.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id R3d_3aHwmgro for <dns-privacy@ietfa.amsl.com>; Tue, 21 Jul 2020 08:08:43 -0700 (PDT)
Received: from haggis.mythic-beasts.com (haggis.mythic-beasts.com [IPv6:2a00:1098:0:86:1000:0:2:1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A6D6A3A0A03 for <dns-privacy@ietf.org>; Tue, 21 Jul 2020 08:08:42 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=sinodun.com ; s=mythic-beasts-k1; h=To:Date:From:Subject; bh=DXDAUL5PonLUn2kX0fdj6aGLe/Oo9N60kCFXpSphvic=; b=A+wBXzsI9e7DRhJakyLovZGmAV mb3RnHSLLZHVXmXKxgrOwrvCDTTQ0qHRhOswlf50nO0PRhycEN5G8RFkmro3PXZZhXmW5sfEElmo1 3JsMx1ptaCemUwC/jmSwEV7TDpDLP1wV3eM6dcL9ShHgEX1YWP0yaBdz19hpz0LQZZRojf+KzAbck YciBetUT2fLSV9US1qDIMF/VRcdSxmVcCx1dt/wjTT8WvWkl6jqDnVLorAtOCxqDWRCFHu199jNj6 dTeoUhYbKlRcprpQEKCXzPEBS6vCJihGRrngxSeB43n13kc4SEpw+ZVXKa4u5rTpxXk+WIISPAPWH j066D0Yw==;
Received: from [62.232.251.194] (port=19850 helo=[172.27.240.4]) by haggis.mythic-beasts.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92.3) (envelope-from <sara@sinodun.com>) id 1jxtsm-0006pp-Qc; Tue, 21 Jul 2020 16:08:41 +0100
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.15\))
From: Sara Dickinson <sara@sinodun.com>
In-Reply-To: <alpine.DEB.2.20.2007132101010.32181@grey.csi.cam.ac.uk>
Date: Tue, 21 Jul 2020 16:08:38 +0100
Cc: DNS Privacy Working Group <dns-privacy@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <A210D60B-4788-47AC-BF5E-61830BE24B77@sinodun.com>
References: <159465861212.27789.12532144774876250909@ietfa.amsl.com> <A0F13E92-889B-415A-BFA1-215838EE895D@sinodun.com> <alpine.DEB.2.20.2007132101010.32181@grey.csi.cam.ac.uk>
To: Tony Finch <dot@dotat.at>
X-Mailer: Apple Mail (2.3445.104.15)
X-BlackCat-Spam-Score: 4
Archived-At: <https://mailarchive.ietf.org/arch/msg/dns-privacy/PcFtKkUsH4DVoa4oF_BPqLsGgCQ>
Subject: Re: [dns-privacy] I-D Action: draft-ietf-dprive-xfr-over-tls-02.txt
X-BeenThere: dns-privacy@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <dns-privacy.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dns-privacy>, <mailto:dns-privacy-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dns-privacy/>
List-Post: <mailto:dns-privacy@ietf.org>
List-Help: <mailto:dns-privacy-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dns-privacy>, <mailto:dns-privacy-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 21 Jul 2020 15:08:46 -0000


> On 13 Jul 2020, at 23:35, Tony Finch <dot@dotat.at> wrote:
> 
> I've had a read through and here are a few, er, I mean several things that
> caught my eye:
> 

Hi Tony, 

Many thanks for the detailed review!

> 
> In the intro, I think it's too strong to say that RFC 5155 was "to
> prevent" zone enumeration - its abstract says it "provides measures
> against" which is a more accurate guide to NSEC3's effectiveness.
> Also the
> same paragraph could probably be more clear that NSEC5 is not a practical
> thing (yet? or likely ever?). I.e., neither of them are really useful
> privacy mechanisms.

Yes - agree this could be more specific on both topics. Will re-word as suggested.

> 
> 4.2 IXFR - RFC 1995 doesn't use RFC 1123-style requirements keywords (and
> obviously it predates RFC 2119) so I don't think you can say the
> lower-case "should" is non-normative. Spelling "forth" -> "fourth" :-)

Both fixed.

> 
> The last paragraph in this section should have a cross-reference to the
> section that describes the new IXFR requirements in detail. If these
> requirements are supposed to apply to pure TCP as well as IXoT then it's
> probably worth promoting them to a top-level section to make it more
> obvious that they exist, independent of TLS. Apart from this paragraph,
> section 4 looks more like a non-normative summary of existing
> specifications, which is useful background information, but I think it's
> helpful to clearly separate normative and informative sections.

Agree. These requirements are meant to apply specifically to IXFR-over-TCP so I have created a separate top level section with the title ‘Update to RFC1995 for IXFR-over-TCP’ and moved the normative statements there.

> 
> 4.3 Is it worth discussing information leakage about which zones are
> present on a secondary? i.e. is that part of the threat model?

We didn’t include that in this threat model because that can in principle be discovered by active measurements of the DNS ….but it might be worth a sentence to explain that.

> 
> 5.3 I'm not sure I understand what this section is getting at. Is it
> saying that a client can have either an XoTCP or an XoTLS connection, but
> not both? Because it should try to limit itself to one connection of any
> kind for zone transfers?

Not at all - perhaps it needs re-wording. RFC77666 recommended client/server connections be:

* one TCP connection for regular queries
* one TCP connection for zone transfers
* one connection per other transport on top of TCP (the implication being this is used for everything)

As an author on RFC7766 I was surprised when I went back to read it and could not remember the specific rationale for it!

The intention in this section is to update this guidance to say that all connection based transport should use separate connections for regular queries and zone transfers, just like TCP. So in principle the client/server interaction _could_  look something like:

* one TCP connection for regular queries
* one TCP connection for zone transfers
* one DoT connection for regular queries
* one XoT connection for zone transfers
* one DoH connection for regular queries
* one XFR-over-DoH connection for zone transfers

Listing the potential connections out like this might make it more obvious? 

> 
> 5.4 What is the base DNS RCODE for non-XoT traffic on an XoT connection?
> (extended errors do not have a fixed association with RCODEs)
> What about non-EDNS queries?

Ah yes - we should have specified REFUSED here as the base RCODE - fixed.

> 
> 5.6.2 AXoT
> 
> In the keepalive discussion, is the intention that a server can use a
> timeout of 0 to abort a connection in the middle of a transfer, or is it
> supposed to indicate that there can be no more transfers on the
> connection, but existing transfers in progress are allowed to finish?

RFC7828 says "A DNS client that receives a response that includes the edns-tcp-
   keepalive option with a TIMEOUT value of 0 SHOULD send no more
   queries on that connection and initiate closing the connection as
   soon as it has received all outstanding responses."

The intention of a 0 keepalive timeout is to stop further use of the existing connection, if the server needs to terminate a particular AXFR immediately then it still needs to close the connection at its end. 

The mention of abort in this text is misleading I think. I suggest updating:

OLD:
Note that this requirement, combined
with the use of EDNS0 Keepalive, enables AXoT servers to signal the
desire to close a connection due to low resources by sending an EDNS0
Keepalive option with a timeout of 0 on any AXoT response (in the
absence of another way to signal the abort of a AXoT transfer).

NEW:
Note that this requirement, combined
with the use of EDNS0 Keepalive, enables AXoT servers to signal the
desire to close a connection (when existing transactions have competed) 
due to low resources by sending an EDNS0 Keepalive option with a 
timeout of 0 on any AXoT response. Aborting an AXFR during the transfer 
still requires the server to close the connection.

We did toy with the idea of defining a new EDNS0 option/extended error code for a server to signal an abort of an individual AXoT without closing the connection but weren’t convinced there was a use case but perhaps we should revisit this (as per your next point)…? 

> 
> Is there a reason for allowing concurrent AXFRs of the same zone?
> Actually, thinking about this more generally, I can't see a way in RFC
> 5936 for the server to impose backpressure to limit the number of
> concurrent AXFRs. And there isn't an extended error code for concurrency
> control or backpressure. If the server had a suitable response, that would
> allow it to control xfer resources in general, as well as to choose
> whether or not it wants to allow multiple AXFRs for the same zone at the
> same time.

I don’t believe RFC5936 says anything expliclty about concurrent transfer behaviour, and while there may not be a use case for it do you think we should actually prohibit it?

Of course a server can error any AXFR if it chooses [RFC5936]:

"To indicate an error in an AXFR response, the AXFR server sends a
   single DNS message when the error condition is detected, with the
   response code set to the appropriate value for the condition
   encountered.  Such a message terminates the AXFR session;…” 

so it _could_ already answer SERVFAIL if it didn’t have the resources?, or REFUSED if a transfer is already underway and it doesn’t want to do another one? I’m not actually sure what existing implementations do in this case? (will double check)

I suppose the advantage of adding an extended error code would be so that well behaved clients didn’t continue to request transfers that were going to be refused. 

> 
> Still 5.6.2
> 
> The connection re-use requirements seem to be restating 5.3 in more
> detail. Would it be more clear to put these related requierments in the
> same section?

Well 5.3 is a general update to RFC7766 about transport usage but 5.6.2 is specifically updating RFC5936… 

> 
> Re pipelining, I can't see in RFC 5936 whether concurrent AXFRs are just
> concurrent outstanding queries, with all the response messages for one
> zone sent back-to-back, or whether response messages for different
> concurrent AXFRs can be interleaved.

No, you are right - that behaviour isn’t explicitly specified there but the discussion around using message IDs to match responses at the end of section 4.1.1. suggests/implies intermingling should work. Our draft doesn’t update RFC5936 at all (at the moment)… I hadn’t thought it necessary but perhaps we should actually make the normative statements around the updates to RFC1995 apply to RFC5936 as well for consistency?

> 
> 5.6.3 padding
> 
> Why would empty response messages be needed? Isn't it enough to pad the
> regular response messages that contain RRs? (Or maybe reduce the number of
> RRs per message and increase the padding if more obfuscation is needed?)
> Servers need to keep track of zone sizes in order to mitigate
> CVE-2016-6170 (DoS attack by sending an excessively huge AXFR response) so
> I would expect servers to be able to use that accounting to decide how to
> spread padding between AXFR response messages, without the need for extra
> padding-only messages.

Adding that requirement to this document was a question of flexibility and future proofing to allow padding for AXFR to happen in different ways and with simple algorithms. e.g. the maximum size a (tiny) zone could be padded to would theoretically be limited (to something very large admittedly) if there had to be a minimum of 1 RR per packet. I can add some text to clarify this.

> 
> 5.7 IXoT
> 
> Looking back and comparing with section 4.2, it looks like the concurrency
> requirements in section 5.7 only apply to TLS. Are they supposed to apply
> to TCP as well?

The normative statements in section 4.2 (in particular, follow RFC7766) do require IXFR to support pipelining of queries and out of order processing and re-use of one TCP connection for zone transfers. Section 5.7.1 should have a sentence referencing back to that update which is it doesn’t at the moment - I’ll add it. 

The first paragraph in 5.7.1 should probably be moved to the new, earlier section on the normative update to RFC1995…. and paragraph 2 in section 5.7.2 should probably be removed or apply to IXFR-over-TCP as well for consistency, I suppose. Keeping it seems preferable in order to remove any doubt about behaviour. 

Getting updates on top of updates for each type of transfer (or both) clear and consistent is a bit tricky here :-)

> 
> I think it would help to have some more explicit discussion of how IXoT
> and AXoT share a connection, wrt concurrency, interleaving of response
> messages (or not), and so forth. Perhaps as a subsection beween 5.5 and
> 5.6? Or maybe as an expanded 5.3? 

Are you thinking of some text clarifying that servers can send AXoT responses for different zones intermingled with each other and with IXoT responses and clients have to handle them? I guess I thought that was implicit in the RFC7766 model but we could add some clarifying text. Again though, that would (I think) apply equally for AXFR and IXFR sharing a connection so perhaps it needs to appear earlier when they are discussed…. Do you have any error/problem cases in mind, or just clarifying what needs to be supported?


> Also covering other things that are
> common to IXot and AXoT like keepalive timeouts, concurrency backpressure,
> presence or absence of EDNS, padding, and anything else I've missed.

Well, at the moment the detailed discussion happens in the AXoT discussion for keepalive and EDNS0 and the IXoT section just references back to that. I tried moving it earlier and it seemed out of context and between the two seemed odd so I landed on this structure but I realise there is a lot of overlap. I expect the structure to evolve a bit again in the next version so thanks for all the feedback. Padding I think is better kept separate?

> 
> 7 authentication
> 
> It seems weird to mix up channel auth and data auth, since they are quite
> different things. As I understand it, ZONEMD isn't really authentication,
> it's just an integrity check (unless it is used in a signed zone). And if
> you are talking about data authentication it seems odd to leave out
> RRSIGs.
> 
> TSIG doesn't provide data authentication. It provides mutual
> authentication of the endpoints, and data integrity, but the server can
> lie to the client about the zone contents. (The server is not necessarily
> the ultimate authority for the zone.)
> 
> It would be useful to have terminology to distinguish between TLS where
> the client software tries on its own initiative, with fallback to TCP
> (which is what I think of when I read "opportunistic"); as opposed to TLS
> configured by the admin without fallback to TCP and without any client or
> server certificate auth. I'll call the latter "unauth".
> 
> I don't think strict TLS + TSIG adds any benefits beyond unauth TLS +
> TSIG, because TSIG already provides mutual auth. Well, there's some risk
> that the client may send requests to the wrong server, which goes back to
> my section 4.3 question about whether it is part of the threat model to
> worry about exposing which zones a client holds.
> 
> Mutual TLS is roughly comparable to unauth TLS + TSIG, but it has the
> advantage that it's a bit easier to set up in a way that prevents clients
> from being able to impersonate the server. If you want to do this with
> TSIG then every client needs its own key, and the server config has to be
> updated whenever a client is provisioned or decommissioned. With mutual
> TLS the server only needs a relatively static CA cert that can
> authenticate any client cert.
> 
> I think there should be something in the spec about how certificate
> subject names relate to how (in strict and mutual TLS) the client
> authenticates the server, and how (in mutual TLS) the server decides that
> the client's requests are authorized. I would like to be able to give my
> client a server name (and optional address) and have it authenticate the
> server using the system CA cert store and server certificate
> subjectAltNames. I would like to be able to give my server an ACL
> containing my private CA cert and a client cert subject name pattern.

For convenience, I’ll pull this topic out into a different thread if that is OK with you…?

> 
> 8 policies
> 
> I think the definition of xfer group can be slightly improved, like:
> 
>  We call the entire group of servers involved in XFR for a particular
>  set of zones (all the primaries and all the secondaries) the 'transfer
>  group' for those zones.
> 
> (My auth servers host multiple sets of zones belonging to several
> different institutions, with different and partially overlapping transfer
> groups, with different security configurations…)

Yup - that works.

> 
> I think "mTLS" should be written "Mutual" for consistency?

Sure. 

> 
> Finally, at last ...
> 
> The figures and tables were missing from the plain text version that I
> looked at so I didn't review them. I could guess what the diagrams showed
> but I got the impression that the table in section 7 was a bit more
> substantive.

At the moment they are just links to SVG images in the GitHub repo so in the plain text version you do need to copy and paste the URIs from section 16.3. (in the HTML you can click to have them open in a new tab). Attempting proper SVG integration (or failing that, reverting to ASCII art) is on the TODO list! 

Thanks.

Sara.