Re: [DNSOP] Eric Rescorla's No Objection on draft-ietf-dnsop-session-signal-12: (with COMMENT)

Ted Lemon <mellon@fugue.com> Wed, 01 August 2018 02:28 UTC

Return-Path: <mellon@fugue.com>
X-Original-To: dnsop@ietfa.amsl.com
Delivered-To: dnsop@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A3380130FA9 for <dnsop@ietfa.amsl.com>; Tue, 31 Jul 2018 19:28:56 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.009
X-Spam-Level:
X-Spam-Status: No, score=-0.009 tagged_above=-999 required=5 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, T_DKIMWL_WL_MED=-0.01, URIBL_BLOCKED=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=fugue-com.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4JmteqDVlUHP for <dnsop@ietfa.amsl.com>; Tue, 31 Jul 2018 19:28:54 -0700 (PDT)
Received: from mail-it0-x244.google.com (mail-it0-x244.google.com [IPv6:2607:f8b0:4001:c0b::244]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 95CC5130F93 for <dnsop@ietf.org>; Tue, 31 Jul 2018 19:28:51 -0700 (PDT)
Received: by mail-it0-x244.google.com with SMTP id v71-v6so7273688itb.3 for <dnsop@ietf.org>; Tue, 31 Jul 2018 19:28:51 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fugue-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=vB6n56Gzei/mgd8z5sFgM/LbfvSzl//J6k/U1ACozpA=; b=NsFlPjKfxPyLF6hZbBzYRwh2lqpLTDHMwzK6b/AEnR7N0fMnkILMgV67OTH0oAZXBd cIL3+aLuhfGV+wFVTjwSIsMCWVzGQ7ks5eRjvC00G1HJl0WLkhXNpNN4ZTZyMjmJzpBB xERPuvA4JDYC7xeJZpJKZ24b46QS0iEj5pnY0EUpL87/b994X2ETd+FhuWdhMtm7glVH mPQiu158qKCoZnF4whFohr2w0yKbq5Hs87i1dvMLtr+r6Bx7O8Kxl5d+L62WjnYOKtQ6 ZpP6G9crw9oy24jv+AKW2xe8ighPySDiccElbePECEfOzNe5kLxinlKgkNd9UcHRq6yd yVyA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=vB6n56Gzei/mgd8z5sFgM/LbfvSzl//J6k/U1ACozpA=; b=JJHXA27ST7VmRm0YeIHNcvQgIC00rDYUnf5o2vASuKMH1kCuI9QoT5Z8JeZVR9113B aCw7P7Z4+H9nmDUbcA1TPfshiVFAY3dQNedH8YxDO4vEpF4T2b4cLXz/iwTKqRoKVeia eZQYxM1YXlBrNP8syvzEO47qxH9X4tn40nToUkEDk3iX0su3DgQKpDHqPm4PdTA8sqKF 8adFpQu4Q2bl5Zgo3tx0UE0Qem1gegFMYbQY1IPeJhWHg35NOyGYbhXrEk0zSKRhK35Y b7xx9iXj1GjOoGJJbJAWlJv3x8i9MhHMHPaelcc+qUrkyMe8hmcbQw4AvcyTV/r0t0+F 8REg==
X-Gm-Message-State: AOUpUlHe+I2Z92cFZpMaAlrpgVqGyYGhxEhMZTvt9ffJYbFSDq+O6kED f0bz/dwnBI1HRgY3tXVDiUnWTcNXdx8TDt4yUUFDaA==
X-Google-Smtp-Source: AAOMgpfwMfLnmVeQfgMIADMxYYq/3lyJp4Wcw+c9ya99n4XOrqvrHbE1Y9do551/dhVel4CEb9EYYY6xe+jGmNgwSB8=
X-Received: by 2002:a24:3d01:: with SMTP id n1-v6mr1912755itn.144.1533090530560; Tue, 31 Jul 2018 19:28:50 -0700 (PDT)
MIME-Version: 1.0
Received: by 2002:a4f:b442:0:0:0:0:0 with HTTP; Tue, 31 Jul 2018 19:28:10 -0700 (PDT)
In-Reply-To: <153307610896.3243.16744625445592133652.idtracker@ietfa.amsl.com>
References: <153307610896.3243.16744625445592133652.idtracker@ietfa.amsl.com>
From: Ted Lemon <mellon@fugue.com>
Date: Tue, 31 Jul 2018 22:28:10 -0400
Message-ID: <CAPt1N1m_LP2ZcG9yR1DggwRg7KOeuRwTzrsX67iBxkpXekjxUw@mail.gmail.com>
To: Eric Rescorla <ekr@rtfm.com>
Cc: The IESG <iesg@ietf.org>, Tim Wicinski <tjw.ietf@gmail.com>, dnsop WG <dnsop@ietf.org>, dnsop-chairs <dnsop-chairs@ietf.org>, draft-ietf-dnsop-session-signal@ietf.org
Content-Type: multipart/alternative; boundary="0000000000007baa07057256748a"
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnsop/XmmGlxoHBRd-D4Id5leRP-B4y14>
Subject: Re: [DNSOP] Eric Rescorla's No Objection on draft-ietf-dnsop-session-signal-12: (with COMMENT)
X-BeenThere: dnsop@ietf.org
X-Mailman-Version: 2.1.27
Precedence: list
List-Id: IETF DNSOP WG mailing list <dnsop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnsop>, <mailto:dnsop-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dnsop/>
List-Post: <mailto:dnsop@ietf.org>
List-Help: <mailto:dnsop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 01 Aug 2018 02:28:57 -0000

On Tue, Jul 31, 2018 at 6:28 PM, Eric Rescorla <ekr@rtfm.com> wrote:

> IMPORTANT
> S 5.3.
> >      field set to zero, and MUST NOT elicit a response.
> >
> >      Every DSO request message (QR=0) with a nonzero MESSAGE ID field is
> >      an acknowledged DSO request, and MUST elicit a corresponding
> response
> >      (QR=1), which MUST have the same MESSAGE ID in the DNS message
> header
> >      as in the corresponding request.
>
> How do I handle duplicate message IDs on the responder? Did I miss
> where you said this? Is this just an error?
>

In order for the responder to detect this condition, it has to track
message IDs.   The existing implementation that I have doesn't track
message IDs—it just responds with the message ID that it got.   There's no
hash table of known incoming message IDs, and maintaining such a table
would add complexity.   I think that this is okay, because it's the
requester that loses in this case—it's effectively attacking itself.
 However, if the responder does track message IDs, then it could be an
issue.   So I've added the following text:

 If a client or server receives a response (QR=1) where the MESSAGE ID is
zero, or is
 any other value that does not match the MESSAGE ID of any of its
outstanding operations,
 this is a fatal error and the recipient MUST forcibly abort the connection
immediately.

+If a responder receives a request (QR=0) where the MESSAGE ID is not zero,
and
+the responder tracks query MESSAGE IDs, and the MESSAGE ID
+matches the MESSAGE ID of a query it received for which a response has not
yet been sent,
+it MUST forcibly abort the connection immediately.   This behavior is
required to prevent
+a hypothetical attack that takes advantage of undefined behavior in this
case.   However,
+if the server does not track MESSAGE IDs in this way, no such risk exists,
so tracking
+MESSAGE IDs just to implement this sanity check is not required.

 Does this address your concern?


> S 9.3.
> >
> >      Requests to register additional new DSO Type Codes in the
> >      "Unassigned" range 0040-F7FF are to be recorded by IANA after Expert
> >      Review [RFC8126].  At the time of publication of this document, the
> >      Designated Expert for the newly created DSO Type Code registry is
> >      [*TBD*].
>
> What is the standard for the expert to follow
>

I added this (the text has changed somewhat as a result of a previous
comment):

 Requests to register additional new DSO Type Codes
 in the "Unassigned" range 0040-F7FF
 are to be recorded by IANA after Expert Review {{!RFC8126}}.
+The expert review should validate that the requested type code
+is specified in a way that conforms to this specification, and
+that the intended use for the code would not be addressed with
+an experimental/local assignment.

 DSO Type Codes in the "experimental/local" range F800-FBFF
 may be used as Experimental Use or Private Use values {{!RFC8126}}

OK?


> COMMENTS
> S 1.
> >      is appended to the end of the DNS message header.  When displayed
> >      using packet analyzer tools that have not been updated to recognize
> >      the DSO format, this will result in the DSO data being displayed as
> >      unknown additional data after the end of the DNS message.  It is
> >      likely that future updates to these tools will add the ability to
> >      recognize, decode, and display the DSO data.
>
> I'm sure you will get to this soon, but what are the backward
> compatibility implications for the two endpoints.
>

Do you mean in terms of requesters and responders that don't implement DSO
but receive a DSO message?   I think we've specified that adequately—the
responder is expected to return "Not Implemented" because of the status
code.   As I mentioned in a different response, this is the behavior of at
least BIND 9 and Google's 8.8.8.8 server.   1.1.1.1 just didn't reply to
the message.   So I realized, and maybe this is what you intended me to
realize, that the text doesn't address several possible outcomes of trying
to establish a DSO session.   I believe the following changes should do it:

 sending DNS messages on that connection, but the client SHOULD NOT
 issue further DSO messages on that connection.

+Two other possibilities exist: the server might drop the connection, or
+the server might send no response to the DSO message.   In the first
+case, the client SHOULD mark the server as not supporting DSO, and not
+attempt a DSO connection for some period of time (at least an hour)
+after the failed attempt.   The client MAY reconnect but not use
+DSO, if appropriate.
+
+In the second case, the client SHOULD set a reasonable timeout, after
+which time the server will be assumed not to support DSO.   At this
+point the client MUST drop the connection to the server, since the
+server's behavior is out of spec, and hence its state is undefined.
+The client MAY reconnect, but not use DSO, if appropriate.
+
 When the server receives a DSO request message
 from a client, and transmits a successful NOERROR response to that
 request, the server considers the DSO Session established.
@@ -477,7 +490,11 @@ clients and servers should behave as described in this
specification with
 regard to inactivity timeouts and session termination, not as previously
 prescribed in the earlier specification for DNS over TCP {{!RFC7766}}.

-Note that for clients that implement only the DSO-TYPEs defined in
+Because the Keepalive TLV can't fail (that is, can't return an RCODE
+other than NOERROR), it is an ideal candidate for use in establishing
+a DSO session.   Any other option that can only succeed MAY also be
+used to establish a DSO session.
+For clients that implement only the DSO-TYPEs defined in
 this base specification, sending a Keepalive TLV is the only
 DSO request message they have available to initiate a DSO Session.
 Even for clients that do implement other future DSO-TYPEs, for simplicity


> S 3.
> >      The unqualified term "session" in the context of this document means
> >      the exchange of DNS messages over a connection where:
> >
> >      o  The connection between client and server is persistent and
> >         relatively long-lived (i.e., minutes or hours, rather than
> >         seconds).
>
> This is a surprising taxonomy. I would assume that some of the options
> you are proposing would be relevant with a 30s connection (very long
> by HTTP standards!)
>

Hm.   I think this text is motivated by the first and second identified use
cases for this document: DNS Push and DNSSD Discovery Relay.   Neither
would be expected to have 30-second connections.   However, in principle I
think that you are correct, and actually I think specifying the time is
unnecessary, so I'd propose the following change:

 DNS messages over a connection where:

 - The connection between client and server is persistent and relatively
-  long-lived (i.e., minutes or hours, rather than seconds).
+  long-lived.
 - Either end of the connection may initiate messages to the other.

 In this document the term "session" is used exclusively as described above.

S 3.
> >      Where this specification says, "close gracefully," that means
> sending
> >      a TLS close_notify (if TLS is in use) followed by a TCP FIN, or the
> >      equivalents for other protocols.  Where this specification requires
> a
> >      connection to be closed gracefully, the requirement to initiate that
> >      graceful close is placed on the client, to place the burden of TCP's
> >      TIME-WAIT state on the client rather than the server.
>
> Does this mean that the server will ask the client to close?
>

Yes, this is specified in 6.6.

S 3.
> >      connection to be closed gracefully, the requirement to initiate that
> >      graceful close is placed on the client, to place the burden of TCP's
> >      TIME-WAIT state on the client rather than the server.
> >
> >      Where this specification says, "forcibly abort," that means sending
> a
> >      TCP RST, or the equivalent for other protocols.  In the BSD Sockets
>
> Because you bother to mention TLS above, what about non-close_notify
> TLS alerts?
>

That's the only alert that would be used here.   TLS alerts used in
connection establishment and in ongoing connections are deferred to the
DNS-over-TLS spec, which in turn defers to BCP 195.

S 3.
> >      the server's listening socket.
> >
> >      The terms "initiator" and "responder" correspond respectively to the
> >      initial sender and subsequent receiver of a DSO request message or
> >      unacknowledged message, regardless of which was the "client" and
> >      "server" in the usual DNS sense.
>
> Might be helpful to say earlier that this is a request/response
> protocol
>

It's not.   Not all requests require responses: Unacknowledged requests
explicitly don't.


> S 3.
> >
> >      DNS Stateful Operations uses three kinds of message: "DSO request
> >      messages", "DSO response messages", and "DSO unacknowledged
> >      messages".  A DSO request message elicits a DSO response message.
> >      DSO unacknowledged messages are unidirectional messages and do not
> >      generate any response.
>
> This would be useful further up.
>

Do you mean in the introduction?


> S 5.1.
> >
> >      DNS over plain UDP [RFC0768] is not appropriate since it fails on
> the
> >      requirement for in-order message delivery, and, in the presence of
> >      NAT gateways and firewalls with short UDP timeouts, it fails to
> >      provide a persistent bi-directional communication channel unless an
> >      excessive amount of keepalive traffic is used.
>
> Note that this is going to make things not work super-well with DNS-
> over-QUIC unless you use one stream only.
>

We don't expect this to work with DNS-over-QUIC without some restriction
like this, hence the text in paragraph 3 of section 5.1.

S 5.1.
> >
> >      If the RCODE is set to any value other than NOERROR (0) or DSOTYPENI
> >      ([TBA2] tentatively 11), then the client MUST assume that the server
> >      does not implement DSO at all.  In this case the client is permitted
> >      to continue sending DNS messages on that connection, but the client
> >      SHOULD NOT issue further DSO messages on that connection.
>
> Why is this a SHOULD and not a MUST?
>

Agreed, fixed.


> S 5.1.3.
> >      to any problems that could be result from the inadvertent replay
> that
> >      can occur with zero round-trip operation.
> >
> >   5.1.3.  Middlebox Considerations
> >
> >      Where an application-layer middlebox (e.g., a DNS proxy, forwarder,
>
> I'm having trouble with this section. Is it a set of requirements on
> middleboxes or statements of fact? If the latter, it seems like there
> are a bunch of ways for middleboxes to mess things up,
>

This was covered in the gen-art review, and the text was updated to address
the comment that was made there about this.   We believe that we have
thought this through and that the text as written is correct.

S 5.4.
> >         generate a response, the application-layer software informs the
> >         TCP implementation that it should go ahead and send the TCP ACK
> >         and window update immediately, without waiting for the Delayed
> ACK
> >         timer.  Unfortunately it is not known at this time which (if any)
> >         of the widely-available networking APIs currently include this
> >         capability.
>
> Are you going to make a recommendation here?
>

Out of scope for DNSOP to update TCP stack APIs.


> S 6.6.1.2.
> >      client a Retry Delay message, or by forcibly aborting the underlying
> >      transport connection) the client SHOULD try to reconnect, to that
> >      service instance, or to another suitable service instance, if more
> >      than one is available.  If reconnecting to the same service
> instance,
> >      the client MUST respect the indicated delay, if available, before
> >      attempting to reconnect.
>
> Do you want to recommend some sort of randomness around this value to
> avoid avalanche?
>

The server is specifying the retry delay, so if the client adds randomness,
that could result in more collisions rather than fewer.


> S 8.2.
> >      The table below indicates, for each of the three TLVs defined in
> this
> >      document, whether they are valid in each of ten different contexts.
> >
> >      The first five contexts are requests or unacknowledged messages from
> >      client to server, and the corresponding responses from server back
> to
> >      client:
>
> Nit. This text is a tiny bit hard to read, because you don't list S-P,
> etc.
>

Hm, I believe that this should be using the requester/responder vernacular,
not the client/server vernacular.   I've asked the team to confirm.

S 10.
> >      messages are subject to the same constraints as any other DNS-over-
> >      TLS messages and MUST NOT be sent in the clear before the TLS
> session
> >      is established.
> >
> >      The data field of the "Encryption Padding" TLV could be used as a
> >      covert channel.
>
> Why not require this to be 0, then?
>

The text describing the option currently says this:

As specified for the EDNS(0) Padding Option {{!RFC7830}}
the PADDING bytes SHOULD be set to 0x00.  Other values MAY be used,
for example, in cases where there is a concern that the padded
message could be subject to compression before encryption.
PADDING bytes of any value MUST be accepted in the messages received.

Do you believe that this is an invalid concern?