Re: [DNSOP] Eric Rescorla's No Objection on draft-ietf-dnsop-session-signal-12: (with COMMENT)

Eric Rescorla <ekr@rtfm.com> Wed, 01 August 2018 02:56 UTC

Return-Path: <ekr@rtfm.com>
X-Original-To: dnsop@ietfa.amsl.com
Delivered-To: dnsop@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 880E8130F8E for <dnsop@ietfa.amsl.com>; Tue, 31 Jul 2018 19:56:43 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.008
X-Spam-Level:
X-Spam-Status: No, score=-0.008 tagged_above=-999 required=5 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, T_DKIMWL_WL_MED=-0.01, URIBL_BLOCKED=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=rtfm-com.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id DZNZjeS53keq for <dnsop@ietfa.amsl.com>; Tue, 31 Jul 2018 19:56:39 -0700 (PDT)
Received: from mail-lf1-x12e.google.com (mail-lf1-x12e.google.com [IPv6:2a00:1450:4864:20::12e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 8AE8E130E23 for <dnsop@ietf.org>; Tue, 31 Jul 2018 19:56:33 -0700 (PDT)
Received: by mail-lf1-x12e.google.com with SMTP id a4-v6so12210456lff.5 for <dnsop@ietf.org>; Tue, 31 Jul 2018 19:56:33 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rtfm-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=Q830QKHDn9QXKQU5zyu0QmyzID7VIgBCI9EQrZervL8=; b=TAq5H9+uih3oTfGyqjohUt6yrveGlL6Li5jOPIlrvgBZ86gIuQc93Oq3QF+9yIYk7O rJnKNad+suiIdZ0epEuT4Shto4LiwDJvRzUnPB9xbTrKAijfJ9jDR94YbyNRjIZjwMWQ IV0U2MzQrJaHrb6CSRmTjBF2vxyKtwUwWgpq7dVAlK4J+esuSgfiCSsumQ9aOlfHUM2z msAs/J8lOvbDAEtFd3216u4aWuKMVysxCDnZUbE2V4bPhHtkfo73XcEfnK0VjgiDSCou kHH8oAXiPShwHCd/cgpo1ZIsxRWlp4v7su3q9weo1ovJwtNLKfQsPINYtleF7EvUpM2x urPg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=Q830QKHDn9QXKQU5zyu0QmyzID7VIgBCI9EQrZervL8=; b=t5vWuP4M4SrnmmIx7H47UlJA35IosJ6pnvFS0nSeaNcI1iNCwbJgyUyNM8k8OfETL8 bGEqpSCYxbobNk6/64ksa1dXcr4uTTh72JmG728eh0lmzeCPVsT7hUReAjrksMplRbHC lz3/oh3eHFjC/NbwH1KdwPhxTDi5iLBSZ0Ihk54IHuHnrAtLrc7HQLHkCiMgDNakPTET 9lhlu2z9RwKjQzhC+lGJ8YU1FjYc3z1idMvyXKetT6mqP+XW6dfMi7EHY8Wz3gQHmxSn P/PI9HSZjuY5RCm7xD/mo/DpSOv9MMYdWGYzLXqjAGBkURkEDKrRim6hN8x8RZSQlXwI q0gw==
X-Gm-Message-State: AOUpUlH+I1dC3RpxxdVA3kr4VjkOjRB4bQtcMcMcDWzlX0DfiqjwwaNG pNnE5oLxZIq/6EkHzb+iTRWuzsD3BZc3slyTVd3p8A==
X-Google-Smtp-Source: AAOMgpfBtS0TCXgXLax5jiQRTNPJHjYAj4Ao0C52vnPB2q3VEqcxHzjF076FIS6BKFlnUJsRbbr2zQUPebTIX+gL3Iw=
X-Received: by 2002:a19:a417:: with SMTP id q23-v6mr14057455lfc.59.1533092191539; Tue, 31 Jul 2018 19:56:31 -0700 (PDT)
MIME-Version: 1.0
Received: by 2002:ab3:4091:0:0:0:0:0 with HTTP; Tue, 31 Jul 2018 19:55:50 -0700 (PDT)
In-Reply-To: <CAPt1N1m_LP2ZcG9yR1DggwRg7KOeuRwTzrsX67iBxkpXekjxUw@mail.gmail.com>
References: <153307610896.3243.16744625445592133652.idtracker@ietfa.amsl.com> <CAPt1N1m_LP2ZcG9yR1DggwRg7KOeuRwTzrsX67iBxkpXekjxUw@mail.gmail.com>
From: Eric Rescorla <ekr@rtfm.com>
Date: Tue, 31 Jul 2018 19:55:50 -0700
Message-ID: <CABcZeBO+5Nxv88HW6xVkDtru5DUnCHxeytX1gdRAc4Z3S7K+Jg@mail.gmail.com>
To: Ted Lemon <mellon@fugue.com>
Cc: The IESG <iesg@ietf.org>, Tim Wicinski <tjw.ietf@gmail.com>, dnsop WG <dnsop@ietf.org>, dnsop-chairs <dnsop-chairs@ietf.org>, draft-ietf-dnsop-session-signal@ietf.org
Content-Type: multipart/alternative; boundary="0000000000007c2ff3057256d789"
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnsop/YX0ExmhWQpRBoVsy8TSVsXQaTsY>
Subject: Re: [DNSOP] Eric Rescorla's No Objection on draft-ietf-dnsop-session-signal-12: (with COMMENT)
X-BeenThere: dnsop@ietf.org
X-Mailman-Version: 2.1.27
Precedence: list
List-Id: IETF DNSOP WG mailing list <dnsop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnsop>, <mailto:dnsop-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dnsop/>
List-Post: <mailto:dnsop@ietf.org>
List-Help: <mailto:dnsop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 01 Aug 2018 02:56:44 -0000

On Tue, Jul 31, 2018 at 7:28 PM, Ted Lemon <mellon@fugue.com> wrote:

> On Tue, Jul 31, 2018 at 6:28 PM, Eric Rescorla <ekr@rtfm.com> wrote:
>
>> IMPORTANT
>> S 5.3.
>> >      field set to zero, and MUST NOT elicit a response.
>> >
>> >      Every DSO request message (QR=0) with a nonzero MESSAGE ID field is
>> >      an acknowledged DSO request, and MUST elicit a corresponding
>> response
>> >      (QR=1), which MUST have the same MESSAGE ID in the DNS message
>> header
>> >      as in the corresponding request.
>>
>> How do I handle duplicate message IDs on the responder? Did I miss
>> where you said this? Is this just an error?
>>
>
> In order for the responder to detect this condition, it has to track
> message IDs.   The existing implementation that I have doesn't track
> message IDs—it just responds with the message ID that it got.   There's no
> hash table of known incoming message IDs, and maintaining such a table
> would add complexity.   I think that this is okay, because it's the
> requester that loses in this case—it's effectively attacking itself.
>  However, if the responder does track message IDs, then it could be an
> issue.   So I've added the following text:
>
>  If a client or server receives a response (QR=1) where the MESSAGE ID is
> zero, or is
>  any other value that does not match the MESSAGE ID of any of its
> outstanding operations,
>  this is a fatal error and the recipient MUST forcibly abort the
> connection immediately.
>
> +If a responder receives a request (QR=0) where the MESSAGE ID is not
> zero, and
> +the responder tracks query MESSAGE IDs, and the MESSAGE ID
> +matches the MESSAGE ID of a query it received for which a response has
> not yet been sent,
> +it MUST forcibly abort the connection immediately.   This behavior is
> required to prevent
> +a hypothetical attack that takes advantage of undefined behavior in this
> case.   However,
> +if the server does not track MESSAGE IDs in this way, no such risk
> exists, so tracking
> +MESSAGE IDs just to implement this sanity check is not required.
>
>  Does this address your concern?
>

Yeah, this seems fine. Didn't mean to make you do a lot of work here, just
noticed as I was reading.

>
>
>> S 9.3.
>> >
>> >      Requests to register additional new DSO Type Codes in the
>> >      "Unassigned" range 0040-F7FF are to be recorded by IANA after
>> Expert
>> >      Review [RFC8126].  At the time of publication of this document, the
>> >      Designated Expert for the newly created DSO Type Code registry is
>> >      [*TBD*].
>>
>> What is the standard for the expert to follow
>>
>
> I added this (the text has changed somewhat as a result of a previous
> comment):
>
>  Requests to register additional new DSO Type Codes
>  in the "Unassigned" range 0040-F7FF
>  are to be recorded by IANA after Expert Review {{!RFC8126}}.
> +The expert review should validate that the requested type code
> +is specified in a way that conforms to this specification, and
> +that the intended use for the code would not be addressed with
> +an experimental/local assignment.
>
>  DSO Type Codes in the "experimental/local" range F800-FBFF
>  may be used as Experimental Use or Private Use values {{!RFC8126}}
>
> OK?
>

LGTM.



>
>> COMMENTS
>> S 1.
>> >      is appended to the end of the DNS message header.  When displayed
>> >      using packet analyzer tools that have not been updated to recognize
>> >      the DSO format, this will result in the DSO data being displayed as
>> >      unknown additional data after the end of the DNS message.  It is
>> >      likely that future updates to these tools will add the ability to
>> >      recognize, decode, and display the DSO data.
>>
>> I'm sure you will get to this soon, but what are the backward
>> compatibility implications for the two endpoints.
>>
>
> Do you mean in terms of requesters and responders that don't implement DSO
> but receive a DSO message?   I think we've specified that adequately—the
> responder is expected to return "Not Implemented" because of the status
> code.   As I mentioned in a different response, this is the behavior of at
> least BIND 9 and Google's 8.8.8.8 server.   1.1.1.1 just didn't reply to
> the message.   So I realized, and maybe this is what you intended me to
> realize, that the text doesn't address several possible outcomes of trying
> to establish a DSO session.   I believe the following changes should do it:
>
>  sending DNS messages on that connection, but the client SHOULD NOT
>  issue further DSO messages on that connection.
>
> +Two other possibilities exist: the server might drop the connection, or
> +the server might send no response to the DSO message.   In the first
> +case, the client SHOULD mark the server as not supporting DSO, and not
> +attempt a DSO connection for some period of time (at least an hour)
> +after the failed attempt.   The client MAY reconnect but not use
> +DSO, if appropriate.
> +
> +In the second case, the client SHOULD set a reasonable timeout, after
> +which time the server will be assumed not to support DSO.   At this
> +point the client MUST drop the connection to the server, since the
> +server's behavior is out of spec, and hence its state is undefined.
> +The client MAY reconnect, but not use DSO, if appropriate.
> +
>  When the server receives a DSO request message
>  from a client, and transmits a successful NOERROR response to that
>  request, the server considers the DSO Session established.
> @@ -477,7 +490,11 @@ clients and servers should behave as described in
> this specification with
>  regard to inactivity timeouts and session termination, not as previously
>  prescribed in the earlier specification for DNS over TCP {{!RFC7766}}.
>
> -Note that for clients that implement only the DSO-TYPEs defined in
> +Because the Keepalive TLV can't fail (that is, can't return an RCODE
> +other than NOERROR), it is an ideal candidate for use in establishing
> +a DSO session.   Any other option that can only succeed MAY also be
> +used to establish a DSO session.
> +For clients that implement only the DSO-TYPEs defined in
>  this base specification, sending a Keepalive TLV is the only
>  DSO request message they have available to initiate a DSO Session.
>  Even for clients that do implement other future DSO-TYPEs, for simplicity
>

You're doing more than I asked for here (thanks!). I just found myself
wondering at this point in the spec how this worked and so was looking for
an overview. But this text is very helpful, so I think it's good to add.



>
>> S 3.
>> >      The unqualified term "session" in the context of this document
>> means
>> >      the exchange of DNS messages over a connection where:
>> >
>> >      o  The connection between client and server is persistent and
>> >         relatively long-lived (i.e., minutes or hours, rather than
>> >         seconds).
>>
>> This is a surprising taxonomy. I would assume that some of the options
>> you are proposing would be relevant with a 30s connection (very long
>> by HTTP standards!)
>>
>
> Hm.   I think this text is motivated by the first and second identified
> use cases for this document: DNS Push and DNSSD Discovery Relay.   Neither
> would be expected to have 30-second connections.   However, in principle I
> think that you are correct, and actually I think specifying the time is
> unnecessary, so I'd propose the following change:
>
>  DNS messages over a connection where:
>
>  - The connection between client and server is persistent and relatively
> -  long-lived (i.e., minutes or hours, rather than seconds).
> +  long-lived.
>  - Either end of the connection may initiate messages to the other.
>

LGTM.


>  In this document the term "session" is used exclusively as described
> above.
>
> S 3.
>> >      Where this specification says, "close gracefully," that means
>> sending
>> >      a TLS close_notify (if TLS is in use) followed by a TCP FIN, or the
>> >      equivalents for other protocols.  Where this specification
>> requires a
>> >      connection to be closed gracefully, the requirement to initiate
>> that
>> >      graceful close is placed on the client, to place the burden of
>> TCP's
>> >      TIME-WAIT state on the client rather than the server.
>>
>> Does this mean that the server will ask the client to close?
>>
>
> Yes, this is specified in 6.6.
>

Yep. I just make these comments as I go and it wasn't clear here. Sorry if
that was confusing.


> S 3.
>> >      connection to be closed gracefully, the requirement to initiate
>> that
>> >      graceful close is placed on the client, to place the burden of
>> TCP's
>> >      TIME-WAIT state on the client rather than the server.
>> >
>> >      Where this specification says, "forcibly abort," that means
>> sending a
>> >      TCP RST, or the equivalent for other protocols.  In the BSD Sockets
>>
>> Because you bother to mention TLS above, what about non-close_notify
>> TLS alerts?
>>
>
> That's the only alert that would be used here.   TLS alerts used in
> connection establishment and in ongoing connections are deferred to the
> DNS-over-TLS spec, which in turn defers to BCP 195.
>

OK.


> S 3.
>> >      the server's listening socket.
>> >
>> >      The terms "initiator" and "responder" correspond respectively to
>> the
>> >      initial sender and subsequent receiver of a DSO request message or
>> >      unacknowledged message, regardless of which was the "client" and
>> >      "server" in the usual DNS sense.
>>
>> Might be helpful to say earlier that this is a request/response
>> protocol
>>
>
> It's not.   Not all requests require responses: Unacknowledged requests
> explicitly don't.
>

Fair enough. I guess where I was going was that you talk about request here
but it's not clear till later that there might be responses. Not a big deal.


>
>> S 3.
>> >
>> >      DNS Stateful Operations uses three kinds of message: "DSO request
>> >      messages", "DSO response messages", and "DSO unacknowledged
>> >      messages".  A DSO request message elicits a DSO response message.
>> >      DSO unacknowledged messages are unidirectional messages and do not
>> >      generate any response.
>>
>> This would be useful further up.
>>
>
> Do you mean in the introduction?
>

Yes.


>
>> S 5.1.
>> >
>> >      DNS over plain UDP [RFC0768] is not appropriate since it fails on
>> the
>> >      requirement for in-order message delivery, and, in the presence of
>> >      NAT gateways and firewalls with short UDP timeouts, it fails to
>> >      provide a persistent bi-directional communication channel unless an
>> >      excessive amount of keepalive traffic is used.
>>
>> Note that this is going to make things not work super-well with DNS-
>> over-QUIC unless you use one stream only.
>>
>
> We don't expect this to work with DNS-over-QUIC without some restriction
> like this, hence the text in paragraph 3 of section 5.1.
>

Sure. I just wanted to point out that that was a consequence of this design


S 5.1.
>> >
>> >      If the RCODE is set to any value other than NOERROR (0) or
>> DSOTYPENI
>> >      ([TBA2] tentatively 11), then the client MUST assume that the
>> server
>> >      does not implement DSO at all.  In this case the client is
>> permitted
>> >      to continue sending DNS messages on that connection, but the client
>> >      SHOULD NOT issue further DSO messages on that connection.
>>
>> Why is this a SHOULD and not a MUST?
>>
>
> Agreed, fixed.
>
>
>> S 5.1.3.
>> >      to any problems that could be result from the inadvertent replay
>> that
>> >      can occur with zero round-trip operation.
>> >
>> >   5.1.3.  Middlebox Considerations
>> >
>> >      Where an application-layer middlebox (e.g., a DNS proxy, forwarder,
>>
>> I'm having trouble with this section. Is it a set of requirements on
>> middleboxes or statements of fact? If the latter, it seems like there
>> are a bunch of ways for middleboxes to mess things up,
>>
>
> This was covered in the gen-art review, and the text was updated to
> address the comment that was made there about this.   We believe that we
> have thought this through and that the text as written is correct.
>

Sorry, has this been changed in a new version?


> S 5.4.
>> >         generate a response, the application-layer software informs the
>> >         TCP implementation that it should go ahead and send the TCP ACK
>> >         and window update immediately, without waiting for the Delayed
>> ACK
>> >         timer.  Unfortunately it is not known at this time which (if
>> any)
>> >         of the widely-available networking APIs currently include this
>> >         capability.
>>
>> Are you going to make a recommendation here?
>>
>
> Out of scope for DNSOP to update TCP stack APIs.
>

Sorry, it wasn't clear from context what I was referring to here. You
discuss a number
of strategies and then say they are all bad. Do you had a recommendation
for what
people ought to do?


>
>> S 6.6.1.2.
>> >      client a Retry Delay message, or by forcibly aborting the
>> underlying
>> >      transport connection) the client SHOULD try to reconnect, to that
>> >      service instance, or to another suitable service instance, if more
>> >      than one is available.  If reconnecting to the same service
>> instance,
>> >      the client MUST respect the indicated delay, if available, before
>> >      attempting to reconnect.
>>
>> Do you want to recommend some sort of randomness around this value to
>> avoid avalanche?
>>
>
> The server is specifying the retry delay, so if the client adds
> randomness, that could result in more collisions rather than fewer.
>

Sure, if the server actually randomizes it itself. Perhaps that's worth
recommending?


>
>> S 8.2.
>> >      The table below indicates, for each of the three TLVs defined in
>> this
>> >      document, whether they are valid in each of ten different contexts.
>> >
>> >      The first five contexts are requests or unacknowledged messages
>> from
>> >      client to server, and the corresponding responses from server back
>> to
>> >      client:
>>
>> Nit. This text is a tiny bit hard to read, because you don't list S-P,
>> etc.
>>
>
> Hm, I believe that this should be using the requester/responder
> vernacular, not the client/server vernacular.   I've asked the team to
> confirm.
>
> S 10.
>> >      messages are subject to the same constraints as any other DNS-over-
>> >      TLS messages and MUST NOT be sent in the clear before the TLS
>> session
>> >      is established.
>> >
>> >      The data field of the "Encryption Padding" TLV could be used as a
>> >      covert channel.
>>
>> Why not require this to be 0, then?
>>
>
> The text describing the option currently says this:
>
> As specified for the EDNS(0) Padding Option {{!RFC7830}}
> the PADDING bytes SHOULD be set to 0x00.  Other values MAY be used,
> for example, in cases where there is a concern that the padded
> message could be subject to compression before encryption.
> PADDING bytes of any value MUST be accepted in the messages received.
>
> Do you believe that this is an invalid concern?
>

Mostly I just missed this. With that said, generally we are discouraging
compression in cryptographic protocols.
OTOH, I'm not that worried about the covert-channel either, so it's kind of
a tossup
-Ekr