Re: [DNSOP] Mirja Kühlewind's Discuss on draft-ietf-dnsop-session-signal-12: (with DISCUSS and COMMENT)

Ted Lemon <> Tue, 23 October 2018 19:52 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id C1795130FFF for <>; Tue, 23 Oct 2018 12:52:33 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id rv-xl3vR2kFQ for <>; Tue, 23 Oct 2018 12:52:31 -0700 (PDT)
Received: from ( [IPv6:2607:f8b0:4864:20::730]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id E0E59130E71 for <>; Tue, 23 Oct 2018 12:52:26 -0700 (PDT)
Received: by with SMTP id a193-v6so1653783qkc.13 for <>; Tue, 23 Oct 2018 12:52:26 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=mWx8jAL1APQX4qv9X8rCFgEEjiha8GA7fS/Tqa2g8Ak=; b=smONPHL6S1Jx5bFD/HgsUHX/nWX4UREj495pHxnFRkl4vsZqexwbyCKLboEQuV+D3E qYcMjlrUuKfPU2j/ho+UV7auL7RZvlTdVIjDUL406SsiZ9dzZ0dHYwll9SwUAL6z8qkY Py50VtT56DnCkHfN5ek9UYVUu9MehWRmwUO4pGyGIZJUev/0y2mmIJP5ctYuJ4I9Qm0+ PGvgJRHNJtidhynNeWlqGcIs1Y/SsZ1ntPic34kYKthS85J+qkFfHlR/PVU3rGLqB8F3 vH0WMYG4A6tXUjVbtILnlVHY+WrhTE3c8yEC8nXbs7FiUEfuM19xvAIVsPrJzPZ2LnUc IOaQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=mWx8jAL1APQX4qv9X8rCFgEEjiha8GA7fS/Tqa2g8Ak=; b=L61MIRCxCUDBn5ad9XMWpe8DaAuRLeX8YrwRnth7pUqy9IheilC0eJUHwNmTXSGX70 9JcFH3z/ivFPjUuOucw0U+qBA1Pm2P+RO08rfa33PfDidxTCXrLD1dGuxMp46QpiERny iE7i6/m0VBKPLg81vR+Vn9bFBKi9e9N7KnYOjSwYnIaDR+0Q6XceNuAVzfBe8BbpB9hf TcAneX2OuZ+2BSyBRWvc1kF+mFl+IprUw5cSC4/4arOVDU945RBh3fgTFdlIn5+Cn/vz bSG5sx8SeCbkJfZMIkwnfHHAawA32LATQLRD7FKMvUkM+fC4T3MNMFLPlZS21ln6M0dh siPg==
X-Gm-Message-State: ABuFfogNawtAglVL9n+2QKerS49TPbIvVfvKxa30D4IghvsBVawzhVly 31rpJuroiohqA7vgmHb8/yPgLQnxRi/X3bLAR+vF3QIpiXoi9Q==
X-Google-Smtp-Source: ACcGV62uv5KULS3II3GOTJiPngpEKPPJZaoxzX0KS4xUNs6aYicnvo5ocwRRC/TekmDU3QfQJyQIT26ZDcFBLjcS9nY=
X-Received: by 2002:a37:9081:: with SMTP id s123-v6mr46991310qkd.164.1540324345896; Tue, 23 Oct 2018 12:52:25 -0700 (PDT)
MIME-Version: 1.0
References: <> <> <> <>
In-Reply-To: <>
From: Ted Lemon <>
Date: Tue, 23 Oct 2018 15:51:49 -0400
Message-ID: <>
To: Mirja Kuehlewind <>
Cc: The IESG <>,, Tim Wicinski <>, dnsop-chairs <>, dnsop WG <>
Content-Type: multipart/alternative; boundary="00000000000079f7f90578eab563"
Archived-At: <>
Subject: Re: [DNSOP] =?utf-8?q?Mirja_K=C3=BChlewind=27s_Discuss_on_draft-ietf?= =?utf-8?q?-dnsop-session-signal-12=3A_=28with_DISCUSS_and_COMMENT=29?=
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF DNSOP WG mailing list <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Tue, 23 Oct 2018 19:52:37 -0000

On Mon, Oct 15, 2018 at 10:02 AM Mirja Kuehlewind (IETF) <> wrote:

> sorry for the delay, however, as you performed a couple of changes it took
> me a while to re-review. I believe I’m unfortunately not fully ready to
> release my discuss at this point, but close..

No worries—it's a busy time.

> Regarding my first discuss point (delayed ACKs aso.) I think the text
> improved and  I would like to seem my minor wording question (comment 2)
> below addressed before I finally release the discuss here. However, I still
> think the extensive discussion as provided in section 9.5 now, does not
> necessarily belong in this document. Therefore I would rather would have
> preferred to move this text in a real appendix, or removed it completely
> and maybe document in an own informational RFC (in tcpm).
> Regarding my second discuss point (keep-alives), the text seems still not
> quite right yet, or I’m really confused. Please also see also further below
> (comment 3).
> Anyway here are my comments on the edited/new text in the order they
> appear in the draft:
> 1) I think the following text in section 3 is not fully correct:
> "Fast Open message: A TCP SYN packet that begins a DSO connection and
>    contains early data ([RFC8446] section 2.3).  Fast Open is only
>    permitted when using TLS encapsulation: a TCP SYN message that does
>    not use TLS encapsulation but contains early data is not permitted.“
> If TLS 0-RTT is used this data will not be carried in the TCP SYN, it will
> „just“ be send at the same time as the TLS handshake is performed (but
> after the TCP handshake). Only if TCP Fast Open (TFO) (see RFC7413) is
> used, data can also be sent in the TCP SYN. I guess you mainly need to fix
> the reference here, or maybe name both mechanisms separately.

If you look at the table on p. 18 of RFC8447, it shows early data being
sent in the first packet.  What am I missing here?

> 2) In section 5.5.1:
>    "With a DSO request message, the TCP implementation waits for the
>    application-layer client software to generate the corresponding DSO
>    response message, which enables the TCP implementation to send a
>    single combined IP packet containing the TCP acknowledgement, the TCP
>    window update, and the application-generated DSO response message.
>    This is more efficient than sending three separate IP packets.“
> The phrasing here is a bit confusing, to me at least. It sounds a bit like
> there is a special TCP for DSO… maybe the following is a bit better:
>    "With a DSO request message, TCP delayed acknowledge timer will usually
>    make the implementation wait for the
>    application-layer client software to generate the corresponding DSO
>    response message before it sends out an TCP acknowledgment
>    This will generate a
>    single combined IP packet containing the TCP acknowledgement, the TCP
>    window update, and the application-generated DSO response message and
>    is more efficient than sending three separate IP packets.“
> (Note that the deplayed ack timer can be configured to a very small value
> as well, and as such it depends on the processing time and the value of the
> timer if a TCP implementation will wait or not.)

I think using the passive voice here makes the text harder to follow, but I
see what you are saying.   How about this:

With a bidirectional exchange over TCP, as for example with a DSO request
message, the operating system TCP implementation waits for the
application-layer client software to generate the corresponding DSO
response message.   It can then send a
single combined packet containing the TCP acknowledgement, the
TCP window update, and the application-generated DSO response message.
This is more efficient than sending three separate packets, as would occur
the TCP packet containing the DSO request were acknowledged immediately.

3) Section 6.5.2
> "For example, a (hypothetical and unrealistic)
>    keepalive interval value of 100 ms would result in a continuous
>    stream of ten messages per second or more, in both directions, to
>    keep the DSO Session alive.  And, in this extreme example, a single
>    packet loss and retransmission over a long path could introduce a
>    momentary pause in the stream of messages of over 200 ms, long enough
>    to cause the server to overzealously abort the connection.“
> I think this example is still not correct (and the changes might made have
> it worse: how can there be more then 10 messages?)

> So the point here is that there is a dependency on the RTT. Only if the
> RTT is smaller than 200ms this can happen, otherwise the connection is
> closed anyway after two keep-alives. However, if the RTT is much smaller
> than 100ms and e.g. TLP is used, it would still work even if one packet is
> lost.

Remember that keepalives are not synchronous.   That is, if we send a
keepalive, we don't wait for the response.   So it's perfectly possible for
there to be several keepalives in flight in this situation, if the RTT is

> In any case, I don’t think this example is actually very helpful. The
> point is that the keep-alives interval should always be much larger than
> the RTT to make this work appropriately. However, the point about keeping
> the network load is, is rather independent to the question of when the
> mechanism actually breaks. I would recommend to simply remove this example
> and just say that the interval MUST not be smaller than 10 sec to keep the
> network load reasonably low.
> However, having read this and the previous section again, I think your
> implementation of the keep-alives mechanism could also be improved.
> Usually, there should be two intervals. One defines, how long the
> connection can be idle before an keeps-live is sent and one that defines
> when a keeper-lives should be retransmitted if it is deemed to be lost,
> where the first one just usually be larger than the second one (and both
> timers should always be larger than the RTT). That would enable faster
> failure if the connection is actually lost.

A possible point of confusion is that these are not TCP keepalive packets.
 These are DSO messages being sent over the TCP transport.   So it's not
possible for a keepalive to be lost.   If we don't get a response to a
keepalive during the keepalive interval, this means that the TCP connection
has stalled, or that the remote end is no longer reachable.   There is no
retransmission.   Is that where the confusion lies, or am I

> 4) Section (Reconnecting After an Unexplained Connection Drop)
>   "It is also possible for a server to forcibly terminate the
>    connection; in this case the client doesn't know whether the
>    termination was the result of a protocol error or a network outage.
>    The client could determine which of the two is occurring by noticing
>    if a connection is repeatedly dropped by the server; if so, the
>    client can mark the server as not supporting DSO.“
> How often should the client try and in which interval?

I've added the following text to address this question:

### Misbehaving Clients

A server may determine that a client is not following the protocol
correctly.  There may be no
way for the server to recover the session, in which case the server
forcibly terminates the
connection.  Since the client doesn't know why the connection dropped, it
may reconnect
immediately.  If the server has determined that a client is not following
the protocol
correctly, it may terminate the DSO session as soon as it is established,
specifying a long
retry-delay to prevent the client from immediately reconnecting.


#### Reconnecting After an Unexplained Connection Drop {#dropreconnect}

It is also possible for a server to forcibly terminate the connection; in
this case the client doesn't know whether the termination was the result
of a protocol error or a network outage.   When the client notices that
the connection has been dropped, it can attempt to reconnect immediately.
However, if the connection is dropped again without the client being
able to successfully do whatever it is trying to do, it should mark the
server as not supporting DSO.

These two bits of advice, in combination with the surrounding text, should
address the problem you're pointing to.

> 5) Section 9.2:
>    "In principle, anycast servers could maintain sufficient state that
>     they can both handle packets in the same TCP connection.“
> Really? I mean in theory yes but has this ever been done in practice? I
> would think that sharing TCP state is even harder than sharing DSO state.

I've just deleted this paragraph—I think we were trying to address a
hypothetical scenario here and got a little carried away. :)

> Please clarify that TLS 0-RTT can be used without TFO (or TFO can be used
> without TLS) and I would also recommend to discuss the respective issue
> separately.

As Benjamin said, I took out support for TCP Fast Open without TLS 1.3
because I didn't think it was practical to address the potential issues
with it.   However, in looking back at what I wrote, it's easy to see why
this was confusing.   I've substantially tightened up the text about this:
all cases where terms like "TCP Fast Open" and "0-RTT" are used now refer
to "early data."   The changes are relatively small, but sprinkled over a
whole section, so I don't think it's practical to enumerate them here, but
they should show up nicely in the diffs.   I believe these changes address
your concern, but please let me know if they do not.