Re: [DNSOP] Mirja Kühlewind's Discuss on draft-ietf-dnsop-session-signal-12: (with DISCUSS and COMMENT)

Ted Lemon <> Fri, 26 October 2018 15:13 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 027BA130E09 for <>; Fri, 26 Oct 2018 08:13:02 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.103
X-Spam-Status: No, score=-1.103 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, MIME_BOUND_DIGITS_15=0.798, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=no autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id o3_ZxIRj9L8M for <>; Fri, 26 Oct 2018 08:12:59 -0700 (PDT)
Received: from ( [IPv6:2607:f8b0:4864:20::735]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 5535D130E03 for <>; Fri, 26 Oct 2018 08:12:55 -0700 (PDT)
Received: by with SMTP id f18-v6so854749qkm.7 for <>; Fri, 26 Oct 2018 08:12:55 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=qY/N67tXcC9HOn0SDV9a5lYDzCBjZJf6Zl8DYkp1698=; b=Vu78dkASIcIqaeDN8zpBOm/18YGycA5Z7gUA4X+C6EVRq50G9y6nd5DYME+UWxKXSQ kkiydbH27U4BRSZRe7275bhj3tqhRptrb3dpq18lGpmNRihlPO6f0uIZM0PG9pUoNTXi 6ZWHcbjCb2YNsdrMk6MdXmyWs+tXJ1QXpu4jIQqM7I3Unrx4Mdttf+p8varniplMKNyI fnUqTBciT/aYkimBPmUyhw/nyjnx4bPgNygra0uwM+B5odiNSJyVYZY0UVlJ8s56aAgk Fg4hfv4qFC2nx+SJtWP3OZ9z2OG91QKENrULXSO4INyl/e7UDhZMZTAqWDJRIacNzxcy 2kBQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=qY/N67tXcC9HOn0SDV9a5lYDzCBjZJf6Zl8DYkp1698=; b=sN3L8LEjMlpBhYmj4cT0lwtbQ+4nohy5BvHE2taln5ZS3rpKPy9woCqxAX4AhQpsFU hmOCWPtZ6EWmOsrYFFjifS/pHkuC/bhvFWkPe+s4tgZU8diHP5pC8FDS7EWXN0PPzJ0x dukUXkDU4xEObNSdXaIRQhNP8pUzWABMrT0hQ8DdJkXO6xPIZJwqz/dctGC12iglgk5e b9IVmCg2PAeM4Vs49vkUQ2glXowbWssY1vi54niZ59XrzrtAk3BQV+pwuX23Hy+o/yOF BqzbrZgv7I3pgiLmIbf150W0L0ll6j1gqi5o4WJi4zsMRU/Yh2IKCgRZnl3IenBHHpHk Jnqg==
X-Gm-Message-State: AGRZ1gLOvhBGiqBP/2qPJNjPLyf1QEhCxl4S7ifu79+BdiJnkTZKsybV SbFBnCbP/MHVoBjiBqDdOPJRrW8XrttNx3K1lizXOw==
X-Google-Smtp-Source: AJdET5ck9vpl2rLMdJUTJVB/h/RBOnRKnZpjBMIkCV+S4EWVyO/oQqmE27YjIsp+jo8o9ViBGLZec7hoeO50qvpLvqA=
X-Received: by 2002:a37:bf46:: with SMTP id p67-v6mr3565044qkf.46.1540566774192; Fri, 26 Oct 2018 08:12:54 -0700 (PDT)
MIME-Version: 1.0
References: <> <> <> <> <> <> <> <> <> <>
In-Reply-To: <>
From: Ted Lemon <>
Date: Fri, 26 Oct 2018 11:12:18 -0400
Message-ID: <>
To: Mirja Kuehlewind <>
Cc: Tim Wicinski <>, dnsop WG <>, dnsop-chairs <>, The IESG <>,
Content-Type: multipart/alternative; boundary="0000000000005430480579232751"
Archived-At: <>
Subject: Re: [DNSOP] =?utf-8?q?Mirja_K=C3=BChlewind=27s_Discuss_on_draft-ietf?= =?utf-8?q?-dnsop-session-signal-12=3A_=28with_DISCUSS_and_COMMENT=29?=
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF DNSOP WG mailing list <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Fri, 26 Oct 2018 15:13:02 -0000

Definitely the text should be correct!   :)

We can't suggest using TCP keepalives as an alternative of app layer
keepalives without breaking interop.   But I will make another pass at the
document later today to try to address the other points we've discussed.
 The discussion about the TCP delayed ack text and the TFO text has been
helpfully clarifying, so thanks for that (not to say the keepalive
discussion wasn't clarifying too—I just don't think I can quite do what
you're asking there).

On Fri, Oct 26, 2018 at 11:09 AM Mirja Kuehlewind (IETF) <> wrote:

> Hi Ted,
> see inline.
> > Am 26.10.2018 um 16:58 schrieb Ted Lemon <>om>:
> >
> > Okay, I'm going to update the document to add a clarification about the
> handling of early data outside of TFO.   I think the keepalive issue is a
> matter of judgment, and we would almost have to rewrite the document to
> change how that works now.   Looking at the MacOS documentation, I can see
> ways that I might be able to get the behavior we want, but I don't know how
> available this API is, and I think it's reasonable not to assume that it is
> available in all cases.   So unless you absolutely insist, I am not going
> to change this.
> I don’t insist. My initial comment was actually about the concrete text
> and the example you give (less about the mechanism). The point is that
> there is a dependency on the RTT and in your text you don’t say anything
> about the RTT which makes it hard to follow. I’m still not sure if that
> example is needed at all but if you want to keep it, it should be right.
> Also now that I understood what you do, I don’t think it’s the best choice
> but usually it should be fine.
> Still, I would recommend two additional things maybe: 1) Maybe a short
> discussion that tcp keep-alives could be used if available and configurable
> instead. And 2) make very clear that it doesn’t make sense to have a keep
> alive time out that is smaller than the RTT.
> >
> > As for the delayed ack timer implementation, the point of this is to
> call the implementor's attention to this.   If they are setting values for
> the delayed ack timer, then they already know about this issue; if not, the
> default behavior is as described.
> Actually different systems have different default values and given this is
> a sysctl not all people deploying this, maybe be the same ones configuring
> the system, or might not even have the right to configure the system.
> >   Furthermore, one of the goals of this text is to avoid having the
> implementor override TCP settings that are actually correct.   For example,
> we don't want TCP_NODELAY here.   We don't want to disable the delayed ack
> timer, or to mess with its duration.   What we want is for normal TCP
> processing to happen, but for the app to signal to the tcp stack that it's
> not going to send a response.   This is new behavior, which is already
> available in some commercial stacks, but not in all stacks, and we want the
> implementor to use it if it's there.   That's why this text is here.
> Understood. Still the text should be correct :-)
> Mirja
> >
> > On Fri, Oct 26, 2018 at 10:32 AM Mirja Kuehlewind (IETF) <
>> wrote:
> > Hi Ted,
> >
> > please see below.
> >
> > > Am 26.10.2018 um 15:59 schrieb Ted Lemon <>om>:
> > >
> > > On Fri, Oct 26, 2018 at 9:35 AM Mirja Kuehlewind (IETF) <
>> wrote:
> > > I guess you mean RFC8446 :-)
> > > Yup, sorry.
> > >
> > > The table there on p.18 shows only the TLS handshake, there is a TCP
> handshake before that.
> > >
> > > I don't believe this is correct, and given your comment at the end
> maybe I'm just misunderstanding.   Are you saying that the TLS handshake is
> never included in the SYN packet with TFO, or are you saying that it might
> not be in the SYN packet?
> >
> > None of both. If TFO is NOT used, there is first a normal TCP handshake
> and then eventually TLS 0-RTT. TLS 0-RTT only means that you can send data
> at then same time then the TLS handshake is performed.
> >
> > If TFO is used, there can be data in the SYN. If TLS is used usually the
> TLS initial packet would be in that SYN data, however, there might not be
> enough space to send any other 0-RTT data in that packet because it is
> really just one packet.
> >
> > So the answer is that I actually don’t how TFO with TLS 0-RTT is
> implemented in practice. But for sure without TFO there is the TCP
> handshake first (one RTT) and then any TLS traffic happens.
> >
> > Does that help, or was that confusing again?
> >
> > > If the TLS handshake doesn't come in the TFO packet, there's a round
> trip, so we have some assurance that we are not being flooded with TCP SYN
> packets by an off-link spoofer faking its source address.
> >
> > No flooding can happen, however, there are migration techniques. E.g. if
> you are overloaded, don’t process the data immediately or TCP could even
> drop the data and it would be retransmitted. Yes, this causes additional
> latency but you would only do that in an attack situation. I believe this
> is discussed in the TFO RFC7413.
> >
> > > So it's only in the case that TLS 1.3 early data in the TFO packet is
> an issue.   I suppose because of the way that early data is handled, if it
> were present in the third packet there might be some risk, but to be honest
> I do not know what the risk would be, and that was not part of the threat
> analysis that Benjamin asked us to do (or at least if it was, I didn't
> notice, and Benjamin seems okay with the outcome).
> >
> > There is a replay attack for 0-RTT data in TLS which is independent of
> use of TFO or not. This is why you should not send none idempotent data in
> 0-RTT. However, I think this is correctly addressed in the draft.
> >
> > >   If you think we've missed something here, it would help to get
> clarity on that.   I do not claim to be an SME!   If there is an issue, I
> think the fix would be to just make the text about early data apply whether
> it's in a TFO or not.
> >
> > Yes, I think the case with and with TFO must be discussed separately.
> >
> > >
> > > > The phrasing here is a bit confusing, to me at least. It sounds a
> bit like there is a special TCP for DSO… maybe the following is a bit
> better:
> > > >    "With a DSO request message, TCP delayed acknowledge timer will
> usually
> > > >    make the implementation wait for the
> > > >    application-layer client software to generate the corresponding
> > > >    response message before it sends out an TCP acknowledgment
> > > >    This will generate a
> > > >    single combined IP packet containing the TCP acknowledgement, the
> > > >    window update, and the application-generated DSO response message
> and
> > > >    is more efficient than sending three separate IP packets.“
> > > >
> > > > (Note that the deplayed ack timer can be configured to a very small
> value as well, and as such it depends on the processing time and the value
> of the timer if a TCP implementation will wait or not.)
> > > >
> > > > I think using the passive voice here makes the text harder to
> follow, but I see what you are saying.   How about this:
> > > >
> > > > With a bidirectional exchange over TCP, as for example with a DSO
> request
> > > > message, the operating system TCP implementation waits for the
> > >
> > > "the operating system’s TCP implementation will usually wait for"
> > >
> > > or even better
> > >
> > > „the deplayed acknowledgments timer in TCP will usually wait for"
> > >
> > > Is there an error in the text as proposed?   I understand that the way
> the text expresses this point is not how you would express the point, but
> this feels like a nitpick, not an actual problem in the text.
> >
> > I think it would be important to add the „usually" because the TCP
> delayed acknowledgement timer is configurable and it depends on it value
> and the exact timing of the application if what you write is true or not.
> >
> > >
> > > > Remember that keepalives are not synchronous.   That is, if we send
> a keepalive, we don't wait for the response.   So it's perfectly possible
> for there to be several keepalives in flight in this situation, if the RTT
> is >200ms.
> > >
> > > Yes, I misunderstood that initially, however btw. why did you decide
> to design it that way?
> > > >
> > > > In any case, I don’t think this example is actually very helpful.
> The point is that the keep-alives interval should always be much larger
> than the RTT to make this work appropriately. However, the point about
> keeping the network load is, is rather independent to the question of when
> the mechanism actually breaks. I would recommend to simply remove this
> example and just say that the interval MUST not be smaller than 10 sec to
> keep the network load reasonably low.
> > > >
> > > > However, having read this and the previous section again, I think
> your implementation of the keep-alives mechanism could also be improved.
> Usually, there should be two intervals. One defines, how long the
> connection can be idle before an keeps-live is sent and one that defines
> when a keeper-lives should be retransmitted if it is deemed to be lost,
> where the first one just usually be larger than the second one (and both
> timers should always be larger than the RTT). That would enable faster
> failure if the connection is actually lost.
> > > >
> > > > A possible point of confusion is that these are not TCP keepalive
> packets.   These are DSO messages being sent over the TCP transport.   So
> it's not possible for a keepalive to be lost.   If we don't get a response
> to a keepalive during the keepalive interval, this means that the TCP
> connection has stalled, or that the remote end is no longer reachable.
>  There is no retransmission.   Is that where the confusion lies, or am I
> misunderstanding?
> > >
> > > Right, I got actually confused here. So you send data frequently and
> if something goes wrong the connection will be closed at sender-side. Hm...
> it seems like if you want to test transport liveness with this (and not
> application liveness), you might maybe rather want use the existing
> keep-alive mechanism in TCP…? Why don’t you just recommend to use that?
> > >
> > > You mean just use TCP keepalive?   It takes (I think) 90 seconds for a
> TCP keepalive to time out the connection.   In some cases this is fine; in
> others possibly not.   Doing it at the app layer gives us more flexibility,
> and also conveys more information.
> >
> > The TCP keep-alives is also configurable. (Or at least should be in most
> implementations.)
> >
> > >
> > > > Please clarify that TLS 0-RTT can be used without TFO (or TFO can be
> used without TLS) and I would also recommend to discuss the respective
> issue separately.
> > > >
> > > > As Benjamin said, I took out support for TCP Fast Open without TLS
> 1.3 because I didn't think it was practical to address the potential issues
> with it.
> > >
> > > Not sure why you think that/which issues you mean? rfc7413 actually
> discusses all kind of issues extensively.
> > >
> > > >   However, in looking back at what I wrote, it's easy to see why
> this was confusing.   I've substantially tightened up the text about this:
> all cases where terms like "TCP Fast Open" and "0-RTT" are used now refer
> to "early data."   The changes are relatively small, but sprinkled over a
> whole section, so I don't think it's practical to enumerate them here, but
> they should show up nicely in the diffs.   I believe these changes address
> your concern, but please let me know if they do not.
> > > >
> > > >
> > >
> > > Okay, I believe this is fine now. I guess you could further clarify
> somewhere that „early data“ is always 0-RTT TLS with or _WITHOUT_ TFO.
> > >
> > > Hm, okay.   As I said earlier, I'm not an SME here.   If there is an
> issue with 0-RTT in the third packet of a TCP three-way handshake that
> didn't use TFO, I can clarify.
> >
> > As I said above, there is a replay attack that is described in RFC8446
> (and which I think is sufficiently addressed in your doc). However, it
> would be good to clarify that both cases are possible, with and without TFO.
> >
> > Mirja
> >
> >