Re: [DNSOP] Mirja Kühlewind's Discuss on draft-ietf-dnsop-session-signal-12: (with DISCUSS and COMMENT)

Stuart Cheshire <> Wed, 01 August 2018 07:48 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 12657130FC4 for <>; Wed, 1 Aug 2018 00:48:38 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.41
X-Spam-Status: No, score=-2.41 tagged_above=-999 required=5 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, T_DKIMWL_WL_HIGH=-0.01, URIBL_BLOCKED=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id jmgcLQkFy-bH for <>; Wed, 1 Aug 2018 00:48:36 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 14E64130FC5 for <>; Wed, 1 Aug 2018 00:48:34 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256;; s=mailout2048s; c=relaxed/simple; q=dns/txt;; t=1533109713; x=2397023313; h=From:Sender:Reply-To:Subject:Date:Message-id:To:Cc:MIME-version:Content-type: Content-transfer-encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-reply-to:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=VbE1rL+5ttg9/UdqXWbW7byjSfovtAGNeOXcL5ieRNU=; b=DvPzC2bpXLX7cLcKUGrdbkklwyhN4OfnKZJ07MnBYlx03mxiHaP9C6eYaRZ70qgu hYY/sYJXKu4LOhZiRZNtz5wBbQYF10qFuABhTyGUJb4eX+rTZGID89+QdlHfyeLk HTo1oc0FB5IbOq97S7J35gDwDalJp3rdfo4EWBS5llXSmTM0g2RG4aNjyCrmhubp UY0zIbm68TmGCoOpb/Q5KXQFloGp0bBTjwZeXR8mpS5S4pcl785ywzXmXxCF+6/s SVib5cVgz9GA+eelNWwg9VG0xQMgTda2cjIFK9BbnlrqSi9IffoYsyQL4qwib18f LgQ/ApqkN64C0zfGlwYCxQ==;
X-AuditID: 11ab0218-0e9ff70000001a2c-10-5b6165d08ea9
Received: from ( []) (using TLS with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by (Apple Secure Mail Relay) with SMTP id DB.F0.06700.0D5616B5; Wed, 1 Aug 2018 00:48:33 -0700 (PDT)
MIME-version: 1.0
Content-type: text/plain; charset=utf-8
Received: from ( []) by (Oracle Communications Messaging Server 64bit (built Jun 14 2018)) with ESMTPS id <>; Wed, 01 Aug 2018 00:48:32 -0700 (PDT)
Received: from by (Oracle Communications Messaging Server 64bit (built Jun 14 2018)) id <>; Wed, 01 Aug 2018 00:48:32 -0700 (PDT)
X-Va-T-CD: 8948baeb49e2acc5cbdf6070e03c74b2
X-Va-E-CD: de82133bc47bc07535f0d4ebfe66197b
X-Va-R-CD: daea56100356075a97bca774dc8b5365
X-Va-CD: 0
X-Va-ID: d255f9ab-57b7-4671-a7d7-0fd24c90a2a3
X-V-T-CD: 8948baeb49e2acc5cbdf6070e03c74b2
X-V-E-CD: de82133bc47bc07535f0d4ebfe66197b
X-V-R-CD: daea56100356075a97bca774dc8b5365
X-V-CD: 0
X-V-ID: 3db0e524-aa61-4089-a218-42bc32774f7d
Received: from by (Oracle Communications Messaging Server 64bit (built Jun 14 2018)) id <>; Wed, 01 Aug 2018 00:48:31 -0700 (PDT)
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-08-01_02:,, signatures=0
Received: from [] (unknown []) by (Oracle Communications Messaging Server 64bit (built Jun 14 2018)) with ESMTPSA id <>; Wed, 01 Aug 2018 00:48:31 -0700 (PDT)
From: Stuart Cheshire <>
In-reply-to: <>
Date: Wed, 01 Aug 2018 00:48:29 -0700
Cc: The IESG <>,,,,
Content-transfer-encoding: quoted-printable
Message-id: <>
References: <>
To: =?utf-8?Q?Mirja_K=C3=BChlewind?= <>
X-Mailer: Apple Mail (2.3445.5.20)
X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFvrMIsWRmVeSWpSXmKPExsUiuPlRu+7F1MRogyeXmS3ebJ/EYnH3zWUW i3nr1zBZzPgzkdnixfWPzBbT2jYzO7B57Jx1l91jyZKfTB4tHxeyBjBHcdmkpOZklqUW6dsl cGWsP7SctWCRYcXSBbtYGxiPqHcxcnBICJhIzFum1cXIxSEkcJBJYsfXVWxdjJwcvAKCEj8m 32MBqWEWUJeYMiUXomYdk8TWv9tZIJwuJokTXZeZQBokBNgl/vzawQJha0ucuDCdDcb+9PII XPzNp/NQcS6JBVtPs0LYuhK3fpyCmsMmsf7EEihbS6L17RRmGLvl6lo2GPvF8wtQvZwS579M ZIewdSSWP53GCnFcJ5PE5jU7oRZnS6yY8hzKDpa4vqWNHaJoIpPEtG1zwbqFBaQkXq38zAyS EBZYyyjx5OhvsAQbyLrPV8BWcwr4Smx4MxPMZhFQlXg6cwITSAOzQDOjxLYJJ8FuZQb688k7 iPt4BWwk5u04BRYXEvCR2PCyD2yoiICVRPP2RywTGBVnIYX3LER4z0IyaQEj8ypG4dzEzBzd zDwjE73EgoKcVL3k/NxNjKBUshoYcYxfXhseYhTgYFTi4T1RnRAtxJpYVlyZe4hRmoNFSZz3 4y6xaCGB9MSS1OzU1ILUovii0pzU4kOMTBycUg2My76W3/73r4jF6ZGYtWN1OXs7c/u9CZ5T vy6/y/1yudKkP5e2x/ofVhZ492/vvK7OvVplaxiT/F/8kP0sdDHSZ476/slnBS87zFZRDuE4 cc5BP9Ax8QFzSoJWhH3vq5DMM7KPX9R/ErD+I99s/nLryTkdkS1iG8+Y8E1fsuf2YjbJT6eY qvj7lViKMxINtZiLihMBVRXR/wYDAAA=
Archived-At: <>
Subject: Re: [DNSOP] =?utf-8?q?Mirja_K=C3=BChlewind=27s_Discuss_on_draft-ietf?= =?utf-8?q?-dnsop-session-signal-12=3A_=28with_DISCUSS_and_COMMENT=29?=
X-Mailman-Version: 2.1.27
Precedence: list
List-Id: IETF DNSOP WG mailing list <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 01 Aug 2018 07:48:38 -0000

On 30 Jul 2018, at 13:19, Mirja Kühlewind <> wrote:

> ----------------------------------------------------------------------
> ----------------------------------------------------------------------

I’m responding to the “DISCUSS” items right now. I’ll get to the “COMMENT” items shortly.

> 1) In addition to the bullet point in the 6.2 that was flagged by Spencer, I
> would like to discuss the content of section 5.4.  (DSO Response Generation). I
> understand the desire to optimize for the case where the application knows that
> no data will be sent as reply to a certain message, however, TCP does not have
> a notion of message boundaries and therefore cannot and should not act based on
> the reception of a certain message. Indicating to the TCP that an ACK can be
> set immediately in an specific situation is also problematic as ACK processing
> is part of the TCP's internal machinery. However, why it is important at all
> that an TCP-level ACK is send out fast than the delayed ACK timer? The ACK
> receiver does not expose the information when an ACK is received to the
> application and the delayed ACK timer only expires if no further data is
> received/send by the ACK-receiver, therefore this optimization should not have
> any impact in the application performance. I would just recommend to remove
> this section and any additional discussion about delayed ACKs.
> Please note that the problem described in [NagleDA] only occurs for
> request-response protocols where no further request can be sent before the
> response is received. This is not the case in this protocol (as pipelining is
> supported).

The problem here is not further requests, it’s further responses. Consider a client that subscribes for mDNS relay service <>.

If the server gets an mDNS packet and relays it, Nagle blocks relaying of a further mDNS packet until an ack is received. On a campus GigE backbone with sub-millisecond round-trip times, this potentially delays the relaying of a subsequent mDNS packet for up to 200 ms. That’s a long time on a sub-millisecond network. If the client were to send a reply to the first relayed mDNS packet, then TCP would piggyback its ack on that data packet, and Nagle would then free the server to relay the next mDNS packet.

The optimization advocated here is the observation that if a networking API were to allow the server to explicitly indicate an empty reply, then that lets the TCP stack know that it doesn’t need to wait 200 ms in the hope that it can piggyback its ack on an outbound data packet.

Without this, people are tempted to set TCP_NODELAY, which is worse overall for the network.

> 2) Further regarding keep-alives:
> in sec 6.5.2: "For example, a hypothetical keepalive interval
>   value of 100ms would result in a continuous stream of at least ten
>   messages per second, in both directions, to keep the DSO Session
>   alive."
> This does not seems correct. There should be at max one keep-alives message in
> flight. Thus the keep-laives timer should only be restarted after the
> keep-alive reply was received.

On a campus GigE backbone with sub-millisecond round-trip times, even a hypothetical keepalive interval value of 100ms would still have only one keep-alive message in flight at a time. But it would still be an unreasonable keepalive interval.

>   And, in this extreme example, a single packet loss and
>   retransmission over a long path could introduce a momentary pause in
>   the stream of messages, long enough to cause the server to
>   overzealously abort the connection."
> This doesn't really make sense to me: As I said, TCP will retransmit and the
> keep-alive timer should not be running until the reply is received. If you want
> to abort the connection based on keep-alives quickly before the TCP connection
> indicates you a failure, you need to wait at minimum for an interval that is
> larger than the TCP RTO (with is uaually 3 RTTs) which means you basically need
> to know the RTT.

The point of this text is to illustrate that a keepalive interval value of 100ms would be unreasonable. I think you would agree with that. This is to support why the immediately following text mandates a minimum keepalive interval of ten seconds.

> Also sec 7.1: "If the client does not generate the
>      mandated keepalive traffic, then after twice this interval the
>      server will forcibly abort the connection."
> Why must the server terminate the connection at all if the client refuses to
> send keep-alives? Isn't that what the inactivity timer is meant for? Usually
> only the endpoint that initiates the keep-alive should terminate the connection
> if no response is received.

A client cannot refuse to send keep-alives. A connection with an active mDNS relay subscription is never considered “inactive”, but a server may still require reasonable keep-alives to verify that the client is still there.

> 3) There is another contraction regarding the inactive timer:
> Sec 6.2 say
>   "A shorter inactivity timeout with a longer keepalive interval signals
>   to the client that it should not speculatively keep an inactive DSO
>   Session open for very long without reason, but when it does have an
>   active reason to keep a DSO Session open, it doesn't need to be
>   sending an aggressive level of keepalive traffic to maintain that
>   session."
> which indicates that the client may leave the session open longer than
> indicated by the inactive timer of the server. However section 7.1.1 say that
> the client MUST close the connection when the timer is expired.

A connection with an active mDNS relay subscription is never considered “inactive”, because there is still active client/server state, even if no traffic is flowing. A server may still require reasonable keep-alives to verify that the client is still there.

Stuart Cheshire