Re: [DNSOP] Mirja Kühlewind's Discuss on draft-ietf-dnsop-session-signal-12: (with DISCUSS and COMMENT)

Ted Lemon <mellon@fugue.com> Thu, 02 August 2018 05:39 UTC

MIME-Version: 1.0
In-Reply-To: <F4BF5104-7FE1-4586-80E6-1B2708E0EF38@kuehlewind.net>
References: <153298197116.8154.9156104510824888266.idtracker@ietfa.amsl.com> <406F498C-03CF-45AE-93A5-D0632F862E0A@apple.com> <F4BF5104-7FE1-4586-80E6-1B2708E0EF38@kuehlewind.net>
From: Ted Lemon <mellon@fugue.com>
Date: Thu, 02 Aug 2018 01:39:04 -0400
Message-ID: <CAPt1N1mXgZA0FnL16xYk+fHe1p6F5LztFTASRCsdUB-0W-kZMA@mail.gmail.com>
To: "Mirja Kuehlewind (IETF)" <ietf@kuehlewind.net>
Cc: Stuart Cheshire <cheshire@apple.com>, Tim Wicinski <tjw.ietf@gmail.com>, dnsop WG <dnsop@ietf.org>, dnsop-chairs <dnsop-chairs@ietf.org>, The IESG <iesg@ietf.org>, draft-ietf-dnsop-session-signal@ietf.org
Content-Type: multipart/alternative; boundary="0000000000000c4d6305726d3de4"
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnsop/sbB6vb_oQ351DUv2-3tPMBubJeE>
Subject: Re: [DNSOP] Mirja Kühlewind's Discuss on draft-ietf-dnsop-session-signal-12: (with DISCUSS and COMMENT)
Precedence: list

On Wed, Aug 1, 2018 at 8:02 AM, Mirja Kuehlewind (IETF) <ietf@kuehlewind.net
> wrote:

> >> 1) In addition to the bullet point in the 6.2 that was flagged by
> Spencer, I
> >> would like to discuss the content of section 5.4.  (DSO Response
> Generation). I
> >> understand the desire to optimize for the case where the application
> knows that
> >> no data will be sent as reply to a certain message, however, TCP does
> not have
> >> a notion of message boundaries and therefore cannot and should not act
> based on
> >> the reception of a certain message. Indicating to the TCP that an ACK
> can be
> >> set immediately in an specific situation is also problematic as ACK
> processing
> >> is part of the TCP's internal machinery. However, why it is important
> at all
> >> that an TCP-level ACK is send out fast than the delayed ACK timer? The
> ACK
> >> receiver does not expose the information when an ACK is received to the
> >> application and the delayed ACK timer only expires if no further data is
> >> received/send by the ACK-receiver, therefore this optimization should
> not have
> >> any impact in the application performance. I would just recommend to
> remove
> >> this section and any additional discussion about delayed ACKs.
> >>
> >> Please note that the problem described in [NagleDA] only occurs for
> >> request-response protocols where no further request can be sent before
> the
> >> response is received. This is not the case in this protocol (as
> pipelining is
> >> supported).
> >
> > The problem here is not further requests, it’s further responses.
> Consider a client that subscribes for mDNS relay service <
> https://tools.ietf.org/html/draft-ietf-dnssd-mdns-relay-01>.
> >
> > If the server gets an mDNS packet and relays it, Nagle blocks relaying
> of a further mDNS packet until an ack is received. On a campus GigE
> backbone with sub-millisecond round-trip times, this potentially delays the
> relaying of a subsequent mDNS packet for up to 200 ms. That’s a long time
> on a sub-millisecond network. If the client were to send a reply to the
> first relayed mDNS packet, then TCP would piggyback its ack on that data
> packet, and Nagle would then free the server to relay the next mDNS packet.
> >
> > The optimization advocated here is the observation that if a networking
> API were to allow the server to explicitly indicate an empty reply, then
> that lets the TCP stack know that it doesn’t need to wait 200 ms in the
> hope that it can piggyback its ack on an outbound data packet.
>
> I unterstand the point, I just don’t really think that this is a problem
> that is specific to this use case and therefore should not be necessarily
> discussed in tis document (given the problem is quite complex). However, I
> guess that could be good input to taps, given taps is working on a
> message-based interface on top of TCP.
>
> >
> > Without this, people are tempted to set TCP_NODELAY, which is worse
> overall for the network.
>
> Not sure. In the described scenarios this might actually not be a bad
> think to do.
>

The problem is that we can't give good advice on this—in some scenarios,
TCP_NODELAY probably won't cause any harm.   My personal opinion on this
(which Stuart has not confirmed) is that indeed this is a general problem,
and that it does make sense to raise it in taps rather than trying to solve
it here.   Consequently, I've updated the section as follows:

## Flow Control Considerations

Because unacknowledged DSO messages do not generate an immediate response
from the responder, if
there is no other traffic flowing from the responder to the initiator, this
can result in a
200ms delay before the TCP acknowledgment is sent to the initiator
{{NagleDA}}.  If the
initiator has another message pending, but has not yet filled its output
buffer, this can delay
the delivery of that message by more than 200ms.  In many cases, this will
make no difference.
However, implementors should be aware of this issue.  Some operating
systems offer ways to
disable the 200ms TCP acknowledgment delay; this may be useful for
relatively low-traffic
sessions, or sessions with bursty traffic flows.

> >> 2) Further regarding keep-alives:
> >> in sec 6.5.2: "For example, a hypothetical keepalive interval
> >>  value of 100ms would result in a continuous stream of at least ten
> >>  messages per second, in both directions, to keep the DSO Session
> >>  alive."
> >>
> >> This does not seems correct. There should be at max one keep-alives
> message in
> >> flight. Thus the keep-laives timer should only be restarted after the
> >> keep-alive reply was received.
> >
> > On a campus GigE backbone with sub-millisecond round-trip times, even a
> hypothetical keepalive interval value of 100ms would still have only one
> keep-alive message in flight at a time. But it would still be an
> unreasonable keepalive interval.
>
> Not sure if that is a unreasonable keep-alive in a GigE backbone. I would
> actually hope that you don’t need a keep-alive mechanism at all in those
> scenarios but it depends if there are any middleboxes and how quickly their
> state expires. Given you have a campus network, you might know what the
> timeout are and set the keep-alive interval respectively. Maybe that’a
> better advise to give.
>

I think this text (currently in the document) addresses this case:

A corporate DNS server that knows it is serving only clients on the
internal
network, with no intervening NAT gateways or firewalls, can impose a higher
keepalive interval, because frequent DSO keepalive traffic is not required.

Indeed, in this case I agree that a fairly long number could be used.
 However, bear in mind that keepalive not only catches lost state in the
network, but also lost state on the endpoint, so getting rid of it entirely
means that a connection might appear open for a very long time with no
traffic flowing and no attempt to reconnect.   Sending regular keepalives
ensures that if state is lost on either endpoint, we get a timely TCP RST
when the next keepalive arrives and finds no open connection waiting for it.

However, I think that 100ms is pretty clearly too short for this
application—we aren't trying to do VRRP, and a packet that would produce a
user-visible delay would also trigger the TCP reset for the broken
connection.

My problem with the example text above it that is seems to indicate that
> you just send a keep-alives very x time units while you need to wait for
> the response before restarting the timer. This needs to be clarified.
> However as I said, I’m not certain about the actual value fo this section
> as all, as it does not seems the right document to discuss these more
> general issue.
>

I think I see where the confusion is.   The current text says that when a
client receives a keepalive timeout TLV, it resets the keepalive timer.
 This only makes sense when the server is sending the client an
unacknowledged Keepalive message.   I've made the following change, which I
think should clear up the confusion:

In the case of the keepalive timer, the handling of the received value is
straightforward. When
a client receives a server-initiated message with the Keepalive TLV as its
primary TLV, it resets the
keepalive timer.  Whenever it receives a Keepalive TLV from the server,
either in a server-initiated
message or a reply to its own client-initiated Keepalive message, it
updates the keepalive interval
for the DSO Session. The new keepalive
interval indicates the maximum time that may elapse before another message
must be sent or
received on this DSO Session, if the DSO Session is to remain alive.
If the client receives a response to a keepalive message that specifies a
keepalive interval
shorter than the current keepalive timer, the client MUST immediately send
a Keepalive message.
However, this should not normally happen in practice: it would require that
Keepalive interval
the server be shorter than the round-trip time of the connection.

> >> This doesn't really make sense to me: As I said, TCP will retransmit
> and the
> >> keep-alive timer should not be running until the reply is received. If
> you want
> >> to abort the connection based on keep-alives quickly before the TCP
> connection
> >> indicates you a failure, you need to wait at minimum for an interval
> that is
> >> larger than the TCP RTO (with is uaually 3 RTTs) which means you
> basically need
> >> to know the RTT.
> >
> > The point of this text is to illustrate that a keepalive interval value
> of 100ms would be unreasonable. I think you would agree with that.
>
> Yes. I understood that, however, for me this illustration was rather
> confusing. For me something like "the keep-alives interval should not be
> chosen to low to reduce network load and must be sufficiently larger than
> the RTT to avoid server termination if the keep-alive gets lost and needs
> to be retransmitted“ would be enough.
>

I think the confusion here is rooted in the same problem that I (hope) I
fixed with the above text.

> This is to support why the immediately following text mandates a minimum
> keepalive interval of ten seconds.
> >
> >> Also sec 7.1: "If the client does not generate the
> >>     mandated keepalive traffic, then after twice this interval the
> >>     server will forcibly abort the connection."
> >> Why must the server terminate the connection at all if the client
> refuses to
> >> send keep-alives? Isn't that what the inactivity timer is meant for?
> Usually
> >> only the endpoint that initiates the keep-alive should terminate the
> connection
> >> if no response is received.
> >
> > A client cannot refuse to send keep-alives. A connection with an active
> mDNS relay subscription is never considered “inactive”, but a server may
> still require reasonable keep-alives to verify that the client is still
> there.
>
> Ah, thanks I don’t think this case was explain in the text (or I missed
> it) please clarify. May also provide in general more reasoning why and that
> the client is required to send the keep-alives as requested. At the
> beginning of the doc it seemed more that the keep-alive interval is more a
> recommendation than a requirement and it is important to understand that to
> correctly understand the rest of the doc.
>

This is the current text that covers this:

If a client disconnects from the network abruptly,
without cleanly closing its DSO Session,
perhaps leaving a long-lived operation uncancelled,
the server learns of this after failing to
receive the required DSO keepalive traffic from that client.
If, at any time during the life of the DSO Session,
twice the keepalive interval value (i.e., 30 seconds by default) elapses
without any DNS messages being sent or received on a DSO Session,
the server SHOULD consider the client delinquent,
and SHOULD forcibly abort the DSO Session.

I think that this means that if a non-infinite keepalive interval has been
negotiated, the client has to send keepalives.   Of course, the spec does
allow the client and server to negotiate an infinite keepalive interval, in
which case no keepalives are ever sent; in this sense, keepalives are
indeed optional.

>> 3) There is another contraction regarding the inactive timer:
> >> Sec 6.2 say
> >>  "A shorter inactivity timeout with a longer keepalive interval signals
> >>  to the client that it should not speculatively keep an inactive DSO
> >>  Session open for very long without reason, but when it does have an
> >>  active reason to keep a DSO Session open, it doesn't need to be
> >>  sending an aggressive level of keepalive traffic to maintain that
> >>  session."
> >> which indicates that the client may leave the session open longer than
> >> indicated by the inactive timer of the server. However section 7.1.1
> say that
> >> the client MUST close the connection when the timer is expired.
> >
> > A connection with an active mDNS relay subscription is never considered
> “inactive”, because there is still active client/server state, even if no
> traffic is flowing. A server may still require reasonable keep-alives to
> verify that the client is still there.
>
> However, the cited text above says
> "should not speculatively keep an inactive DSO Session open for very long
> without reason, but when it does have an active reason to keep a DSO
> Session open,“
> which explicitly talks about keeping INACTIVE session open speculatively
> for a longer time than the inactivity timeout.
>

I added the following text after the text you quoted.

An example of this would be a client that has subscribed to DNS Push
notifications: in this case, the client is not sending any traffic to the
server, but the session is not inactive, because there is a pending request
to the server to receive push notifications.

I have not finished reviewing all your comments, but I wanted to
particularly review your conversation with Stuart before going through the
stuff you and Stuart didn't discuss, so that I'd have that context.   I
will send a second response to your message that covers the issues I
haven't yet covered here.

Re: [DNSOP] Mirja Kühlewind's Discuss on draft-ie… Stuart Cheshire
Re: [DNSOP] Mirja Kühlewind's Discuss on draft-ie… Mirja Kuehlewind (IETF)
[DNSOP] Mirja Kühlewind's Discuss on draft-ietf-d… Mirja Kühlewind
Re: [DNSOP] Mirja Kühlewind's Discuss on draft-ie… Ted Lemon
Re: [DNSOP] Mirja Kühlewind's Discuss on draft-ie… Ted Lemon
Re: [DNSOP] Mirja Kühlewind's Discuss on draft-ie… Ted Lemon
Re: [DNSOP] Mirja Kühlewind's Discuss on draft-ie… Warren Kumari
Re: [DNSOP] Mirja Kühlewind's Discuss on draft-ie… Warren Kumari
Re: [DNSOP] Mirja Kühlewind's Discuss on draft-ie… Mirja Kühlewind (IETF)
Re: [DNSOP] Mirja Kühlewind's Discuss on draft-ie… Warren Kumari
Re: [DNSOP] Mirja Kühlewind's Discuss on draft-ie… Mirja Kuehlewind (IETF)
Re: [DNSOP] Mirja Kühlewind's Discuss on draft-ie… Mirja Kuehlewind (IETF)
Re: [DNSOP] Mirja Kühlewind's Discuss on draft-ie… Mirja Kuehlewind (IETF)
Re: [DNSOP] Mirja Kühlewind's Discuss on draft-ie… Benjamin Kaduk
Re: [DNSOP] Mirja Kühlewind's Discuss on draft-ie… Ted Lemon
Re: [DNSOP] Mirja Kühlewind's Discuss on draft-ie… Mirja Kuehlewind (IETF)
Re: [DNSOP] Mirja Kühlewind's Discuss on draft-ie… Ted Lemon
Re: [DNSOP] Mirja Kühlewind's Discuss on draft-ie… Mirja Kuehlewind (IETF)
Re: [DNSOP] Mirja Kühlewind's Discuss on draft-ie… Ted Lemon
Re: [DNSOP] Mirja Kühlewind's Discuss on draft-ie… Mirja Kuehlewind (IETF)
Re: [DNSOP] Mirja Kühlewind's Discuss on draft-ie… Ted Lemon