Re: [TLS] Transport Issues in DTLS 1.3

Martin Duke <martin.h.duke@gmail.com> Tue, 30 March 2021 18:47 UTC

MIME-Version: 1.0
References: <CAM4esxR3YPoWaxU9B--oaT9r2bh_QBNH=tt0FsiUKaAT=M6_fg@mail.gmail.com> <CABcZeBMS5fUej0q5XhbxM5sMLQwAAyCgyAfbkTORQjvMM+jb7A@mail.gmail.com> <E43A7F98-6AE3-402B-B166-077B6D74B97A@icsi.berkeley.edu>
In-Reply-To: <E43A7F98-6AE3-402B-B166-077B6D74B97A@icsi.berkeley.edu>
From: Martin Duke <martin.h.duke@gmail.com>
Date: Tue, 30 Mar 2021 11:47:47 -0700
Message-ID: <CAM4esxR+4NWHW6PadAVUsnwMZzE+yw75fdk2m2s3jV7V3inuQw@mail.gmail.com>
To: Mark Allman <mallman@icsi.berkeley.edu>
Cc: Eric Rescorla <ekr@rtfm.com>, draft-ietf-tls-dtls13.all@ietf.org, Lars Eggert <lars@eggert.org>, Gorry Fairhurst <gorry@erg.abdn.ac.uk>, "<tls@ietf.org>" <tls@ietf.org>
Content-Type: multipart/alternative; boundary="00000000000076768705bec56f91"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tls/sV04JNqR0UBJ6jiIyM4kS1An6oE>
Subject: Re: [TLS] Transport Issues in DTLS 1.3
Precedence: list

 Thank you Eric (and Mark).

To reiterate, I believe introducing latency regressions with respect to
DTLS 1.2 would be bad for the internet. So what's new in the area under
discussion is (a) lowering the timeout from 1s to 100ms, and (b) the
introduction of ACKs.

I would characterize ekr's reply as making the following points:

(1) *DTLS practice at Mozilla and elsewhere already uses timeouts << 1 sec*.

Thanks for this report about the real world. I have no doubt that for
WebRTC and other use cases, a short timeout is fine. However, DTLS is a
general-purpose protocol and the standard should be quite conservative
about the paths this thing is going to run over. Obviously, people are
going to ignore this requirement when they think they can get an advantage
no matter what the RFC says.

I see three acceptable ways to proceed:
(a) stick with 1 second with words saying that given some OOB knowledge you
can go lower;
(b) the same, but having an explicit floor of 100ms or 200ms; or
(c) having a shorter threshold for small flights, as I proposed in my
DISCUSS

(2) *DTLS 1.2 does full retransmissions on each timeout, and there is no
window halving.*

This is a good point, but I will note that 1.2 always has an RTO-based
timeout, so the sending rate is halved because the timeout doubles each
time. With an ACK, there will be no rate halving, unless the ACK clears
half the window or more.

That said, Mark doesn't seem to be too concerned about it. The
constrained-network problem where these bursts are just too large already
exists in DTLS 1.2 so I'm increasingly persuaded that it's OK to drop this
issue.

Mark said a lot about RTT measurement in his reply. I gather from the draft
that there is no such measurement going on, but including it would be
another way to address some of the backoff issues.

(3) *The applicability of this algorithm is at most a few packets, which
strictly limits the risk in a way that renders RFC 8085, etc.
considerations largely irrelevant.*

The strawman in my DISCUSS was that bursts of <= 2 packets could be more
aggressive; that's a negotiable number, and the de jure TCP 4*MSS initial
window, for example, is one I can easily be persuaded of. I feel some
desire to guard against giant post-quantum certificates, or what have you,
but some sufficiently wide guardrails here will probably have little or no
short-term real-world impact, and I trust we can reach a mutually agreeable
number. The largest flights today in DTLS 1.2 seem like a good number that
addresses my concerns while respecting my no-regressions principle.

Thanks,
Martin

On Tue, Mar 30, 2021 at 10:48 AM Mark Allman <mallman@icsi.berkeley.edu>
wrote:

>
> Hi Ekr!
>
> > This means that we have rather more latitude in terms of how
> > aggressively we retransmit because it only applies to a small
> > fraction of the traffic.
>
> (Strikes me as a bit of a weird formulation.)
>
> > Firefox uses 50ms and AIUI Chrome
> > uses a value derived from the ICE handshake (which is probably
> > better because there are certainly times where 50ms is too short).
>
> Yes- the best thing to do is to use a measured value instead of
> assuming on static number will always work.  But, you have to get a
> measurement to do that, so you have to start somewhere.
>
> >> Relatedly, in section 5.8.3 there is no specific recommendation for a
> >> maximum flight size at all. I would think that applications SHOULD
> >> have no more than 10 datagrams outstanding unless it has some OOB
> >> evidence of available bandwidth on the channel, in keeping with de
> >> facto transport best practice.
> >
> > I agree that this is a reasonable change.
>
> I like this, too.  I think that limits the impact of any sort of
> badness.
>
> >> Granted, doubling the timeout will reduce the rate, but when
> >> retransmission is ack-driven there is essentially no reduction of
> >> sending rate in response to loss.
> >
> > I don't believe this is correct. Recall that unlike TCP, there's
> > generally no buffer of queued packets waiting to be transmitted.
> > Rather, there is a fixed flight of data which must be delivered.
> > With one exceptional case [1], an ACK will reflect that some but
> > not all of the data was delivered and processed; when
> > retransmitting, the sender will only retransmit the un-ACKed
> > packets, which naturally reduces the sending rate. Given the quite
> > small flights in play here, that reduction is likely to be quite
> > substantial. For instance, if there are three packets and 1 is
> > ACKed, then there will be a reduction of 1/3.
>
> I tend to agree with ekr here.  This doesn't tend to worry me
> greatly.
>
> > Note that the timeout is actually only reset after successful loss-free
> > delivery of a flight:
> >
> >    Implementations SHOULD retain the current timer value until a
> >    message is transmitted and acknowledged without having to
> >    be retransmitted, at which time the value may be
> >    reset to the initial value.
> >
> > There seems to be some confusion here (perhaps due to bad
> > writing).  When the text says "resets the retransmission timer" it
> > means "re-arm it with the current value" not "re-set it to the
> > initial default". For instance, suppose that I send flight 1 with
> > retransmit timer value
> > T. After T seconds, I have not received anything and so I retransmit
> > it, doubling to 2T. After I get a response, I now send a new
> > flight. The timer should be 2T, not T.
>
> I agree that is how to manage the timer.
>
> > With that said, I think it would be reasonable to re-set to whatever
> > the measured RTT was, rather than the initial default. This would
> > avoid potentially resetting to an overly low default (though it's
> > not clear to me how this could happen because if your RTT estimate
> > is too low you will never get a delivery without retransmission).
>
> That's one problem with a too-low initial RTT and a reason why RFCs
> 8085 & 8961 use a conservative initial.
>
> However, I might suggest not setting the timeout to the measured
> RTT, but to something based on the measured RTT.  The best guidance
> here (8085 & 8961) is that this value should be based on both the
> RTT and the variance in the RTT.  With one sample you don't have
> variance.  TCP handles this by setting the RTO to 3 times the first
> measured RTT.  That's just old VJCC.  It has always struck me as a
> bit conservative, but ultimately this is a blip in the TCP context
> and so I have never thought deeply about it.  But, perhaps if you
> did something like 1.5 times the measured RTT you'd account for a
> bit of variance that will no doubt be present.
>
> > On point (1), I think that the fact that we have extensive
> > deployment of timeout-driven retransmission in the field with
> > short timers is fairly strong evidence that it will not destroy
> > the Internet and more generally that the "retransmit the whole
> > flight" design is safe in this case. I certainly agree that there
> > might be settings in which 100ms is too short. Rather than
> > litigate the timer value, which I agree is a judgement call, I
> > suggest we increase the default somewhat (250? 500) and then
> > indicate that if the application has information that a shorter
> > timer is appropriate, it can use one.
>
> I think that sounds fine.  And, if you could wedge some words about
> experience into the document that'd seem useful, as well, IMO.
>
> > With that said, given that your concern seems to be large flights,
> > I could maybe live with halving the *window* rather than the size
> > of the flight. In your example, you suggest an initial window of
> > 10, so this would give us 10, 5, 3, ... This would have little
> > practical impact on the vast majority of handshakes, but I suppose
> > might slightly improve things on the edge cases where you have a
> > large flight *and* a high congestion network.
>
> I dunno ... I'd be interested in Martin's thought here.  But, at
> these levels I am just not sure if the complexity of tracking a
> flight size is really worth it.
>
> >>   - "Though timer values are the choice of the implementation,
> >>     mishandling of the timer can lead to serious congestion
> >>     problems"
> >>
> >>     + Gorry flagged this and I am flagging it again.  If this is
> >>       something that can lead to serious problems, let's not just
> >>       leave it to "choice of the implementation".  Especially if we
> >>       have some idea how to make it less problematic.
> >
> > I'm not sure what you'd like here. I think the guidance in this
> > specification is reasonable, so I'd be happy to just remove this
> > text.
>
> I don't find the two halves of the sentence consistent with each
> other and therefore the message seems muddled.
>
> Removing is fine.
>
> >>   - "The retransmit timer expires: the implementation transitions to
> >>     the SENDING state, where it retransmits the flight, resets the
> >>     retransmit timer, and returns to the WAITING state."
> >>
> >>     + Maybe this is spec sloppiness, but boy does it sound like the
> >>       recipe TCP used before VJCC to collapse the network.  I.e.,
> >>       expire and retransmit the window.  Rinse and repeat.  It may
> >>       be the intention is for backoff to be involved.  But, that
> >>       isn't what it says.
> >
> > It says it elsewhere, in the section you quoted:
> >
> >    a congested link.  Implementations SHOULD use an initial timer value
> >    of 100 msec (the minimum defined in RFC 6298 {{RFC6298}}) and double
> >    the value at each retransmission, up to no less than 60 seconds
> >    (the RFC 6298 maximum).
> >
> > As I said to Martin, I think some of the confusion is that this
> > specification uses "reset" to mean both "re-arm" and "set the
> > value back to the initial" and depends on context to clarify
> > that. Obviously that's not been entirely successful, so I propose
> > to use re-arm" where I mean "start a timer with the now current
> > value".
>
> I agree this is mostly a writing issue.  I would suggest looking for
> the word "reset" and just using more than one word so it's
> absolutely clear what you mean.  E.g., something like "double the
> timeout value and start a new timer" instead of "reset" or "rearm".
>
> >>   - “When they have received part of a flight and do not immediately
> >>     receive the rest of the flight (which may be in the same UDP
> >>     datagram). A reasonable approach here is to set a timer for 1/4 the
> >>     current retransmit timer value when the first record in the flight
> >>     is received and then send an ACK when that timer expires.”
> >>
> >>     + Where does 1/4 come from?  Why is it "reasonable"?  This just
> >>       feels like a complete WAG that was pulled out of the air.
> >
> > Yes, it was in fact pulled out of the air (though I did discuss it
> > with Ian Swett a bit). To be honest, any value here is going to be
> > somewhat pulled out of the air, especially because during the
> > handshake the retransmit timer values are incredibly imprecise,
> > consisting as they do of (at most) one set of samples.  In
> > general, this value is a compromise between ACKing too
> > aggressively (thus causing spurious retransmission of in-flight
> > packets) and ACKing too conservatively (thus causing spurious
> > retransmission of received packets).
>
> Well, perhaps what is needed here is some of the words from your
> email.  I.e., a bit of an explanation of things instead of simply
> declaring 1/4 to be reasonable.
>
> allman
>

[TLS] Transport Issues in DTLS 1.3 Martin Duke
Re: [TLS] Transport Issues in DTLS 1.3 Gorry Fairhurst
Re: [TLS] Transport Issues in DTLS 1.3 Eric Rescorla
Re: [TLS] Transport Issues in DTLS 1.3 Eric Rescorla
Re: [TLS] Transport Issues in DTLS 1.3 Martin Duke
Re: [TLS] Transport Issues in DTLS 1.3 Hannes Tschofenig
Re: [TLS] Transport Issues in DTLS 1.3 Mark Allman
Re: [TLS] Transport Issues in DTLS 1.3 Martin Duke
Re: [TLS] Transport Issues in DTLS 1.3 Bill Frantz
Re: [TLS] Transport Issues in DTLS 1.3 Gorry Fairhurst
Re: [TLS] Transport Issues in DTLS 1.3 Hannes Tschofenig
Re: [TLS] Transport Issues in DTLS 1.3 Mark Allman
Re: [TLS] Transport Issues in DTLS 1.3 Mark Allman
Re: [TLS] Transport Issues in DTLS 1.3 Eric Rescorla