Re: [TLS] Transport Issues in DTLS 1.3

Gorry Fairhurst <gorry@erg.abdn.ac.uk> Thu, 25 March 2021 18:45 UTC

Return-Path: <gorry@erg.abdn.ac.uk>
X-Original-To: tls@ietfa.amsl.com
Delivered-To: tls@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8BF053A29DF; Thu, 25 Mar 2021 11:45:14 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 21rkS6mhLBOY; Thu, 25 Mar 2021 11:45:12 -0700 (PDT)
Received: from pegasus.erg.abdn.ac.uk (pegasus.erg.abdn.ac.uk [IPv6:2001:630:42:150::2]) by ietfa.amsl.com (Postfix) with ESMTP id 45C253A29E3; Thu, 25 Mar 2021 11:45:10 -0700 (PDT)
Received: from GF-MBP-2.lan (fgrpf.plus.com [212.159.18.54]) by pegasus.erg.abdn.ac.uk (Postfix) with ESMTPSA id E179E1B00064; Thu, 25 Mar 2021 18:45:02 +0000 (GMT)
To: Mark Allman <mallman@icsi.berkeley.edu>, Martin Duke <martin.h.duke@gmail.com>
Cc: draft-ietf-tls-dtls13.all@ietf.org, Lars Eggert <lars@eggert.org>, tls@ietf.org
References: <CAM4esxR3YPoWaxU9B--oaT9r2bh_QBNH=tt0FsiUKaAT=M6_fg@mail.gmail.com> <A2B835CB-7D03-435F-AD06-9ED16E2661CF@icsi.berkeley.edu>
From: Gorry Fairhurst <gorry@erg.abdn.ac.uk>
Message-ID: <a224cf89-ee4b-4092-1fc0-4d0a30308c20@erg.abdn.ac.uk>
Date: Thu, 25 Mar 2021 18:45:02 +0000
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:78.0) Gecko/20100101 Thunderbird/78.8.1
MIME-Version: 1.0
In-Reply-To: <A2B835CB-7D03-435F-AD06-9ED16E2661CF@icsi.berkeley.edu>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Content-Language: en-GB
Archived-At: <https://mailarchive.ietf.org/arch/msg/tls/l7ygNmTIRvcGqoY6awNmqX5vScM>
Subject: Re: [TLS] Transport Issues in DTLS 1.3
X-BeenThere: tls@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "This is the mailing list for the Transport Layer Security working group of the IETF." <tls.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tls>, <mailto:tls-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tls/>
List-Post: <mailto:tls@ietf.org>
List-Help: <mailto:tls-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tls>, <mailto:tls-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 25 Mar 2021 18:45:15 -0000

Thanks Mark, my comments were also writen in haste, and I think relates 
to only some specific sections of the draft. It would be good to know 
what next steps are planned.

Gorry

On 25/03/2021 17:29, Mark Allman wrote:
> Folks-
>
> I am appending what I sent Martin & Gorry this morning.  I looked
> quite quickly as Martin was looking for quick input.  I am happy to
> iterate if things aren't all that understandable.
>
> allman
>
>
> [Quick hit.]
>
> I agree with Martin's DISCUSS and Gorry's notes.  A couple more
> things here ...
>
>> This spec looks a mess!
> A generous reading is that this is most of the problem.  I think
> maybe there is some intent in here that isn't stated very well.  It
> needs to be explicit and not as sloppy.
>
> A few specific things (in addition to what Gorry said, which I
> absolutely agree with):
>
>    - "Though timer values are the choice of the implementation,
>      mishandling of the timer can lead to serious congestion
>      problems"
>
>      + Gorry flagged this and I am flagging it again.  If this is
>        something that can lead to serious problems, let's not just
>        leave it to "choice of the implementation".  Especially if we
>        have some idea how to make it less problematic.
>
>    - "Implementations SHOULD use an initial timer value of 100 msec
>      (the minimum defined in RFC 6298 [RFC6298])"
>
>      + I wrote RFC 6298 and I have no idea where this is coming from!
>
>      + Even if this value of 100msec is OK for DTLS it shouldn't lean
>        on RFC 6298 because RFC 6298 doesn't say that is OK.  I.e.,
>        the parenthetical is objectively wrong.
>
>      + RFC 6298 says the INITIAL RTO should be 1sec (point (2.1) in
>        section 2).  RFC 8961 affirms this and also says the INITIAL
>        RTO should be 1sec (requirement (1) in section 4).
>
>    - "Note that a 100 msec timer is recommended rather than the
>      3-second RFC 6298 default in order to improve latency for
>      time-sensitive applications."
>
>      + Again, this mis-states RFC 6298, which says the initial RTO is
>        1sec (not 3sec).  (Previous to RFC 6298 the initial RTO was
>        3sec, which is probably where the notion comes from.  Most of
>        the purpose of RFC 6298 was to drop the initial RTO to 1sec.)
>
>      + This is a statement of desire, not any sort of principled
>        justification for using 100msec.  At the least this should be
>        much better argued.
>
>      + To me 100msec feels much too close to the RTT of some network
>        paths to be appropriate here.  To be clear, deviations from
>        RFC 8961 that gather consensus are fine, but you should say
>        why that deviation is OK.  And, I'd think the further you
>        deviate the more you need to say (for me).  I.e., dropping
>        from 1sec to 900msec may not be that big of an issue.  But,
>        dropping to 1/10-th of the guideline and to something pretty
>        close to not rare RTTs should require some care and some
>        discussion, IMO.
>
>      + And, I am not trying to be a picky protocol lawyer and say
>        this document "didn't check the RFC 8085 / RFC 8961 box".
>        Rather, RFC 8085 & 8961 say things for a reason and I don't
>        think we should implicitly ignore them because they come from
>        experience on how to do these sorts of things.
>
>    - "The retransmit timer expires: the implementation transitions to
>      the SENDING state, where it retransmits the flight, resets the
>      retransmit timer, and returns to the WAITING state."
>
>      + Maybe this is spec sloppiness, but boy does it sound like the
>        recipe TCP used before VJCC to collapse the network.  I.e.,
>        expire and retransmit the window.  Rinse and repeat.  It may
>        be the intention is for backoff to be involved.  But, that
>        isn't what it says.
>
>    - “When they have received part of a flight and do not immediately
>      receive the rest of the flight (which may be in the same UDP
>      datagram). A reasonable approach here is to set a timer for 1/4 the
>      current retransmit timer value when the first record in the flight
>      is received and then send an ACK when that timer expires.”
>
>      + Where does 1/4 come from?  Why is it "reasonable"?  This just
>        feels like a complete WAG that was pulled out of the air.
>
> And, +1 on all the flight size stuff Martin mentioned.
>
> allman