Re: [tsvwg] UDP-Options: UDP has two ???maximums???

Joseph Touch <touch@strayalpha.com> Sun, 04 April 2021 01:50 UTC

Return-Path: <touch@strayalpha.com>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8094E3A0853 for <tsvwg@ietfa.amsl.com>; Sat, 3 Apr 2021 18:50:00 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 1.07
X-Spam-Level: *
X-Spam-Status: No, score=1.07 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HAS_X_OUTGOING_SPAM_STAT=2.388, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_NEUTRAL=0.779, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=strayalpha.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Cy71vnMfL2ow for <tsvwg@ietfa.amsl.com>; Sat, 3 Apr 2021 18:49:56 -0700 (PDT)
Received: from server217-4.web-hosting.com (server217-4.web-hosting.com [198.54.116.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 15AC33A0845 for <tsvwg@ietf.org>; Sat, 3 Apr 2021 18:49:55 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=strayalpha.com; s=default; h=To:References:Message-Id:Cc:Date:In-Reply-To: From:Subject:Mime-Version:Content-Type:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=bfYOtFNX2/tdDRZydpU0xtQMHUIoWIOPo4Nh+4r2020=; b=tH7uLsWaNXkTIu1XkoRaLtD/7 3BbPHCTtaCwopDKxbyIUqyjAjCw8AgEVIQEUOVQydBVdvRRP3so/97KzDrlKvpbaWL6xKk73vT3uf WHTxgYbW/oz2eB66jvk9xsRe1wePk3hLerDJvyTD8gWpZ3o6XVGAX0QODqe0EOyYTgXIrWc/4JXWz WlwM2MiAbxD0LzG1hOvoSmvkZMI+fbXYXVWEcLP6Gk3dLo6Oee5A8RyhPraFt7NrfxpBqBSxxbzho 0TahICM/XHG0zwRVnv+RbdQcblNHNCz0QlUklO3y+QE/fid8m5uoJdBT/+7EI2V0AyTdxIsyVBhWg t/o/LTtUw==;
Received: from cpe-172-250-225-198.socal.res.rr.com ([172.250.225.198]:64881 helo=[192.168.1.14]) by server217.web-hosting.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94) (envelope-from <touch@strayalpha.com>) id 1lSrtc-000WaI-9B; Sat, 03 Apr 2021 21:49:53 -0400
Content-Type: multipart/alternative; boundary="Apple-Mail=_D28358D1-15F1-4243-B219-633E27916198"
Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.60.0.2.21\))
From: Joseph Touch <touch@strayalpha.com>
In-Reply-To: <20210404012903.qirmrspgkjjk6a64@family.redbarn.org>
Date: Sat, 3 Apr 2021 18:49:46 -0700
Cc: Gorry Fairhurst <gorry@erg.abdn.ac.uk>, "tsvwg@ietf.org" <tsvwg@ietf.org>
Message-Id: <15883BDF-644B-408B-A575-B73C4127471E@strayalpha.com>
References: <8296B6C0-0010-4EAE-A6C9-6C3D43AC5BAB@strayalpha.com> <28f28347-b6a8-9f38-e03c-70bf06322c48@erg.abdn.ac.uk> <93556D3A-3C42-4944-9202-DE75AE864CBA@strayalpha.com> <853caba2-b7ce-db2e-338c-ad1d161a5fe9@erg.abdn.ac.uk> <48DA3058-3380-46AC-951E-27B28489AAF6@strayalpha.com> <846f084a-c441-1d2f-a858-e4d34d528c83@erg.abdn.ac.uk> <20210402231200.4q5czwbxswdneinr@family.redbarn.org> <2d36e27c-1470-35f9-3079-6a150e83c713@erg.abdn.ac.uk> <20210403202313.ojof3hcwj35xs67b@family.redbarn.org> <B1E3E640-42B5-452F-BB04-424B0AF10FE7@strayalpha.com> <20210404012903.qirmrspgkjjk6a64@family.redbarn.org>
To: Paul Vixie <paul@redbarn.org>
X-Mailer: Apple Mail (2.3654.60.0.2.21)
X-OutGoing-Spam-Status: No, score=-1.0
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - server217.web-hosting.com
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - strayalpha.com
X-Get-Message-Sender-Via: server217.web-hosting.com: authenticated_id: touch@strayalpha.com
X-Authenticated-Sender: server217.web-hosting.com: touch@strayalpha.com
X-Source:
X-Source-Args:
X-Source-Dir:
X-From-Rewrite: unmodified, already matched
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/UxNSEYmkMiLlogBtufFhqXDYd0o>
Subject: Re: [tsvwg] UDP-Options: UDP has two ???maximums???
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 04 Apr 2021 01:50:01 -0000

Hi, Paul,

> On Apr 3, 2021, at 6:29 PM, Paul Vixie <paul@redbarn.org> wrote:
> 
>> On Apr 3, 2021, at 1:23 PM, Paul Vixie <paul@redbarn.org> wrote:
>>> PLPMTUD is exactly what i wanted for DNS over UDP,
>>> and we refer to it here:
>>> 
>>> ... draft-ietf-dnsop-avoid-fragmentation-04.txt
>>> 
>>> note, a lot of others expect DNS to move to HTTP/3 or TCP, but even in
>>> those cases i would like to use the largest discrete PDUs that will fit.
> 
> On Sat, Apr 03, 2021 at 04:07:22PM -0700, Joseph Touch wrote:
>> You might take a closer look at UDP options:
>> https://tools.ietf.org/html/draft-ietf-tsvwg-udp-options-09
>> 
>> It supports UDP-layer fragmentation and reassembly with a 32-bit ID field.
> 
> you're implicitly positing a situation wherein a DNS speaker could know that
> the far endpoint knew about UDP options and could reassemble, and that the
> local endpoint (kernel) knows about UDP options and could fragment, and that
> using UDP Fragmentation would be seen as a better choice than leaving out
> optional data or signalling a need to retry with TCP.

Yes. Note though that the fragmentation in UDP can be used safely; legacy endpoints just see (at most) packets with zero data.

> in practice this means if a DNS UDP initiator asked its networking stack to
> include a UDP option list (perhaps empty, but present), this signal could be
> made available to the DNS UDP responder by its networking stack, so that the
> capability of UDP fragmentation and reassembly could be considered when
> deciding how to respond.

Yes.

> i worry about microbursts. the DNS UDP responder knows how much data it has
> committed to the network recently, and how much was committed to a given
> initiator, and can slow down a little to avoid back-to-back transmissions,
> but certainly will slow down a little and reach such avoidance if the
> initiator is not pipelining its requests.
> 
> what we've learned from NFS, and high volume authoriative UDP DNS, is that
> the network doesn't love minimum-spaced back-to-back packets, and that if
> an 8KiB NFS result gets chopped into ~1500B chunks, tail drop is likely.
> this is the biggest source of operator pain from IP fragmentation, fwiw.

Although I appreciate this concern, TCP does the same kind of bursts - which seems like DNS would exacerbate, either by sending up to 10-packet bursts at the start of every new connection or by sending potentially larger bursts if persistent connections are used.

I had thought we knew about this long enough that vendors didn’t use tail drop; they should have been doing AQM or at least something akin to RED.

> (bizarrely, this was not an issue in 10base5 or 10baseT due to CSMA/CD, even
> where repeaters were present. but once we add bridging, such as "switches",
> there's no reliable interface-driver signal for non-local congestion.)
> 
> candidly, i would not want to have to teach a kernel network stack how to
> pace its transmitted fragments. this observation was one of my few
> contributions to RFC6013, in which a TCB could revert to embryonic state
> but retain its CWND until either reused or pruned. alas, TCPM felt they
> had to make a choice, and chose TCPFO, which has since proved unworkable,
> thus leading to QUIC.

A few of us at ISI explored ways to adjust TCP slow start restart to avoid this issue too:
https://tools.ietf.org/html/draft-hughes-restart-00 <https://tools.ietf.org/html/draft-hughes-restart-00>

> further digression: the framer of messages (like TCP, or DNS, or NFS) ought
> to know the PMTU, which is why PMTUD was originally a non-optional feature
> of IPv6 until we learned that ICMPv6 was dangerous as hell and threw out
> PMTUD, thus leaving us with the pessimal and never-expected-to-be-used 1280
> and 1232 numbers. if we can get PLPMTUD then we can make IPv6 better than
> IPv4 in terms of header amortization rather than (as it currently is) worse.

Agreed; that’s aided in UDP with options as per Gorry’s draft.

Joe