Re: [DNSOP] 答复: Call for Adoption: draft-song-atr-large-resp

Brian Dickson <> Mon, 28 January 2019 07:08 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id DDDE0130FA0 for <>; Sun, 27 Jan 2019 23:08:57 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.998
X-Spam-Status: No, score=-1.998 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id koGplc-K5Vbi for <>; Sun, 27 Jan 2019 23:08:54 -0800 (PST)
Received: from ( [IPv6:2607:f8b0:4864:20::82d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 447BA12896A for <>; Sun, 27 Jan 2019 23:08:54 -0800 (PST)
Received: by with SMTP id t33so17181032qtt.4 for <>; Sun, 27 Jan 2019 23:08:54 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=ysb5B1NXqZiLwDZoMr09mhu+WhZSIpYh1Y5QzxV0z6I=; b=auZ8Ow+DxbKrsPfpc2gIKK2llO/SU7An6I777ULWl3qwAln97H2jxtzCZT3tNN52/D ZmcZ0QHM1tUsuTboB3ic2/Lsh51JpO4JoQVMVFPtZxd0VAzTiodAkK7Ud0MvZdlTm1ba w8W+61ylhnHmPMJGtEL2uTzOzq5y1sqtMZqx6AKRW4XtLQCdpyO5AZgIimLfqm9U6gTo POKiCMBBxXHjm18MF4ZI33QX2AlmhYyVWoyjSUz2MWi8NAdoeh+Dxz3w41XHKjgYftJL 8j/zUxEuDmyPloFnjcY2upcc7btIXTijgDqe1sBHJKDAIpPfiD/paFtxMQnmMVIIfH8g nwfQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ysb5B1NXqZiLwDZoMr09mhu+WhZSIpYh1Y5QzxV0z6I=; b=Niy89Nc5D+Xu2YsCnVeHAQYwVf3J9vt/tfDDVfDUovtXno/GOnxkxmIwZMvWfkO9BP 8ItwBqdhDJxbQK7r3zYRrXMAJJgOPlPh3fG9wC6XWBPtE+eCGJKfxsDWZqaQcJ1QdRT9 kVA+ElDNCE1ckqBoLNoWaVWuUbr5U+Cs0LSALeKvtP+/YC0Lnu1z9qnJQGEVMy23x3tm OFiISFEgxZvT0uYon3W7JtrWeCIZjIrfmGnZt/CUO6zoSZvCtyYmmjLwKjByQWpYh8Pm p3zJwcmHO0PevW567S8T6UytrnoT6lo9OfV3Wsw5vLdgah/iWfU/7gQu4LyZuaONWXqs zS1A==
X-Gm-Message-State: AJcUukeyyEU3POukPQZkVH4B6I9o0UYo5feCLW/vChVjoitNqcPGCY7B YRL1v/eGOFAnh2Zq4OH5KAxHUmF1gG1p7xQwOHU8WQ==
X-Google-Smtp-Source: ALg8bN7kFWL7huZE6cTWFTTmnNoZMMXkDLCZk5vV/pudmQ7H/Pd1OP+zzObHxhA49hpiGeYRiUpM/wxx4HiK/veQCTU=
X-Received: by 2002:ac8:d86:: with SMTP id s6mr20040958qti.324.1548659333122; Sun, 27 Jan 2019 23:08:53 -0800 (PST)
MIME-Version: 1.0
References: <> <> <> <> <> <> <> <>
In-Reply-To: <>
From: Brian Dickson <>
Date: Sun, 27 Jan 2019 23:08:41 -0800
Message-ID: <>
To: =?UTF-8?B?RGF2ZXkgU29uZyjlrovmnpflgaUp?= <>
Cc: Peter van Dijk <>, " WG" <>, Ralf Weber <>
Content-Type: multipart/alternative; boundary="0000000000006dcf3005807f5959"
Archived-At: <>
Subject: Re: [DNSOP] =?utf-8?b?562U5aSNOiBDYWxsIGZvciBBZG9wdGlvbjogZHJhZnQt?= =?utf-8?q?song-atr-large-resp?=
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF DNSOP WG mailing list <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Mon, 28 Jan 2019 07:08:58 -0000

On Fri, Jan 25, 2019 at 7:10 PM Davey Song(宋林健) <> wrote:

> Hi Brian,
> Thanks for your questions. Reply inline
> >(1) Has your testing revealed *where* the IPv6 fragmentation is
> occurring? IIRC, IPv6 requires the originating host to do so. And
> originating UDP packet size will be the smaller of the authority servers'
> configs and the EDNS bufsize on the request.
> IPv6 fragmentation is done by the end host which is specified in in
> RFC8200. The difference of IPv4 and IPv6 fragmentation is comprehensively
> introduced in draft-ietf-intarea-frag-fragile-05.

> EDNS0 bufsize initialized to 4096 octets has no meaning for large UDP DNS
> response.

Right, that's what I'm getting at.

If the EDNS0 bufsize is configured (on the resolver specifically, to be
clear) as MTU*, then no fragmentation on UDP packets will occur.

(*MTU minus fixed IP and UDP headers.)

This is what I suggest you test against the scenarios. Set the EDNS0
bufsize to various values, and observe the results (including whether or
not TC gets set).

If it is possible to fix the fragmentation problem simply by changing
EDNS0, that would be much better to write up as an ID rather than this ATR

My strong suspicion is (modulo the individual operator's localized,
observed "internet MTU") setting the EDNS0 bufsize will prevent
fragmentation without degrading/breaking DNS resolution.
(E.g. if only some of the Additional section fits, it may be necessary for
the resolver to request individual records, and/or do DNSSEC validation for
missing RRSIGs).

> Because if a IPv6 packets size larger than 1500 will be fragmented by the
> Ethernet interface. AFAIK, I''m not sure there is a configuration on
> authority severs on the size of UDP packets size. I think you refer to the
> the size of the DNS message.

Yes, technically that is correct. However, the authority servers will have
a configured maximum DNS message size.

The actual maximum DNS message size for a given resolver<->authority pair
will be MIN(EDNS0 bufsize, authority configuration).

The actual maximum UDP packet size will be that value, plus some fixed
stuff (IP and UDP headers).

The actual maximum layer 2 frame size will be the UDP packet size plus
whatever layer 2 stuff gets added.

You can generally work backwards from the functional limits (at which no
fragmentation happens) to determine the maximum EDNS0 bufsize to configure,
assuming a given destination.

A particular operator would probably need to be cognizant of the set of
such values across observed authority server paths, and pick the lowest
common denominator (i.e. that will fit in the vast majority of paths with
no fragmentation).

> >(2) Have you experimented with setting EDNS0 UDP bufsize to the *actual
> max size* that IPv6 allows *without fragmenting* (or MTU?), and what
> happens when you do that? (Actual MTU may vary topologically, YMMV, etc.)
> It require resolvers' change to set EDNS0 bufsize below a certain number.
> Usually the authority server will work around it  as stated in the ATR
> draft.

No, they won't. There are some authorities that have configured their own
max DNS message size low enough to not fragment, but that is not
necessarily in response to observed fragmentation.
And in particular, this is the operator doing this, not the DNS software.
It conflates the issue to say "server". It is better to say either
"operator" or "software", as it will only be one or the other, and knowing
which is which, along with how and why, is important in figuring out what
the problem is and what solutions are feasible.

And yes, I am aware that it requires resolvers make the change. My point is
they can, and if that fixes the problem for them, that should be the
recommendation (which might be something the software folks look at, either
adaptively or by changing their default value for EDNS0 bufsize.)

> " To avoid that issue, some  authoritative servers may adopt a policy
>    ignoring the UDP payload size in EDNS0 extension and always
>    truncating the response when the response size is large than a
>    expected one."

No, you are mischaracterizing what is happening.

It is ALWAYS the case (and required by the RFCs), that the answering server
send a DNS message no larger than the EDNS0 bufsize requested. It may be
the case that the authority server has a lower value than the supplied
EDNS0 bufsize, but that is not the same as "ignoring" the bufsize. It is
doing "MIN(server_config, EDNS0 bufsize)" to determine max DNS message
size. If the actual DNS message is smaller than that, it is irrelevant, of
course. If the DNS message would be larger, the smaller of the two values
is used, always, and it isn't "some" servers "ignoring" the supplied value,
at all.

If the EDNS0 bufsize is smaller than the configured server limit, the EDNS0
bufsize will be used, and specifically NOT ignored.

The only time an EDNS0 bufsize is ignored, is when it is below the minimum
allowed value (512), again, per RFCs.

> It is introduced that some root operator did this during KSK rollover. (
> )
> Because some end users may be behind a resolvers which is not
> TCP-capable(17% according to APNIC measurement). That is one background
> that ATR is proposed:
>  "ATR will helpful for resolver without TCP capacity, because the resolver
>    still has a fair chance to receive the large response. "
> I noticed there is a typo in above sentence in 02-version in which the txt
> is ""ATR will helpful for resolver with TCP capacity" .
> >My suspicion is that the better approach for resolvers might actually be
> to do their IPv6 stuff "better", for some value of "better", in a way that
> does not require DNS protocol changes (or changes to transport specs like
> UDP or IPv6).
> Or maybe we could add a new edns0 ip6-bufsize option in future so v4 vs v6
> limits can be separated (and thus standardize (and kind of simplify)
> resolver and auth server configs).
> It is a choice to ask resolver or server to  adopt that change which  is
> in the solution category  of Tony Finch.  I think both work and both with
> incentive to change. The difference is resolver's change take long time
> (installed base).   And the server's change will efficient in a timely
> manner.

Resolvers do not require any software updates, to make changes to their
EDNS0 bufsize. All resolver software that does EDNS0 supports configurable
(user-supplied) bufsize values, at least as far as I'm aware.

Communicating experimental results with recommendations for changes to
operational parameters is something that various folks can help with.
Getting those experimental results is the first step, IMHO.
Not doing those experiments first, and continuing to push ATR, without any
evidence showing that this specific suggestion (lower the EDSN0 bufsize)
WON'T work, would be premature (and/or irresponsible). If there is an easy
solution that will work, that is something that should be agreeable as
being the path that should be at least investigated.
(All the folks involved in the KSK roll are eager to learn any observed
issues, and to communicate recommendations to the larger DNS community,
including resolver operators.)

Would you be willing to redo your tests with EDNS0 bufsize set to some
smaller values, like 1500, 1472, 1452?


> However, ATR actually did not necessary require DNS protocol to change  or
> more specifically the change to the running authoritative server. Akira
> KATO once suggested me that ATR can be implemented as a on-path fix
> independent of DNS server. For example, use a hack in iptable or a
> separated device monitoring the DNS response size on the path and generate
> an additional truncated response if the size beyond a certain number.
> Davey