Re: [DNSOP] CNAME chain length limits

dagon <dagon@sudo.sh> Wed, 27 May 2020 20:06 UTC

Return-Path: <dagon@sudo.sh>
X-Original-To: dnsop@ietfa.amsl.com
Delivered-To: dnsop@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EF0B33A0B15 for <dnsop@ietfa.amsl.com>; Wed, 27 May 2020 13:06:16 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.621
X-Spam-Level:
X-Spam-Status: No, score=-1.621 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, KHOP_HELO_FCRDNS=0.276, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id agATvKQBpvz2 for <dnsop@ietfa.amsl.com>; Wed, 27 May 2020 13:06:15 -0700 (PDT)
Received: from sudo.sh (hexakaideca.sudo.sh [198.177.251.74]) by ietfa.amsl.com (Postfix) with ESMTP id 6B39D3A0B12 for <dnsop@ietf.org>; Wed, 27 May 2020 13:06:15 -0700 (PDT)
Received: by sudo.sh (Postfix, from userid 1000) id 841AF26547F; Wed, 27 May 2020 20:06:14 +0000 (UTC)
Date: Wed, 27 May 2020 20:06:14 +0000
From: dagon <dagon@sudo.sh>
To: Evan Hunt <each@isc.org>
Cc: John R Levine <johnl@taugh.com>, dnsop@ietf.org
Message-ID: <20200527200614.GC3582@sudo.sh>
References: <alpine.OSX.2.22.407.2005271341530.35268@ary.qy> <20200527180846.GA51895@isc.org>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <20200527180846.GA51895@isc.org>
User-Agent: Mutt/1.5.23 (2014-03-12)
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnsop/tq5FJQWLiKGMVRehI-p03MgNMKc>
Subject: Re: [DNSOP] CNAME chain length limits
X-BeenThere: dnsop@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF DNSOP WG mailing list <dnsop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnsop>, <mailto:dnsop-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dnsop/>
List-Post: <mailto:dnsop@ietf.org>
List-Help: <mailto:dnsop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 27 May 2020 20:06:17 -0000

On Wed, May 27, 2020 at 06:08:46PM +0000, Evan Hunt wrote:
> On Wed, May 27, 2020 at 01:48:32PM -0400, John R Levine wrote:
> > is there any consensus as to the maximum CNAME chain length
> > that works reliably, and what happens if the chain is too long? Hanging
> > seems sub-optimal.
> 
> BIND cuts CNAME chains off at 16. As far as I know that was an arbitrarily-
> selected value, but it's been in the code since 1999 and so far as I can
> recall, no one's complained. The maximum reliable chain length won't be any
> longer than that; it might be shorter.
> 
> When a chain is too long, I think BIND just returns a response with the 16
> CNAMEs it's found so far, and without a final answer. The client can start a
> new query from where the response left off, but I would expect most to
> treat it as a non-answer.

This is an interesting topic.

Some recursives cut off at 8 CNAME chains.  Some (Level 3, if I
recall) fail at less, but retry right after sending SERVFAIL or
RCODE!=0 to the stub, perhaps to populate cache farms.  Some major
"cloud DNS" (e.g., Google if I recall) chase 30 chains before fail.
Some appear to have a ~3-sec window for the outbound queries (meaning
they have no chain count limit, only time); some appear to have a hard
numbered limit like BIND.

Poorly crafted DNS crawler scripts seem to follow CNAMEs forever (up
to some script mediated timeout period, or until the operator stops
the script and complains to the parent zone's registrar, on the theory
that unexpected behavior is abuse---despite CNAME chains being useful
for path diagnosis in VPN operation, passive DNS monitoring, etc.)

The CNAME behavioral matrix can also be extended to include:

  -- Tests for ("improper") horizontal vs. vertical CNAMEs.  Some
     recursive speakers fail; some complain ("BAD (HORIZONTAL)
     REFERRAL", but answer), and some follow without complaint.

  -- All should avoid graph cycles in CNAME chains back to ancestral
     records.

  -- Tests for slow responses, where the authority crafting the CNAME
     delays by some variable millisecond time period, to test whether
     the chain depth is time or count based.

  -- One could test for 1034 s.3.6.2 restart to chase discovered
     CNAMEs, absent additional records being added to cache.  Some
     platforms (Azure, if I recall) return just the CNAME, even if
     local cache (evidently) holds a terminating record.  I've not
     tested if this (re)introduces circular dependencies, but
     Azure(?)'s explanation would be that the restart and cycle
     detection must now occur in the stub.  But one should test
     with/without BIND's minimal-responses (and similar configurations
     in other recursives), and implications for cycles.

  -- All of the above, but for DNAME instead of CNAME.  PowerDNS will
     not support this of course, for what I infer are (understandable)
     architectural and commercial demand reasons.  But conceptually an
     authority creating synthetic CNAME records is a workable
     substitute for DNAMEs.  Some DNAME chase limits follow the CNAME
     chain limits.  One can chain multiple CNAME chains together using
     DNAME, and this may count against the original chain counter, or
     start a new one (and sometimes within some timeout period of
     course).  This also stops many script-based crawlers, which don't
     handle DNAMEs or don't bother to substitute synthesize the query
     under the new zone tree.  (I.e., they appear to cut/grep DNAME
     answers, and not handle out-zone synthesis, making them blind to
     the referral subzones.)

  -- Same, but for so-called ANAME or 'rooted CNAME' records.

  -- Measure all of these behaviors tests where the NS has essentially
     zero TTL (e.g., to measure whether retries are with/without NS
     delegation rediscovery).

  -- Measure all of the above with/without DNS authorities that are
     whitelisted for EDNS Client Subnet. (It is hard to think of a
     reason to allow ECS in a CNAME, PTR, or similarly constrained
     query type.  But some architectures use "turtles all the way
     down", and the CDN itself needs a CDN via ECS, which then needs a
     CDN via ECS, etc.
     
I've tested all of these combinations and more.  There are also many
valid commercial uses for CNAME chains beyond CDN and zone agility
(e.g., path discovery, edns-client-subnet testing, etc.)  So
blocking/limiting CNAME chains seems unwise.

-- 
David Dagon
dagon@sudo.sh
D970 6D9E E500 E877 B1E3  D3F8 5937 48DC 0FDC E717