Re: [DNSOP] Fundamental ANAME problems

Brian Dickson <brian.peter.dickson@gmail.com> Fri, 02 November 2018 01:38 UTC

Return-Path: <brian.peter.dickson@gmail.com>
X-Original-To: dnsop@ietfa.amsl.com
Delivered-To: dnsop@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 31FB6124BE5 for <dnsop@ietfa.amsl.com>; Thu, 1 Nov 2018 18:38:34 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.998
X-Spam-Level:
X-Spam-Status: No, score=-1.998 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id sM5UPJTsoUKO for <dnsop@ietfa.amsl.com>; Thu, 1 Nov 2018 18:38:31 -0700 (PDT)
Received: from mail-vk1-xa2b.google.com (mail-vk1-xa2b.google.com [IPv6:2607:f8b0:4864:20::a2b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id BCD36129C6A for <dnsop@ietf.org>; Thu, 1 Nov 2018 18:38:29 -0700 (PDT)
Received: by mail-vk1-xa2b.google.com with SMTP id l186so99858vke.0 for <dnsop@ietf.org>; Thu, 01 Nov 2018 18:38:29 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=qfQMAgTceMN1ONJLOfVYVaq07MGCRKX/el6syWccAU8=; b=ML5xXahR5k2DKYFcFMBEBPsx/iYOC0tTPDhr+MdK4s2gKjoLoqZ4JWUCUrLerIDSyI 7Ud+U6yQlWspaG/5y6O1iiQoFrTd9c6NmNJVs+TDe5GUunsOqHRjuLR0hMPrL8X/WYdQ g4MUOtUVqnHbgeIyeN8BugSX9qrLvs1ekUgKjU4TESaXLQGOWTWAXfXhm8B0QDGL+NtV ATEDkLkGbPm3ieYGexvPpG7Yd/X77jhjmDLSZwKerUged4Yo0WsgcmlSQoQZ9a7mzbu8 Cn+0Rb/1q38cWRBcUnUsdwla7ZDklRc7SQpgjv2xTOBbj5tGK0kqxUhT9gnbSY1ZBywQ oPfQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=qfQMAgTceMN1ONJLOfVYVaq07MGCRKX/el6syWccAU8=; b=Z7rN3I5sdQpnyaO73Q00v6+69x3EAGAIuJNczVv0+EptTMeW203OQDGKvzRg7H9tfX IcRQfCt+LkeUOCDJah+atnI/pbVJa9ODc1nWMGasfrKotSvjz6Xg2PyW4kmjucDdWYDD gEXDY1wxdf9qhxHMg8ulnjpNjU+H1S+ZovHYho6TE6F1IhvLfs4WeWpD8QdRsf0AeN1Y Z60u8pCDx2ICXV+N1lKEM+wwUDIVqkuSjMn7qkTKbPYqtsUAjXjCEz5HSqO8pV2yOmbx WtHTiUN6cpJ/oUkr5viATqRZO1WN0CHBdj4FmGkPNL3Wa7qbpEL4F0r6gER34X4Ex80n lfAQ==
X-Gm-Message-State: AGRZ1gJbAM4K0NR5twMJwZa+Laik+Tqd8QbuVQVqcY8WgXnx0Oxb1pts yfYmFuLPzckUYh7lnGxc/ZeDHIjqTrPVodbJYaY=
X-Google-Smtp-Source: AJdET5cK856TN3kn3ufGn2vNXj/sqaAC21kQuRegDyoSkQVPd7d5GTEIC6r8lLApdGJXJjq81F/H9Eq+cXz+IVSUfH4=
X-Received: by 2002:a1f:2ed7:: with SMTP id u206mr4278535vku.72.1541122708633; Thu, 01 Nov 2018 18:38:28 -0700 (PDT)
MIME-Version: 1.0
References: <CAH1iCirXYsYB3sAo8f1Jy-q4meLmQAPSFO-7x5idDufdT_unXQ@mail.gmail.com> <20181102001431.129AC2007E00AF@ary.local>
In-Reply-To: <20181102001431.129AC2007E00AF@ary.local>
From: Brian Dickson <brian.peter.dickson@gmail.com>
Date: Thu, 01 Nov 2018 18:38:17 -0700
Message-ID: <CAH1iCioGbweYndujWRsHFJ5ZJz+NXkL-_cyB13Xq4m5Espbmpw@mail.gmail.com>
To: John Levine <johnl@taugh.com>
Cc: "dnsop@ietf.org WG" <dnsop@ietf.org>
Content-Type: multipart/alternative; boundary="0000000000009a7d980579a49766"
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnsop/2aTr6xPgsPT3te6OEwZlpSKY1wo>
Subject: Re: [DNSOP] Fundamental ANAME problems
X-BeenThere: dnsop@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF DNSOP WG mailing list <dnsop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnsop>, <mailto:dnsop-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dnsop/>
List-Post: <mailto:dnsop@ietf.org>
List-Help: <mailto:dnsop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 02 Nov 2018 01:38:34 -0000

On Thu, Nov 1, 2018 at 5:14 PM John Levine <johnl@taugh.com> wrote:

> I can't help but note that people all over the Internet do various
> flavors of ANAME now, and the DNS hasn't fallen over.  Let us not make
> the same mistake we did with NAT, and pretend that since we can't find
> an elegant way to do it, we can put our fingers in our ears and it
> will go away.
>
>
Did you not read my full message?
I didn't say don't do that, I said let's do it in an elegant way.
Then I provided a few examples of how to do that.

What is being done now is not ANAME by any stretch; it is
vertically-integrated apex CNAME flattening.
Yes, there are several providers doing it.
Their customers are locked in to a single provider, precisely because of
that vertical integration.
None of their customers can have multi-vendor redundancy with feature
parity.
While not a prime motivation for ANAME or its alternatives, it is certainly
(or should be) one of its goals.

The fact that each existing vendor's solution is, and requires, vertical
integration, means each is fundamentally a closed system, with no interop
possible.

What ANAME, and the other suggested things, are doing is figuring out how
to do interoperable stuff that allows something kind of like a CNAME, to
co-exist at an apex.

Can you point me to a non-closed, non-vertically-integrated ANAME-like
thing that offers interoperable multi-vendor support?

I think you are confusing "dynamic update of A based on
meta-data-configured FQDN" with actual ANAME.

So, DNS not having fallen over yet, has nothing at all to do with ANAME.


> In article <
> CAH1iCirXYsYB3sAo8f1Jy-q4meLmQAPSFO-7x5idDufdT_unXQ@mail.gmail.com> you
> write:
> >The requirement on update rate, is imposed externally by whichever entity
> >operates the ANAME target. In other words, this is not under the direct
> >control of the zone operator, and is potentially a potentially (and very
> >likely) UNBOUNDED operational impact/cost.
>
> "Something very bad will happen if I do that."  "OK, so don't do
> that."  My aname-ish code has a maximum update rate, and I expect
> everyone else's does too.  Yeah, the ANAMEs won't be in sync with
> the hostile remote server, but I can't get too upset about that.
>

How many zones do you operate this way?
What is the maximum update rate?
Are those zones you operate on behalf of paying customers?
If those were paying customers, and the records got out of sync, don't you
think the customers would get upset?

That's the primary point; when non-toy situations with paying customers are
considered, it isn't up to you to decide what the update rate is, and you
don't have the luxury of not caring.

It isn't whether it works for you; it's whether it works for EVERYBODY.
If it doesn't, then we need to work harder on the problem.


>
> >Third, there is an issue with the impact to anycast operation of zones
> with
> >ANAMEs, with respect to differentiated answers, based on topological
> >locations of anycast instances.
>
> How is this different from CNAMEs via to 8.8.8.8 and other anycast
> caches?  The cache has no relation to the location of the client unless
> you use one of the client location hint hacks.
>

Because authority servers for the same zone, when not doing stupid DNS
tricks, are in sync.
This is by design, and is the expectation of clients, resolvers, and
registrants.

Anycast caches do not have any expectation or requirement to be sync'd, and
in particular, due to stupid DNS tricks, are typically topologically sync'd
to regional answers.

Anycast caches with smaller footprint or odd customer bases, might do those
hacks, but even without them, there will be significant differences in the
contents of those caches, in different locations.

The problem is the ANAME *target* -- that will typically also be
topologically diverse, e.g. answers supplied will involve stupid DNS tricks.

You can't have your ANAME use only a single view and push that SAME answer
to all anycast nodes.
Doing so would break the client->resolver->(anycast auth)->ANAME-target
model of diversified answers.
If client/resolver are supposed to hit ANAME-targets (which are themselves
anycast, but which do stupid DNS tricks to give different answers) and get
DIFFERENT answers, then having only one instance of the ANAME-target
returned by the anycast auth (regardless of location) will be an
"#EpicFail".

Example:

   - client in Los Angeles -> resolver somewhere in California -> ??? ->
   AWS obfuscated-name -> California IP address (based on resolver IP, or
   maybe client-subnet)
   - client in Boston -> resolver somewhere in New England -> ??? -> AWS
   obfuscated-name -> New York IP address (based on resolver IP, or maybe
   client-subnet)
   - If ??? is an ANAME, which does a tracking query FROM ONE LOCATION, and
   mirrors that out to many anycast instances, then one of two results will be
   seen in the mini-example case:
      - The client in Los Angeles will receive the New York IP address, or
      - The client in Boston will receive the California IP address
      - According to the HTTP folks, neither of those is "acceptable".

The alternative is having EACH anycast instance for the auth server (which
has an ANAME target of the AWS blob) doing its OWN tracking, which requires:

   - Widely distributed DNSSEC signing (which requires placing the ZSKs
   everywhere)
   - Even more costly load out in every anycast location (multiply the
   original Master lookups, by the number of anycast instances, times the
   number of zones)



>
> I'm not wedded to the current ANAME spec but we have plenty of experience
> showing that it's possible to implement without causing disasters?
>
>
The issue isn't whether it is possible to IMPLEMENT, it is whether it is
feasible to OPERATE at scale.

I don't doubt your implementation is easy, or any singular (non-anycast)
implementation equally easy.

The question is, would the incremental operational load, if you upped the
zone count to O(10^6), be within reason?
Or, how about the complexity of implementing the anycast stuff, especially
differentiated answers?

I submit that currently, in the face of possible externally-imposed update
rates, it isn't possible to guarantee that the incremental operational cost
would be negligible.

On the other hand, the impact on resolvers (whose scaling is determined by
actual query load, not the authority-update side of things), would be
definitely marginal, if a solution that placed the burden on them was
agreed-upon.

Resolver operators can trivially shoulder the burden.

Here's why: The operation difference would be, when QTYPE=A, Answer
includes ANAME or WCRR, do another query for the RDATA, exactly as if the
Answer had been a CNAME. Functionally identical, load-wise identical, to
having an actual apex CNAME.

For authority operators, scale is the problem, along with disproportionate
load. The cost of tracking the siblings to ANAME records is the same,
regardless of zone popularity. Amortized over a large query volume, and a
small number of zones, it is easy to ignore the cost. However, when
multiplied by a large number of zones, in the commodity range of domain
names, it is not the case. You can't lose money on each and expect to make
it up on volume.

Brian