Re: [DNSOP] [Ext] Questions / concerns with draft-ietf-dnsop-svcb-https (in RFC Editor queue)

Ben Schwartz <bemasc@google.com> Sat, 27 August 2022 22:00 UTC

Return-Path: <bemasc@google.com>
X-Original-To: dnsop@ietfa.amsl.com
Delivered-To: dnsop@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DAB9DC1522C1 for <dnsop@ietfa.amsl.com>; Sat, 27 Aug 2022 15:00:18 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -17.609
X-Spam-Level:
X-Spam-Status: No, score=-17.609 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, HTML_MESSAGE=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id kcb43L2QYQwf for <dnsop@ietfa.amsl.com>; Sat, 27 Aug 2022 15:00:16 -0700 (PDT)
Received: from mail-wr1-x42c.google.com (mail-wr1-x42c.google.com [IPv6:2a00:1450:4864:20::42c]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4CA19C14F606 for <dnsop@ietf.org>; Sat, 27 Aug 2022 15:00:16 -0700 (PDT)
Received: by mail-wr1-x42c.google.com with SMTP id o6so125770wrx.11 for <dnsop@ietf.org>; Sat, 27 Aug 2022 15:00:16 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc; bh=Qv28T+HQMgIz+/WZVehONEv04afdY86xbfMBQ5fE3Tk=; b=JXRyvD+SN/iTqCFd0MRccqGFqM52VHZ8W5B2cNvMBxNyVUwzaH88FZdmgJ851WBVPv 3MVLH4Yy/jiRN0oHS8XyAiNnRrw0bMX3bq59p2zeFVTZQbqIbQd+YFJ0deWXBEwNe9xj M3HY8Ls2uUWHORLNf1/oUs7J3ptFD0CKOtMmte3FCtr5+Qg+e4zEzUPVi0VpiTTJUfRF 0XOlTwNo6GVpLQWRsW+XpI/9hjV6QDmx33aOTftlMIvr/1ByyPRAmAY3sdYKfB988siN I30oPU2hSgZCxC1VywLb4XEzDgjAfIvkyyxBGnZzQF8gYKUdjE4jhj+5WK77JTG5sz7M gZ5w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc; bh=Qv28T+HQMgIz+/WZVehONEv04afdY86xbfMBQ5fE3Tk=; b=xH+x4+0XGqBY62vRP6xmtTQLG68f0Kh8hqfZGzi2zloYoWDadsvA/cbBV+br2h327S RuNvg1V/ry9NTWzZ25fFTfSdnkSzi9QXGOrwMQXEsC9A2zt4ZIKKSANJOxi7pKcsDWMy PwPBa+6MZTtcPBZ6WgYxsoRNAUSQj7UqaBgZE7LwbKm4Z5oToKf3ceom19sS/w776tYu t43WY92TEl2kd9fnTcz6a5mt1ugQIJovnScD8YU970KNZ/P2egXWB1UTbO8VCxFmDNLw 7D13AEI/uvHarccIkzHUO1D0Xmxxa8NAkIVCARoOHxTMgKqbB7JKfFT/r7w/Wj1q4j4q k06Q==
X-Gm-Message-State: ACgBeo04MZu/TbnGtmUTn/P/vZcKpegYmOpnTRWGWnuZPZ9CsHuo3s3i TdvNfQlcUOxU0/k7AHHhz6aYcMo6dzPLrhmRylAtq563UfgZ/A==
X-Google-Smtp-Source: AA6agR7zOPPCTq9GFcDyEyD34FLKIxVfdAjNCSL66JUoo2D5WDh+utbnDt868WSKGobWzgUAEHF0iWIhvYcq0VE9HQ8=
X-Received: by 2002:a5d:64e2:0:b0:225:79d3:d6d9 with SMTP id g2-20020a5d64e2000000b0022579d3d6d9mr2836455wri.240.1661637614170; Sat, 27 Aug 2022 15:00:14 -0700 (PDT)
MIME-Version: 1.0
References: <CAHw9_iKZJndu1100LBU3TiuhF9ACb0As2deA1oZWD2eA46tBbA@mail.gmail.com> <CAH1iCiqryY=u6MN2mkf7krHLmc7TQkoDaXe0k=ZZ+0e9uiMb-Q@mail.gmail.com> <YwaQrnoA3hifxCQW@straasha.imrryr.org> <CAMOjQcEcKQSWvb_LqmfkGwZ2dt_561jLZxHTMuMO0pMy2s9mbw@mail.gmail.com> <CAH1iCirnWdDY0p2-grQKN3PQWOM=JLevxbNskFFEzGwHvisGZA@mail.gmail.com> <B024358C-77FD-4E63-8E18-1CBCEA6C6B14@icann.org> <CAH1iCiry3VDS+dM+wEkPH5a_TSt5pEddxPjKOhL9_M20e_dR0A@mail.gmail.com> <8B970775-22CF-403B-9B8A-84DCC0932D76@icann.org> <CAHbrMsC_RO1J6qp_yOWOc3P4zpZ-cOCB6adXRwjoSQP7_yrWug@mail.gmail.com> <CAH1iCiqzeZORDmbE+XMs1wt6YZKYFZWnsnrvN8fbLHpFXEfDfw@mail.gmail.com>
In-Reply-To: <CAH1iCiqzeZORDmbE+XMs1wt6YZKYFZWnsnrvN8fbLHpFXEfDfw@mail.gmail.com>
From: Ben Schwartz <bemasc@google.com>
Date: Sat, 27 Aug 2022 18:00:01 -0400
Message-ID: <CAHbrMsDSbDapPFFfhU1iyi5BpEjb8NA7WXz+1pu78dGnuVkNzg@mail.gmail.com>
To: Brian Dickson <brian.peter.dickson@gmail.com>
Cc: Paul Hoffman <paul.hoffman@icann.org>, "dnsop@ietf.org WG" <dnsop@ietf.org>
Content-Type: multipart/signed; protocol="application/pkcs7-signature"; micalg="sha-256"; boundary="000000000000c86abc05e7402727"
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnsop/ph8L-2RsIkFaX9WEO9786zrZO1g>
Subject: Re: [DNSOP] [Ext] Questions / concerns with draft-ietf-dnsop-svcb-https (in RFC Editor queue)
X-BeenThere: dnsop@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: IETF DNSOP WG mailing list <dnsop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnsop>, <mailto:dnsop-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dnsop/>
List-Post: <mailto:dnsop@ietf.org>
List-Help: <mailto:dnsop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 27 Aug 2022 22:00:18 -0000

On Fri, Aug 26, 2022 at 10:49 PM Brian Dickson <
brian.peter.dickson@gmail.com> wrote:

>
>
> On Thu, Aug 25, 2022 at 1:35 PM Ben Schwartz <bemasc@google.com> wrote:
>
>> Thanks to Brian, Viktor, and others for the very close review, as
>> always.  While I disagree with many of the claims made here, it's clear
>> that some of the text has proven confusing.  I'm not familiar with the
>> process rules given the draft's current state, but perhaps we can improve
>> clarity with some modest revisions.  I'll try to keep that discussion
>> separate.
>>
>> The key text for this discussion is in Section 3:
>>
>>    If the client is SVCB-optional, and connecting using this list of
>>    endpoints has failed, the client now attempts to use non-SVCB
>>    connection modes.
>>
>> I believe most reviewers correctly understood this to mean that, if all
>> else fails, you can connect to https://www.example.com/ using the AAAA
>> or A records on www.example.com, as usual.
>>
>
> One of the problems here is that the term "connection mode" is somewhat
> ambiguous.
> Does that refer to the HTTP mode, or DNS queries?
>

It means, "connect however you would have if you didn't know about the SVCB
RR type".

It could mean, "connect via HTTP using no SvcParameter" elements, i.e.
> connects using vanilla HTTPS, using the last $QNAME.
>

No, that is covered separately two paragraphs earlier, and would not be
"non-SVCB" if $QNAME is derived from a SVCB TargetName.


> It does not explicitly identify the connection target address.
> (e.g. if there was not a DNS resolution failure, the connection failure
> should be "hard" rather than "soft", and only the final $QNAME should be
> used, vs explicitly stating "non-SVCB connection to addresses found by
> resolving A/AAAA records at <specified location relative to $origin>").
>

This is intentional, because there are many possible ways to use DNS
records to reach a service, and this specification does not know what the
client will do if SVCB is not working.

The logic is simple: this draft does not make "HTTPS Record" support
>> mandatory for HTTP clients.
>>
>
> What the current version of the draft does, is it asserts HTTP ownership
> of any apex A/AAAA records, something that was not the case prior to this
> draft.
>

This draft says nothing about how HTTP connects in the absence of SVCB, so
it cannot "assert HTTP ownership" over anything that was not the case prior
to this draft.

>
>    - The fallback logic assumes that A/AAAA address records MUST serve
>    the same content as the HTTPS record
>
> Yes.

>
>    - This effectively forces the owner to choose between "no A/AAAA
>    record", or "have A/AAAA records that require maintenance (if the are the
>    result of doing DNS resolution on the HTTPS target)", or "do some other
>    thing that serves the same HTTP content".
>
> This is already forced by the existence of non-SVCB-aware clients.

>
>    - This means that for as long as the current draft is published as-is,
>    and until it is given a -bis treatment or is deprecated, no other use of
>    apex A/AAAA records is possible (without impacting HTTPS-aware clients)
>       - This also means no OTHER SVCB-compatible apex RRTYPE can be used
>       that also requires fallback to A/AAAA records, as those would conflict with
>       the HTTPS usage
>
> No, it is perfectly possible for "example.com" to serve many different
protocols today.  Defining SVCB mappings for all those protocols, and
publishing SVCB records for them at _fooN.example.com, would work fine,
because the fallback A/AAAA records on the origin hostname would continue
to serve all of those protocols.

...

> Thus, HTTP servers are still required to publish functional address
>> records on the origin hostname, as usual.
>>
>
> This is *only* true if the HTTPS record is added to a pre-existing
> address record's origin hostname.
>

Whether the hostname is pre-existing is immaterial.  HTTP hostnames must
carry address records if they want to reach the large fraction of clients
that do not (or cannot) query for HTTPS records.  This draft does not
"deprecate" such clients.

...

> (Similar logic applies to any other pre-existing protocol that may be used
>> with SVCB.)
>>
>
> I don't follow this logic, could you explain, please?
>

Protocols such as SSH/SFTP, IMAP(S), and XMPP all expect the hostname to
carry address records.  If SVCB mappings are defined for these protocols in
the future, the need for compatibility with non-SVCB-aware clients will
similarly require server operators to maintain these address records.

Suppose (at some future point) there are half a dozen SVCB-compatible
> RRTYPEs, all published at a zone apex.
> Which service (corresponding to which RRTYPE) is the pre-existing protocol
> served at the apex A/AAAA address(es)?
>

If these services all predate SVCB, then the use of SVCB is optional for
clients, so these services need to function for non-SVCB-aware clients.


> There can be ONLY one protocol attached to an address, unless the address
> is forced to serve many protocols simultaneously (not an ideal situation at
> all).
>

SVCB does not make this any worse than before.  Indeed, it makes it better:
new protocols can define themselves as SVCB-reliant, disabling the fallback
behavior and achieving your architectural goal.


> Since these records are required to be present and working, we can hardly
>> forbid clients from using them.
>>
>
> Except, they aren't *required* to be present and working, at least not
> within the scope of this draft.
>

If they aren't present and working, your website will be broken for a lot
of users.  That's a pretty good definition of "required".

...

> This pre-existence assumption omits an entirely different logic branch:
> green-field deployments using HTTPS (where no current records of any type
> existed, or existed for other services, including "parking" generic pages,
> "cash parking", etc.).
>

No, it doesn't matter whether the domain is new or old, so long as there
are clients that can't reach it.

GoDaddy provides managed DNS hosting (which in some cases involves add-on
> services that are provisioned into the customer's zone), for tens of
> millions of domains. A significant portion of those will be updated to use
> HTTPS in the near future, presuming what gets published will work correctly.
>

That's great!

 On Fri, Aug 26, 2022 at 10:57 PM Brian Dickson <
brian.peter.dickson@gmail.com> wrote:

>
>
> On Thu, Aug 25, 2022 at 1:35 PM Ben Schwartz <bemasc@google.com> wrote:
>
>
>> (Also, Brian's analysis indicates that the origin hostname's address
>> record TTL would bias the endpoint selection, but this is not correct.)
>>
>
> This statement intrigues me, and I think is highly relevant to the
> discussion.
>
> Could you explain further?
>

Nothing in the client behavior section recommends prioritizing the use of
address records that are in the DNS cache, and the SVCB endpoints have
higher priority than the origin endpoint.  (Section 5.1 describes an
interesting special case that would not apply to your architecture.)

...

> For purposes of focusing on the draft, I would like to limit this to
> things in the draft; if it isn't in the draft, but it addresses the
> concerns raised, the obvious answer is: please add it to the draft
>
>    - The goal is to standardize the behavior, by explicitly including
>    everything authority servers, resolvers, and clients need to do to
>    interoperate with the same characteristics
>
> That is not the goal of this draft.  The goal of this draft is to define
the semantics of SVCB/HTTPS records, allowing servers to publish "promises"
and "hints" that can be used flexibly by clients.

>
>    - If only some of the clients have this good behavior, but others do
>    not, that is not good for the usability of HTTPS records
>
> Tightly constraining the behavior of clients would certainly allow server
operators to deploy fancier architectures, but I do not believe that this
harms the usability of HTTPS records in the use cases that are considered
in-scope.  Giving clients flexibility is also to the advantage of server
operators, who surely would like clients to be able to make adjustments
that optimize the performance and availability of connections to their
servers.
...

On Fri, Aug 26, 2022 at 11:12 PM Brian Dickson <
brian.peter.dickson@gmail.com> wrote:

>
>
> On Fri, Aug 26, 2022 at 4:29 AM Ben Schwartz <bemasc=
> 40google.com@dmarc.ietf.org> wrote:
>
>>
>>
>> On Thu, Aug 25, 2022 at 7:19 PM Viktor Dukhovni <ietf-dane@dukhovni.org>
>> wrote:
>>
>>> On Thu, Aug 25, 2022 at 04:35:39PM -0400, Ben Schwartz wrote:
>>>
>> Indeed it is a possible position to say that the Internet is not yet
>>> ready for semantically distinct services seen by SVCB-aware and legacy
>>> clients.
>>
>>
>> In addition to the deployment concerns I've mentioned earlier, a
>> deployment of this kind would be intrinsically insecure: a hostile
>> intermediary could override the choice of which semantically distinct
>> service is seen by the client.  That's another reason why this
>> configuration is not permitted.
>>
>
> I don't think it is the case that it is not permitted.
>
> Note that many/most of the cases in 3.1 do not account for one specific
> permutation:
>
>    - An apex AliasMode HTTPS record, with no prior or subsequent CNAMEs,
>    and no subsequent AliasMode records, in a DNSSEC signed zone, which also
>    has apex A/AAAA records. All the records in the zone are signed.
>    - It is literally impossible for a hostile intermediary to selectively
>    block service, without the client having the ability to detect this (if the
>    client is doing DNSSEC validation itself, or if the client is asking the
>    upstream DNS resolver to do DNSSEC validation and return data with the AD
>    bit set).
>    - If the client detects any failure (including SERVFAIL), and the
>    Chain length is 1, and the DNS lookups are cryptographically protected, the
>    client MUST hard-fail (per the current spec).
>
> This particular case appears to me (and I'd argue is also proveably)
> intrinsically secure.
>

No, it is not secure.  Whether or not DNSSEC is used, a network
intermediary can (for example) rewrite the IP headers on the transport
packets so that the user connects to any endpoint of the attacker's
choice.  TLS authentication cannot distinguish which endpoint was
selected.  If the endpoints serve different content, the attacker chooses
which content the user sees.

NB: When GoDaddy begins publishing HTTPS records in customer-managed DNS
> zones, it will do so only with DNSSEC signed zones, using AliasMode records
> with Chain length of 1, with or without apex A/AAAA records (mostly likely
> with).
> (The intent of making DNSSEC widely available has previously been
> discussed, so this isn't really news per se, except in the context of HTTPS
> and section 3.1)
>

That's great to hear!

On Sat, Aug 27, 2022 at 12:22 AM Brian Dickson <
brian.peter.dickson@gmail.com> wrote:

>
>
> On Thu, Aug 25, 2022 at 1:35 PM Ben Schwartz <bemasc@google.com> wrote:
>
>>
>> Brian proposes a use case of serving only a warning message on the origin
>> endpoint, in order to minimize the load on IP addresses that are likely
>> hardcoded into a customer's zone.
>>
>
> So, the major update to add to this is:
>
>    - We (GoDaddy) have revisited this approach, and are now considering a
>    much better design (summary follows below)
>
>
> The design we are considering is deployment of Web redirect servers (via
> apex A/AAAA records) which do HTTP 301 permanent redirect responses.
> These would respond to connections to the apex domain ("example.com") and
> redirect the client to a non-apex name ("www.example.com").
> The non-apex name would have a CNAME to redirect to the actual delegated
> authority.
> The RDATA on the CNAME would be identical to the RDATA on the apex HTTPS
> record.
>

I like it!  This achieves a nice balance of simplicity and compatibility.

...

> This means that for those domains (in the millions or tens of millions),
> the fallback in the draft will only result in added overhead while never
> actually achieving any successful connections (due to shared fate between
> legacy and HTTPS).
>

Statistically, it'll probably still help.  You'll get an extra bite at the
apple on flaky failures.

...

> Removing the language from the draft does not force implementers to not do
> their own thing. Individual client implementations could still do the
> fallback thing, but would not be required to do so.
> It does, however, put more responsibility on the implementers to respond
> to issues raised if adverse effects result. It might be advisable to be a
> user-configurable option, possibly off-by-default.
> Implementers would not be able to deflect blame for problems via the "it's
> what the RFC says" response, if problems do occur.
>

I agree that the fallback logic results in a somewhat more aggressive
client retry behavior.  I don't think this amounts to an "adverse effect".
The intention is to ensure that the retry behavior with SVCB is not _less_
aggressive than it would otherwise have been, so naturally it will
sometimes be _more_ aggressive.

Instead, the draft attempts to ensure that deploying and implementing the
>> HTTPS record "does no harm", by giving participating clients no worse
>> reliability than legacy clients.
>>
>
> This is one place where quantitative data would help the conversation
> immensely.
> Is there data concerning the failures observed (DNS resolution or HTTP
> connections) in following CNAME records from authoritative zones to CDNs?
> If the failure rates are really low, is that worth the effort in adding
> this fallback flow?
>

This fallback does not apply to CNAME.  It applies to SVCB/HTTPS, which is
expected (and recently measured) to have a higher failure rate due to
ossified DNS middleboxes and the need for client followup of non-default
TargetNames.  However, DNS resolution failures are not the only motivating
failure mode here.  The fallback procedure is principally relevant SVCB
resolution succeeds but transport connections fail (i.e. CDN outage).

...

> If not, perhaps the benefits of ServiceMode actually become more
> important, and falling back is actually likely to degrade, rather than
> improve, the user experience?
>

Given that the only alternative to fallback is total connection failure, I
don't see how it is likely to degrade the user experience.

Is the implementation of fallback strictly speculative?
>

No, it is primarily logical.  These records are expected to exist and be
functional, so the client might as well try to use them if all else fails.
Moreover, the client will already have fetched them (due to the parallel
query optimization), so the cost of trying them is minimal.

If so, perhaps leaving it out of the draft, presenting results at DNS-OARC
> once data is available, and publishing a -bis draft to include fallback (if
> the data supports doing so) is a better approach?
>

I don't think this would significantly alter the course of implementation.
Several major browser implementations are already complete or under-way,
and those developers will choose their precise behavior based on their own
best judgement.  (This logic has not elicited any concerns from those
developers thus far.)


> For example, post-deployment data from browsers may show that we could
>> eliminate the final fallback without reducing reliability.
>>
>
> Among the problems introduced by HTTPS-aware clients successfully
> obtaining AliasMode records, and then subsequently connecting via apex
> A/AAAA records (when fallback occurs) is that DNS-level observations are
> adversely affected.
>

To be clear: clients perform their initial HTTPS, A, and AAAA queries in
parallel, as they do not know whether the HTTPS record exists and cannot
delay for 1 RTT to find out.  This is independent of fallback.

...

> There will not be a clean "signal" identifying legacy-only clients.
> There will not be any ability to correlate fallback behavior with client
> software (browser "brand" generally, or brand+version).
>

In practice there is likely to be an extremely strong correlation, limited
only by the failure rate of the ServiceMode endpoints (which ideally will
be low!).

So, attempting to optimize for failure can actually negatively impact
> measurement of failure and root cause analysis.
>

Measuring and diagnosing a surge in usage of the fallback IPs seems likely
to be much easier than diagnosing a mysterious loss of traffic at the
ServiceMode IPs, and it has the extra benefit of avoiding a user-visible
outage.

...

>  Fail fast may not be appealing, but in some (probably the majority of)
> cases, it may be the most correct option.
>
> It may also be the case that the zone owner knows whether this is the case.
> I think it is much more likely that explicitly declaring the situation (if
> known) is more useful than having several billion clients independently
> attempting to infer whether the first option will even work, let alone
> provide a useful alternative to the second or third.
>

In fact, there is one way for the zone owner to disable fallback: enable
ECH.  Fallback is not compatible with ECH, so ECH-aware clients will
disable fallback when the ServiceMode records contain ECH.

On Sat, Aug 27, 2022 at 12:32 AM Brian Dickson <
brian.peter.dickson@gmail.com> wrote:

>
>
> On Thu, Aug 25, 2022 at 1:35 PM Ben Schwartz <bemasc@google.com> wrote:
>
>> For now, I think it's better to keep the current guidance, in order to
>> minimize the risk of disruptions as these new RR types begin to be deployed.
>>
>
> I have a small favor to ask.
>
> Could you try to "sell" the guidance from the hypothetical perspective of
> it not having been part of the draft?
> I.e. if it was not already in the draft, and you were proposing the
> fallback (after successful AliasMode response), is there a short pitch that
> makes it compelling?
>

I think of fallback as a lightweight form of "multi-CDN".  The basic pitch
for me is: HTTPS records should be safe to deploy.  If you add a CDN via
HTTPS records, and the CDN goes down, the website shouldn't go down,
because simply adding HTTPS records should not make anything worse.  If you
mess up your HTTPS record configuration, your site shouldn't go down if it
can be helped.  We want people to feel comfortable rolling out HTTPS
records, not worrying that it will reduce reliability.

However, the standardization pitch is really: the standard should describe
what clients will actually do, and this is what clients will actually do.

Consider also that there are roughly 3 million resolvers (with vastly
> varying client bases), hundreds of millions of zones, and billions of
> clients. The cross product is probably sparse, but definitely not super
> sparse.
> There is no sharing of information between clients, and they are all
> implementing the logic from local knowledge only, correct?
>

Correct.


> How does this scale if a large proportion of fallback lookups don't/can't
> result in success if the primary lookup fails?
>

There is no "fallback lookup".  The A, AAAA, and (initial) HTTPS records
are queried simultaneously.

On the client, on the resolver, on the authority server, on the apex web
> server (at the fallback address), on the CDN?
>

Fallback has no impact on the DNS behavior.  Its main effect is to ensure
that, if the CDN starts failing, that traffic moves to the fallback
address.  This makes CDN failures easier to measure and mitigate.