Re: [DNSOP] [Ext] DS glue for NS draft

Brian Dickson <brian.peter.dickson@gmail.com> Mon, 16 August 2021 23:41 UTC

Return-Path: <brian.peter.dickson@gmail.com>
X-Original-To: dnsop@ietfa.amsl.com
Delivered-To: dnsop@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 95ABB3A2288 for <dnsop@ietfa.amsl.com>; Mon, 16 Aug 2021 16:41:05 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.097
X-Spam-Level:
X-Spam-Status: No, score=-2.097 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 217hKxyhORCX for <dnsop@ietfa.amsl.com>; Mon, 16 Aug 2021 16:41:00 -0700 (PDT)
Received: from mail-lf1-x12e.google.com (mail-lf1-x12e.google.com [IPv6:2a00:1450:4864:20::12e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 7A05D3A2289 for <dnsop@ietf.org>; Mon, 16 Aug 2021 16:40:59 -0700 (PDT)
Received: by mail-lf1-x12e.google.com with SMTP id t9so37784372lfc.6 for <dnsop@ietf.org>; Mon, 16 Aug 2021 16:40:59 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=RNl+MHEBUgIal9DgFRhFMthnbJ/N9bHdMHZvA3UeCTE=; b=s/KIr1JKMCyyaJuzTr+F8PtYQg6thYSGgbKG4vCHp7BcsuIU9aaPHeXhpvubzyUwue KjL3tyXNBqofFdU+ovdW9MQ4MVxaIzYB7/mYTaKHO11V5nntBkSlEpXFre/YmwnLLEFG sABjHpULkU22+odnJA8ohVmFItit9gTyiseBlYEVln+G907pOC309yqYCaXBRp//HbxC QGhLH7qOo7Ienof0/Wa/ZtBJyS0z/VmCobODAtuV49uFNFrGe3LsmcrD8V0d1tgEKzNF 6PQeksdpR4oSL6eVeynnxK6/eVwkqEhAFr3Fu9ZC2MrCx2lNWEIhKQ6qYS3GMucDwZye ztEw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=RNl+MHEBUgIal9DgFRhFMthnbJ/N9bHdMHZvA3UeCTE=; b=S3gGOSIoZ+PooXzD+szIY1CGEhhISG5tppVzBdzew6pbbYi9k5bTc5ZQiE4a44uzq4 +OX0LAUUa1Oe2XOFWRhrN5s/EDKQUjF/lTIHaF1/Zc9Pp/H9ALf27apwAmFTWULkr4DH K825UCDR7ssMPdNyYg+RQYQ3UY7C6EpfBNy3OTZkCu97clAGyc2DDz12XDFJ8Q6pu1a+ YGIeWjHKW7T7frNOCk2iFCH9kLONUuZDW6N38KOprut8JqbpE/b9bMND1X+vld6Axit2 KfbBrCNcZeqybj9qjRvtvHylPPeRL8B6GEhVvvZoFJdrtfETIuXWGEZD2nan7cMdDv3E OksA==
X-Gm-Message-State: AOAM533qlK/p4DFqTqREP1C5GWg7gA/lG4msRpnd6SgfKWNgB6A3w97r LOYMJFDZpKY8ZPBwRLWe6y72HQ/R/JqWf2TTJ6E=
X-Google-Smtp-Source: ABdhPJxODU7ZCJ10LvX9B22fkZgN/GUfiY5ih3D9SQbyIE/o4Rrj0TY4bJiTHpKIeBHYzF9jYeSfZbh/IRoyRxajmgo=
X-Received: by 2002:ac2:5327:: with SMTP id f7mr227018lfh.62.1629157252607; Mon, 16 Aug 2021 16:40:52 -0700 (PDT)
MIME-Version: 1.0
References: <E64E409D-ABAA-4F09-8759-D3D8CEB36F13@gmail.com> <20210812162151.271D5261E7C4@ary.local> <CAHbrMsBTCBtOxjXXfUR3TQz+abaDdWRum2x_ZroZ2Wxyhscrfg@mail.gmail.com> <67B40C28-677F-42E1-B259-F90F6A78578D@icann.org> <CAH1iCirsTc78r5tS_suOpTe2DiX5tYtmN7mUnPtyM8VaVny3KQ@mail.gmail.com> <CAHbrMsCjuNh3UCMrpqxG33ES_sVjPDb+Z_N4QAxBOSRpMV7NGQ@mail.gmail.com> <CAH1iCir_ro2YvJYKWoyKv5qZ+Xe0JjE++mymdWbxeU70AvP_jw@mail.gmail.com> <CAHbrMsDPw_yRJdTYnxEctrqdeffHp_0oC9vgu+cnn2-C1MRiXA@mail.gmail.com>
In-Reply-To: <CAHbrMsDPw_yRJdTYnxEctrqdeffHp_0oC9vgu+cnn2-C1MRiXA@mail.gmail.com>
From: Brian Dickson <brian.peter.dickson@gmail.com>
Date: Mon, 16 Aug 2021 19:40:41 -0400
Message-ID: <CAH1iCioUvNxxePhrQSsyCO-Ea2Fyir2oMTksVWwstnEUVEuzWQ@mail.gmail.com>
To: Ben Schwartz <bemasc@google.com>
Cc: Paul Hoffman <paul.hoffman@icann.org>, dnsop <dnsop@ietf.org>
Content-Type: multipart/alternative; boundary="00000000000053306005c9b5bbfa"
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnsop/L_F2lFQM2HxwtEUgDBIT_7_ar48>
Subject: Re: [DNSOP] [Ext] DS glue for NS draft
X-BeenThere: dnsop@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF DNSOP WG mailing list <dnsop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnsop>, <mailto:dnsop-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dnsop/>
List-Post: <mailto:dnsop@ietf.org>
List-Help: <mailto:dnsop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 16 Aug 2021 23:41:06 -0000

On Mon, Aug 16, 2021 at 3:14 PM Ben Schwartz <bemasc@google.com> wrote:

>
>
> On Mon, Aug 16, 2021 at 2:05 PM Brian Dickson <
> brian.peter.dickson@gmail.com> wrote:
> ...
>
>> I'm arguing against the parent ever putting SVCB records in any
>> delegation response, regardless of whether the data is signed, or whether
>> the parent is capable of doing so. The data model for doing so is IMNSHO
>> incorrect.
>>
>> The parent is responsible for signaling the delegation, and everything
>> else is necessary for resolution, but not authoritative.
>>
>
> I think we can and should view the server configuration hint (e.g. SVCB)
> as "necessary for resolution".  Otherwise, we would be asking resolvers to
> add a round-trip delay (a "synchronous binding check") to every single DNS
> delegation, in order to check whether a SVCB record exists before
> proceeding.  I believe this is untenable: our first deployment step can't
> be to slow down the whole internet.  The resolver operators I've spoken to
> have made it clear that they will not implement any behavior with that kind
> of impact.
>

I think it is time to actually work through an example, as your assertion
("every single DNS delegation") does not align with my understanding.

Suppose we have the following domain delegation at the "example" TLD:
foo.example NS ns1.out-of-bailiwick.example

And suppose there is an SVCB record for the name server, served by the
authoritative server for the "out-of-bailiwick.example" domain:
_853._tcp.ns1.out-of-bailiwick.example SVCB (parameters)

In a cold cache, this SVCB query would need to be made (obviously).

However, now consider a subsequent stub client query to the same resolver,
for "bar.example", with delegation as follows:
bar.example NS ns1.out-of-bailiwick.example

If the cached SVCB record is still present in the resolver's cache, i.e.
the cache is warm, the SVCB record would be locally found in the cache (and
there would not be any additional RTTs.)

Am I misunderstanding the SVCB usage, or is the example set not correct in
some way?

In addition to the above, the "out-of-bailiwick.example" zone could also
contain things like:
ns1.out-of-bailiwick.example A (some address)
ns1.out-of-bailiwick.example AAAA (some address)
ns1.out-of-bailiwick.example TLSA (tlsa record parameters)

Depending on the resolver's policies and authoritative
server's functionality etc., the number of RTTs required and original
QNAME/QTYPE might vary.
But, once those are all cached, all subsequent queries for domains served
by ns1.out-of-bailick.example would not require additional RTTs.
(Add DNSKEY queries as necessary, and RRSIG validation processing as
first-iteration overhead).


>
> Even if we can somehow limit this "synchronous binding check" to domains
> that have opted-in, we would still be adding a round-trip delay to all
> delegations in the long term, when encryption is widely deployed.  I don't
> see any operational concern here that could outweigh the cost of making the
> whole internet slower.
>
> ...
>
>> To avoid misunderstanding: SVCB itself does not alter the nameserver name
>>> used for TLS.  The nameserver name is the name in the NS record, regardless
>>> of what SVCB says.  ("TLS clients MUST continue to validate TLS
>>> certificates for the original service name",
>>> https://datatracker.ietf.org/doc/html/draft-ietf-dnsop-svcb-https-07#section-2.3
>>> )
>>>
>>
>> At issue is not whether the name is changed, it is who is authoritative
>> for the data at the owner name, which is the child zone.
>>
>
> I agree.  That's why I think it's wisest to require that any records
> provided by the parent are only used for delegation, and never returned to
> the stub.
>
>
>> The SVCB also is controlling the transport and potentially other
>> parameters.
>>
>
> These parameters are thoroughly analogous to the IP addresses in glue A
> records today.  In both cases, we are discussing information that is used
> to connect to the nameserver, not information that will be returned to the
> stub as zone contents.
>
>
>> Absent a discussion on how to securely get that data into the parent, it
>> cannot be accurately asserted as having the same security properties as
>> when the SVCB record is published in a signed child zone.
>>
>>
>>>
>>>
>>>> The resolver talking TLS will be talking to the name server, to make
>>>> encrypted queries about a domain served by the name server.
>>>> (The domain served that is being queried is presumed to not have an
>>>> in-bailiwick name server, since that particular corner case is a
>>>> chicken/egg problem.)
>>>>
>>>> The alternative (to requiring the SVCB record in the child, signed by
>>>> the child domain) might be an SVCB-like thing in the parent which contains
>>>> data signed by the child. Specifically:
>>>>
>>>>    - A DNSKEY algorithm, or component of the draft-schwartz encoding,
>>>>    which contains (SVCB + RRSIG_name_server_child_ksk(SVCB))
>>>>    - This would be signed by the parent too, but the critical
>>>>    component the provides the data integrity from the child is the
>>>>    RRSIG_name_server_child_ksk element
>>>>    - Only the operator of the name server's name's domain (which
>>>>    possesses the name_server_child_ksk) can generate this signature
>>>>
>>>> Yes, but the parent controls the choice of nameserver name, so this
>>> does not create an effective defense against a hostile parent, which is
>>> impossible under standard DNSSEC assumptions.
>>>
>>
>> None of this has anything to do with a potential hostile parent, not sure
>> why you are raising that.
>>
>
> Security is defined by a threat model.  if the parent is excluded from the
> threat model, as we seem to agree, then anything signed by the parent is
> fully secure by definition.
>

The parent can be trustworthy, but not the communications to the parent
(EPP). Those are not equivalent elements of the threat model.
This is especially true concerning the signing processes.
The signing itself can be super-safe with all kinds of practise statements,
but that does not mean there is not a weakness on the data upload path.
The latter is input to the signed data, and is the weak point in the
overall system.


>
> But I think it is probably a moot point to discuss any other method of
>> encoding SVCB in the parent.
>>
>> There are a lot of other reasons why SVCB in the parent has
>> characteristics that are not suitable.
>>
>>    - TTL on the parent side is controlled by the parent, not the child
>>       - This has operational impacts, particularly if problems occur and
>>       a roll-back is necessary
>>
>>  I don't see a difference here from the status quo with NS, A, and AAAA
> glue records.
>

The NS records reference names (RDATA values), but the names may be
"glue-less", e.g.

   - The delegated domain and NS are not under the same TLD
   - The NS domain is itself glueless (served by name servers not under the
   same domain as the NS names)
      - Example:
         - foo.example NS ns1.out-of-bailiwick.example
         - out-of-bailiwick.example NS ns1.dns-operator-domain.example
         - ns1.dns-operator-domain.example A glue-A-record
      - This is a counter-example, showing that there may not even be A and
      AAAA glue records involved that correspond to SVCB records
      - The SVCB record, if present, would be tied to
      "ns1.out-of-bailiwick.example"



>
>>    - TTL on the child side is controlled by the child (which is likely
>>    to be important when rolling out new features, or making changes)
>>       - TTL can be adjusted up/down as needed for planned maintenance
>>       work
>>    - Frequency of updates for current DS records, is basically "when KSK
>>    rolls", which is on the order of year(s)
>>       - Encoding glue via DS, involves updates whose frequency aligns
>>       with changes to that glue, also typically stable for years at a time
>>
>> There is no reason to expect frequent changes to any of the relevant
> records.  Even TLSA records would not be rotated more than a few times a
> year.
>

I think you are projecting from individual practices onto the entire
industry's operational practices.
Just because *some* TLSA records might not change often, does not mean
*all* TLSA records will not change often.
I can probably point to counter-examples.

Also, the frequency of changes is not the same as the impact of TTL on
changes (cached records in some resolvers' caches, vs newly queried records
in other resolvers' caches.

There are other issues that you haven't discussed here yet:

   - There may be different anycast locations, topologies, and policies,
   vis a vis TLSA records
      - Different DNS Operator anycast instances may serve different TLSA
      records (geographically aligned values)
      - TLD delegation glue does not generally support such differentiated
      views
      - Even if some TLDs did support these views, the locations may not
      align properly
      - Some locations may not even offer TLS service, for name servers
      with the same name
   - (Other things I can't think of right now...)



>
>>    - Changes to SVCB would require regenerating RRSIGs.
>>       - Doing that imposes impact to the signing zone (signature
>>       frequency is typically bound by hardware, and upping the former requires
>>       more of the latter)
>>       - That also presupposes that the proposed frequency can even be
>>       supported
>>       - If the signing is done by the child, the cost is borne by the
>>       party making the changes
>>       - If the signing is done by the parent, the cost is not borne by
>>       the party making the changes
>>    - The SVCB RDATA is not actually authoritative on the parent side
>>    (bears repeating, this has a lot of implications, including infosec issues)
>>
>> Yes, but there is no need for it to be authoritative, because it would
> never be returned as zone contents.  We already use non-authoritative data
> to establish a connection to a nameserver, so that's nothing new.
>

Except this is new, and is precisely the case where using non-authoritative
data to establish a connection to the name server would break the
assumptions needed for privacy.


>
> Also, it is not clear from the discussion thus far, on which record(s)
>> would have that SVCB record included:
>>
>>    - The registrant's zone name (delegation from the parent to the DNS
>>    Operator's server)
>>    - The DNS Operator's zone name (possibly glue-less XOR possibly
>>    having in-bailiwick glue required)
>>    - The Zone serving the name server names (referenced in the NS
>>    records of the registrant's zone)
>>
>> The first one would not be reasonable at all, since presumably the
>> registrant would have the ability to submit SVCB records even if the
>> registrant did not operate the authoritative DNS server.
>>
>
> SVCB records would only be accepted on in-bailiwick delegations, i.e. the
> same delegations that include A/AAAA glue today.  I believe this is your
> third case, although there can technically be multiple in-bailiwick
> delegation steps in the course of a single iterative resolution.
>

See above. The second bullet being glueless and the third being glue-full,
would mean the SVCB record would not have associated glue and thus no v4 or
v6 hints.


>
>>>>    - This is much stronger than being able to push data over an EPP
>>>>    channel, which is the mechanism available to a Registrar that polls the
>>>>    Registrant's domain for CDS records
>>>>    - EPP does not have a mechanism currently for validation of
>>>>    supplied DS records
>>>>
>>>> I don't understand.  When using CDS, the CDS records are validated by
>>> DNSSEC from the current DS.  Otherwise, the DS records are authenticated by
>>> the registrar's usual login procedure.  Either way, any adversary who could
>>> forge DS records could also forge RRSIGs or any other RR type.
>>>
>>
>> This is a "chain of custody/control" thing. If the Registry is not doing
>> CDS directly, the CDS model has failed to protect the DS update end-to-end.
>> While another party (Registrar) polling and validating CDS would be
>> _necessary_, it would not be _sufficient_.
>>
>
> It sounds like you are saying that a compromised or hostile Registrar
> should be part of our threat model.  I do not think this is necessary or
> practical.  A compromised Registrar can always replace the entire child
> zone contents, so nothing we place in the child zone can defend against it.
>

It has nothing to do with whether the Registrar is hostile or not. If the
data is not authenticated by the TLD, the chain of custody has been broken.
This is a problem with lack of alignment in the security model between
Registrar polling (using CDS records, signed by the Registrant or the
Registrant's DNS Operator) and EPP submission (which uses Registrar
credentials, not Registrant credentials).

An example of a trustworthy Registry and trustworth Registrar communicating
over TLS, still being potentially compromised:
If the Registry's TLS connection was MITM'd by an on-path adversary with a
valid TLS cert issued by a different CA, which was in the trusted CA set
for the Registrar, the DS update could be tampered with.
If the DS record was signed, and the Registry validated the signed DS
record, this tampering would not be possible, and there would not be any
reliance on CAs or TLS certs. Data integrity is different from channel
integrity.

The threat model is not relevant if there is a lower-trust component in the
update. CDS is cryptographically strong (CDS signed by existing DNSKEY
private key), while Registrar credentials are not (at least generally,
today).

It is a significant problem that I think needs to be addressed.

The interim mechanisms for protecting against these weaknesses do not scale
well - i.e. Registry Lock, which is incompatible with frequent DS changes.


>
> There are only a handful of CCTLDs doing CDS, and I am not aware of any
>> significant plans beyond those currently.
>> CDS _by the Registry_ is incompatible with the RRR model, i.e. the gTLD
>> mechanism controlled by ICANN contracts.
>> That is highly unlikely to change in a timeframe faster than multiple
>> years, if ever.
>>
>> Basically, I don't foresee CDS done by Registries being relevant to any
>> proposals for securing the glue data.
>>
>
> Perhaps, but some Registrars have already implemented support, e.g.
> https://glauca.digital/blog/2020/08/10/cds-at-the-registrar-level.html.
> Domain owners that wish to publish via CDS can already choose such a
> Registrar, and use CDS for automated maintenance with any Registry.
>

Even if every Registrar supported this, it does not change the RRR security
model that underlies everything, outside of the CCTLDs doing CDS.
They all rely on EPP, with no data integrity check on submitted DS records
associated with such CDS updates.
The Registry can be completely trustworthy, but the weak link is unchanged
-- the Registrar-Registry update mechanism, which relies on Registrar
credentials.
Those credentials are applicable to use for updates to any/all Registrant
records (i.e. every Registrant that uses that Registrar).
It doesn't mean the Registrar isn't trustworthy either.
It only means the model is not as secure as would be the case if the data
required "proof of possession" of the private key, e.g. signing the DS.



>
> ...
>
>> Also, optimizing for cold cache vs warm cache should be explicitly called
>> out. It looks to me like "SVCB in parent" is only valuable in the cold
>> cache scenario.
>>
>
> I agree!
>
>
>> Stats are probably available showing how often queries are made for DNS
>> operator's zones that indicate a cold cache is being used. Absent a
>> compelling case for the cold cache, it does not seem to be worth any effort.
>>
>
> Perhaps we can find some stats, but I think the "cold cache" case is worth
> optimizing for reasons of tail latency and consolidation pressure.  Cold
> cache delegations already resolve slower, so making them even slower is
> likely to have a disproportionate impact on tail latency.  Low-traffic
> nameservers are much more likely to be cold in cache, so increasing the
> latency penalty for a cache miss would increase pressure to migrate to
> centralized nameservers.
>
>>