[DNSOP] Secdir early review of draft-ietf-dnsop-domain-verification-techniques-01

Benjamin Kaduk via Datatracker <noreply@ietf.org> Thu, 20 April 2023 00:40 UTC

MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
From: Benjamin Kaduk via Datatracker <noreply@ietf.org>
To: secdir@ietf.org
Cc: dnsop@ietf.org, draft-ietf-dnsop-domain-verification-techniques.all@ietf.org
Auto-Submitted: auto-generated
Precedence: bulk
Message-ID: <168195124881.58577.7223695021506805865@ietfa.amsl.com>
Reply-To: Benjamin Kaduk <kaduk@mit.edu>
Date: Wed, 19 Apr 2023 17:40:48 -0700
Archived-At: <https://mailarchive.ietf.org/arch/msg/dnsop/Ksw4aypWpgyFKiSgvabvpFhFIzI>
Subject: [DNSOP] Secdir early review of draft-ietf-dnsop-domain-verification-techniques-01

Reviewer: Benjamin Kaduk
Review result: Has Issues

# SecDir review of draft-ietf-dnsop-domain-verification-techniques-01
CC @kaduk

## Discuss

## Comments

I'm a little nervous about allocating a new BCP number for a topic of fairly
narrow scope such as this, but the guidance is good and worth publishing, and
I have not better alternative to offer, so I will stifle any objection I might
have raised.

On the whole the content is reasonable, but read on for some suggestions on
how to tighten things up and some questions about why all the pieces are
needed.

I also made a PR with some editorial suggestions, 
https://github.com/ietf-wg-dnsop/draft-ietf-dnsop-domain-verification-techniques/pull/51

### Continuing Ownership

While I see the example in Appendix A.1.4.1 about Atlassian, when we say in
the introduction that sometimes the DNS record must be kept in the zone to
prove continued ownership of the domain, that doesn't exactly seem to hold
true, or at least only provides a weak indication of continued ownership.  I
think that to really prove continued ownership the challenge would need to be
rotated periodically; just putting one record in once and leaving it there
mostly only shows that you had control of the domain at the one point in time
it was added and that there isn't anything cleaning up old records.  A
wholesale migration of the domain to a different DNS server would clean up old
records, but staffing changes within a large organization would not.

### Underscore for attrleaf

In §3.1 we say "The provider constructs the validation domain name by
prepending a provider-relevant prefix followed by "-challenge" to the domain
name being validated (e.g. "_foo-challenge.example.com")." which implicitly
uses an underscore-prefixed atterleaf name component.  Is the underscore part
of the required or recommended behavior?

### Hashing the random token

The recommendations in §3.1 requires computing the SHA-256 digest of the
Random Token before base64url encoding and use as RDATA, but provides no
justification for the SHA-256 step that I can see.  If the intent is for the
Random Token to be a non-public identifier for the challeng with only the
digest value being public, we should say that.  Otherwise it's not clear what
we gain over just using base64url(random-data) as the RDATA.

### TXT RDATA contents

We say that the RDATA of the recommended TXT resource record construction must
"contain" a certain token construction (rather than "consist of").  Does the
format/structure of the RDATA also need to be one that unambiguously
identifies where the token is (or is a raw substring match sufficient)?

I think we might provide clearer guidance if we laid out the recommended
comma-separated key-value format first and talked about the token= contents,
and then only afterward mention that even if other formats are used, the RDATA
MUST contain the token value (and whether it must be identifiable as such via
some metadata).  Or we could state up front what properties we want from the
RDATA before going on to describe how we achieve those properties.  But the
current formulation starts out with a MUST-level requirement without much
context and then tries to build the context around it, which tends to cause
confusion in the reader.

Reading again, I think that the core of what is confusing me here is that the
SHOULD-level guidance for using the comma-separated key-value pair format is
buried as an aside in the mechanism for conveying the instructions for
removing a resource record.  It seems like it's a more generic recommendation
and should be framed as such.

### Instructions for TXT record removal

We say

> Providers MUST provide clear instructions on when a verifying record
> can be removed.  The user SHOULD de-provision the resource record
> provisioned for a DNS-based domain verification challenge once the
> one-time challenge is complete.  These instructions SHOULD be encoded
> in the RDATA via comma-separated ASCII key-value pairs [RFC1464]
> using the key expiry.  If this is done, the token should have a key
> token.  For example:

I think that the "these instructions" refers to the "clear instructions"
provided by the provider.  But the TXT record is something provisioned by the
user, not the provider.  If the instructions are to be encoded in the RDATA,
does that require that the RDATA contents themselves are provided by the
provider to the user?  I'd strongly recommend including some more description
of the workflow that's envisioned if it includes the provider giving the user
the entire RDATA contents to copy/paste in place (and also using those RDATA
contents as a separate information channel to the user).

Additionally, do we make any requirement/recommendation of the format used for
the value corresponding to the "expiry" key?

### CNAME chaining

In §3.2 we see the text "Another issue with CNAME records is that they must
not point to another CNAME" provided, with no reference.  But CNAME chains are
in heavy use on the internet in practice with no ill effect (case in point: my
employer's CDN business), so I'd recommend removing the discussion of CNAME
chaining, or supplying a reference and discussion around them that is closer
to the deployment reality.


### Relationship to other work

While I don't mind a focus on DNS-based domain-verification techniques in this
document, it does seem like we might want to spend a little more time placing
this work in the context of other schemes.  For example, ACME has a number of
challenge types, and one of them is essentially the DNS-based stuff covered by
this document.  Is what RFC 8555 specifies compliant with this document's
requirements and recommendations?  Do we expect ACME to use our guidance for
DNS-based verification and keep doing what they do for HTTP, TLS-SNI, etc.
verification?  Or should they use a wholistic body of work for all
verification such that this document is out of scope for ACME?

### AVOID-FRAGMENTATION reference classification

The actual text in Appendix A.1.1 that references [AVOID-FRAGMENTATION] does
not contain any usage that would obviously promote it to a normative
reference.  If there is intent to require implementors of this document to
read that one, some additional discussion of which parts are normative
requirements and why seems in order.

### Motivation for application-specific name prefix

In Appendix A.1.1 we have some good text about an attack where a malicious
service can provide to the user the challenge from a different provider, when
the TXT verification is on the actual domain name being verified (no
prefix/suffix).  I would say this fits better in the main document (security
considerations, probably) rather than the appendix with survey of current
usage.  We currently say that the provider and service "should be obvious from
the TXT content" in the security considerations but not much about how things
go wrong if the provider and service are not obvious.  (The existing prose in
the appendix is perhaps a little unclear about how the attack actually would
get pulled off in practice; it was clear to me based on my prior experience
but could be improved by, e.g., naming the parties involved (victim service,
malicious service, user, DNS administrator).)

## Nits

### APEX Capitalization

I don't remember seeing APEX used as a defined term in all-caps elsewhere.
What value do we think we're adding by writing it this way?

### Verifying Record

We seem to implicitly introduce the term "verifying record" in §3.1; it might
benefit from having an explicit definition.

### Self-referential Appendix

Appendix A.1.1 contains the text "See Appendix A for a survey of different
implementations.", which is perhaps not a very useful reference.

### DNS name component ordering

In Appendix A.1.1 we write "Some providers use a suffix of
_PROVIDER_NAME-challenge in the Name field of the TXT record challenge", when
that value is a suffix in the wire format of the name but a prefix in the
presentation form of the name.  This is especially confusing when the next
sentence goes on to include an example in presentation form,
_acme-challenge.<YOUR_DOMAIN>, where the _PROVIDER_NAME-challenge is a prefix.
Maybe talking about "leaf" rather than prefix or suffix would be most clear.

[DNSOP] Secdir early review of draft-ietf-dnsop-d… Benjamin Kaduk via Datatracker
Re: [DNSOP] Secdir early review of draft-ietf-dns… Paul Wouters
Re: [DNSOP] Secdir early review of draft-ietf-dns… Benjamin Kaduk
Re: [DNSOP] Secdir early review of draft-ietf-dns… Shumon Huque
Re: [DNSOP] Secdir early review of draft-ietf-dns… Shivan Kaul Sahib