Re: [Uta] Consensus call for proposed changes to draft-ietf-uta-rfc6125bis-10

Peter Saint-Andre <stpeter@stpeter.im> Wed, 08 February 2023 02:32 UTC

Feedback-ID: i24394279:Fastmail
Message-ID: <2feec816-e95e-e650-c692-d1c5923e176d@stpeter.im>
Date: Tue, 07 Feb 2023 19:31:56 -0700
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.7.1
Content-Language: en-US
To: Patrik Fältström <paf@paftech.se>
Cc: uta@ietf.org, John C Klensin <john-ietf@jck.com>
References: <029901d93618$7ac97b80$705c7280$@smyslov.net> <DM6PR14MB21869E09C50E623CDAB2454E92D19@DM6PR14MB2186.namprd14.prod.outlook.com> <26d9a7fe-dd55-d4aa-4cc1-ee92c80b3bd8@stpeter.im> <2848770E-FF3C-4DDE-B3E9-68BD0FB2F24D@paftech.se>
From: Peter Saint-Andre <stpeter@stpeter.im>
In-Reply-To: <2848770E-FF3C-4DDE-B3E9-68BD0FB2F24D@paftech.se>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/uta/6BOvMz8RB9j8MdzAg20x_xLOcUo>
Subject: Re: [Uta] Consensus call for proposed changes to draft-ietf-uta-rfc6125bis-10
Precedence: list

Hi Patrik,

Thanks for taking the time to provide such a detailed message, and my 
apologies for the delayed reply. Comments inline.

On 2/2/23 6:59 AM, Patrik Fältström wrote:
> On 2 Feb 2023, at 9:58, Peter Saint-Andre wrote:
> 
>> On 2/1/23 6:17 AM, Corey Bonnell wrote:
>>
>>> I think it would be unfortunate if the usage of terms that are defined in
>>> RFC 5890 is not aligned with their definitions.
>>>
>>> If we are not opposed to introducing new terminology to the document, then I
>>> suggest the following:
>>>
>>> 1.	Replace all instances of "A-label" with the term "P-label" from the
>>> CABF Baseline Requirements [1]: "P-Label: A XN-Label that contains valid
>>> output of the Punycode algorithm (as defined in RFC 3492, Section 6.3) from
>>> the fifth and subsequent positions."
>>> 2.	For U-label:
>>> 	a. Punt and call it "Unicode representation" instead (this is what
>>> the CABF Baseline Requirements does, although that may not be appropriate
>>> for this document).
>>> 	b. Create a new term that is defined as "A non-LDH label that
>>> contains valid output of the decoding algorithm for Punycode (as defined in
>>> RFC 3492, Section 6.2)." and use this new term instead of "U-label".
>>>
>>> I'd be happy to work on concrete text to this effect if there's agreement
>>> this is a good path to resolve the issue.
>>
>> I would very much like to hear what John Klensin and Patrik Fältström (cc'd) think about this proposal.
>>
>> As noted in my other message <https://mailarchive.ietf.org/arch/msg/uta/92tKoHT3Kjll1o_mCYQYQT8xON4/> I'm not immediately comfortable with referencing a CA/Browser Forum document instead of RFC 5890.
>>
>> Having looked at Corey's proposal more closely, I'm doubly unsure because (a) it is not fully clear to me how the P-label construct differs from the A-label construct in RFC 5890 and (b) coming up with new DNS-related terminology in a late-stage document about certificate validation just seems like a bad idea (e.g., I'm not sure how to get proper review) even if it were necessary (which I'm not sure it is).
> 
> Thanks for being brought into this discussion Peter.
> 
> I had a read of the document and have these direct comments:
> 
>>     delegated domain:  A domain name or host name that is explicitly
>>        configured for communicating with the source domain, either by the
>>        human user controlling the client or by a trusted administrator.
>>        For example, an IMAP server at mail.example.net could be a
>>        delegated domain for a source domain of example.net associated
>>        with an email address of user@example.net.
> 
> This might be confusing as it is using the term "delegated" and give indeed an example where "mail.example.net" might (or might not) be delegated from "example.net", while the administrator of an imap server at a specific domain name might have no similarities at all with the MX record of the domain to which email is to be sent to end up in the named IMAP server.
> 
> So I think a better example is to either use the term "delegated" when it really talks about DNS delegation, OR, you use a different term but have an example where you can have:
> 
> - IMAP server: imap.example.se.
> - MX target: mx.example.net.
> - Email domain: example.com.

Although you might be right that "delegated domain" is less than ideal, 
it's the term we used in RFC 6125. As a result, a number of 
specifications that cite RFC 6125 also use the term, so it seems 
inadvisable to change terminology now.

The original idea was not DNS delegation at the nameserver level, but 
service delegation at the application level such as one finds in this 
document (e.g., in order to retrieve email for addresses at example.net, 
one configures one's email client to connect to the server at 
imap.example.net).

At the least, it seems reasonable for us to explain this in more detail 
so that the reader doesn't confuse this perhaps bespoke notion of 
service delegation with the perhaps more established notion of DNS 
delegation.

>>     derived domain:  A domain name or host name that a client has derived
>>        from the source domain in an automated fashion (e.g., by means of
>>        a [DNS-SRV] lookup).
> 
> Also MX? 

I don't see why not. If DNS SRV records had existed from the beginning 
of time, it seems that email protocols would have used SRV rather than 
MX, right?

> What is then the difference or similarity between an MX related derivation of one domain name from another and an SRV related derivation?

It seems to me that they are functionally equivalent. But I am not a DNS 
expert or email expert, so (leaving aside various nuances) I might be 
missing some essential difference.

> Can a delegated domain also be derived?

Not really. The idea is that a delegated domain is explicitly configured 
client-side whereas a derived domain is obtained in an automated fashion 
via DNS. So they are two different constructs that play two different 
roles in protocols.

>>     source domain:  The FQDN that a client expects an application service
>>        to present in the certificate.  This is typically input by a human
>>        user, configured into a client, or provided by reference such as a
>>        URL.  The combination of a source domain and, optionally, an
>>        application service type enables a client to construct one or more
>>        reference identifiers.
> 
> I presume you also include domain names that one at a time is created using a search list construction in a DNS stub resolver? 

If I understand you correctly, I would say that we have not had a theory 
about how domain names are created (e.g., using a suffix search list). 
And it's not clear to me that we need to have such a theory here.

> I.e. what you talk about is really a FQDN?

That is the intent - no bare hostnames or, more generally, no domain 
names that do not include all labels.

> I think this is a good thing, but hope people to understand what this implies.
> 
> I hate search lists and relative domain names.
> 
>>     The DNS name conforms to one of the following forms:
>>
>>     1.  A "traditional domain name", i.e., a FQDN that conforms to
>>         "preferred name syntax" as described in Section 3.5 of
>>         [DNS-CONCEPTS] and for which all of its labels are "LDH labels"
>>         as described in [IDNA-DEFS].  Informally, such labels are
>>         constrained to [US-ASCII] letters, digits, and the hyphen, with
>>         the hyphen prohibited in the first character position.
>>         Additional qualifications apply (refer to the above-referenced
>>         specifications for details), but they are not relevant here.
>>
>>     2.  An "internationalized domain name", i.e., a DNS domain name that
>>         includes at least one label containing appropriately encoded
>>         Unicode code points outside the traditional US-ASCII range and
>>         conforming to the processing and validity checks specified for
>>         "IDNA2008" in [IDNA-DEFS] and the associated documents.  In
>>         particular, it contains at least one U-label or A-label, but
>>         otherwise may contain any mixture of NR-LDH labels, A-labels, or
>>         U-labels.
> 
> This is confusing 

What specifically do you think is confusing? We tried to get it right, 
but clearly didn't succeed...

> and it seems people misunderstand the big changed we went through in the IETF from IDNA2003 to IDNA2008.
> 
> In IDNA2008 we have:
> 
> - Got rid of mapping, i.e. mapping like case folding is something happening in application layer, and have nothing to do with "domain names".
> - Have a 1:1 mapping between A-label and U-label.
> - In theory because of this can have A-label and U-label for domain names that include by IDNA2008 not allowed Unicode code points (or not allowed code point by other policy rules, for example the ones a registry have).
> 
> I stronly recommend you have similar rules here. Separate potential mapping from comparison of domain names which in turn must be separated from policy for what code points are allowed.

When you say "have similar rules here", are you suggesting that we 
define such rules outside the context of IDNA2008 (e.g., in a way that 
would be valid for both IDNA2008 and IDNA2003 + UTS-46?) I think it 
would be a challenge to get that right and I'm not confident that a 
document about certificate matching is the correct place to do so.

> Ok, onwards...
> 
>>     If the DNS domain name portion of a reference identifier is a
>>     traditional domain name, then matching of the reference identifier
>>     against the presented identifier MUST be performed by comparing the
>>     set of domain name labels using a case-insensitive ASCII comparison,
>>     as clarified by [DNS-CASE].  For example, WWW.Example.Com would be
>>     lower-cased to www.example.com for comparison purposes.  Each label
>>     MUST match in order for the names to be considered to match, except
>>     as supplemented by the rule about checking of wildcard labels given
>>     below.
>>
>>     If the DNS domain name portion of a reference identifier is an
>>     internationalized domain name, then the client MUST convert any
>>     U-labels [IDNA-DEFS] in the domain name to A-labels before checking
>>     the domain name or comparing it with others.  In accordance with
>>     [IDNA-PROTO], A-labels MUST be compared as case-insensitive ASCII.
>>     Each label MUST match in order for the domain names to be considered
>>     to match, except as supplemented by the rule about checking of
>>     wildcard labels given below.
> 
> All of the above can be replaced by just saying that "A domain name is to be compared using case insensitive matching according to what DNS uses, and this because of this include domain names that have A-Labels in them" and reference IDNA2008.

It seems that we should at least say that U-labels need to be converted 
to A-labels first, no? Or do you think that is implied by referencing 
the DNS rules (which don't allow U-labels natively)?

> It *might* also include wording about:
> 
> - If a domain name include unicode characters, and case folding equivalent approximate matching is expected by the client, mapping from one unicode character to another must take place before the A-label is created from the U-label. And reference section 4.2 in RFC 5894.

Thanks for the reminder about that section.

> Do not come up with your own words please!

Agreed.

> - If a domain name include code points that are DISALLOWED according to IDNA2008 or any other policy, for example a registry, it MUST be defined in this document whether it SHOULD be allowed to do a comparison of the domain names or not. If a label include 0x00 bytes for example (which is normally never allowed in any protocol) should such a lable be able to get a "match" when the domain name is to be compared?

It seems like a bad idea to match on DISALLOWED code points! But see below.

> Please be specific in the general case!
> 
>>     A wildcard in a presented identifier can only match exactly one label
>>     in a reference identifier.  Note that this is not the same as DNS
>>     wildcard matching, where the "*" label always matches at least one
>>     whole label and sometimes more.  See [DNS-CONCEPTS], Section 4.3.3
>>     and [DNS-WILDCARDS].
> 
> Wow, wildcards in DNS is hairy. I know some people knows this, be careful, as wildcards in DNS is very different from (so far) wildcards in certificates.

I believe we included that text only to note that the wildcard matching 
for certificates is more constrained that for DNS. Do you think that 
further clarifications are needed?

>>     An IP-ID matches based on an octet-for-octet comparison of the bytes
>>     of the reference identity with the bytes contained in the iPAddress
>>     subjectAltName.  Because the iPAddress field does not include the IP
>>     version, a helpful heuristic for implementors is to distinguish IPv4
>>     addresses from IPv6 addresses by their length.
> 
> Why "octet by octet"?

Do you suggest some other text? Specifically do you have in mind "bit by 
bit" perhaps?

> The field include either a 32 bit or 128 bit field. If what is compared have different length, the match is False. If the length is the same, the values are compared. If they are the same, the match is True, otherwise False.

We were trying to be more precise about what "the same" means, but as we 
know it can be a challenge to get that right.

>>     If the identifier is an SRV-ID, then the application service name
>>     MUST be matched in a case-insensitive manner, in accordance with
>>     [DNS-SRV].  Note that the _ character is prepended to the service
>>     identifier in DNS SRV records and in SRV-IDs (per [SRVNAME]), and
>>     thus does not need to be included in any comparison.
> 
> Please reference one place in this document where case sensitivity is explained. Do not repeat text.

Noted.

>> 7.3.  Internationalized Domain Names
>>
>>     As specified under Section 6, matching of internationalized domain
>>     names is performed on A-labels, not U-labels.  As a result, potential
>>     confusion caused by the use of visually similar characters in domain
>>     names is likely mitigated in certificate matching as described in
>>     this document.
>>
>>     As with URIs and URLs, there are in practice at least two primary
>>     approaches to internationalized domain names: "IDNA2008" (see
>>     [IDNA-DEFS] and the associated documents) and an alternative approach
>>     specified by the Unicode Consortium in [UTS-46].  (At this point the
>>     transition from the older "IDNA2003" technology is mostly complete.)
> 
> Not really...it is neither one or the other.
> 
> The basis for all domain names is what is defined in DNS, and that is IDNA2008.
> 
> The differences from UTS-46 are specifically two things:
> 
> - UTS-46 also include rules for mapping that IDNA2008 does not include. The mapping that might be performed according to UTS-46 is "out of scope" for IDNA2008.
> 
> - What code points are allowed in the ultimate domain name is slightly different.
> 
> But, we have people using domain names (i.e. in the wild) which are neither allowed in UTS-46 or IDNA2008.
> 
> And, then there are people using the algorithm in IDNA2008 applied to versions of Unicode that IETF have not approved yet.
> 
> So, once again, not "either or". It is "a little bit of everything".

I see what you mean. However, that makes it more difficult to specify 
recommended behavior.

As one example, it seems possible that these differences could lead to 
someone using domain names in the wild that include DISALLOWED code 
point (e.g., because the definition of which code points are DISALLOWED 
can vary across Unicode versions). Thus if we say that applications MUST 
NOT match on DISALLOWED code points, behavior could be inconsistent.

>>     Differences in specification, interpretation, and deployment of these
>>     technologies can be relevant to Internet services that are secured
>>     through certificates (e.g., some top-level domains might allow
>>     registration of names containing Unicode code points that typically
>>     are discouraged, either formally or otherwise).  Although there is
>>     little that can be done by certificate matching software itself to
>>     mitigate these differences (aside from matching exclusively on
>>     A-labels), the reader needs to be aware that the handling of
>>     internationalized domain names is inherently complex and can lead to
>>     significant security vulnerabilities if not properly implemented.
>>
>>     Relevant security considerations for handling of internationalized
>>     domain names can be found in [IDNA-DEFS], Section 4.4, [UTS-36], and
>>     [UTS-39].

Does that text seem correct or appropriate?

Do you have opinions on Corey's suggestion to use P-labels instead of 
U-labels and to reference the CA/Browser Forum specifications?

https://mailarchive.ietf.org/arch/msg/uta/r5uJRGUzCC55XH4XSnwtMB2YWPA/

Again, many thanks for the thorough review.

Peter

[Uta] Consensus call for proposed changes to draf… Valery Smyslov
Re: [Uta] Consensus call for proposed changes to … Corey Bonnell
Re: [Uta] Consensus call for proposed changes to … Valery Smyslov
Re: [Uta] Consensus call for proposed changes to … Corey Bonnell
Re: [Uta] Consensus call for proposed changes to … Rob Sayre
Re: [Uta] Consensus call for proposed changes to … Viktor Dukhovni
Re: [Uta] Consensus call for proposed changes to … Salz, Rich
Re: [Uta] Consensus call for proposed changes to … Rob Sayre
Re: [Uta] Consensus call for proposed changes to … John Levine
Re: [Uta] Consensus call for proposed changes to … Peter Saint-Andre
Re: [Uta] Consensus call for proposed changes to … Peter Saint-Andre
Re: [Uta] Consensus call for proposed changes to … Valery Smyslov
Re: [Uta] Consensus call for proposed changes to … Corey Bonnell
Re: [Uta] Consensus call for proposed changes to … Patrik Fältström
Re: [Uta] Consensus call for proposed changes to … John C Klensin
Re: [Uta] Consensus call for proposed changes to … Peter Saint-Andre
Re: [Uta] Consensus call for proposed changes to … John C Klensin
Re: [Uta] Consensus call for proposed changes to … Viktor Dukhovni
Re: [Uta] Consensus call for proposed changes to … Patrik Fältström
Re: [Uta] Consensus call for proposed changes to … Valery Smyslov
Re: [Uta] Consensus call for proposed changes to … Rob Sayre
Re: [Uta] Consensus call for proposed changes to … Valery Smyslov
Re: [Uta] Consensus call for proposed changes to … Viktor Dukhovni
Re: [Uta] Consensus call for proposed changes to … Rob Sayre