Re: [Uta] UTS-46 / WHATWG

Rob Sayre <sayrer@gmail.com> Tue, 31 January 2023 17:45 UTC

Return-Path: <sayrer@gmail.com>
X-Original-To: uta@ietfa.amsl.com
Delivered-To: uta@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 90A7EC151547; Tue, 31 Jan 2023 09:45:53 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.094
X-Spam-Level:
X-Spam-Status: No, score=-2.094 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id IFinnyh23HsZ; Tue, 31 Jan 2023 09:45:49 -0800 (PST)
Received: from mail-ej1-x635.google.com (mail-ej1-x635.google.com [IPv6:2a00:1450:4864:20::635]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 04B37C15155A; Tue, 31 Jan 2023 09:45:48 -0800 (PST)
Received: by mail-ej1-x635.google.com with SMTP id ml19so20385077ejb.0; Tue, 31 Jan 2023 09:45:48 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=ukVFiwWpTPCFHUsWu3zsbdlCNRmKsDiDmMDIBY3z+Bs=; b=oaAFPhUXgSKQkW8reg0B+iLviTvd+6MhgKbAw1LMvem8H+AE1LJPRKVdz46PPKtjNa N6LIl6HyuhVCwc7e/72mQER3MGGHrXKH+T0zEPR3+BM9mHtW4n76lAmncgv52wIFVIBp 0o17ZVPG0D3epvDoLmAbnozMZta6YKgnGBgvgQMG5psA1tdLGo+V5S8tLNUVAAoOwADY vHtLyTUi3UhgfzGmJLrr7OuPV/TTHfSIAX/NJxm3CO67F11qXlakshTbfsd2/47UEOk5 B2DgppT9BPQxzp01Olo//ijTzVuR33J4Ni1nk6nLb4mg39rGqDLl4eegCRFQS7LSrWGI 54YA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ukVFiwWpTPCFHUsWu3zsbdlCNRmKsDiDmMDIBY3z+Bs=; b=PyzHpTddgpdTpwgSeBJ8O+N5OSkThbekoudK27x9hqgnpwRcBfF2bUtNJCeeXme33g SXBMU7lVabWJl/mHfn0DweRCoLXveq2/NM2yLOUzFEIoEoexReNTX90390D9HJx8V2xB UUyOqQsLYN0TfCqGKRIhycvB8zkRzNsx2lG7Q6yOdGXwXeB2YckiwsYmpOqtU17mcDGq oKrZ2J5QpaOIstLee+vraofthw65NWnH010Bq3VS78lgk/q4Yl5RUKUk3yXbyE1OwKEx P7fnIWgfliFQyOZrcBgpai0YBbJEFdsGAbctlU1Ej85oOLBim0ExjtHiyN769XBYlDa7 ojOg==
X-Gm-Message-State: AO0yUKVuQZBrqh4KpkX4kNUAoYMCpgDxKOv7wCtHsxqvoJtaaxKEbpe/ pEuxZ+f7PvDc1hQnb/4bkMojQFq36KCv8m9YoUc=
X-Google-Smtp-Source: AMrXdXukZicLNsRFEKJ0GCTJrYkhqlpnsmU0o5wXjF8XAraL3SyuxGwkYf7KMNvKuWP+piKnZS2vh8hzMtKoc4gRj/w=
X-Received: by 2002:a17:906:3042:b0:877:ec5e:87ce with SMTP id d2-20020a170906304200b00877ec5e87cemr7050260ejd.262.1675187146765; Tue, 31 Jan 2023 09:45:46 -0800 (PST)
MIME-Version: 1.0
References: <CAChr6Sxk70HaZbuGykdLHBXr50kVhu3Aw8as5zkHUZjtNzRSfQ@mail.gmail.com> <c3388f3c-ddf4-994d-2724-5116b160297e@stpeter.im> <CAChr6SwQKNrDG-Pehn_-qB+QH7A6aj0R_9Ss_fofQJcyTfXZ=g@mail.gmail.com> <931D4D2C72BA35802715512C@PSB> <CAChr6SxPKvBfzRcaR5J4_xwee7dtJa8iws5GP99OSd-P6trnEg@mail.gmail.com> <6bbb9b8d-21b7-808d-773a-4b54932217a4@lear.ch> <CAChr6Sz7ShmU0c=Q_az_Cde4xRK8EnTL1dON8Jpd5x_1ryzCVw@mail.gmail.com> <Y9bHXiTanQ5fnEAh@straasha.imrryr.org> <FDA2D2DE-D24B-4CA3-AD44-B4C68B6931CF@akamai.com> <9d2ee1ba-2b04-8dd2-1015-391c20708f46@stpeter.im> <006501d934b4$3bad6fc0$b3084f40$@gmail.com> <CAChr6SxdUmShJpO+dHipgHeMZZQvpGdWLn6bvXDPXqjUBMXe3g@mail.gmail.com> <DM6PR14MB21869271F948D8CF69C5FFEC92D39@DM6PR14MB2186.namprd14.prod.outlook.com>
In-Reply-To: <DM6PR14MB21869271F948D8CF69C5FFEC92D39@DM6PR14MB2186.namprd14.prod.outlook.com>
From: Rob Sayre <sayrer@gmail.com>
Date: Tue, 31 Jan 2023 09:45:35 -0800
Message-ID: <CAChr6SzwqpHaoxDy+fxTs-24DnLfQ4QxXKBmr_C1AKJ7cmY4=Q@mail.gmail.com>
To: Corey Bonnell <Corey.Bonnell@digicert.com>
Cc: Valery Smyslov <smyslov.ietf@gmail.com>, "uta@ietf.org" <uta@ietf.org>, Peter Saint-Andre <stpeter@stpeter.im>, "Salz, Rich" <rsalz=40akamai.com@dmarc.ietf.org>, "uta-chairs@ietf.org" <uta-chairs@ietf.org>
Content-Type: multipart/alternative; boundary="000000000000d0b99f05f392e6c2"
Archived-At: <https://mailarchive.ietf.org/arch/msg/uta/nALENZUVcnq9NMpV_GUeAx7AlDI>
Subject: Re: [Uta] UTS-46 / WHATWG
X-BeenThere: uta@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: UTA working group mailing list <uta.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/uta>, <mailto:uta-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/uta/>
List-Post: <mailto:uta@ietf.org>
List-Help: <mailto:uta-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/uta>, <mailto:uta-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 31 Jan 2023 17:45:53 -0000

Section 2, new:
"An "internationalized domain name", i.e., a DNS domain name that includes
at least one label containing appropriately encoded Unicode code points
outside the traditional US-ASCII range. In particular, it contains at least
one U-label or A-label, but otherwise may contain any mixture of NR-LDH
labels, A-labels, or U-labels. Refer to [[Section 7.3]] for further
details."

Section 7.3, new:
"The IETF document covering internationalized domain names is "IDNA2008"
[IDNA-DEFS]. The Unicode Consortium publishes a similar document known as
"UTS-46".

UTS-46 allows names that are valid in IDNA2003 but not IDNA2008, and
additionally allows characters that are not valid in either IETF document,
such as emoji characters. This more lenient approach carries additional
risk of semantic ambiguity and additional security considerations. ICANN
recommends IDNA2008 [
https://features.icann.org/ssac-advisory-use-emoji-domain-names] and
correspondingly recommends against emoji characters in DNS names. However,
the internet contains old content published under IDNA2003, and people
enjoy emoji characters, so consumer applications often end up using the
approach in [UTS-46]. DNS names that conform to IDNA2008 are likely to face
fewer interoperability barriers, while applications that conform to UTS-46
may be able to verify a broader range of certificates."

So, that paragraph now includes some of Viktor's text. IETF people might
recognize Postel's Law here. That's not always the best way to do things,
but I think it's too late for this one, given that even some IETF software
is using UTS-46. It also gives straightforward reasoning, rather than the
old text, which used a lot of passive voice.

thanks,
Rob

There are two other pieces of feedback that seem reasonable to me, but they
didn't come with spec text, and I'm not sure how to incorporate them:

Regarding Section 2, Corey Bonnell writes:
"Section 4.2 of the Protocol Document (RFC 5891) [2] proceeds to define
requirements for IDNA2008-valid labels which would exclude strings that
would be valid in UTS-46 (as has been exhaustively discussed the past few
days on this list). Given this, I don’t believe that U-Label (and perhaps
the other terms defined in RFC 5890) would be the correct term to use to
encompass those labels that are valid for UTS-46 but not IDNA2008."

Regarding Section 7.3, Watson Ladd writes:
"I think we actually need to say more here: the A-label used in the
X509 comparison needs to be the A-label derived and used to do the DNS
lookup. Otherwise we have the issue of bugs that change the IDN
behavior between application and X509/TLS library breaking the
relation between what the user put in and the cert presented.

Also I don't think comparison is enough: don't name constraints need
to be included in the calculation?"



On Mon, Jan 30, 2023 at 1:01 PM Corey Bonnell <Corey.Bonnell@digicert.com>
wrote:

>
>    - “An "internationalized domain name", i.e., a DNS domain name that
>    includes at least one label containing appropriately encoded Unicode code
>    points outside the traditional US-ASCII range. In particular, it contains
>    at least one U-label or A-label, but otherwise may contain any mixture of
>    NR-LDH labels, A-labels, or U-labels. Refer to [[Section 7.3]] for further
>    details.”
>
>
>
> RFC 5890, section 2.3.2.1 [1] defines “U-label” as:
>
>
>
> “A "U-label" is an IDNA-valid string of Unicode characters, in
>
>       Normalization Form C (NFC) and including at least one non-ASCII
>
>       character, expressed in a standard Unicode Encoding Form (such as
>
>       UTF-8).  It is also subject to the constraints about permitted
>
>       characters that are specified in Section 4.2 of the Protocol
>
>       document and the rules in the Sections 2 and 3 of the Tables
>
>       document, the Bidi constraints in that document if it contains any
>
>       character from scripts that are written right to left, and the
>
>       symmetry constraint described immediately below.”
>
>
>
> Section 4.2 of the Protocol Document (RFC 5891) [2] proceeds to define
> requirements for IDNA2008-valid labels which would exclude strings that
> would be valid in UTS-46 (as has been exhaustively discussed the past few
> days on this list). Given this, I don’t believe that U-Label (and perhaps
> the other terms defined in RFC 5890) would be the correct term to use to
> encompass those labels that are valid for UTS-46 but not IDNA2008.
>
>
>
> Thanks,
>
> Corey
>
>
>
> [1] https://www.rfc-editor.org/rfc/rfc5890#section-2.3.2.1
>
> [2] https://www.rfc-editor.org/rfc/rfc5891#section-4.2
>
>
>
>
>
> *From:* Uta <uta-bounces@ietf.org> *On Behalf Of *Rob Sayre
> *Sent:* Monday, January 30, 2023 1:49 PM
> *To:* Valery Smyslov <smyslov.ietf@gmail.com>
> *Cc:* uta@ietf.org; Peter Saint-Andre <stpeter@stpeter.im>; Salz, Rich
> <rsalz=40akamai.com@dmarc.ietf.org>; uta-chairs@ietf.org
> *Subject:* Re: [Uta] UTS-46 / WHATWG
>
>
>
> Hi,
>
>
>
> That is a reasonable thing to ask for, and I will supply edits below. They
> might sound like me rather than the authors, so I wouldn't mind if they
> write something substantially similar in their own voice.
>
>
>
> I also understand the point of view that says "Really all this draft says
> is 'compare A labels.'" But this is incompatible with strenuous objections
> to changes to IDNA text in the document, so I don't understand this
> behavior. When I read the document, I thought "that's not how it really
> works, and I sure wish I didn't have to read the WHATWG document to get the
> truth, because it's really long". In fact, this very mailing list even runs
> on UTS-46. See
> https://mailarchive.ietf.org/arch/msg/uta/KbtvWrG5vdW6iq0scwWpBc1xeM8/
>
>
>
> In Section 2, the current text is incorrect, because UTS-46 domains
> sometimes don't conform to these validity checks. So, the document is
> inconsistent in making this claim. Citing UTS-46 here would be correct,
> since that would also cover IDNA2008, but it doesn't seem like that will
> fly.
>
>
>
> Current:
>
> ---
>
> An "internationalized domain name", i.e., a DNS domain name that includes
> at least one label containing appropriately encoded Unicode code points
> outside the traditional US-ASCII range and conforming to the processing and
> validity checks specified for "IDNA2008" in [IDNA-DEFS] and the associated
> documents. In particular, it contains at least one U-label or A-label, but
> otherwise may contain any mixture of NR-LDH labels, A-labels, or U-labels.
>
>
>
> New:
>
> ---
>
> An "internationalized domain name", i.e., a DNS domain name that includes
> at least one label containing appropriately encoded Unicode code points
> outside the traditional US-ASCII range. In particular, it contains at least
> one U-label or A-label, but otherwise may contain any mixture of NR-LDH
> labels, A-labels, or U-labels. Refer to [[Section 7.3]] for further details.
>
> ---
>
>
>
>
>
> Then, in section 7.3:
>
> Current:
>
> ---
>
> As with URIs and URLs, there are in practice at least two primary
> approaches to internationalized domain names: "IDNA2008" (see [IDNA-DEFS]
> and the associated documents) and an alternative approach specified by the
> Unicode Consortium in [UTS-46]. (At this point the transition from the
> older "IDNA2003" technology is mostly complete.) Differences in
> specification, interpretation, and deployment of these technologies can be
> relevant to Internet services that are secured through certificates (e.g.,
> some top-level domains might allow registration of names containing Unicode
> code points that typically are discouraged, either formally or otherwise).
> Although there is little that can be done by certificate matching software
> itself to mitigate these differences (aside from matching exclusively on
> A-labels), the reader needs to be aware that the handling of
> internationalized domain names is inherently complex and can lead to
> significant security vulnerabilities if not properly implemented.
>
>
>
> New:
>
> The IETF document covering internationalized domain names is "IDNA2008"
> [IDNA-DEFS]. The Unicode Consortium publishes a similar document known as
> "UTF-46". This document allows names that are valid in IDNA2003 but not
> IDNA2008, and additionally allows characters that are not valid in either
> IETF document, such as emoji characters. This more lenient approach carries
> additional risk of semantic ambiguity and additional security
> considerations. ICANN recommends IDNA2008 [
> https://features.icann.org/ssac-advisory-use-emoji-domain-names] and
> against emoji characters in domain names. However, the internet contains
> old content published under IDNA2003, and people enjoy emoji characters, so
> consumer applications often end up using the more liberal approach in
> [UTS-46].
>
> ---
>
>
>
> That's it, and I must say I'm a bit dismayed that argument has continually
> drifted to the /people/ rather than the content of the document.
>
>
>
> thanks,
>
> Rob
>
>
>
>
>
>
>
> On Mon, Jan 30, 2023 at 6:07 AM Valery Smyslov <smyslov.ietf@gmail.com>
> wrote:
>
> Hi,
>
> thanks to all for very interesting discussion (and thanks
> to John and Patrik for the explanation of the history of the problem).
>
> Before issuing a consensus call, the first question is to Rob:
> can you propose concrete text changes that you want to see in the draft?
>
> Regards,
> Valery (for the chairs).
>
> > On 1/29/23 1:14 PM, Salz, Rich wrote:
> > >
> > >> It seems to me that It remains the case that this I-D is not the best
> > >> forum to litigate which U-labels are valid candidates for turning into
> > >> A-labels. Surely that belongs elsewhere.
> > >
> > > I agree that this kind of thing belongs in the DNS groups.
> > >
> > >>   However it is that
> > > applications (or their libraries) turn U-labels into A-labels, this I-D
> > > describes how to match them against presented identifiers in
> > > certificates.
> > >
> > > *EXACTLY*
> > >
> > > Really all this draft says is "compare A labels."
> > >
> > > What else do we need to say?  In my view nothing.
> >
> > Completely agree.
> >
> > And that's what draft-ietf-uta-rfc6125bis (even RFC 6125 before it) has
> > always done, with version -10 now including additional security
> > considerations and pointers to relevant specifications.
> >
> > Chairs, can you please initiate a consensus call on whether or not we
> > need to make changes to draft-ietf-uta-rfc6125bis on this topic? As far
> > as I can see, we have one person loudly in the rough, but a consensus
> > call would enable us to determine whether there is broader support for
> > modifications to the draft (which, I would like to point out, has
> > already completed two working group last calls).
> >
> > Peter
> >
> > _______________________________________________
> > Uta mailing list
> > Uta@ietf.org
> > https://www.ietf.org/mailman/listinfo/uta
>
>