Re: [Ltru] Mail regarding draft-ietf-ltru-4646bis and draft-ietf-ltru-matching

Mark Davis ☕️ <> Thu, 29 August 2019 06:19 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 0345A1201E5 for <>; Wed, 28 Aug 2019 23:19:40 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.388
X-Spam-Status: No, score=-1.388 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_FORGED_FROMDOMAIN=0.249, FREEMAIL_FROM=0.001, FROM_EXCESS_BASE64=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.249, HTML_FONT_FACE_BAD=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_KAM_HTML_FONT_INVALID=0.01] autolearn=no autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id 2IGlw4NDy9WV for <>; Wed, 28 Aug 2019 23:19:36 -0700 (PDT)
Received: from ( [IPv6:2607:f8b0:4864:20::333]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 6A3D1120020 for <>; Wed, 28 Aug 2019 23:19:36 -0700 (PDT)
Received: by with SMTP id 100so2346239otn.2 for <>; Wed, 28 Aug 2019 23:19:36 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=d4vJSUeEcU8ZpBUz5kqJ7JBo7dovjtlqihDQxPCi4ps=; b=jBvgl3+bKknzkmCKoWGAnN02shkuujHZht8QK4jbRdxAyNVlYA1u989N7mNuDyYWTa Gzk4mLrVMhFUNDl6VppMDTucup2NEnuwGW0z6iQDIO/3tj9HsX8iH4SJ7VmWUTUVStJA KiWm7byV1ujx3s7F3yvoEwUH5LXnLgt2NGZBKwlaI3/d32L2Ayhlp+1agnY2RLyzy1C9 bZr2wBJT1+BKaYELs5tj4FjG166O6ASBirhNU/QL6X7ymId48SQHePtDvhGoID/zjvFp hMDqUCDehMx8UgWo+fSIYkzIU2x6N+0ysbD+qAtOpxY+mDMIjaOd/Zn+XBUedQPv9K8U 27Fw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=d4vJSUeEcU8ZpBUz5kqJ7JBo7dovjtlqihDQxPCi4ps=; b=MfeNLwvRt8gWqtDG2HowSXka8NaArv8/YcxkalY7Pqzny+6KXv2sd/8eT6sQtar3dy VRW0OLmshFeCg5MxPnhJO3rq+VQzrthytggaBXIwj65i6/nnof+FRwnhb9NXp/3mF3qX zRZHbdE6+dv4NuljnxaY+oJ2kX4XF+N9o5QkFyUe+CF9QvbSfg1aVD4WzQyD3KdbsndM IX57zgqSuiAsv14miFnFwYK8K1abCygpYvwVINI0TlR5DMDGkTfffR1zvSngTScVF72y KJEaLCratC312lY0zb210j4GlGAWRXGd8i1VmqQH1wiMNrXfUScUw6yEP1QLbQwv2Apr 9cLg==
X-Gm-Message-State: APjAAAVMObClSEZJ2I1BwSd/7hqnKb0m1TE2VCxKKu84IfeIAJm9by3d KJsyJSij+NR+b28A/H/0tDn3TRinAbvJbjucMo3hzsfp
X-Google-Smtp-Source: APXvYqwy/r4mtons8+OeJd17gHzYzbRZ9w7/sx8gBvqeFy2XPoY22VcG7dzgpPzxGZ92FDOOArm9MeZRIhPXS9Fd/1A=
X-Received: by 2002:a9d:909:: with SMTP id 9mr6468867otp.261.1567059575351; Wed, 28 Aug 2019 23:19:35 -0700 (PDT)
MIME-Version: 1.0
References: <> <>
In-Reply-To: <>
From: =?UTF-8?B?TWFyayBEYXZpcyDimJXvuI8=?= <>
Date: Thu, 29 Aug 2019 08:19:24 +0200
Message-ID: <>
To: Florian Rivoal <>
Cc: Doug Ewell <>, LTRU Working Group <>
Content-Type: multipart/alternative; boundary="00000000000054aed605913b7d29"
Archived-At: <>
Subject: Re: [Ltru] Mail regarding draft-ietf-ltru-4646bis and draft-ietf-ltru-matching
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Thu, 29 Aug 2019 06:19:40 -0000

Canonicalizing to the extlang form has a number of disadvantages, and I
would recommend strongly against it. Don't have time to discuss now, will
see about next week.


On Wed, Aug 28, 2019 at 11:56 AM Florian Rivoal <> wrote:

> > On Aug 28, 2019, at 2:47, Doug Ewell <> wrote:
> >
> > On July 27, Florian Rivoal wrote:
> >
> >> However, RFC5646 Section 4.5, which defines canonicalization, only
> >> does so for language tags, not for language ranges. Presumably, the
> >> process is largely the same, with wildcards in the language subtag
> >> being preserved, and I suppose wildcards in other subtags would likely
> >> be dropped. But as it stands, that seems undefined.
> >
> > I think you are on the right track by assuming that ranges are
> > canonicalized just like tags, with asterisks left alone.
> Thanks for confirming.
> > It's not very likely that most LTRU participants will be eager to start
> > up a new IETF project to update 5646 for something like this. Best to go
> > with your assumption.
> >
> >> Also, while giving recommendations about canonicalization for the
> >> purpose of filtering, it would seem useful to mention (and possibly to
> >> recommend) canonicalizing to the "extlang form". The definition of the
> >> extlang form itself (in  RFC5646 Section 4.5) mentions that it is
> >> useful for matching and selecting, but that information isn't relayed
> >> anywhere RFC4647.
> >
> > At the time these documents were written, there was a strong sentiment
> > around de-emphasizing extlangs in general. It's good to know that
> > there's a real-world use case for using them here. Again, it's unlikely
> > that people will want to rev 4647 for this.
> The use case is CSS selectors, when writing rules for typography/styling
> in a document. On the one hand, the document gets marked up which part of
> it are in which language. On the other side, the style sheet describes
> which part of the document must be styled which way, and can make that
> styling dependent on the language.
> The need for normalization comes from the fact that stylesheet authoring
> and document authoring are not coordinated in the general case, so a
> stylesheet author cannot know, generally speaking, if a document will be
> marked up with, for example, zh-yue or yue. The stylesheet author is then
> faced with two options, both unattractive for different reasons:
> * use the deprecated tag: it's more likely to be found in existing
> documents due to being older. The first downside is that it doesn't always
> work. The second one is that this slows down adoption of the newer
> preferred tag, as document authors wanting to be compatible with existing
> stylesheets will keep on using the deprecated one as well for
> compatibility, and we get into a vicious cycle of everybody continuing to
> use the deprecated variant.
> * Use both the deprecated and the preferred tag in the stylesheet's
> selector. This works, but it means that stylesheet authors need to be aware
> of, and manually replicate the information in
> Asking people to manually do what software could isn't great, as they tend
> not to, or to do it with bugs, or to not update when the registry updates,
> etc.
> So it seems preferable, given that this correspondence is maintained in a
> neatly usable format, to have CSS renderers deal with the correspondence
> between deprecated and preferred tags by way of canonicalizing to the
> extlang form and doing the selector matching on that.
> In the long run, both document authors and stylesheet authors should use
> the preferred tag without the extlang prefix, and the canonicalization to
> extang form will be invisible to them. But even if some don't, everything
> works.
> —Florian
> _______________________________________________
> Ltru mailing list