Re: [Ltru] Mail regarding draft-ietf-ltru-4646bis and draft-ietf-ltru-matching

Mark Davis ☕️ <mark@macchiato.com> Thu, 29 August 2019 06:19 UTC

Return-Path: <mark.edward.davis@gmail.com>
X-Original-To: ltru@ietfa.amsl.com
Delivered-To: ltru@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0345A1201E5 for <ltru@ietfa.amsl.com>; Wed, 28 Aug 2019 23:19:40 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.388
X-Spam-Level:
X-Spam-Status: No, score=-1.388 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_FORGED_FROMDOMAIN=0.249, FREEMAIL_FROM=0.001, FROM_EXCESS_BASE64=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.249, HTML_FONT_FACE_BAD=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_KAM_HTML_FONT_INVALID=0.01] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=macchiato-com.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2IGlw4NDy9WV for <ltru@ietfa.amsl.com>; Wed, 28 Aug 2019 23:19:36 -0700 (PDT)
Received: from mail-ot1-x333.google.com (mail-ot1-x333.google.com [IPv6:2607:f8b0:4864:20::333]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6A3D1120020 for <ltru@ietf.org>; Wed, 28 Aug 2019 23:19:36 -0700 (PDT)
Received: by mail-ot1-x333.google.com with SMTP id 100so2346239otn.2 for <ltru@ietf.org>; Wed, 28 Aug 2019 23:19:36 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=macchiato-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=d4vJSUeEcU8ZpBUz5kqJ7JBo7dovjtlqihDQxPCi4ps=; b=jBvgl3+bKknzkmCKoWGAnN02shkuujHZht8QK4jbRdxAyNVlYA1u989N7mNuDyYWTa Gzk4mLrVMhFUNDl6VppMDTucup2NEnuwGW0z6iQDIO/3tj9HsX8iH4SJ7VmWUTUVStJA KiWm7byV1ujx3s7F3yvoEwUH5LXnLgt2NGZBKwlaI3/d32L2Ayhlp+1agnY2RLyzy1C9 bZr2wBJT1+BKaYELs5tj4FjG166O6ASBirhNU/QL6X7ymId48SQHePtDvhGoID/zjvFp hMDqUCDehMx8UgWo+fSIYkzIU2x6N+0ysbD+qAtOpxY+mDMIjaOd/Zn+XBUedQPv9K8U 27Fw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=d4vJSUeEcU8ZpBUz5kqJ7JBo7dovjtlqihDQxPCi4ps=; b=MfeNLwvRt8gWqtDG2HowSXka8NaArv8/YcxkalY7Pqzny+6KXv2sd/8eT6sQtar3dy VRW0OLmshFeCg5MxPnhJO3rq+VQzrthytggaBXIwj65i6/nnof+FRwnhb9NXp/3mF3qX zRZHbdE6+dv4NuljnxaY+oJ2kX4XF+N9o5QkFyUe+CF9QvbSfg1aVD4WzQyD3KdbsndM IX57zgqSuiAsv14miFnFwYK8K1abCygpYvwVINI0TlR5DMDGkTfffR1zvSngTScVF72y KJEaLCratC312lY0zb210j4GlGAWRXGd8i1VmqQH1wiMNrXfUScUw6yEP1QLbQwv2Apr 9cLg==
X-Gm-Message-State: APjAAAVMObClSEZJ2I1BwSd/7hqnKb0m1TE2VCxKKu84IfeIAJm9by3d KJsyJSij+NR+b28A/H/0tDn3TRinAbvJbjucMo3hzsfp
X-Google-Smtp-Source: APXvYqwy/r4mtons8+OeJd17gHzYzbRZ9w7/sx8gBvqeFy2XPoY22VcG7dzgpPzxGZ92FDOOArm9MeZRIhPXS9Fd/1A=
X-Received: by 2002:a9d:909:: with SMTP id 9mr6468867otp.261.1567059575351; Wed, 28 Aug 2019 23:19:35 -0700 (PDT)
MIME-Version: 1.0
References: <20190827104755.665a7a7059d7ee80bb4d670165c8327d.0f79efb126.wbe@email03.godaddy.com> <910CB6C8-9F66-4255-B149-B146DA8E5695@rivoal.net>
In-Reply-To: <910CB6C8-9F66-4255-B149-B146DA8E5695@rivoal.net>
From: Mark Davis ☕️ <mark@macchiato.com>
Date: Thu, 29 Aug 2019 08:19:24 +0200
Message-ID: <CAJ2xs_GWQH=zOvzVqUqpFKHmLKWZTR=ybJOv+K_SMhCW==X23g@mail.gmail.com>
To: Florian Rivoal <florian@rivoal.net>
Cc: Doug Ewell <doug@ewellic.org>, LTRU Working Group <ltru@ietf.org>
Content-Type: multipart/alternative; boundary="00000000000054aed605913b7d29"
Archived-At: <https://mailarchive.ietf.org/arch/msg/ltru/i-iyzjg-Zm4wrpXkEdl_bFsqlIg>
Subject: Re: [Ltru] Mail regarding draft-ietf-ltru-4646bis and draft-ietf-ltru-matching
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ltru/>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 29 Aug 2019 06:19:40 -0000

Canonicalizing to the extlang form has a number of disadvantages, and I
would recommend strongly against it. Don't have time to discuss now, will
see about next week.

Mark


On Wed, Aug 28, 2019 at 11:56 AM Florian Rivoal <florian@rivoal.net> wrote:

>
>
> > On Aug 28, 2019, at 2:47, Doug Ewell <doug@ewellic.org> wrote:
> >
> > On July 27, Florian Rivoal wrote:
> >
> >> However, RFC5646 Section 4.5, which defines canonicalization, only
> >> does so for language tags, not for language ranges. Presumably, the
> >> process is largely the same, with wildcards in the language subtag
> >> being preserved, and I suppose wildcards in other subtags would likely
> >> be dropped. But as it stands, that seems undefined.
> >
> > I think you are on the right track by assuming that ranges are
> > canonicalized just like tags, with asterisks left alone.
>
> Thanks for confirming.
>
> > It's not very likely that most LTRU participants will be eager to start
> > up a new IETF project to update 5646 for something like this. Best to go
> > with your assumption.
> >
> >> Also, while giving recommendations about canonicalization for the
> >> purpose of filtering, it would seem useful to mention (and possibly to
> >> recommend) canonicalizing to the "extlang form". The definition of the
> >> extlang form itself (in  RFC5646 Section 4.5) mentions that it is
> >> useful for matching and selecting, but that information isn't relayed
> >> anywhere RFC4647.
> >
> > At the time these documents were written, there was a strong sentiment
> > around de-emphasizing extlangs in general. It's good to know that
> > there's a real-world use case for using them here. Again, it's unlikely
> > that people will want to rev 4647 for this.
>
> The use case is CSS selectors, when writing rules for typography/styling
> in a document. On the one hand, the document gets marked up which part of
> it are in which language. On the other side, the style sheet describes
> which part of the document must be styled which way, and can make that
> styling dependent on the language.
>
> The need for normalization comes from the fact that stylesheet authoring
> and document authoring are not coordinated in the general case, so a
> stylesheet author cannot know, generally speaking, if a document will be
> marked up with, for example, zh-yue or yue. The stylesheet author is then
> faced with two options, both unattractive for different reasons:
> * use the deprecated tag: it's more likely to be found in existing
> documents due to being older. The first downside is that it doesn't always
> work. The second one is that this slows down adoption of the newer
> preferred tag, as document authors wanting to be compatible with existing
> stylesheets will keep on using the deprecated one as well for
> compatibility, and we get into a vicious cycle of everybody continuing to
> use the deprecated variant.
>
> * Use both the deprecated and the preferred tag in the stylesheet's
> selector. This works, but it means that stylesheet authors need to be aware
> of, and manually replicate the information in
> https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry.
> Asking people to manually do what software could isn't great, as they tend
> not to, or to do it with bugs, or to not update when the registry updates,
> etc.
>
> So it seems preferable, given that this correspondence is maintained in a
> neatly usable format, to have CSS renderers deal with the correspondence
> between deprecated and preferred tags by way of canonicalizing to the
> extlang form and doing the selector matching on that.
>
> In the long run, both document authors and stylesheet authors should use
> the preferred tag without the extlang prefix, and the canonicalization to
> extang form will be invisible to them. But even if some don't, everything
> works.
>
> —Florian
> _______________________________________________
> Ltru mailing list
> Ltru@ietf.org
> https://www.ietf.org/mailman/listinfo/ltru
>