Re: [Ltru] Mail regarding draft-ietf-ltru-4646bis and draft-ietf-ltru-matching

Florian Rivoal <florian@rivoal.net> Fri, 30 August 2019 01:14 UTC

Return-Path: <florian@rivoal.net>
X-Original-To: ltru@ietfa.amsl.com
Delivered-To: ltru@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 263721201C6 for <ltru@ietfa.amsl.com>; Thu, 29 Aug 2019 18:14:35 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.7
X-Spam-Level:
X-Spam-Status: No, score=-2.7 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=rivoal.net header.b=QnQ3sL1o; dkim=pass (2048-bit key) header.d=messagingengine.com header.b=yzlard3r
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 94beuytLRHKs for <ltru@ietfa.amsl.com>; Thu, 29 Aug 2019 18:14:33 -0700 (PDT)
Received: from out4-smtp.messagingengine.com (out4-smtp.messagingengine.com [66.111.4.28]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D780512091B for <ltru@ietf.org>; Thu, 29 Aug 2019 18:14:32 -0700 (PDT)
Received: from compute2.internal (compute2.nyi.internal [10.202.2.42]) by mailout.nyi.internal (Postfix) with ESMTP id 168D321D6E; Thu, 29 Aug 2019 21:14:32 -0400 (EDT)
Received: from mailfrontend1 ([10.202.2.162]) by compute2.internal (MEProxy); Thu, 29 Aug 2019 21:14:32 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivoal.net; h= content-type:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; s=fm1; bh=D MFGgZ5NAZpbGBabtB3oLrlE5Sl8Wmhyu0iJAzGNN8A=; b=QnQ3sL1oIL5XJb7M3 SOej9KZPhGalSGy03ZWxeJMO023CDIrGQVNT5APgP8/g60puPtOEkiWBy1YeAzjv S9YpR2S1K+u4Pf97D0n/DeKPnxgQVQRtShFd3TgMyCusuU5v4Nx7AwjGPadYQVzf jE16I4bGX+0U6f75aGPVK+6oJAjSESJXEKpN/3wucvc/SnBOYpHkesmZyrn6jRf1 Ve7g47B8i3uh04di08H4y/Np3iW5AZ16snt9tBNHuJ+C0ijkW1o5qib0sSmsVjv7 nWO1W/9DXt8mJHN8lK8rCRT3OaWCYWY1NfEtyOWSblsAdEGN8MOoOUvQrQlJyqKo c+LSw==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm3; bh=DMFGgZ5NAZpbGBabtB3oLrlE5Sl8Wmhyu0iJAzGNN 8A=; b=yzlard3rCseJ8QeIVd2YfpZEu8klcmIXr5kqkuFLAZGTjao385/FG7CnF YSPGN6bpXWzaMhfXMApzbOLCwrth0wHpMr7Ui9f+Rg53l1HRZ1qbv55kl+4LDAfe /9mge3YLYijRcij7CuCUlDfeLnjCmkFsTexQ8Q/AaXcznZNd+MkAWKrq/FSQbo0D 1zBc6cvABeKieTc+iI8TM8HjrfqD1JKd22K9dbLmoFbXS8VAaW6w0u66yfPt/IHQ kcWvfpI4J39NkjiTA3LjDRHBxt0sHIql2NEOFgOnO1JbPeY6bah+6bfDj94iQ+BK 2j+vAqcJ48zJFFRdK9nIOZxNtkpfA==
X-ME-Sender: <xms:d3hoXWcrMW3c6tKasDRYbh5sCmNNv_osSNEGTGOngbFHHVpemjh71Q>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduvddrudeifedggedtucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurheptggguffhjgffgffkfhfvofesthhqmhdthhdtjeenucfhrhhomhephfhlohhr ihgrnhcutfhivhhorghluceofhhlohhrihgrnhesrhhivhhorghlrdhnvghtqeenucfkph epudekfedrjeeirdehrddvtdenucfrrghrrghmpehmrghilhhfrhhomhepfhhlohhrihgr nhesrhhivhhorghlrdhnvghtnecuvehluhhsthgvrhfuihiivgeptd
X-ME-Proxy: <xmx:d3hoXSip25zXjjgCam4JQBVmT276nsZxUN-8DOSKIElqeBPIAKZA3A> <xmx:d3hoXWNTNOE-dVn3zeE0Bm_43EnHnY_gRQBceWTCrj2nNYv4IhPbBg> <xmx:d3hoXQ6Joxje_Ct3RCyMDCPQJINYHWyj0xSWji8LyA49MfuvFxpg9g> <xmx:eHhoXf8HLIUVEXFfUhDDT4u_LiVlP_BJtUkbVje1Io6-TMe6aQu_JA>
Received: from [192.168.1.3] (ab005020.dynamic.ppp.asahi-net.or.jp [183.76.5.20]) by mail.messagingengine.com (Postfix) with ESMTPA id 750F280060; Thu, 29 Aug 2019 21:14:30 -0400 (EDT)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\))
From: Florian Rivoal <florian@rivoal.net>
In-Reply-To: <94B0FC03-B793-43BB-B864-38A306E6B5CA@rivoal.net>
Date: Fri, 30 Aug 2019 10:14:28 +0900
Cc: LTRU Working Group <ltru@ietf.org>, Doug Ewell <doug@ewellic.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <A8BF9658-CB3C-4DC5-AECB-840C0053A943@rivoal.net>
References: <20190827104755.665a7a7059d7ee80bb4d670165c8327d.0f79efb126.wbe@email03.godaddy.com> <910CB6C8-9F66-4255-B149-B146DA8E5695@rivoal.net> <CAJ2xs_GWQH=zOvzVqUqpFKHmLKWZTR=ybJOv+K_SMhCW==X23g@mail.gmail.com> <73BE5AF7-0C62-425F-834E-8759628D2C5F@rivoal.net> <CAD2gp_S+dDdgo9WsOixT_-jHkWZxmajWmx2MRKi0iDHVSwd-3g@mail.gmail.com> <94B0FC03-B793-43BB-B864-38A306E6B5CA@rivoal.net>
To: John Cowan <cowan@ccil.org>
X-Mailer: Apple Mail (2.3445.104.11)
Archived-At: <https://mailarchive.ietf.org/arch/msg/ltru/mSfvRjpME2ETJ-BRGQXZbG_xANw>
Subject: Re: [Ltru] Mail regarding draft-ietf-ltru-4646bis and draft-ietf-ltru-matching
X-BeenThere: ltru@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Language Tag Registry Update working group discussion list <ltru.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ltru>, <mailto:ltru-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ltru/>
List-Post: <mailto:ltru@ietf.org>
List-Help: <mailto:ltru-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ltru>, <mailto:ltru-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 30 Aug 2019 01:14:35 -0000


> On Aug 30, 2019, at 10:08, Florian Rivoal <florian@rivoal.net> wrote:
> 
> 
> 
>> On Aug 29, 2019, at 22:44, John Cowan <cowan@ccil.org> wrote:
>> 
>> 
>> 
>> On Thu, Aug 29, 2019 at 5:00 AM Florian Rivoal <florian@rivoal.net> wrote:
>> 
>> In case that influences what you have to say, note that what I intend to do is not to store the canonicalized-to-extlang form anywhere. It would only be for internal processing: when performing an extended filtering operation, where it is unknown whether the ranges and tags are in extlang form or not, canonicalize both to extlang form do the extended filtering operation on that.
>> 
>> In that case you can equally canonicalize away from the extlang form as toward it.  I recommend that.
> 
> Can you?
> 
> Let's say you want to match (using extended filtering) the zh range against documents that may contain the zh-yue or yue tags (and possibly other zh-cmn, zh-hakka, zh, zh-HK…). This could be something a typesetter wants to do to use a particular font and set of line breaking rules for any chunk of Chinese (in the broad sense) text.
> 
> If we canonicalize to extlang form: 
>  zh -> zh
>  zh-yue -> zh-yue
>  yue -> zh-yue
> Therefore, the zh range will match both the documents that contained zh-yue or yue. This is what I want.
> 
> If we canonicalize away from extlang form: 
>  zh -> zh
>  zh-yue -> yue
>  yue -> yue
> Therefore, the zh range will match neither documents that contained zh-yue nor yue. This is not what I want, and is worse than not canonicalizing at all.
> 
> So it seems to me that no, we cannot canonicalize away from the extlang form and get the same results.
> 
> If the extended filtering operation did something smart with macrolanguages, then I wouldn't need canonicalization at all, but it doesn't, so I feel I need to canonicalize, and as described above, only canonicalization to extlang actually seems to help.
> 
> Am I missing something?
> 
> —Florian

Sorry for not including that in the previous message, but to give another example, if I want to use the no range to match any of: no, no-bok, no-nyn, nb, or nn, canonicalization to extlang form works, and canonicalization away from it doesn't.

—Florian