Re: [Ietf-languages] Fwd: I-D Action: draft-msporny-d-langtag-ext-00.txt

Mark Davis ☕️ <mark@macchiato.com> Tue, 28 May 2019 04:05 UTC

Return-Path: <mark.edward.davis@gmail.com>
X-Original-To: ietf-languages@ietfa.amsl.com
Delivered-To: ietf-languages@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BDEE0120045 for <ietf-languages@ietfa.amsl.com>; Mon, 27 May 2019 21:05:50 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.362
X-Spam-Level:
X-Spam-Status: No, score=-1.362 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_FORGED_FROMDOMAIN=0.198, FREEMAIL_FROM=0.001, FROM_EXCESS_BASE64=0.979, HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_FONT_FACE_BAD=0.981, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_SOFTFAIL=0.665, T_KAM_HTML_FONT_INVALID=0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=macchiato-com.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7nY0JeKGS_M9 for <ietf-languages@ietfa.amsl.com>; Mon, 27 May 2019 21:05:47 -0700 (PDT)
Received: from mork.alvestrand.no (mork.alvestrand.no [158.38.152.117]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6A838120043 for <ietf-languages@ietf.org>; Mon, 27 May 2019 21:05:47 -0700 (PDT)
Received: by mork.alvestrand.no (Postfix) id C21B47C386E; Tue, 28 May 2019 06:05:43 +0200 (CEST)
Delivered-To: ietf-languages@alvestrand.no
Received: from localhost (localhost [127.0.0.1]) by mork.alvestrand.no (Postfix) with ESMTP id A21767C3863 for <ietf-languages@alvestrand.no>; Tue, 28 May 2019 06:05:43 +0200 (CEST)
X-Virus-Scanned: Debian amavisd-new at alvestrand.no
Authentication-Results: mork.alvestrand.no (amavisd-new); dkim=pass (2048-bit key) header.d=macchiato-com.20150623.gappssmtp.com
Received: from mork.alvestrand.no ([127.0.0.1]) by localhost (mork.alvestrand.no [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SBuQxKb4DcPE for <ietf-languages@alvestrand.no>; Tue, 28 May 2019 06:05:40 +0200 (CEST)
X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0
X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0
X-Comment: SPF skipped for whitelisted relay - client-ip=192.0.33.71; helo=pechora1.lax.icann.org; envelope-from=mark.edward.davis@gmail.com; receiver=ietf-languages@alvestrand.no
Received: from pechora1.lax.icann.org (pechora1.icann.org [192.0.33.71]) by mork.alvestrand.no (Postfix) with ESMTPS id 99B127C3812 for <ietf-languages@alvestrand.no>; Tue, 28 May 2019 06:05:39 +0200 (CEST)
Received: from mail-ot1-x342.google.com (mail-ot1-x342.google.com [IPv6:2607:f8b0:4864:20::342]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by pechora1.lax.icann.org (Postfix) with ESMTPS id EE1B81E04C5 for <ietf-languages@iana.org>; Tue, 28 May 2019 04:05:37 +0000 (UTC)
Received: by mail-ot1-x342.google.com with SMTP id i2so16449904otr.9 for <ietf-languages@iana.org>; Mon, 27 May 2019 21:05:37 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=macchiato-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=DdP3WQ9UHZ9Aj9CCJ68oOKHdpErQhdZZluLJVPH65co=; b=p1de/TY5K7+NhPT/gg8hE0Hm2Bi3sJJfW8BDEDs2gr8VFSHRaVdmHZaDhe97VUyTZm otzvD0fWAjTbxOyIHCi8cKIPxSQcGBxmrSC6weavQUXy16NeZ0Os1UYWFbxW5DN3uisK 44kecVyzrhlqNoePxGKIjsfBASJN+z6NRP/p7ppruhs//dhoGXFb5/+X+Jq8KYlvZGkD ZRm0B1rhL8t9lmWS7dfLq+Z51EgwioZSAoNDfgTxLBKW5V7Ng6YgjplB1g3ZUV0SFHGO XxIm8X/L3snqWnGG5s8S6nqetznWqqQXbpfmd91WTlfZxZL4X+JfOtChnLa3YY+jatbF D/Eg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=DdP3WQ9UHZ9Aj9CCJ68oOKHdpErQhdZZluLJVPH65co=; b=GjNNZ+bMPaPq0mULZ4MNJ8FXiO4VfeRdt6U8NGuhEoXsbSqTHKOgEVYqTFbR+tTbeQ PNwKqwKQ6pfkv6Qw547zlirm3CIOrd9jwhU2KHsfVLkCVlocf4hKPVMD48rWb3pDSNuc jtrcgEDwSPuoeCDLtw7AypESSko2dkaCc+BhCSlSchNtgJYxf7P3qjaL5zuRC7iAUxfx 4hqcAS6Y56OyIPeakcHYDYrv+pkIYRok18jE56+AO1324WJjz5OhgrEBAdgxNATjIfUz sF+lJbsmFnvLcrWfkzrfPKdsD6WZEdRey5876rxC/0i+PxOoDLzU2qR5z2CZKyUibOaw W7RA==
X-Gm-Message-State: APjAAAVTj2wXdWRTV383XM50E6fHkmvcU6pejy/CMi0Cd++LuIuQGWDh oZkwkDHB/fjoQRb4izlfEmJt7ZIPpWIJP+QpkbA=
X-Google-Smtp-Source: APXvYqzLW2HS5l2/0NKOH8+3o/BDKwAFupLyPoee4HUnZBxkQAjTLSK4YkIHU2rmjIZ1cxQMMTqU+AdDv33APhVl3Jk=
X-Received: by 2002:a05:6830:2054:: with SMTP id f20mr5920791otp.288.1559016316934; Mon, 27 May 2019 21:05:16 -0700 (PDT)
MIME-Version: 1.0
References: <155881874982.30992.4869767614014356043@ietfa.amsl.com> <49b6a1de-e016-514f-90e4-24703b5818d2@it.aoyama.ac.jp> <63b4f786-8b44-ecdf-ed33-ff0567ecc839@digitalbazaar.com> <000001d51425$a48ac140$eda043c0$@ewellic.org> <CAJ2xs_EwKg3Tu5etk-ELXXd0u2Go-6TZbGm3QsBxV1upKTa8_g@mail.gmail.com> <0819b68a-56a1-d11d-db36-7e5510a8e971@digitalbazaar.com>
In-Reply-To: <0819b68a-56a1-d11d-db36-7e5510a8e971@digitalbazaar.com>
From: Mark Davis ☕️ <mark@macchiato.com>
Date: Tue, 28 May 2019 06:05:05 +0200
Message-ID: <CAJ2xs_EeeF=S1Z3JNe55fDC2c6y7W4n97+GML365Us5V1+MmLA@mail.gmail.com>
To: Manu Sporny <msporny@digitalbazaar.com>
Cc: Doug Ewell <doug@ewellic.org>, "Phillips, Addison" <addison@lab126.com>, "Martin J. Dürst" <duerst@it.aoyama.ac.jp>, IETF Languages Discussion <ietf-languages@iana.org>
Content-Type: multipart/alternative; boundary="000000000000c51fe50589eac5e4"
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf-languages/PxcYVEhtfXWsTJITUM_oaExhTXg>
Subject: Re: [Ietf-languages] Fwd: I-D Action: draft-msporny-d-langtag-ext-00.txt
X-BeenThere: ietf-languages@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <ietf-languages.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf-languages>, <mailto:ietf-languages-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf-languages/>
List-Post: <mailto:ietf-languages@ietf.org>
List-Help: <mailto:ietf-languages-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf-languages>, <mailto:ietf-languages-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 28 May 2019 04:05:51 -0000

Mark


On Mon, May 27, 2019 at 10:10 PM Manu Sporny <msporny@digitalbazaar.com>
wrote:

> On 5/27/19 9:52 AM, Mark Davis ☕️ wrote:
> > I think what they are trying to do is shoehorn in a parameter that
> > lets them set the paragraph embedding level
> > (https://unicode.org/reports/tr9/#BD4) for the Bidi Algorithm.
>
> Hmm, no, I don't think that's it... Here's some background, Mark:
>
> https://github.com/w3c/rdf-dir-literal/issues/3#issuecomment-496004819
>
> ... and why we're having this discussion:
>
> https://github.com/w3c/rdf-dir-literal/issues/3#issuecomment-496006350
>
> ... and what the current proposed spec text in the Verifiable
> Credentials Data Model specification regarding i18n states:
>
>
> https://pr-preview.s3.amazonaws.com/w3c/vc-data-model/pull/641.html#internationalization-considerations
>
> > 1. So from the tag "ar-Arab", we get the script "Arab". Then use
> >
> https://github.com/unicode-org/cldr/blob/master/common/properties/scriptMetadata.txt
> ,
> >
> >
> >
> >
> which has a mapping from script to direction (RTL=YES). (I'm pointing to
> > trunk, just so people can read the file easily; one would use the
> > latest release.)
>
> What about for something like this, where BiDi doesn't work?
>
> HTML و CSS: تصميم و إنشاء مواقع الويب
>

I assume that by the phrase "BiDi doesn't work" you mean that the original
author's intended display is not achieved.

It would help if you clearly provided examples of the source text, and what
is desired. While the paragraph direction is important, it is often *not*
enough to get exactly the right display, such as when there are neutral
characters on the boundaries between RTL and LTR text that shouldn't follow
the paragraph direction. For that the author needs finer-grained control
over the direction. That can be achieved using the bidi control characters
in Unicode, or by markup in higher-level languages (such as HTML/CSS). The
markup approach is cleaner, where possible. Where the environment does not
permit markup, the Unicode characters can be used to exactly control the
bidi display.

It appears that your use case has extremely limited markup, just the
language tag, and just for a whole paragraph, and that you don't have the
ability to add any further markup, and are driven to overloading the
language tag for that purpose. Even assuming that, you have not made it
clear why simply the language tag of "ar" — *from which the paragraph
direction could be derived* — is not sufficient for your use case of
setting the paragraph direction.


> > It isn't that Arabic would be displayed left to right, it is what
> > establishes the paragraph ordering. The problem arises when you have
> > mixed text. Look at the following example, using the convention that
> > lowercase = English and uppercase=Arabic. The majority of the text
> > and the first strong character are both English, but the sentence is
> >  meant to be used in an Arabic environment, so the default paragraph
> >  embedding level needs to be RTL.
>
> Yep.
>
> > 3. I also agree with Martin that the definition "automatically
> > detected" for subtag 'auto' is not adequate. How does it differ from
> > leaving off the D extension altogether?
> >
> > Agreed, not well specified. But -d- is not needed in the first place,
> > so moot.
>
> Folks have argued against `auto`, happy to remove it if that's what
> folks in this group think we should do.
>
> It was meant to achieve the same thing this achieves:
>
> https://www.w3.org/TR/string-meta/#dom-localizable-dir
>
> > While this is true, for the fast majority of cases, LTR and RTL are
> > the important issues. Most computer systems don't really handle
> > vertical natively; one needs to have more specialized text processing
> > systems, and that is not, I imagine, the target for this syntax.
>
> We're happy to add other directionalities that folks in this group think
> we should add.
>
> > 5. Given #4, the lack of a registry for the proposed extension, or
> > even the mention of one, is a significant problem. The set of exactly
> > 3 values associated with this extension ('ltr', 'rtl', and 'auto')
> > would be fixed; adding to it would require updating the RFC, which is
> > much more work than updating a registry.
> >
> > Agreed, that would be a major drawback.  But -d- is not needed in
> > the first place, so moot.
>
> Sure, we can add a registry, I can make that change in the next version
> once it becomes clear that the proposal has merit and won't be rejected
> by this or the W3C i18n community.
>
> > Without these issues being addressed in a satisfactory way, I would
> > lobby IETF not to approve this I-D.
> >
> > I don't see that there is any reason to approve it, given that it is,
> > as far as I can tell, completely unnecessary and would just
> > complicate implementer's lives to no good end.
>
> Given the new information above (links to use cases, background
> discussion), are you still of the opinion that there is no need for the
> extension?
>
> -- manu
>
> --
> Manu Sporny (skype: msporny, twitter: manusporny)
> Founder/CEO - Digital Bazaar, Inc.
> blog: Veres One Decentralized Identifier Blockchain Launches
> https://tinyurl.com/veres-one-launches
>