Re: [I18ndir] Writing direction

Asmus Freytag <asmusf@ix.netcom.com> Wed, 18 May 2022 05:32 UTC

Return-Path: <asmusf@ix.netcom.com>
X-Original-To: i18ndir@ietfa.amsl.com
Delivered-To: i18ndir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C4D33C15EB2B for <i18ndir@ietfa.amsl.com>; Tue, 17 May 2022 22:32:56 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.757
X-Spam-Level:
X-Spam-Status: No, score=-3.757 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, NICE_REPLY_A=-1.857, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=earthlink.net
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rSkrXERoWTEk for <i18ndir@ietfa.amsl.com>; Tue, 17 May 2022 22:32:54 -0700 (PDT)
Received: from nmtao102.oxsus-vadesecure.net (mta-102a.oxsus-vadesecure.net [51.81.61.66]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id AE4BFC15EB20 for <i18ndir@ietf.org>; Tue, 17 May 2022 22:32:54 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; bh=mEfGRKRgICrK1LI4NAQlhZevja2dDLsT6I32uO CeXq8=; c=relaxed/relaxed; d=earthlink.net; h=from:reply-to:subject: date:to:cc:resent-date:resent-from:resent-to:resent-cc:in-reply-to: references:list-id:list-help:list-unsubscribe:list-subscribe:list-post: list-owner:list-archive; q=dns/txt; s=dk12062016; t=1652851973; x=1653456773; b=fm2IgG3DnlYH39Q0OpTBc7fO38oSiEGGI1jn7W81NTirFCOhfiSidLp SkPvkFXygtJhMFwANynPkaA2sLUveMy+RgVbh588fDIg5q9yFJZ1kZ5djLrqlocySg2xxdN +EokBI55MZdOPCBAk6EtyQR9iEJWvsgChkOQwamLE3NKScPeNtBc6JUd0P6Y4MUoNeNU5Qh HSZf1HkAhJKonW0M42hoiIGcnRftxdRSlN/tJvBSMkjPcIQ3VhhEk0TiDr1tlAB7zKNf/Te kW3QzVaNj39BuH7hqwYnanz+2Vg/u7yoSGC7pxQFkkhf3p0l44hZPw7AA/3fpQZQ9BGFuUZ 7Tg==
Received: from [10.71.219.206] ([142.147.89.249]) by smtp.oxsus-vadesecure.net ESMTP oxsus1nmtao02p with ngmta id 8f1a112c-16f01c1cd654987e; Wed, 18 May 2022 05:32:53 +0000
Message-ID: <d0a966fd-b947-8d40-29dc-eed88a8a64c9@ix.netcom.com>
Date: Tue, 17 May 2022 22:32:52 -0700
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.9.0
Content-Language: en-US
To: i18ndir@ietf.org
References: <4C4A249559BA1E86B17E53FE@PSB> <D59F50F7-A266-48F3-AA78-DA46023033BD@frobbit.se> <39F2CBAA1F19DB765CC59369@PSB> <F6E64852-5CA0-432C-90D3-9DA7D3CCCE69@frobbit.se> <F3072E6B0F1EF9E2951E4D3D@PSB> <CA6F6D68-D83F-46CC-B949-218915ACD116@frobbit.se>
From: Asmus Freytag <asmusf@ix.netcom.com>
In-Reply-To: <CA6F6D68-D83F-46CC-B949-218915ACD116@frobbit.se>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Authentication-Results: oxsus-vadesecure.net; auth=pass smtp.auth=asmusf@ix.netcom.com smtp.mailfrom=asmusf@ix.netcom.com;
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18ndir/zTrDq6H9Ur0kzD0vnnBOVrZae9Q>
Subject: Re: [I18ndir] Writing direction
X-BeenThere: i18ndir@ietf.org
X-Mailman-Version: 2.1.34
Precedence: list
List-Id: Internationalization Directorate <i18ndir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18ndir/>
List-Post: <mailto:i18ndir@ietf.org>
List-Help: <mailto:i18ndir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 18 May 2022 05:32:56 -0000

On 5/17/2022 9:57 PM, Patrik Fältström wrote:
> On 18 May 2022, at 0:23, John C Klensin wrote:
>
>> But there is barely any place to put language and no place to put direction. What do you suggest we do?" Probably we need to try to answer the question and, at least so far, a directionality extension to the BCP47 code system is the least horrible possible answer I've been able to come up with, the need to climb in and have a close encounter with those rats
>>
>> notwithstanding.
> Ok, obviously without knowing the complete context here I would say that first of all the big problem is mixing protocol parameters with display. I call this "leakage". We see this in DNS where a domain name is visible to the user. We see it in other parts of a URI, an email address etc. Oh, email address is a perfect example. It does have a "free text" name, and then an address. But many people want to use their name as an email address which leads to collisions and other things. This while applications that only show the name have similar security risks like text that is a link that people click on might have a destination that is not what the end user guesses or believes.
>
> To the "free text".
>
> To me there are two issues here:
>
> 1. Display is very important to the end user. We have the context within which the sender of the text has, and the context of the receiver of the text. If a text is to be displayed we even without talking about general directionality (that do impact rendering) we have the issue of mixing two contexts. Even if I have some clue about i18n I have very to no knowledge about the same text, same script, same language is possible to display with different directionality. I believe some asian scripts can do this, and for example hebrew. So the first question is what problem is to be solved. I guess it is "to have the receiver understand what general directionality the sender of the text decides". The receiver can then display in whatever directionality context the sender wants.
>
> 2. Second question is whether general directionality is a degree of freedom that is really needed in this protocol. I think it is really really really important to agree this *is* important. And I mean that it is much more important than deciding that "the free text in this protocol has a directionality context that is R2L", or L2R for that matter. I.e. that this protocol element (because it is a protocol element after all, even if the element contains "free text"). If the string is short, I claim one can create a string with the help of directionality is like if the general directionality was the opposite of what the general directionality is.
>
> 3. If the answer to the second question is that one can absolutely not have a given directionality, I still think one should not give up. One can still say that "the directionality of the free text element is R2L", with the addition "if the free text element is to be a L2R context, then the first character of the element MUST be U+2066 "Left-To-Right Isolate".
>
> 4. If 3 is too ugly, then I do not see any other solution than to also have a protocol parameter instead of embedding the directionality in the first character in the free text protocol element, be a separate protocol element.
>
> You can not both throw away the butter, eat the cookie and have coffee.
>
>     Patrik
>
Patrik,

embedding bidi controls into protocol text data is ugly, because they 
end up, sooner or later, embedded in the plain-text backbone of an HTML 
page. (I'm sure that's a law that's already named by someone out there).

It's better to have out of band information to allow the consumer to 
either add controls or markup when embedding text, depending on the 
destination (and depending, even, on whether the destination's 
directionality matches or disagrees).

Unlike all the other presentational markup that exists to affect 
text-layout, the bidi direction is special in that it affects things 
like the order of first and last name or any other elements where "order 
in the sentence" affects the meaning (and not just the appearance) of 
the text.

Mixed Hebrew and English written LTR is effectively a different "writing 
system" from mixed Hebrew and English written RTL. It is this "writing 
system" aspect that might have a place in a language tag.

The fact that the difference 'only' shows up when displaying text to the 
user is not as strong an argument about it being presentational as one 
might think. A user can compare two strings, one presented horizontally, 
and one vertically without ambiguity. A user cannot compare two strings 
with unknown directionality settings and know that they are 
unambiguously equal. Only a computer with access to the backing store 
can do that.

What are the types of texts that show up in IETF protocols?

If there are only identifiers and no items like the typical metadata 
fields, then, yes, I don't see the use case. However, even for short 
"free-text" metadata items I can see issues with imputing directionality 
from either the code points themselves or the default direction of a 
language.

A./