Re: [rfc-i] Unicode in xml2rfc v3

Marc Petit-Huguenin <marc@petit-huguenin.org> Fri, 18 December 2020 23:56 UTC

Return-Path: <rfc-interest-bounces@rfc-editor.org>
X-Original-To: ietfarch-rfc-interest-archive@ietfa.amsl.com
Delivered-To: ietfarch-rfc-interest-archive@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D00C43A0809; Fri, 18 Dec 2020 15:56:37 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.651
X-Spam-Level:
X-Spam-Status: No, score=-2.651 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.25, MAILING_LIST_MULTI=-1, NICE_REPLY_A=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 69RFbVGYWSPK; Fri, 18 Dec 2020 15:56:36 -0800 (PST)
Received: from rfc-editor.org (rfc-editor.org [4.31.198.49]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2164F3A0803; Fri, 18 Dec 2020 15:56:36 -0800 (PST)
Received: from rfcpa.amsl.com (localhost [IPv6:::1]) by rfc-editor.org (Postfix) with ESMTP id 8F3DCF40716; Fri, 18 Dec 2020 15:56:22 -0800 (PST)
X-Original-To: rfc-interest@rfc-editor.org
Delivered-To: rfc-interest@rfc-editor.org
Received: from localhost (localhost [127.0.0.1]) by rfc-editor.org (Postfix) with ESMTP id 6CF2AF40716 for <rfc-interest@rfc-editor.org>; Fri, 18 Dec 2020 15:56:21 -0800 (PST)
X-Virus-Scanned: amavisd-new at rfc-editor.org
Received: from rfc-editor.org ([127.0.0.1]) by localhost (rfcpa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id M7-KYAA4IRVl for <rfc-interest@rfc-editor.org>; Fri, 18 Dec 2020 15:56:17 -0800 (PST)
Received: from implementers.org (implementers.org [IPv6:2001:4b98:dc0:45:216:3eff:fe7f:7abd]) by rfc-editor.org (Postfix) with ESMTPS id EE4C0F4070F for <rfc-interest@rfc-editor.org>; Fri, 18 Dec 2020 15:56:16 -0800 (PST)
Received: from [IPv6:2601:648:8400:8e7d:1:cb41:bfb2:99cd] (unknown [IPv6:2601:648:8400:8e7d:1:cb41:bfb2:99cd]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "Marc Petit-Huguenin", Issuer "implementers.org" (verified OK)) by implementers.org (Postfix) with ESMTPS id 7BDD7AE11A; Sat, 19 Dec 2020 00:56:20 +0100 (CET)
From: Marc Petit-Huguenin <marc@petit-huguenin.org>
To: Eric Rescorla <ekr@rtfm.com>
References: <20201216184835.CE1CA2ABC7A1@ary.qy> <AF7F0885-2D39-4F8D-A43B-E1D015146EAE@eggert.org> <72467617-6ca7-b2af-b826-d264c6b6380e@gmail.com> <D8AC8FA8-74DC-4B93-AB5B-73FBE1880F26@ietf.org> <162b0211-bc98-d0c8-b67f-c3068664b9f9@petit-huguenin.org> <CABcZeBPghk2uDJcAHWw6ZaWhnthmCBCpL_28-FQyUOZ4-it3CA@mail.gmail.com>
Message-ID: <1ff2777e-92a3-2d50-0363-a397800319ed@petit-huguenin.org>
Date: Fri, 18 Dec 2020 15:56:13 -0800
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.5.1
MIME-Version: 1.0
In-Reply-To: <CABcZeBPghk2uDJcAHWw6ZaWhnthmCBCpL_28-FQyUOZ4-it3CA@mail.gmail.com>
Content-Language: en-US
Subject: Re: [rfc-i] Unicode in xml2rfc v3
X-BeenThere: rfc-interest@rfc-editor.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "A list for discussion of the RFC series and RFC Editor functions." <rfc-interest.rfc-editor.org>
List-Unsubscribe: <https://www.rfc-editor.org/mailman/options/rfc-interest>, <mailto:rfc-interest-request@rfc-editor.org?subject=unsubscribe>
List-Archive: <http://www.rfc-editor.org/pipermail/rfc-interest/>
List-Post: <mailto:rfc-interest@rfc-editor.org>
List-Help: <mailto:rfc-interest-request@rfc-editor.org?subject=help>
List-Subscribe: <https://www.rfc-editor.org/mailman/listinfo/rfc-interest>, <mailto:rfc-interest-request@rfc-editor.org?subject=subscribe>
Cc: RFC Interest <rfc-interest@rfc-editor.org>
Content-Transfer-Encoding: base64
Content-Type: text/plain; charset="utf-8"; Format="flowed"
Errors-To: rfc-interest-bounces@rfc-editor.org
Sender: rfc-interest <rfc-interest-bounces@rfc-editor.org>

On 12/18/20 2:38 PM, Eric Rescorla wrote:
> On Fri, Dec 18, 2020 at 9:16 AM Marc Petit-Huguenin <marc@petit-huguenin.org>
> wrote:
> 
>> On 12/17/20 12:10 PM, Jay Daley wrote:
>>>
>>>
>>>> On 18/12/2020, at 9:00 AM, Brian E Carpenter <
>> brian.e.carpenter@gmail.com> wrote:
>>>>
>>>> On 17-Dec-20 19:57, Lars Eggert wrote:
>>>>> Hi,
>>>>>
>>>>> On 2020-12-16, at 20:48, John Levine <johnl@taugh.com> wrote:
>>>>>> In article <BB864858-1E71-45CF-9411-2ECB003B5EC0@eggert.org> you
>> write:
>>>>>>> It's ridiculous that I can't just write α when I mean α.
>>>>>>
>>>>>> I agree with you but I don't yet see how we get from here to there or
>> exactly
>>>>>> where there is.
>>>>>
>>>>> I think it was Julian who proposed to lift the restriction on Unicode
>> to only be allowed in <contact> and instead rely on the community (at the
>> I-D stage) and the IESG/ISE/RPC for when I-Ds become RFCs to check for
>> "abuse" of Unicode (which I struggle to see happening in practice.)
>>>>
>>>> Agreed. I think the original decision to be very restrictive was due to
>> general concerns about moving away from .txt as the primary format. How π,
>> still less emojis, will be rendered in ASCII remains an issue, but just as
>> RFC1119 worked out fine, I think we should get over the fact that some (or
>> most) future RFCs simply won't work in .txt.
>>>
>>> Can someone explain to me why our .txt files have to be ASCII?  I doubt
>> I have a single text tool left that can’t process UTF-8.
>>>
>>
>> Because non-ASCII characters cannot improve a well-written RFC -- or a
>> well-written program for that matter.
> 
> 
> I don't think I agree with this. For instance, while yes it's possible to
> formally specify the code points for emoji in hex or whatever, it's
> certainly easier to read the text if you can see them, I think.
> 

CLDR short names, prefixed in a similar way than Unicode codepoints (e.g. <N+ROMAN NUMERAL NINE> for <U+2178>) could still be translated to their graphical equivalent in HTML and PDF, but will not make me want to gouge my eyes when implementing from the text format.

In a way, this is to programmers what the alt attribute is for the blind and visually impaired.

(Yes, I believe that the printed paginated text version of a standard is the proper way to implement a standard.  And carrying non-hypertext files in HTTP is in bad taste.  And milk should be poured first in the tea cup)

-- 
Marc Petit-Huguenin
Email: marc@petit-huguenin.org
Blog: https://marc.petit-huguenin.org
Profile: https://www.linkedin.com/in/petithug
_______________________________________________
rfc-interest mailing list
rfc-interest@rfc-editor.org
https://www.rfc-editor.org/mailman/listinfo/rfc-interest