Re: [rfc-i] Unicode names in RFCs and xml2rfc

Martin J. Dürst <duerst@it.aoyama.ac.jp> Wed, 04 December 2019 05:46 UTC

Return-Path: <rfc-interest-bounces@rfc-editor.org>
X-Original-To: ietfarch-rfc-interest-archive@ietfa.amsl.com
Delivered-To: ietfarch-rfc-interest-archive@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0C5761200EC for <ietfarch-rfc-interest-archive@ietfa.amsl.com>; Tue, 3 Dec 2019 21:46:00 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.75
X-Spam-Level:
X-Spam-Status: No, score=-4.75 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_INVALID=0.1, DKIM_SIGNED=0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.25, MAILING_LIST_MULTI=-1, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=fail (1024-bit key) reason="fail (body has been altered)" header.d=itaoyama.onmicrosoft.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id n0DPK9mouR5s for <ietfarch-rfc-interest-archive@ietfa.amsl.com>; Tue, 3 Dec 2019 21:45:57 -0800 (PST)
Received: from rfc-editor.org (rfc-editor.org [4.31.198.49]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2EE131200EB for <rfc-interest-archive-eekabaiReiB1@ietf.org>; Tue, 3 Dec 2019 21:45:57 -0800 (PST)
Received: from rfcpa.amsl.com (localhost [IPv6:::1]) by rfc-editor.org (Postfix) with ESMTP id E83D8F40727; Tue, 3 Dec 2019 21:45:54 -0800 (PST)
X-Original-To: rfc-interest@rfc-editor.org
Delivered-To: rfc-interest@rfc-editor.org
Received: from localhost (localhost [127.0.0.1]) by rfc-editor.org (Postfix) with ESMTP id 9D42AF40727 for <rfc-interest@rfc-editor.org>; Tue, 3 Dec 2019 21:45:53 -0800 (PST)
X-Virus-Scanned: amavisd-new at rfc-editor.org
Authentication-Results: rfcpa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=itaoyama.onmicrosoft.com
Received: from rfc-editor.org ([127.0.0.1]) by localhost (rfcpa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id G_4PZWAYm-Kj for <rfc-interest@rfc-editor.org>; Tue, 3 Dec 2019 21:45:51 -0800 (PST)
Received: from JPN01-TY1-obe.outbound.protection.outlook.com (mail-eopbgr1400120.outbound.protection.outlook.com [40.107.140.120]) by rfc-editor.org (Postfix) with ESMTPS id 210BCF40723 for <rfc-interest@rfc-editor.org>; Tue, 3 Dec 2019 21:45:49 -0800 (PST)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=MZRQLVeGK3OaTrVXbuLTJhawF99xL6DNtWxecpjpXUdvL2Ru+1MGXoemijmuCsX+mxwCrbAYbtPkhyGbb/O3YqgKUV6ISbw4pk7tb3c6/rpoqJoZOvE+2OcuidI/b6tM2Yxt9vopRcVQYw7qL0gm2ihf/6xHIFK8mrfqE3Qey0uW38BmbVerv96uhfwTujOJK9ZmQX39Qt+HrDqwwGKJwQJV5igz1pWRw5Mpy7TBCu7nKNvyPBgIqq8J8qfpJJkk3PxLhjXItdpnLXGKYKM1fl7PoCECDHFNLT8CL3/sQ95IVsQcDu9QQeGVXUCrx+ceB/s1uMmjtAz83D4F9LYh0g==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=/f7YJO08YDZj2MbavN/pFP1/slhVmuNLAOupL/41z+U=; b=XwnvFXKAJRt0MLHMhGsLDJti2Np6j5o7SXXJ66xnJ2L8ihXjl6s/0CNthbDATSgpfNdDB3eWEuEEp9yqVKeSY6jl/ou3CahYpd9m+Puy9pIC+fIzqW/nLx1sU4mLut0r280S3p5ERbqt+NFVAnqY9ZKpxKlVEWp2hZNJCX0D9adICt5knvQOk5k4dAPEPWJZ2Anvn9HEcTI+zcr+b68bPjnZl0EB32IWZ9hMUF56iBtPGL0VlOkUnZjhQ44a1EaDl4so03U0IcfckRZFDBxCW4R4M2c5UmkIe0s1+l/MpqshUWKileSiIHmMJ0IVIxtKs9BpPdaAegiLOUmj1fNS/Q==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=it.aoyama.ac.jp; dmarc=pass action=none header.from=it.aoyama.ac.jp; dkim=pass header.d=it.aoyama.ac.jp; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=itaoyama.onmicrosoft.com; s=selector2-itaoyama-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=/f7YJO08YDZj2MbavN/pFP1/slhVmuNLAOupL/41z+U=; b=ryz5RnK0VS629TOWyw6BaFOkP3Nw4C4SXxXyD1upbYeX42dVqilFPqC840IXlML/K90h29p1WJ4oNrHjKt1fnEpSMmr+QhqmRk8p6HzIN+WClP34jWPsoc3JZaCodoQoiklsVEAAF7SzRd82M0lK3VnmLObMRcUkMN3oI0uht2Q=
Received: from OSBPR01MB4134.jpnprd01.prod.outlook.com (20.178.99.16) by OSBPR01MB5143.jpnprd01.prod.outlook.com (20.179.182.147) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2495.22; Wed, 4 Dec 2019 05:45:45 +0000
Received: from OSBPR01MB4134.jpnprd01.prod.outlook.com ([fe80::2c6e:a5e9:c950:61b1]) by OSBPR01MB4134.jpnprd01.prod.outlook.com ([fe80::2c6e:a5e9:c950:61b1%7]) with mapi id 15.20.2516.013; Wed, 4 Dec 2019 05:45:45 +0000
From: "Martin J. Dürst" <duerst@it.aoyama.ac.jp>
To: Martin Thomson <mt@lowentropy.net>, "rfc-interest@rfc-editor.org" <rfc-interest@rfc-editor.org>
Thread-Topic: [rfc-i] Unicode names in RFCs and xml2rfc
Thread-Index: AQHVqkr6rj5kLoWuHkOmGRJN+zHcFKepd4sA
Date: Wed, 04 Dec 2019 05:45:45 +0000
Message-ID: <6b866238-61b3-b1b9-8bf3-1abcc2576f1a@it.aoyama.ac.jp>
References: <76d730cb-9fe1-4572-acbe-8db5bc0bd598@www.fastmail.com>
In-Reply-To: <76d730cb-9fe1-4572-acbe-8db5bc0bd598@www.fastmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-clientproxiedby: TYAPR01CA0019.jpnprd01.prod.outlook.com (2603:1096:404::31) To OSBPR01MB4134.jpnprd01.prod.outlook.com (2603:1096:604:4c::16)
authentication-results: spf=none (sender IP is ) smtp.mailfrom=duerst@it.aoyama.ac.jp;
x-ms-exchange-messagesentrepresentingtype: 1
x-originating-ip: [150.100.252.94]
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: 5cd6e59e-c05c-44dc-c8ed-08d7787d3511
x-ms-traffictypediagnostic: OSBPR01MB5143:
x-microsoft-antispam-prvs: <OSBPR01MB5143F3DA740C17F43B81D7FACA5D0@OSBPR01MB5143.jpnprd01.prod.outlook.com>
x-ms-oob-tlc-oobclassifiers: OLM:9508;
x-forefront-prvs: 0241D5F98C
x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(396003)(136003)(366004)(376002)(346002)(39840400004)(189003)(199004)(31686004)(31696002)(52116002)(6506007)(76176011)(86362001)(8676002)(85182001)(66446008)(64756008)(66556008)(66946007)(71200400001)(6436002)(256004)(81156014)(6486002)(53546011)(110136005)(229853002)(6512007)(14454004)(102836004)(85202003)(2501003)(71190400001)(305945005)(99286004)(6246003)(26005)(5660300002)(25786009)(81166006)(3846002)(2906002)(8936002)(7736002)(316002)(186003)(786003)(6116002)(386003)(508600001)(2616005)(66476007)(11346002)(446003); DIR:OUT; SFP:1102; SCL:1; SRVR:OSBPR01MB5143; H:OSBPR01MB4134.jpnprd01.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:0;
received-spf: None (protection.outlook.com: it.aoyama.ac.jp does not designate permitted sender hosts)
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: dTzGAU4+G7p+dk9HTQqt8iFYE35Ois9F000X/gL/LeYcOsZlXDIv+8E4eOcJWGAs0Icz1J3UCWZzSYtrhZweZfzuKYJewLkR7m57VA+czZVrLRMHJ8+H9JQ2nvw7QXDLQfTnebZBEOTzMA51HbLmJK1CY1Cvu3DVnNzYyP6wfYFzA+4Z4AEUPqPZstr7hzmV9DA5tSB+0b1FfJeKbt1jzFPem12Nq3Jp7w4WCbjJ/8syTOIgqTO0isXWhDK7ygp+K1pa2/cMsdzUn8+o56rdyR75YaSVkUPK1voUfqiFcKL2vqIGJ3+asb56AaoTLk3xfo/ex0yUpcWv3wq2PiqlGYqJaOLPpU+vfYQiJhJV33nKNJww2SVF3v6v/mcGJ2CobBGwrbAmH+rmImukGM40xs1p3MyPPFjKLXYfu9zC+qpE1qMD7TPsQbd50kwgzPyB
x-ms-exchange-transport-forked: True
Content-ID: <5F8D44969BD67D488FEAFC70ABE33B3F@jpnprd01.prod.outlook.com>
MIME-Version: 1.0
X-OriginatorOrg: it.aoyama.ac.jp
X-MS-Exchange-CrossTenant-Network-Message-Id: 5cd6e59e-c05c-44dc-c8ed-08d7787d3511
X-MS-Exchange-CrossTenant-originalarrivaltime: 04 Dec 2019 05:45:45.5536 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: e02030e7-4d45-463e-a968-0290e738c18e
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: uUC51Ftl2Cqjv11y9pXdl/0Fa4dG7ONG6NLqRRuHMEFp3/SZkfj3dXNlkPNWxhIVmUMq8R14cPocYc0Rq1+TFA==
X-MS-Exchange-Transport-CrossTenantHeadersStamped: OSBPR01MB5143
Subject: Re: [rfc-i] Unicode names in RFCs and xml2rfc
X-BeenThere: rfc-interest@rfc-editor.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "A list for discussion of the RFC series and RFC Editor functions." <rfc-interest.rfc-editor.org>
List-Unsubscribe: <https://www.rfc-editor.org/mailman/options/rfc-interest>, <mailto:rfc-interest-request@rfc-editor.org?subject=unsubscribe>
List-Archive: <http://www.rfc-editor.org/pipermail/rfc-interest/>
List-Post: <mailto:rfc-interest@rfc-editor.org>
List-Help: <mailto:rfc-interest-request@rfc-editor.org?subject=help>
List-Subscribe: <https://www.rfc-editor.org/mailman/listinfo/rfc-interest>, <mailto:rfc-interest-request@rfc-editor.org?subject=subscribe>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
Errors-To: rfc-interest-bounces@rfc-editor.org
Sender: rfc-interest <rfc-interest-bounces@rfc-editor.org>

I couldn't agree more with what Martin Thomson writes below. This must 
be an oversight somewhere, and hopefully can be fixed quickly.

Regards,   Martin.

On 04/12/2019 11:31, Martin Thomson wrote:
> I'm reading the code in xml2rfc to work out how it is intended to work and finding it extraordinarily difficult to achieve a relatively modest goal: putting a person's name into the document.
> 
> My requirements are simple: acknowledge contributions using a person's preferred name.  More concretely, I see no value in expanding ø or ü, but I would however like to provide ASCII analogues of the Japanese names in the list.   This goal seems consistent with the text in RFC 7997:
> 
>     Person names may appear in several places within an RFC (e.g., the
>     header, Acknowledgements, and References).  When a script outside the
>     Unicode Latin blocks [UNICODE-CHART] is used for an individual name,
>     an author-provided, ASCII-only identifier will appear immediately
>     after the non-Latin characters, surrounded by parentheses.  This will
>     improve general readability of the text.
> 
> I'm talking about acknowledgments, so the list appears in a <t> element.  The intent is to render the list of names in an ordinary paragraph, with commas separating each.
> 
> None of the elements that permit Unicode text fit in this context.  I realize that I could use <artwork> for this, but that's clearly an abuse of that element; more so because it renders very differently depending on context (I could probably do something with SVG, now that I think of it...).
> 
> <u> is singularly unsuitable for this purpose.  It insists on - at a minimum - including the U+NNNN notation for every character.  If I could use format="char" or format="char-ascii" it might be acceptable.  Assuming that I have properly understood the code.  The <u> element is not documented in RFC 7991.
> 
> I appreciate the value in having a clear signal from the author that a block of text is intended to include Unicode.  Unicode tends to lead to all sorts of inconvenient inconsistencies, like multiple different dash and hyphen styles, quoting variations, and other such oddities.  I can (grudgingly) accept that some sort of indication is appropriate so that what should be relatively uncommon text usage can be scrutinized additionally.
> 
> It shouldn't be this difficult to acknowledge someone using their name.

_______________________________________________
rfc-interest mailing list
rfc-interest@rfc-editor.org
https://www.rfc-editor.org/mailman/listinfo/rfc-interest