[rfc-i] Unicode in xml2rfc v3

Lars Eggert <lars@eggert.org> Tue, 01 December 2020 17:35 UTC

Return-Path: <rfc-interest-bounces@rfc-editor.org>
X-Original-To: ietfarch-rfc-interest-archive@ietfa.amsl.com
Delivered-To: ietfarch-rfc-interest-archive@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B1BE73A13CB; Tue, 1 Dec 2020 09:35:37 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.449
X-Spam-Level:
X-Spam-Status: No, score=-2.449 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_INVALID=0.1, DKIM_SIGNED=0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, HTML_IMAGE_ONLY_32=0.001, HTML_MESSAGE=0.001, MAILING_LIST_MULTI=-1, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=fail (1024-bit key) reason="fail (message has been altered)" header.d=eggert.org
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id QTCiD7L4PZDO; Tue, 1 Dec 2020 09:35:36 -0800 (PST)
Received: from rfc-editor.org (rfc-editor.org [4.31.198.49]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 205743A13CA; Tue, 1 Dec 2020 09:35:36 -0800 (PST)
Received: from rfcpa.amsl.com (localhost [IPv6:::1]) by rfc-editor.org (Postfix) with ESMTP id CA568F4076A; Tue, 1 Dec 2020 09:35:33 -0800 (PST)
X-Original-To: rfc-interest@rfc-editor.org
Delivered-To: rfc-interest@rfc-editor.org
Received: from localhost (localhost [127.0.0.1]) by rfc-editor.org (Postfix) with ESMTP id DB33BF4076A for <rfc-interest@rfc-editor.org>; Tue, 1 Dec 2020 09:35:32 -0800 (PST)
X-Virus-Scanned: amavisd-new at rfc-editor.org
Authentication-Results: rfcpa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=eggert.org
Received: from rfc-editor.org ([127.0.0.1]) by localhost (rfcpa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EVMLUtx9rIJT for <rfc-interest@rfc-editor.org>; Tue, 1 Dec 2020 09:35:29 -0800 (PST)
Received: from mail.eggert.org (mail.eggert.org [91.190.195.94]) by rfc-editor.org (Postfix) with ESMTPS id 694F1F40764 for <rfc-interest@rfc-editor.org>; Tue, 1 Dec 2020 09:35:28 -0800 (PST)
Received: from [IPv6:2a00:ac00:4000:400:9035:8232:112:331f] (unknown [IPv6:2a00:ac00:4000:400:9035:8232:112:331f]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.eggert.org (Postfix) with ESMTPSA id 8532B61203B for <rfc-interest@rfc-editor.org>; Tue, 1 Dec 2020 19:35:17 +0200 (EET)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=eggert.org; s=dkim; t=1606844117; bh=x/v2KXWSfMaTGgNxetenOL6xKovL5u4pveoVraLAU60=; h=From:Subject:Date:To; b=shQz2PckLBRFX8w2Gck0p9iqvgFhg29TrPmJEN5Oye+vwt7rXWNB9ITtgdk+LFPpn m0Ekiez4DdDHmg8vRvdzbfxQVJjzrv8+2Efi3OuslK1fne2Vcg9iupe1bmQsaWRWOV tYGKpE5rsS1BsbipJXX8xwPlHV+JFnjr9JQW0E04=
From: Lars Eggert <lars@eggert.org>
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.4\))
Message-Id: <F2E370D6-BCA9-4902-836E-8C5BADFE5209@eggert.org>
Date: Tue, 01 Dec 2020 19:35:13 +0200
To: rfc-interest@rfc-editor.org
X-MailScanner-ID: 8532B61203B.A1B7F
X-MailScanner: Found to be clean
X-MailScanner-From: lars@eggert.org
Subject: [rfc-i] Unicode in xml2rfc v3
X-BeenThere: rfc-interest@rfc-editor.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "A list for discussion of the RFC series and RFC Editor functions." <rfc-interest.rfc-editor.org>
List-Unsubscribe: <https://www.rfc-editor.org/mailman/options/rfc-interest>, <mailto:rfc-interest-request@rfc-editor.org?subject=unsubscribe>
List-Archive: <http://www.rfc-editor.org/pipermail/rfc-interest/>
List-Post: <mailto:rfc-interest@rfc-editor.org>
List-Help: <mailto:rfc-interest-request@rfc-editor.org?subject=help>
List-Subscribe: <https://www.rfc-editor.org/mailman/listinfo/rfc-interest>, <mailto:rfc-interest-request@rfc-editor.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============2088785654235543770=="
Errors-To: rfc-interest-bounces@rfc-editor.org
Sender: rfc-interest <rfc-interest-bounces@rfc-editor.org>

Hi,

we're at the moment revising RFC8312, the CUBIC congestion controller for TCP. There is quite a bit of math in RFC8312, and even more in the papers that describe CUBIC, some of which we're planning to roll into RFC8312bis.

That math uses a bunch of greek letters, which we needed to ASCIIfy for RFC8312. One thing I was planning to do for the bis was to reduce the difference between the paper and the RFC, by using the greek letters in the RFC. Since, you know, xml2rfc v3 is supposed to handle Unicode.

Except, as most of you probably know but I didn't, you can't just use Unicode. You need to wrap it in a <u> tag, and it must be rendered with "num" in the format.

Which means that while I can write a formula in kramdown-rfc2629:

~~~ math
W_{est} = W_{est} + α_{aimd} * \frac{segments_{acked}}{cwnd}
~~~

and have it automatically rendered in SVG as (via tex2svg):


and in ASCII (via asciitex) as

                       segments
                               acked
W    = W    + α     * -------------
 est    est    aimd        cwnd

but I cannot use "α<sub>aimd</sub>" in the text of the RFC - it gets rendered as "&#945;_(aimd)".

And using "<u>α</u><sub>aimd</sub>" gets rendered as the even uglier ""α" (GREEK SMALL LETTER ALPHA, U+03B1)_(aimd)". I can of course play with the format string, but since there is no way to not at least use "char", I have basically no option here.

I had really been hoping that v3 would enable better math. Unicode is not just for names...

Thanks,
Lars

_______________________________________________
rfc-interest mailing list
rfc-interest@rfc-editor.org
https://www.rfc-editor.org/mailman/listinfo/rfc-interest