Re: [xml2rfc] [irsg] character sets, was UPDATE regarding <u>

Carsten Bormann <> Sat, 04 March 2023 16:38 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id DFF36C14F738 for <>; Sat, 4 Mar 2023 08:38:24 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -4.188
X-Spam-Status: No, score=-4.188 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, T_SPF_TEMPERROR=0.01] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id ENAEdBVqMN-i for <>; Sat, 4 Mar 2023 08:38:19 -0800 (PST)
Received: from ( []) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by (Postfix) with ESMTPS id A06B4C14F721 for <>; Sat, 4 Mar 2023 08:38:17 -0800 (PST)
Received: from [] ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPSA id 4PTVrj4gTCzDCcF; Sat, 4 Mar 2023 17:38:13 +0100 (CET)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.\))
From: Carsten Bormann <>
In-Reply-To: <>
Date: Sat, 04 Mar 2023 17:38:13 +0100
X-Mao-Original-Outgoing-Id: 699640693.196838-82270bfdd20b682987a8731de03754b0
Content-Transfer-Encoding: quoted-printable
Message-Id: <>
References: <20230304041905.DA71BA438468@ary.qy> <> <>
To: "John R. Levine" <>
X-Mailer: Apple Mail (2.3608.
Archived-At: <>
Subject: Re: [xml2rfc] [irsg] character sets, was UPDATE regarding <u>
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: XML2RFC discussion list <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Sat, 04 Mar 2023 16:38:25 -0000

On 2023-03-04, at 16:46, John R Levine <> wrote:
> In any event, this reminds us that we need some discipline in what we allow beyond letters and punctuation.  Unicode does not make this any easier by providing so many different glyphs that look nearly or exactly the same.

Correct, except that the “allow” is a bit misplaced.  “Recommend”, “nudge authors towards”,  “consider good style” etc. would have worked better for me.

Anyway, that’s why there is now authoring support in kramdown-rfc for character repertoire diagnostics, initially with the tool “echars” (which doesn’t require actually using markdown).  

For those actually using markdown, eventually, I expect the yaml header to the markdown input to be able to carry a declaration of what non 10,32-126,160,8203,8209,8288 characters are actually desired in the input, so warnings can be emitted if the document isn’t staying inside those bounds.

Both of these would be helped by access to information about the current repertoire limitations of xml2rfc, which is why I initiated this subthread.

Grüße, Carsten