Re: [Rfc-markdown] [Tools-discuss] New xml2rfc release: v3.18.0

Kesara Rathnayake <kesara@staff.ietf.org> Wed, 09 August 2023 11:27 UTC

Return-Path: <kesara@staff.ietf.org>
X-Original-To: rfc-markdown@ietfa.amsl.com
Delivered-To: rfc-markdown@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id F2019C15154E for <rfc-markdown@ietfa.amsl.com>; Wed, 9 Aug 2023 04:27:08 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.907
X-Spam-Level:
X-Spam-Status: No, score=-6.907 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=staff-ietf-org.20221208.gappssmtp.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id RnvmXgFGyncx for <rfc-markdown@ietfa.amsl.com>; Wed, 9 Aug 2023 04:27:04 -0700 (PDT)
Received: from mail-ej1-x633.google.com (mail-ej1-x633.google.com [IPv6:2a00:1450:4864:20::633]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 06DE3C15DD5E for <rfc-markdown@ietf.org>; Wed, 9 Aug 2023 04:26:58 -0700 (PDT)
Received: by mail-ej1-x633.google.com with SMTP id a640c23a62f3a-99bed101b70so952272466b.3 for <rfc-markdown@ietf.org>; Wed, 09 Aug 2023 04:26:58 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=staff-ietf-org.20221208.gappssmtp.com; s=20221208; t=1691580417; x=1692185217; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=VDdhzZkIhZf/vZrXY801a9AhOHn+PeDYpdPq6QRPfGk=; b=iTNUoOLzmUpFPxVLHo0vbyjHxLY67HuZcuIq3JiePckKBJG81yxdr34lcY9BAaBoF6 EM4LFvh4A7nqOve65e1izVY6/hkzHYL5WOmrhjUAK6jjRoWS0hAQHaM8BUMVCvnPSaBk FiIxzNplXH6v6jJdXHwdYiwUS9CtWoMIpROfr2aTMyARbqH4PMkI+JCBkBIBc6SDnbDD s92DV/1UqaVvXNPlu8RfIp6cEivDxIMqNZ3si2YfruXMbOyrRI5m1Bvux/C8085jF81o wuDVKe2YVJ3+Yu+yJkBwJC3QzNdG1nyGs4ji7YClO/UILa5BAufljURfLvsB1ZdVtqfb BAgQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691580417; x=1692185217; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=VDdhzZkIhZf/vZrXY801a9AhOHn+PeDYpdPq6QRPfGk=; b=VdLQ+es4hDDQESIUAsPlZAYOkomX2QwGTKaPuIC1/nBmv5FDPPe33c7aWnI2hALbU0 3AyQAT4WuU940qneE/nTHDny/UzL8vXyuJKzeMWeZdqEPdU6aO98LyL0+X7BceV5x0AO ADWrRZ1wv9qROag1/FnVZA5LEnh9wmA1eSJ8zbUA5GlfPuX4n4p4SvqVinqSrOs5HDCK xFskHFu47O5+yYWcRBjVNZ0qxVVss0JqKS37FKMKTKXT/J95LpGT5ymEI6QRXPcCZeVH Tbyd102XX18FT1vfm49eRtqL1zdFjjCmkCZd8abNEgkBd0Mt+jGQPbM+gL0/UlyNG0q1 Dkzg==
X-Gm-Message-State: AOJu0YxwvF3mAscdGPxtyJBKRoMG3A1FtbmulaDIPP98804UgnvedfuV z7buCf46GYTQsBIJFklU0q9Ry2hyD77amAhDOASka6IF
X-Google-Smtp-Source: AGHT+IEg9mtirSfkoeHTiT045xeQWPHETa2MeBMOPozQi6LxXQvZThV4K9LekNkRHkDoDEdOsCxRQ0G5RzMLZ0CeEiU=
X-Received: by 2002:a17:906:194:b0:969:93f2:259a with SMTP id 20-20020a170906019400b0096993f2259amr1879081ejb.73.1691580417276; Wed, 09 Aug 2023 04:26:57 -0700 (PDT)
MIME-Version: 1.0
References: <CAD2=Z85hDSHt9gmAz4OGZ3HpYsyY_0tVjUuad7qPOiYFHdEKgA@mail.gmail.com> <0C87712C-4F97-4150-A7ED-F6438B157462@tzi.org>
In-Reply-To: <0C87712C-4F97-4150-A7ED-F6438B157462@tzi.org>
From: Kesara Rathnayake <kesara@staff.ietf.org>
Date: Wed, 09 Aug 2023 23:26:46 +1200
Message-ID: <CAD2=Z84ufaTai-VydeWOEXLr3vBsbv9NE5HXs_NvhQS92AXmwg@mail.gmail.com>
To: Carsten Bormann <cabo@tzi.org>
Cc: rfc-markdown@ietf.org, XML2RFC Interest Group <xml2rfc@ietf.org>, tools-discuss <tools-discuss@ietf.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/rfc-markdown/DmbmkxTS0VXLEu00liWvZWSwPiQ>
Subject: Re: [Rfc-markdown] [Tools-discuss] New xml2rfc release: v3.18.0
X-BeenThere: rfc-markdown@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "rfc-markdown is a discussion list for people writing I-Ds and RFCs in Markdown and the authors of the tools used for that." <rfc-markdown.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rfc-markdown>, <mailto:rfc-markdown-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rfc-markdown/>
List-Post: <mailto:rfc-markdown@ietf.org>
List-Help: <mailto:rfc-markdown-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rfc-markdown>, <mailto:rfc-markdown-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 09 Aug 2023 11:27:09 -0000

Hi all,

Output from kramdown-rfc's echars and `xml2rfc --warn-bare-unicode`
option is now listed as separate sections in the validation report on
https://author-tools.ietf.org/
Select the file and click on the "Validate (idnits)" button for this.

Cheers,
Kesara

On Fri, 4 Aug 2023 at 16:26, Carsten Bormann <cabo@tzi.org> wrote:
>
> > See https://github.com/ietf-tools/xml2rfc/releases/tag/v3.18.0 for
> > release details.
> >
> > This release allows the use of Unicode characters everywhere.
>
> Wonderful!
>
> (This release allows the use of non-ASCII Unicode characters everywhere;
> Xml2rfc already allowed Unicode characters that were in its “ASCII” subset — which included a few select non-ASCII characters.)
>
> This update should not require any updates in kramdown-rfc, but of course the need for workarounds like {{{}}{{🤦‍♂️}}} is gone.
>
> > The  `--warn-bare-unicode` command line option will warn if Unicode
> > characters are present in any element except artwork, city, cityarea,
> > code, country, email, extaddr, organization, pobox, postalLine,
> > refcontent, region, sortingcode, sourcecode, street, title and u.
> > See https://github.com/ietf-tools/xml2rfc/pull/1017 for more details.
>
> We generally want a soft transition to using the full Unicode repertoire, not the least because xml2rfc’s PDF generator may need attention with new character blocks coming into use — Gurmukhi may not quite work just yet.
> RFCXML's <u element stays useful as an easy way to fulfil RFC 7997’s requirement to fully explain non-ASCII characters when that may be needed for interchange.
>
> Non-ASCII characters sometimes sneak into drafts via copy-paste from sources that use full Unicode as a matter of course.
> Not just typographic quotes, which can be jarring when mixed with typewriter quotes, but also various invisible characters such as zero-width space and word joiners which were already part of xml2rfc’s “ASCII” repertoire.
>
> Kramdown-rfc comes with an analysis tool called “echars” (explain characters).
>
> Running this on a markdown (or XML or TXT!) file generates output such as:
>
> $ echars draft-bormann-restatement.md
> *** Latin-1 Supplement
> »: U+00BB    1 RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK (Common)
> ä: U+00E4    1 LATIN SMALL LETTER A WITH DIAERESIS (Latin)
> *** General Punctuation (Common)
> —: U+2014    5 EM DASH
> ’: U+2019    1 RIGHT SINGLE QUOTATION MARK
> ”: U+201D    2 RIGHT DOUBLE QUOTATION MARK
> …: U+2026    1 HORIZONTAL ELLIPSIS
> ⁠: U+2060    1 WORD JOINER
>
> So there is nothing strange in this document, but it is still worth knowing where these characters outside the LF + %x40-7e space are (in this case: mostly in the titles of references), so I check this now and then for my documents (*).
> Your editor might help with that, e.g. in Emacs use:
>
> M-C-s [^^J-~]
>
> (where ^J is a newline character, entered as ctrl-j, while the ^ preceding it is a caret.)
>
> Of course, in XML you might be hiding beyond-ASCII by using entity references such as &nbsp; or character references such as &#x20AC; or &#8364; — echars doesn’t show these, but then the intent should be quite obvious in the manuscript.
>
> > Report any issues on https://github.com/ietf-tools/xml2rfc/issues
>
> … and any issues with kramdown-rfc on rfc-markdown@ietf.org and/or as issues in https://rfc.space
>
> Grüße, Carsten
>
> (*) I contemplate generating this report with each kramdown-rfc run, possibly modulated by declarations in the YAML header that say which characters the document author already expects.  But I’m going on a couple of vacations first now…
>


-- 
Kesara Rathnayake
Senior Software Development Engineer - IETF Administration LLC
kesara@staff.ietf.org