[Rfc-markdown] 1.6.31: <nobr> hack (non-breaking words)

Carsten Bormann <cabo@tzi.org> Thu, 20 April 2023 07:03 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: rfc-markdown@ietfa.amsl.com
Delivered-To: rfc-markdown@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0E0FCC151B21 for <rfc-markdown@ietfa.amsl.com>; Thu, 20 Apr 2023 00:03:35 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.896
X-Spam-Level:
X-Spam-Status: No, score=-1.896 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cpwoYYTadvdY for <rfc-markdown@ietfa.amsl.com>; Thu, 20 Apr 2023 00:03:31 -0700 (PDT)
Received: from smtp.zfn.uni-bremen.de (smtp.zfn.uni-bremen.de [IPv6:2001:638:708:32::21]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2A215C14CE39 for <rfc-markdown@ietf.org>; Thu, 20 Apr 2023 00:03:28 -0700 (PDT)
Received: from [192.168.217.124] (p548dc9a4.dip0.t-ipconnect.de [84.141.201.164]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.zfn.uni-bremen.de (Postfix) with ESMTPSA id 4Q27sn0fPjzDCfq; Thu, 20 Apr 2023 09:03:25 +0200 (CEST)
From: Carsten Bormann <cabo@tzi.org>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
X-Mao-Original-Outgoing-Id: 703667004.581983-d8b0b5787c20a1881c415bc5df51cce4
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.7\))
Date: Thu, 20 Apr 2023 09:03:24 +0200
Message-Id: <D87DF3A5-57E4-4B5B-981A-124BFA5D3E0A@tzi.org>
To: rfc-markdown@ietf.org
X-Mailer: Apple Mail (2.3608.120.23.2.7)
Archived-At: <https://mailarchive.ietf.org/arch/msg/rfc-markdown/umCatJwgKIhYF3yRRGdRhKVSy4U>
Subject: [Rfc-markdown] 1.6.31: <nobr> hack (non-breaking words)
X-BeenThere: rfc-markdown@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "rfc-markdown is a discussion list for people writing I-Ds and RFCs in Markdown and the authors of the tools used for that." <rfc-markdown.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rfc-markdown>, <mailto:rfc-markdown-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rfc-markdown/>
List-Post: <mailto:rfc-markdown@ietf.org>
List-Help: <mailto:rfc-markdown-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rfc-markdown>, <mailto:rfc-markdown-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 20 Apr 2023 07:03:35 -0000

xml2rfc has some idiosyncratic rules on word breaking that have evolved over time.
See https://github.com/ietf-tools/xml2rfc/issues/984 for an example where the results hurt:

Check out lines 316 and 317 in:
https://www.ietf.org/archive/id/draft-ietf-lamps-header-protection-14.txt

Which are:

   Privacy and security issues regarding email Header Protection in S/
   MIME and PGP/MIME have been identified for some time.  Most current

This is definitely not what one wants xml2rfc to do here.
On the other hand, breaking on “/“ makes a lot of sense for most other usages.

Enter 1.6.31.  Just add anywhere:

*[S/MIME]: <nobr>
*[PGP/MIME]: <nobr>

Of course, there is no actual <nobr> in RFCXML yet, so this needs to be approximated using a hack involving Unicode word joiners (U+2060).

The advantage of this simple declaration is that you have to do it once and it will be taken care of everywhere.

The disadvantage is that all instances of S/MIME that might be subject to wrapping will now actually be S/<U+2060>MIME in XML and HTML (not in plaintext, where the word joiner is consumed in formatting).
This might impede searching for this term, if the search engine is naïve to word joiners.
This MUST disadvantage be fixed in RFCXML and is therefore out of scope for kramdown-rfc (which will happily adopt a <nobr> feature once it is implemented, keeping the same syntax of course).

Fine print:

Using <nobr> on an abbrev title by itself switches off any other functionality of abbrevs.
Additional text in the title such as:

*[PGP/MIME]: <nobr> #sec-pgp-mime

reinstitutes that.

Currently only spaces, hyphens, and slashes receive special treatment; this could be expanded to the fill list in

Grüße, Carsten

PS.: I made a dumb mistake in 1.6.30, which existed briefly; please use 1.6.31 if you picked up 1.6.30 by chance.