[xml2rfc] assuming that period (.) ends a sentence is sometimes wrong

Daniel Kahn Gillmor <dkg@fifthhorseman.net> Sat, 27 February 2021 02:44 UTC

Return-Path: <dkg@fifthhorseman.net>
X-Original-To: xml2rfc@ietfa.amsl.com
Delivered-To: xml2rfc@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DDE283A03C9 for <xml2rfc@ietfa.amsl.com>; Fri, 26 Feb 2021 18:44:17 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.305
X-Spam-Level:
X-Spam-Status: No, score=-1.305 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_BLOCKED=0.001, RDNS_NONE=0.793, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=neutral reason="invalid (unsupported algorithm ed25519-sha256)" header.d=fifthhorseman.net header.b=hSzoVEfM; dkim=pass (2048-bit key) header.d=fifthhorseman.net header.b=TMV/TlIu
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id w_zPVTRcHSvW for <xml2rfc@ietfa.amsl.com>; Fri, 26 Feb 2021 18:44:16 -0800 (PST)
Received: from che.mayfirst.org (unknown [162.247.75.117]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 780E63A00E4 for <xml2rfc@ietf.org>; Fri, 26 Feb 2021 18:44:16 -0800 (PST)
DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/simple; d=fifthhorseman.net; i=@fifthhorseman.net; q=dns/txt; s=2019; t=1614393854; h=from : to : subject : date : message-id : mime-version : content-type : from; bh=utuhG8YW4DtohI7OJ6GxDfTOIUHfm96p5BgKVtakA+s=; b=hSzoVEfM72iW0+iFrnD6zX7at30wimuxDNDepElTcLF5vhwJKhXFtZGHFZ9tcGmkTbalZ LdeSzMM5aASTdWhCQ==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=fifthhorseman.net; i=@fifthhorseman.net; q=dns/txt; s=2019rsa; t=1614393854; h=from : to : subject : date : message-id : mime-version : content-type : from; bh=utuhG8YW4DtohI7OJ6GxDfTOIUHfm96p5BgKVtakA+s=; b=TMV/TlIuy1KrI6k2LqNq9NZgoDoBlwp97xsDmhKPfps8aOVDQBjIQ6zePQLCL0ginIyhm 5UbpKqsSmbhTauccnQ2TrDPv8kHhoUc936iWzQjB3fizAf7nEfPkpUeP5UkaH346dheBgMj 1oTdxd0Mcv3lMWhZAUo7dEwQBlOpQzsBKvkUe42KRx9jFzSh1iipBMslSn+KHRi9A8nHgzq oU4Jm6nImvfFCFnRuxOVQDcKKCdDntgw2W4WQ4QK4qjnN4UUI4EoL1QLN0UUFmd5Dov94Kt QATfWwfZUkyWUMlTdYE797MbCVQwOo+1Fz2LFVVsdZnlYOzZwlSC9EpDNtwQ==
Received: from fifthhorseman.net (lair.fifthhorseman.net [108.58.6.98]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by che.mayfirst.org (Postfix) with ESMTPSA id 6AA19F9A5 for <xml2rfc@ietf.org>; Fri, 26 Feb 2021 21:44:13 -0500 (EST)
Received: by fifthhorseman.net (Postfix, from userid 1000) id 14791204C1; Fri, 26 Feb 2021 21:44:10 -0500 (EST)
From: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
To: xml2rfc@ietf.org
Autocrypt: addr=dkg@fifthhorseman.net; prefer-encrypt=mutual; keydata= mDMEX+i03xYJKwYBBAHaRw8BAQdACA4xvL/xI5dHedcnkfViyq84doe8zFRid9jW7CC9XBiI0QQf FgoAgwWCX+i03wWJBZ+mAAMLCQcJEOCS6zpcoQ26RxQAAAAAAB4AIHNhbHRAbm90YXRpb25zLnNl cXVvaWEtcGdwLm9yZ/tr8E9NA10HvcAVlSxnox6z62KXCInWjZaiBIlgX6O5AxUKCAKbAQIeARYh BMKfigwB81402BaqXOCS6zpcoQ26AADZHQD/Zx9nc3N2kj13AUsKMr/7zekBtgfSIGB3hRCU74Su G44A/34Yp6IAkndewLxb1WdRSokycnaCVyrk0nb4imeAYyoPtBc8ZGtnQGZpZnRoaG9yc2VtYW4u bmV0PojRBBMWCgCDBYJf6LTfBYkFn6YAAwsJBwkQ4JLrOlyhDbpHFAAAAAAAHgAgc2FsdEBub3Rh dGlvbnMuc2VxdW9pYS1wZ3Aub3JnL0Gwxvypz2tu1IPG+yu1zPjkiZwpscsitwrVvzN3bbADFQoI ApsBAh4BFiEEwp+KDAHzXjTYFqpc4JLrOlyhDboAAPkXAP0Z29z7jW+YzLzPTQML4EQLMbkHOfU4 +s+ki81Czt0WqgD/SJ8RyrqDCtEP8+E4ZSR01ysKqh+MUAsTaJlzZjehiQ24MwRf6LTfFgkrBgEE AdpHDwEBB0DkKHOW2kmqfAK461+acQ49gc2Z6VoXMChRqobGP0ubb4kBiAQYFgoBOgWCX+i03wWJ BZ+mAAkQ4JLrOlyhDbpHFAAAAAAAHgAgc2FsdEBub3RhdGlvbnMuc2VxdW9pYS1wZ3Aub3Jnfvo+ nHoxDwaLaJD8XZuXiaqBNZtIGXIypF1udBBRoc0CmwICHgG+oAQZFgoAbwWCX+i03wkQPp1xc3He VlxHFAAAAAAAHgAgc2FsdEBub3RhdGlvbnMuc2VxdW9pYS1wZ3Aub3JnaheiqE7Pfi3Atb3GGTw+ jFcBGOaobgzEJrhEuFpXREEWIQQttUkcnfDcj0MoY88+nXFzcd5WXAAAvrsBAIJ5sBg8Udocv25N stN/zWOiYpnjjvOjVMLH4fV3pWE1AP9T6hzHz7hRnAA8d01vqoxOlQ3O6cb/kFYAjqx3oMXSBhYh BMKfigwB81402BaqXOCS6zpcoQ26AADX7gD/b83VObe14xrNP8xcltRrBZF5OE1rQSPkMNy+eWpk eCwA/1hxiS8ZxL5/elNjXiWuHXEvUGnRoVj745Vl48sZPVYMuDgEX+i03xIKKwYBBAGXVQEFAQEH QIGex1WZbH6xhUBve5mblScGYU+Y8QJOomXH+rr5tMsMAwEICYjJBBgWCgB7BYJf6LTfBYkFn6YA CRDgkus6XKENukcUAAAAAAAeACBzYWx0QG5vdGF0aW9ucy5zZXF1b2lhLXBncC5vcmcEAx9vTD3b J0SXkhvcRcCr6uIDJwic3KFKxkH1m4QW0QKbDAIeARYhBMKfigwB81402BaqXOCS6zpcoQ26AAAX mwD8CWmukxwskU82RZLMk5fm1wCgMB5z8dA50KLw3rgsCykBAKg1w/Y7XpBS3SlXEegIg1K1e6dR fRxL7Z37WZXoH8AH
Date: Fri, 26 Feb 2021 21:44:09 -0500
Message-ID: <87wnuucjra.fsf@fifthhorseman.net>
MIME-Version: 1.0
Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha256; protocol="application/pgp-signature"
Archived-At: <https://mailarchive.ietf.org/arch/msg/xml2rfc/6V7C7l-zxg97jRqvuaq8r1QBQqw>
Subject: [xml2rfc] assuming that period (.) ends a sentence is sometimes wrong
X-BeenThere: xml2rfc@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <xml2rfc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/xml2rfc>, <mailto:xml2rfc-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/xml2rfc/>
List-Post: <mailto:xml2rfc@ietf.org>
List-Help: <mailto:xml2rfc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/xml2rfc>, <mailto:xml2rfc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 27 Feb 2021 02:44:18 -0000

The toolchain to build draft-ietf-openpgp-crypto-refresh produces XML
that contains:
…
          <li>PGP - Pretty Good Privacy.
PGP is a family of software systems developed by Philip R. Zimmermann from which OpenPGP is based.</li>
…

xml2rfc renders this to text as:

…
   *  PGP - Pretty Good Privacy.  PGP is a family of software systems
      developed by Philip R.  Zimmermann from which OpenPGP is based.
…

It looks like it is assuming that a period (.) ends a sentence, and that
two spaces should follow each sentence.  It is correct about the . after
"Privacy", but it is wrong about the . after "R". :(

This is kind of a dumb nit-pick, but it was noticed during a review on
the mailing list.

Is there a recommended way to fix this?  What XML input would produce a
.txt with only one space after the "R."?

   --dkg