Re: [xml2rfc] assuming that period (.) ends a sentence is sometimes wrong

Julian Reschke <julian.reschke@gmx.de> Sat, 27 February 2021 05:23 UTC

Return-Path: <julian.reschke@gmx.de>
X-Original-To: xml2rfc@ietfa.amsl.com
Delivered-To: xml2rfc@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 50BDE3A0FDA for <xml2rfc@ietfa.amsl.com>; Fri, 26 Feb 2021 21:23:40 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.901
X-Spam-Level:
X-Spam-Status: No, score=-1.901 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_FROM=0.001, NICE_REPLY_A=-0.001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=gmx.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id TADAG0rqcoFB for <xml2rfc@ietfa.amsl.com>; Fri, 26 Feb 2021 21:23:38 -0800 (PST)
Received: from mout.gmx.net (mout.gmx.net [212.227.15.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4274B3A0FD8 for <xml2rfc@ietf.org>; Fri, 26 Feb 2021 21:23:38 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1614403414; bh=NzVOWzNt1oHHXJjfZj3EPQNKPGpKurmJIIjlbYQeo+E=; h=X-UI-Sender-Class:Subject:To:References:From:Date:In-Reply-To; b=ONac5anAby3HT56KvhhabU28Ddlr9zKDt2qRwDzIOcxuxuEx3sjONn7VUACtrsDMV 6qAinKkqp0F2NjPLIeQdd26gjLj78R0AD3U0uulD7i8hPmo8gjuYn/oZhUP41bjsJ8 qAbw582vxnC9swYo/dJywFeHDIeoUkSvp/DZHipI=
X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c
Received: from [192.168.178.20] ([217.251.136.5]) by mail.gmx.net (mrgmx005 [212.227.17.190]) with ESMTPSA (Nemesis) id 1N8ofE-1luvJI1iYI-015onc for <xml2rfc@ietf.org>; Sat, 27 Feb 2021 06:23:34 +0100
To: xml2rfc@ietf.org
References: <87wnuucjra.fsf@fifthhorseman.net>
From: Julian Reschke <julian.reschke@gmx.de>
Message-ID: <536e4424-3785-52f7-6702-df685220dec4@gmx.de>
Date: Sat, 27 Feb 2021 06:23:33 +0100
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.8.0
MIME-Version: 1.0
In-Reply-To: <87wnuucjra.fsf@fifthhorseman.net>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: quoted-printable
X-Provags-ID: V03:K1:24An/GbjCz92/a35JzhgpSGnzyBvUDvA3Mi2D4wAHadjmO9q6dW xjzJlUQOqqY5UfIdbdA/z0B4SaipcBRy5Cmu+XRqe5LcWIKP3BbjpKqvFeT+WKeF7BfN0uh g6Qgc7OH4JYOVWZ/FubM1Izt97Tz5WZXLvWUDdPCrzev9HVj0pBGa+mJKtqQfBhIKiNyDeH X3aCrTiEb1l8jtze7KZdQ==
X-UI-Out-Filterresults: notjunk:1;V03:K0:SsTkYMauyzY=:3gKVXDusCBNGQuMjdw4tLs iHumjQ/R89H8fOQX7TuW8mHhj6F59EMMEnP0mAfkr7CU3+dI6YQ9zK8xU3JEFYU1tFUVXWIEh sk0ak0J7jB0XALRZVp+ruRDXaxKmlp6hFmG37i04Cb7qGhe/vn4KSQdzjSHijS9jBwDqz7idm uXyc2intjzlrbpOpV4tcee4XKuAaWj6wApYc9KTxY/5kOQPrQ6+wUa+gF1D9X+O3rr0Ho3PBI Y5JtWB4W36Iohu9n0E3raU2luzh3XMMCGfuPTkcXqdfNLO84qlvLwfr5vaTb6Gayn3bsWnT6s c/VhKBle23/fYRcBntysCOpb36Umm/DvQE7jpgpczzprnBQe1Czz1S8J0um733noFskI4Y7Lz cu6XzH8RfIaNv/dxs9It1kWVnQtXAjYUuo8R3lWYYXbSXX4xOVdWhfeCqFaGznnbMhEaBFuvO 6Hu6EKhy7ky2XY2qn8xx6Z+FuINBN67l2EPZfPFOlvlRbU14lrXD1mMd4GzTs+ocBtgODyiZu YWOq95/DXQTC8xRji7KJCcgxBJcdaZrNCSgDczlfVQVeAFzz8CWWr6rrcc1QoPCasajVu/uR1 heIWlkwB5PC+Wy7xtvs3dF6ygYyhlxc34IpdKYGOfTYukbA6X4kF1ZWupdDTJdzwC8quNWNdB Lae+GfmiSEeK2XZ3/qCSgtmkEB9GcnJWoNGOSYoaZ+Oz75/ueC1mbxdg9ahy/jn0HutfaBmhd 6e3OGNEuRw8eibjwR4aoBm6ZtguGuH1inKZ62e4iEpCzjgT9LKXrmge3vWDb+7ZmdRLluWJkd AODoOaH3w1iRn+4r8sYM+f0y4A1td5VzEWBFG97ZSD786d2Y0zxVpPTo0dDQ/2NLa2h7VjUSU J6/AAZiW6gqHNjlwHfpg==
Archived-At: <https://mailarchive.ietf.org/arch/msg/xml2rfc/V5kjX6xzWpGpHafdw2cpzssYpPM>
Subject: Re: [xml2rfc] assuming that period (.) ends a sentence is sometimes wrong
X-BeenThere: xml2rfc@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <xml2rfc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/xml2rfc>, <mailto:xml2rfc-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/xml2rfc/>
List-Post: <mailto:xml2rfc@ietf.org>
List-Help: <mailto:xml2rfc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/xml2rfc>, <mailto:xml2rfc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 27 Feb 2021 05:23:40 -0000

Am 27.02.2021 um 03:44 schrieb Daniel Kahn Gillmor:
> The toolchain to build draft-ietf-openpgp-crypto-refresh produces XML
> that contains:
> …
>            <li>PGP - Pretty Good Privacy.
> PGP is a family of software systems developed by Philip R. Zimmermann from which OpenPGP is based.</li>
> …
>
> xml2rfc renders this to text as:
>
> …
>     *  PGP - Pretty Good Privacy.  PGP is a family of software systems
>        developed by Philip R.  Zimmermann from which OpenPGP is based.
> …
>
> It looks like it is assuming that a period (.) ends a sentence, and that
> two spaces should follow each sentence.  It is correct about the . after
> "Privacy", but it is wrong about the . after "R". :(
>
> This is kind of a dumb nit-pick, but it was noticed during a review on
> the mailing list.
>
> Is there a recommended way to fix this?  What XML input would produce a
> .txt with only one space after the "R."?
> ...

The simplest possible fix would be to remove the code in xml2rfc
altogether. At some point, the former RSE announced that this would be
the case for text output, but for some reason, this plan was never
written down and executed.

Changing the source to work around the issue in one specifc formatter is
not cool in the first place.

Adding a non-breaking space here will have the desired effect in plain
text output, but will affect other output formats as well.

What *might* be possible in this case is to put the author name into a
<contact> element. (Haven't tried that though).

Best regards, Julian