Re: [xml2rfc] assuming that period (.) ends a sentence is sometimes wrong

Julian Reschke <julian.reschke@gmx.de> Sun, 28 February 2021 08:40 UTC

Return-Path: <julian.reschke@gmx.de>
X-Original-To: xml2rfc@ietfa.amsl.com
Delivered-To: xml2rfc@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 50BB63A0D8B for <xml2rfc@ietfa.amsl.com>; Sun, 28 Feb 2021 00:40:22 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0
X-Spam-Level:
X-Spam-Status: No, score=0 tagged_above=-999 required=5 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_FROM=0.001, NICE_REPLY_A=-0.001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=gmx.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CAbE8DrNzOw0 for <xml2rfc@ietfa.amsl.com>; Sun, 28 Feb 2021 00:40:20 -0800 (PST)
Received: from mout.gmx.net (mout.gmx.net [212.227.15.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 201C03A0D87 for <xml2rfc@ietf.org>; Sun, 28 Feb 2021 00:40:19 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1614501616; bh=hr0ywj2yTqqjbKRQ3Oq4yJCba2EzH8g1/LEz84s8ViY=; h=X-UI-Sender-Class:Subject:To:References:From:Date:In-Reply-To; b=OzaNboQ/Ow9I84l320Mk6Kaerk/Krwqzwp/PvKgen9ZqHY9ufIK4WVLffPpIpIFs0 YcAgf8Q6adsnlpTjHbLPH1+EMhQCb+VFBx8+EvmuKD90a5JZdkdOT7nJXF/COAAux9 flbPTK9GKYEMLWu0mS0axTUl1/iOfj/wj9bgb0+s=
X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c
Received: from [192.168.178.20] ([217.251.129.137]) by mail.gmx.net (mrgmx004 [212.227.17.190]) with ESMTPSA (Nemesis) id 1MK3Rs-1lWAJF3VOQ-00LUZN for <xml2rfc@ietf.org>; Sun, 28 Feb 2021 09:40:15 +0100
To: xml2rfc@ietf.org
References: <20210227191644.165F76F105E2@ary.qy> <28B528D6-7CBA-4735-A5EE-C7061D1C1D0C@tzi.org> <3dc1abe5-24bf-3b12-7b58-d06c7cde428e@taugh.com> <BBA9B16E-5B06-419D-9ABE-BFB7E69B54C9@tzi.org>
From: Julian Reschke <julian.reschke@gmx.de>
Message-ID: <bf516fbd-09a7-3ee3-11cd-8d9e9693988b@gmx.de>
Date: Sun, 28 Feb 2021 09:40:13 +0100
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.8.0
MIME-Version: 1.0
In-Reply-To: <BBA9B16E-5B06-419D-9ABE-BFB7E69B54C9@tzi.org>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: quoted-printable
X-Provags-ID: V03:K1:1nnwoJ62cKAWqLj+Azy0BFntj8W+Z7giJE3QlCNvDTGMTOhlcvf 6Z55cNnox4mdbHTyGxWWgRgc+QrqELAZjdjsq1mRKlcbXTHDO+tE65Je17gRm9F9L3bmI76 sQEgjbK6pd2HYHjMaqLEpBkCiZEvgmkZ97enC/8pIJP+lKiHmGqEef4wNPw2MKu/RAPgb/z gxG8+/WGgQtVe/a41XdUA==
X-UI-Out-Filterresults: notjunk:1;V03:K0:UvRc0ZJyy7M=:ezKcGQmUQyLWrdRCtE1Qc9 dIXLc1w+emGYgzy43QskzKVnC6L712xiHLsXiaB8sL1fhaXXr5O0mxwcWJzsIO1+ExASvc4U9 OYDy6dK4dDapFA1tMT72sIwW75NZIKowVEtOAb0QwC8tk8Ld+yUgNP4abg9RAF2bKYKwPgMD0 yYE7CT9R6RWIrTGzUMBSq2hL2r36YsqyLxVX9nrJHcfKSE5GuxcQyNVB/qqCmUDkDTB5Gs3qd bAk2wHjoqLBsYwFNqq5FHut0sRUB7lzJvVWpUC65It5pEcZQGorKeUhvUsFKnnPRgEGb7fQnJ ot/FZozboWsmwj9DKcsVcCAAvQu2Xp8Nru+ly45fKm5NTrjXdvMX3hSkjqQIBjRPEa2k19aqC R/6U0sFVkaD/4a4pjoSF221VWTUrtlMp85tLj+NQ2zAsaOTk/ccx2NPKmMNWSgrhv2iUDOPWg b0APiHI2NUuhkp4tOng+PQ+TLJqlgT9lL+9abGqrx0npPn3pV0ERCdzhJlykKv3EXsuuO8F7I OwZ/WW9o06swklFNRKWuycS0uMGUbqt12jN9k+08roNs8qiiLj21UelA+troJIVcZbfNegvvA GV6p0vp/FX+bl9FgL+Vgo2LFZ9O3cjbqm7Tz/bHSErJ6YGG9/x/pHJDGg1zGtTR+/jGtpV4Xj CcuWbNgnqMxoQ9c8Gapns7KnydhjPOdxI9uJ+2HYqQ0h+8QPbGgwzpgWw2hAVNQcFCnDAVqOX ejez/1sde5xu6hUMtSSuyVcavK35q1Vs71HRAVZjCO+SA0lTDsMeeQMmRN8oQu4o9h5Yw5Qgg dz/k2Ph5QChGv0+4ehgVoQF/AqArH/SJX6rh2A/nJt8yuOko2QtzjeFZwkIbfaUlcKe3qbUfa ngLmfik8iboJIORfF7Fg==
Archived-At: <https://mailarchive.ietf.org/arch/msg/xml2rfc/SGh7SS7qKBFI_pI4CYeFQ8HodDo>
Subject: Re: [xml2rfc] assuming that period (.) ends a sentence is sometimes wrong
X-BeenThere: xml2rfc@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <xml2rfc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/xml2rfc>, <mailto:xml2rfc-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/xml2rfc/>
List-Post: <mailto:xml2rfc@ietf.org>
List-Help: <mailto:xml2rfc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/xml2rfc>, <mailto:xml2rfc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 28 Feb 2021 08:40:22 -0000

Am 28.02.2021 um 07:32 schrieb Carsten Bormann:
> On 28. Feb 2021, at 00:51, John R Levine <johnl@taugh.com> wrote:
>>
>> Having been through the publishing process in a lot of books, I can report that no matter how good your tools are, the only way to typeset stuff of professional quality is to do hand tweaks where the tools don't get it quite right.  For a bunch of reasons we have decided we're not doing that and I would prefer not to say oh, but THIS tweak is worth it.
>
> For properly doing sentence spacing, what is needed is a way to signal sentence ends.
> For 50 years, the convention in keyboarding manuscripts has been that dots at the end of the input line and dots followed by two spaces (here we are actually using two spaces — in the manuscript!) are periods (i.e., sentence ends).
> That works exceedingly well.
> Authors that keyboard carelessly don’t get proper sentence spacing, but no major disaster happens.
>
> No tweaks needed.
>
> The discussion came up because xml2rfc treated the dot in “Philip R. Zimmermann” as a sentence end.
> This is a mere bug, and bugs can be fixed.
> I’d send a pull request, but...

I have opened
<https://trac.tools.ietf.org/tools/xml2rfc/trac/ticket/602> wrt the
discrepancy between what rfc7322bis does say (or actually does not say
anymore), and what xml2rfc (in text mode) attempts to do.

Furthermore, it really would be good if there was a plan how to progress
RFC 7322bis (while making sure it is aligned with what the grammar can
express). A place for issue tracking (maybe at
<https://github.com/rfc-format>?) would be good as well.

Best regards, Julian