Re: [xml2rfc] assuming that period (.) ends a sentence is sometimes wrong

Julian Reschke <julian.reschke@gmx.de> Sun, 28 February 2021 18:50 UTC

Return-Path: <julian.reschke@gmx.de>
X-Original-To: xml2rfc@ietfa.amsl.com
Delivered-To: xml2rfc@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D056F3A1AA3 for <xml2rfc@ietfa.amsl.com>; Sun, 28 Feb 2021 10:50:04 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.001
X-Spam-Level:
X-Spam-Status: No, score=-0.001 tagged_above=-999 required=5 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_FROM=0.001, NICE_REPLY_A=-0.001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=gmx.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id opi30X_QLmNM for <xml2rfc@ietfa.amsl.com>; Sun, 28 Feb 2021 10:50:02 -0800 (PST)
Received: from mout.gmx.net (mout.gmx.net [212.227.15.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C80633A1A9F for <xml2rfc@ietf.org>; Sun, 28 Feb 2021 10:50:01 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1614538199; bh=NLXLY7P8FIw2xuHMX7+5sKbp8H9jeLAOhkkXHKEzsTk=; h=X-UI-Sender-Class:Subject:To:References:From:Date:In-Reply-To; b=UO4A5n+NT6DIAutumu2zUm+X/PmU7PV7TJ2fvYooDYP0yrYVsR1+V9gL4Nb/dQ9ts iWUPhNJQGalFuyV9eQgB+/JcZAmo6pwV264MHfAJ9zDLIYmgHEEOI0/VnoxKtvSEzf 27vb4FeEXdJkYBEPycPx0xfEeOd7Cwg7egNgW/m8=
X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c
Received: from [192.168.178.20] ([217.251.129.137]) by mail.gmx.net (mrgmx005 [212.227.17.190]) with ESMTPSA (Nemesis) id 1MnJlW-1lhl5900yS-00jMGd for <xml2rfc@ietf.org>; Sun, 28 Feb 2021 19:49:59 +0100
To: xml2rfc@ietf.org
References: <20210227191644.165F76F105E2@ary.qy> <28B528D6-7CBA-4735-A5EE-C7061D1C1D0C@tzi.org> <3dc1abe5-24bf-3b12-7b58-d06c7cde428e@taugh.com> <BBA9B16E-5B06-419D-9ABE-BFB7E69B54C9@tzi.org> <6603926-561f-c9b8-2612-2afb9847b71@taugh.com> <20210228173825.GE30153@localhost> <14ad2b3e-852a-28b1-27ae-5e25ec7823bc@taugh.com> <20210228175959.GF30153@localhost>
From: Julian Reschke <julian.reschke@gmx.de>
Message-ID: <631d06e1-1f33-58d6-6661-4da0fc18a2ef@gmx.de>
Date: Sun, 28 Feb 2021 19:49:56 +0100
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.8.0
MIME-Version: 1.0
In-Reply-To: <20210228175959.GF30153@localhost>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: quoted-printable
X-Provags-ID: V03:K1:6KDIm0J8XINNeEhG5A1BnNitD//5BzCajENxn1283wFywsN9nXr ZTZ1IkRJyXlUGqMwdTrn9iD8BYkqsXlWVlQ22hWmvcfIZyVQwhPEHkdzMrH7fEsjbHVbIe2 NoN4k2hL20Z+vsjgfVEqo9Vd4EOksFZy0FSXVmkOcBcymWr/XjRTTDrTkcqUIXBSowrurPn YmsE7tHPJpDkiMinE2gAw==
X-UI-Out-Filterresults: notjunk:1;V03:K0:hmaxg6fBFyg=:mcm3YGMhF+Nygks+pJYliR az87LqxvqFGAQfEJVtWuLxMR3sy/YeSkmvgE9XOFttg2rh2hCsSjoRwPLdXHaTcrpwo6dDMPj eBiZF1kUFw8XMU8d64xgG1qcOjnGenUpvelReY1eHk9PpIo4iIsgBSf2fDHiAgkDBf1f95vUw Ld0Vu9P9FFHkChtHKl/67CpJBQxI2R4fFp2/SQONeLPsv0BsVIA3i5idGwbYJMoO2kYSuBVhV XFR4Kz0Ghr0usXZt0puI84G3X/8t/YHGXD30aXmNNdtGDAjqSlVME0t6JBNDkKQU9EAdxprIw rjbBvNXVEOAuSBknI7yXofQexCjAnE/id0rHQLm0H4KNlgKoeTfRPJaqRUxsbvk2jPqgp4ZHn U/9oU4nrescGkiWkBfGutp2aIFFl/k+Hw5TB2o3XIzIfDt+Vr72c/Ugy6zzmkqEmUGFZTyNJp X/wzez0B6xdxBCLvXAgK26ZpaqDZ0USpjJVp9R6DmzkCX0RGoA3QegIrOIN2imAujYNpWek8t He0AzzyIxsOb4OVnAgJIELVGmGvvfCJEzz3oJ0y9kviWh+vGCUjpUwlZM+Q+smhJA7InhnyDc 8JXqs7BFPvcZAqBxdZKwMNkiLD2fIi9QUa4C69ouRvxCMRNGjJpP7kqaGkk1IOKdiFENiyrlD 1O93BfAJu/FcJm3tZlCCOSVj7Mu7jJ3ObhIlF1QeVG2Qb7boRieQEy/HYlJlDrLaI3HEXLbXq y5UPpv2T5E9h5SDEnFNTMNGuUKvPm8v1Jyo2P/CHhPhQDy4t8UlCkypfkJl/LYwUz/z2aSWvQ M0sEXpHL37nmqlno6yQuBnDkN9SSGu7zS9TkDAsDe3kO/OLckYdwNpfGvdSk3CNTh4n4jfZH/ nvh8i1UziqmRcE+q1DDg==
Archived-At: <https://mailarchive.ietf.org/arch/msg/xml2rfc/FnP00KJR-52utr0Kf_s-pj9G5-4>
Subject: Re: [xml2rfc] assuming that period (.) ends a sentence is sometimes wrong
X-BeenThere: xml2rfc@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <xml2rfc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/xml2rfc>, <mailto:xml2rfc-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/xml2rfc/>
List-Post: <mailto:xml2rfc@ietf.org>
List-Help: <mailto:xml2rfc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/xml2rfc>, <mailto:xml2rfc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 28 Feb 2021 18:50:05 -0000

Am 28.02.2021 um 18:59 schrieb Nico Williams:
> On Sun, Feb 28, 2021 at 12:54:45PM -0500, John R Levine wrote:
>>> Provided it doesn't also lose alternative Unicode whitespace characters,
>>> using &emsp; is an option.  In a pinch we could have an element to mark
>>> the end of a sentence (<s/>).
>>
>> At the end of every sentence?  That's, uh, quite a stretch.  Are we sure
>> this problem is worth that much effort by every author?
>
> If some such markup were only needed when the sentence-ending-period
> heuristic would fail, then its usage would be very rare.  Maybe
> non-breaking spacing could be used for that?

Nope.

Please let's not leak "optimizations" for text output into the canonical
format.

Assuming we continue with that 2SP practice (in contrast to what 7322bis
says): there are other alternatives, such as a config file to be
supplied to xml2rfc which continues known exception strings.

Best regards, Julian