Re: [xml2rfc] assuming that period (.) ends a sentence is sometimes wrong

Julian Reschke <julian.reschke@gmx.de> Tue, 02 March 2021 09:11 UTC

Return-Path: <julian.reschke@gmx.de>
X-Original-To: xml2rfc@ietfa.amsl.com
Delivered-To: xml2rfc@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9C2E73A0C83 for <xml2rfc@ietfa.amsl.com>; Tue, 2 Mar 2021 01:11:03 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_FROM=0.001, NICE_REPLY_A=-0.001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=gmx.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id kqaTBQDjKcrH for <xml2rfc@ietfa.amsl.com>; Tue, 2 Mar 2021 01:11:02 -0800 (PST)
Received: from mout.gmx.net (mout.gmx.net [212.227.15.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 93D073A1412 for <xml2rfc@ietf.org>; Tue, 2 Mar 2021 01:11:01 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1614676257; bh=x7COa5kwL7g0UYeXzUf74PKexm422bYIx85vwpqkBRQ=; h=X-UI-Sender-Class:Subject:To:References:From:Date:In-Reply-To; b=Aq+EYYnfx6Pi/lZJgE43cYjrMTs4h7ZJqNlAh6PRLPq/0f9ZxqOa5uYU1VTxXft6R A2d3ca+eoI2VV145FQfltthLTnsYmm+2niTcbnC6Pjiv/98t9fOOKgUvvmnxsR2MdJ 9QPjHFQKI8OZeSKa99lJ+fQ5G4nTvcfNhW4Gk9I0=
X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c
Received: from [192.168.178.20] ([91.61.48.133]) by mail.gmx.net (mrgmx004 [212.227.17.190]) with ESMTPSA (Nemesis) id 1MZTmO-1lJtPr2YOk-00WT2y for <xml2rfc@ietf.org>; Tue, 02 Mar 2021 10:10:57 +0100
To: xml2rfc@ietf.org
References: <20210227191644.165F76F105E2@ary.qy> <3dc1abe5-24bf-3b12-7b58-d06c7cde428e@taugh.com> <BBA9B16E-5B06-419D-9ABE-BFB7E69B54C9@tzi.org> <6603926-561f-c9b8-2612-2afb9847b71@taugh.com> <20210228173825.GE30153@localhost> <14ad2b3e-852a-28b1-27ae-5e25ec7823bc@taugh.com> <a7734631-a4f3-cee1-1ee7-e9e0bd3d534a@gmail.com> <d96fc964-f367-dc8f-bdf3-a76b90abd042@alum.mit.edu> <26DCBA0D-AA14-461F-9992-CC631774877E@tzi.org> <45ca32a4-65df-7eea-84f0-b5451698a27b@gmx.de> <D3D8A513-87A6-4A74-97CE-C3FA8DC36318@tzi.org> <ec03aa52-6aa1-0bd0-3638-c11bfc9d64dd@gmx.de> <9C9F3CE7-E269-4BE4-A6FB-D13101D1927D@tzi.org> <47edd9eb-6c96-aa9c-9709-73e054373d4a@gmx.de> <40CE7C2C-65E7-4A4C-B16E-BA4ED62C6FF4@tzi.org> <39aac336-0032-5b11-7d64-e73ed314b79c@gmx.de> <FBC30EFB-10ED-4682-86BD-6CE89E7CDA80@tzi.org> <bf0e22cf-2f90-18e7-cb09-c791aa872f49@gmx.de> <3F93A1FB-0475-4DC5-96B1-A832880779CD@tzi.org> <50c61df3-64e7-d81a-a955-ef2854f05919@gmx.de> <6F258208-B6A0-4F20-90CD-AF32FD361FF5@tzi.org>
From: Julian Reschke <julian.reschke@gmx.de>
Message-ID: <f81230c0-c873-8e75-ffd0-dd903f9be258@gmx.de>
Date: Tue, 2 Mar 2021 10:10:55 +0100
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.8.0
MIME-Version: 1.0
In-Reply-To: <6F258208-B6A0-4F20-90CD-AF32FD361FF5@tzi.org>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: quoted-printable
X-Provags-ID: V03:K1:GAwyzLA2Cqskut4fE423epeklTQIELTq3tx5K8fWX5mnmwJ9MGI dSUHAM0WrPFursdkxrYQjbxlmvB0qDpqNndygieX+FDY3AagOOSORi1GGcw0EkWy8iW95RY rGGIScjl2tTtz7u/IKrDsIhTLDaVQs+FyyewSEXt3GKUe2BBJPE095v+D7IlnFFlonKxsmJ VfK4JmtZuNrVeC0stolbg==
X-UI-Out-Filterresults: notjunk:1;V03:K0:zcQa5Ek0Hw8=:mZ0xEyR7BdW4qkbAOxBZLK 3ePfU8086hMsrjlbTU5Yj48QUL5+wV0/bTt1xeBrwoOk6KiIRcukC+oSdtOp7xk8yvd2ZO9iU k1s5WToulE3dkEnGfJMCVPxH/Y7VKYKnr8D7yqMJGDXnDe+xzuTaxeLGLj47SqAVk2r6GvssW vZopVcZDoAeBAW2B553utyCeE0JxrXDNKrISxXzYXjCETDv+VCtlUeZNI1Vn/6G2v2hehN+xi yQ80fXJyRkwMNwQ5TOssnWNbKMsZpl7uqh+bGYiuOJSElOTYyKAiLSX9W4pst/1Vuctyo8ZIz 6dmJyzrIKFyYlL92d8s0PNANBLB8AoynRQDqTRz9zxP06B4JT18ZtDGREHdwdZy4IAoa86gfL HbXM7yUUJgVpxy+kMJXK7RXMjkpqkUCqClaUXZpxeU92cMGuMSiybSOktWBV0Sbf6YuZLIv4a YmCvajAo7wR1dSDDo3DDKanGkW9dnM6bhLPmHCBiOWzSgbbUqlgWNTzOI6B5v3pT7qqrwjjZA PF4lcAi5ndgdMGLNffRZhIvkjifuhrkvxQ5W+YAsO9lLP0dLud4cgLilWt/octSGAq/9VWzDQ 4UKkhTi4PN2zFJZ3aw+YsZNIqCbgd0BP0I971xgZSNpnQNL9XGQF6jpqW9r7QRFvMvveeO7kw 9IxzkTL23Cn6QEIpk4hBGgYxGf2K+V1wCb4en7dnPPdYQkpjntdg0UtegBc/bNM5hk6JoJb9n Z2XhpDKd3JeXZh0iiGVpAGiU2daJqAnEU1X5RfEQ6tJvLfTVhjYYap0LguRB4Ic54wyZ5AOTD yc5tAZyzMPfCGi6gptG6LABysAJS/p1vLJQKkxOfEgtAKhGmPHsij1oDZcgURzsOSPsSmY1sj 5hTZXh9SQfFrqzDbv7Tw==
Archived-At: <https://mailarchive.ietf.org/arch/msg/xml2rfc/D4cT-4ifTmV2fugSV65qhgsYC_k>
Subject: Re: [xml2rfc] assuming that period (.) ends a sentence is sometimes wrong
X-BeenThere: xml2rfc@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <xml2rfc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/xml2rfc>, <mailto:xml2rfc-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/xml2rfc/>
List-Post: <mailto:xml2rfc@ietf.org>
List-Help: <mailto:xml2rfc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/xml2rfc>, <mailto:xml2rfc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 02 Mar 2021 09:11:04 -0000

Am 01.03.2021 um 14:55 schrieb Carsten Bormann:
> On 2021-03-01, at 14:35, Julian Reschke <julian.reschke@gmx.de> wrote:
>>
>> Also, AFAICT, when M. T. Rose developed this, he used the same
>> processing model as HTML.
>
> Right, which is fine with me.  My copy of xml2rfcv1 also does sentence spacing, pretty much on the model that I described.
>
> My point is that nothing in XML or existing specifications speaks against sentence detection and sentence spacing.  The setting of xml:space is completely irrelevant, as the handling of whitespace is left to the application in either case.
>
> (This does not prejudice whether we want to do sentence spacing in the future, of course.)
> ...

-> <https://github.com/rfc-format/draft-iab-xml2rfc-v3-bis/issues/198>

Best regards, Julian