Re: [xml2rfc] assuming that period (.) ends a sentence is sometimes wrong

Carsten Bormann <cabo@tzi.org> Mon, 01 March 2021 13:19 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: xml2rfc@ietfa.amsl.com
Delivered-To: xml2rfc@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 915E43A1BFB for <xml2rfc@ietfa.amsl.com>; Mon, 1 Mar 2021 05:19:22 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.919
X-Spam-Level:
X-Spam-Status: No, score=-1.919 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id iDfB22pJbejL for <xml2rfc@ietfa.amsl.com>; Mon, 1 Mar 2021 05:19:19 -0800 (PST)
Received: from gabriel-vm-2.zfn.uni-bremen.de (gabriel-vm-2.zfn.uni-bremen.de [134.102.50.17]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id EE50B3A1BF8 for <xml2rfc@ietf.org>; Mon, 1 Mar 2021 05:19:18 -0800 (PST)
Received: from [192.168.217.123] (p5089a828.dip0.t-ipconnect.de [80.137.168.40]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by gabriel-vm-2.zfn.uni-bremen.de (Postfix) with ESMTPSA id 4Dq17S2P9Yz10FZ; Mon, 1 Mar 2021 14:19:16 +0100 (CET)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.4\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <bf0e22cf-2f90-18e7-cb09-c791aa872f49@gmx.de>
Date: Mon, 01 Mar 2021 14:19:16 +0100
Cc: xml2rfc@ietf.org
X-Mao-Original-Outgoing-Id: 636297555.988739-f186d67229c6469c8d54191635bc340f
Content-Transfer-Encoding: quoted-printable
Message-Id: <3F93A1FB-0475-4DC5-96B1-A832880779CD@tzi.org>
References: <20210227191644.165F76F105E2@ary.qy> <28B528D6-7CBA-4735-A5EE-C7061D1C1D0C@tzi.org> <3dc1abe5-24bf-3b12-7b58-d06c7cde428e@taugh.com> <BBA9B16E-5B06-419D-9ABE-BFB7E69B54C9@tzi.org> <6603926-561f-c9b8-2612-2afb9847b71@taugh.com> <20210228173825.GE30153@localhost> <14ad2b3e-852a-28b1-27ae-5e25ec7823bc@taugh.com> <a7734631-a4f3-cee1-1ee7-e9e0bd3d534a@gmail.com> <d96fc964-f367-dc8f-bdf3-a76b90abd042@alum.mit.edu> <26DCBA0D-AA14-461F-9992-CC631774877E@tzi.org> <45ca32a4-65df-7eea-84f0-b5451698a27b@gmx.de> <D3D8A513-87A6-4A74-97CE-C3FA8DC36318@tzi.org> <ec03aa52-6aa1-0bd0-3638-c11bfc9d64dd@gmx.de> <9C9F3CE7-E269-4BE4-A6FB-D13101D1927D@tzi.org> <47edd9eb-6c96-aa9c-9709-73e054373d4a@gmx.de> <40CE7C2C-65E7-4A4C-B16E-BA4ED62C6FF4@tzi.org> <39aac336-0032-5b11-7d64-e73ed314b79c@gmx.de> <FBC30EFB-10ED-4682-86BD-6CE89E7CDA80@tzi.org> <bf0e22cf-2f90-18e7-cb09-c791aa872f49@gmx.de>
To: Julian Reschke <julian.reschke@gmx.de>
X-Mailer: Apple Mail (2.3608.120.23.2.4)
Archived-At: <https://mailarchive.ietf.org/arch/msg/xml2rfc/-G7JjGPTPFVEWbkou-nAMzcHa7Y>
Subject: Re: [xml2rfc] assuming that period (.) ends a sentence is sometimes wrong
X-BeenThere: xml2rfc@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <xml2rfc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/xml2rfc>, <mailto:xml2rfc-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/xml2rfc/>
List-Post: <mailto:xml2rfc@ietf.org>
List-Help: <mailto:xml2rfc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/xml2rfc>, <mailto:xml2rfc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 01 Mar 2021 13:19:23 -0000

On 2021-03-01, at 13:57, Julian Reschke <julian.reschke@gmx.de> wrote:
> 
> Am 01.03.2021 um 13:46 schrieb Carsten Bormann:
>> On 2021-03-01, at 13:16, Julian Reschke <julian.reschke@gmx.de> wrote:
>>> 
>>> It means that a sequence of whitespace characters can be collapsed to a
>>> single space character.
>> 
>> That is XSLT normalize-space.
> 
> XPATH, to be pedantic.
> 
> It's not exactly the same thing in that leading / trailing space in
> content can be significant (at least if you consider it to apply to text
> child nodes of <t>). For instance,
> 
>  <t>see <xref target="foo"/></t>
> 
> is not equivalent to
> 
>  <t>see<xref target="foo"/></t>
> 
> At this point I'm not sure anymore what we're discussing here.

I’m sorry, I’m trying to elicit what you think xml:space=preserve does (or doesn’t that =default does, actually).  It does not mesh with what I learned about XML.

> If your point is that RFC 7991bis should be clearer about what XML calls
> the "default whitespace handling mode", I agree.

Well, what *DOES* XML call the default whitespace handling mode?

https://www.w3.org/TR/2008/REC-xml-20081126/#sec-white-space

> The value "default" signals that applications' default white-space processing modes are acceptable for this element; the value "preserve" indicates the intent that applications preserve all the white space.


(This is in a context of:

> In editing XML documents, it is often convenient to use "white space" (spaces, tabs, and blank lines) to set apart the markup for greater readability. Such white space is typically not intended for inclusion in the delivered version of the document. On the other hand, "significant" white space that should be preserved in the delivered version is common, for example in poetry and source code.

The intent certainly is not to get rid of all whitespace.  There is no talk about the whitespace normalization you have in mind, either.)

Grüße, Carsten