Re: [Tools-discuss] [Cellar] non-ascii characters (fwd) Dave Rice: non-ascii characters

Carsten Bormann <cabo@tzi.org> Fri, 30 August 2019 05:16 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: tools-discuss@ietfa.amsl.com
Delivered-To: tools-discuss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C8EC8120C74; Thu, 29 Aug 2019 22:16:29 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.197
X-Spam-Level:
X-Spam-Status: No, score=-4.197 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7cyhdmpHH6Dy; Thu, 29 Aug 2019 22:16:28 -0700 (PDT)
Received: from gabriel-vm-2.zfn.uni-bremen.de (gabriel-vm-2.zfn.uni-bremen.de [134.102.50.17]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 16B251200E3; Thu, 29 Aug 2019 22:16:28 -0700 (PDT)
Received: from [192.168.217.110] (p548DCCB9.dip0.t-ipconnect.de [84.141.204.185]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by gabriel-vm-2.zfn.uni-bremen.de (Postfix) with ESMTPSA id 46KSPk3PYQzyXX; Fri, 30 Aug 2019 07:16:26 +0200 (CEST)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <4fa5f1ff-71f8-5303-e285-0e1f10e458c2@gmx.de>
Date: Fri, 30 Aug 2019 07:16:26 +0200
Cc: Tom Pusateri <pusateri=40bangj.com@dmarc.ietf.org>, Michael Richardson <mcr+ietf@sandelman.ca>, cellar@ietf.org, Henrik Levkowetz <henrik@levkowetz.com>, tools-discuss <tools-discuss@ietf.org>
X-Mao-Original-Outgoing-Id: 588834984.486306-b40a1ec80b57b06e89369b0c57b34ef5
Content-Transfer-Encoding: quoted-printable
Message-Id: <76BBD514-2BB6-4291-B80F-4C72B17E43B4@tzi.org>
References: <31591.1566930552@localhost> <c9f67a76-c0ae-6d3e-2bad-47dfdda85df6@levkowetz.com> <16112.1566934510@localhost> <5E85BF29-021D-4235-BD68-893F5729D5E6@bangj.com> <AB46FC1D-1FC9-4F3B-8DB9-0FF0E93A91CE@tzi.org> <4fa5f1ff-71f8-5303-e285-0e1f10e458c2@gmx.de>
To: Julian Reschke <julian.reschke@gmx.de>
X-Mailer: Apple Mail (2.3445.9.1)
Archived-At: <https://mailarchive.ietf.org/arch/msg/tools-discuss/ZLAb_yXozu67rl3VT46D67DmLOY>
Subject: Re: [Tools-discuss] [Cellar] non-ascii characters (fwd) Dave Rice: non-ascii characters
X-BeenThere: tools-discuss@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF Tools Discussion <tools-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tools-discuss>, <mailto:tools-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tools-discuss/>
List-Post: <mailto:tools-discuss@ietf.org>
List-Help: <mailto:tools-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tools-discuss>, <mailto:tools-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 30 Aug 2019 05:16:30 -0000

On Aug 30, 2019, at 06:25, Julian Reschke <julian.reschke@gmx.de> wrote:
> 
> On 30.08.2019 06:20, Carsten Bormann wrote:
>> On Aug 27, 2019, at 23:06, Tom Pusateri <pusateri=40bangj.com@dmarc.ietf.org> wrote:
>>> 
>>> RFC 7997 is pretty clear when UTF-8 is permissible and math symbols are not in the list.
>> 
>> I believe, with a 2019 perspective on the world, this is no longer the right decision.
> 
> Did the perspective change significantly within the last 4 years??

I think so.  Of course, tipping points are always a perception issue, but I think it would not be hard to obtain consensus today that we are past the UTF-8 tipping point.  Of course, there will be people who remember that they argued otherwise a decade ago and will try to maintain that position, but outside of vintage computer museums, the reality is pretty well-defined now.

>> We are incurring the costs of supporting Unicode but not reaping its benefits.
>> Providing HTML as a main viewing format and not even being able to solve the hyphen-dash-minus problem leads to weird-looking documents.
>> How is requiring SVG display support for understanding a document better than simply requiring Unicode support?
> 
> I do prefer proper hyphens, but I also note that the proper choice can
> lead to bikeshedding.

I’m not aware of an opportunity for that (well, maybe except for the choice between en-dashes and em-dashes for the punctuation mark in the form of a horizontal line used to indicate a pause, to delimit an inserted sentence or phrase, or to indicate a deliberately omitted word).    But given that this is the IETF, I’m sure we’ll find one :-)
More seriously, this is one place where copy-editing can simply solve the problem.

> My recollection is that we started with the use cases that seemed most
> important back then. I'll also note that Unicode math characters won't
> be sufficient in all cases, so MathML definitively is something to
> consider as well.

Having math characters available is not a valid replacement for something like MathML.
Being able to throw in some casual in-line math (that an authoring tool would convert to the right characters) is useful, though.  As in ⌈a⌉ and friends…  (Nobody would argue that I can’t use a*b in a spec to talk about the product of a and b, so why not a∉B or ∀x.x²≥0.)

Grüße, Carsten