Re: [I18ndir] [art] Just uploaded draft-bray-unichars-03
Asmus Freytag <asmusf@ix.netcom.com> Sun, 10 September 2023 08:43 UTC
Return-Path: <asmusf@ix.netcom.com>
X-Original-To: i18ndir@ietfa.amsl.com
Delivered-To: i18ndir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 84029C14CE52; Sun, 10 Sep 2023 01:43:17 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.996
X-Spam-Level:
X-Spam-Status: No, score=-1.996 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, NICE_REPLY_A=-0.091, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=earthlink.net
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id o_XWb9hyIhvK; Sun, 10 Sep 2023 01:43:13 -0700 (PDT)
Received: from mta-102a.earthlink-vadesecure.net (mta-102b.earthlink-vadesecure.net [51.81.61.67]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id BF4DBC14CE4A; Sun, 10 Sep 2023 01:43:12 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; bh=xPyXunGcL0dO3YbFGbbcEgT7Omw9eTqC7mfbjT 5KZ0w=; c=relaxed/relaxed; d=earthlink.net; h=from:reply-to:subject: date:to:cc:resent-date:resent-from:resent-to:resent-cc:in-reply-to: references:list-id:list-help:list-unsubscribe:list-subscribe:list-post: list-owner:list-archive; q=dns/txt; s=dk12062016; t=1694335388; x=1694940188; b=mwA0ugsgRQcvAVt3usWUvuFFgmnWEGh+OZCGHxk6VzY4V8xLFP/EcWE 4mvWr3roioGsuOoMbf4NTj/hqSUyn7Ydzg00cOJbLBaZ2uNjLUi4sUvf4BcgFtJmhPqTOsJ rChRWc+vMPad2E9a55DMmPNJGE2SNB6qj5iOwbZ5pw7OcekZN6zI1UdzEgLVdDTRgqQnYlY pJbSFj1/0ts9gOau4yeWxhP3ydJPBn6oKyKXWryijX01LA34BXWQaIXTtTNhYpJATdgr/// Xeorwn/XdrAWxh0Y7G6qNhzb/nu12d3QiO8cUat3yMuPYgeFsSd9j7oNjQaSnJnlzVIs1tu I/A==
Received: from [10.71.219.206] ([198.54.134.179]) by vsel1nmtao02p.internal.vadesecure.com with ngmta id 4b7664f3-17837d0edabdcea4; Sun, 10 Sep 2023 08:43:08 +0000
Content-Type: multipart/alternative; boundary="------------9ahcrgW10XMYaAk8t300SbIS"
Message-ID: <ff2df364-ecc6-d4f5-2f87-ad94295f102c@ix.netcom.com>
Date: Sun, 10 Sep 2023 01:43:07 -0700
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.15.0
Content-Language: en-US
To: Tim Bray <tbray@textuality.com>
Cc: Steffen Nurpmeso <steffen@sdaoden.eu>, i18ndir@ietf.org, ART Area <art@ietf.org>, Rob Sayre <sayrer@gmail.com>
References: <CAHBU6is50TkpDsqXTp6WxdVSgE66j3gGHZ60ey2jFYbefaHFJw@mail.gmail.com> <20230909165843.GlTJy%steffen@sdaoden.eu> <CAHBU6iuixTeS=X1kccw11zEnHVG5tx9aHUC-pH00ociBmukhGQ@mail.gmail.com> <d9d5dee0-24d1-54f0-dde9-4bb9ad2e56e7@ix.netcom.com> <CAChr6Sygs5=fyQ7ZJSVoV5EY9hDZWRkj78r9yH2539vtNTT=aQ@mail.gmail.com>
From: Asmus Freytag <asmusf@ix.netcom.com>
In-Reply-To: <CAChr6Sygs5=fyQ7ZJSVoV5EY9hDZWRkj78r9yH2539vtNTT=aQ@mail.gmail.com>
Authentication-Results: earthlink-vadesecure.net; auth=pass smtp.auth=asmusf@ix.netcom.com smtp.mailfrom=asmusf@ix.netcom.com;
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18ndir/VBPVNquRABnzviiH81qW-EoIu2A>
Subject: Re: [I18ndir] [art] Just uploaded draft-bray-unichars-03
X-BeenThere: i18ndir@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Internationalization Directorate <i18ndir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18ndir/>
List-Post: <mailto:i18ndir@ietf.org>
List-Help: <mailto:i18ndir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 10 Sep 2023 08:43:17 -0000
On 9/10/2023 12:05 AM, Rob Sayre wrote: > > >>> And then in 8. >>> >>> 8. String and Character Issues >>> 8.1. Character Encoding >>> JSON text exchanged between systems that are not part of a >>> closed ecosystem MUST be encoded using UTF-8 [RFC3629]. >> >> As Rob Sayre said above, the proposed document probably has to >> address the issue of JSON escapes and emphasize that they are not >> relevant to code point subsets. > No they are. If you are in a JSON-based environment, but have > restricted your repertoire, then even if JSON allows an escape, > it's invalid if it violates your restriction. And your new > specification SHOULD require something definite and drastic to > happen in that case.\ > > > That is exactly the point. The current draft mentions "Transformation > Formats", but doesn't mention that these transformation formats can > and do further encode questionable Unicode via escape sequences. > The draft should mention it. > > Unfortunate, but true. So, you can have a perfect UTF-8 document that > represents a bunch of unpaired surrogate code points. The term "transformation format" is not formally defined in the Unicode Standard. It is noted in chapter 3 as being a (somewhat ambiguous) alias for two other terms, encoding format and encoding scheme. ("schemes" have a fixed byte order for code units and "forms" are generic mappings to an integral type irrespective of byte order). The key is that the definition of both schemes and forms specifies them as a mapping from a single code point (actually: scalar value) to a "sequence of code units", while the term "transformation format" as defined in the Unicode glossary is a more general mapping of character sequence to sequence of code units. /Transformation Format <https://www.unicode.org/glossary/#transformation_format>/. A mapping from a coded character sequence to a unique sequence of code units (typically bytes). The draft currently states: > Unicode describes a variety of "transformation formats", ways to > encode code points in bytes of computer memory. A survey of > transformation formats is beyond the scope of this document. If we accept the term "transformation format" as an alias for the terms that are actually defined in the standard (encoding form or scheme) then that usage of "transformation format" _excludes _escape sequences and similar mappings where the source of the mapping can be more than one code point. The UTF-8, UTF-16 and UTF-32 encoding forms will be formally called out as "standard encoding forms" in the next revision of the Unicode Standard. There are non-standard ones, such as UTF-7, CESU-8 and whatnot, but none of them have escape sequences either. While the in memory form of a string syntax with an escape sequence is technically an example of a "transformation format" as that term is described in the glossary, such a syntax is neither an encoding form, nor an encoding scheme, and those are the things the Unicode Standard actually describes. Because, according to the definition, the escape sequence would represent multiple code points and is therefore not a direct representation of a character under one of the standard encoding forms. Rob's comments make clear that using the term "transformation formats" in the way quoted above can lead to misunderstandings. I sympathize with the desire not to bring in the entire formalism, so the following suggestion might suffice to limit the sense of "transformation format" that is intended here. > Unicode describes a variety of "transformation formats", ways to > *uniquely *encode *each **scalar value* into bytes of computer memory. > A survey of transformation formats is beyond the scope of this document. If you like to avoid the term "scalar value" - it is synonymous to "non-surrogate code point". A./ > > thanks, > Rob > >
- [I18ndir] Just uploaded draft-bray-unichars-03 Tim Bray
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Rob Sayre
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Rob Sayre
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Asmus Freytag
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Asmus Freytag
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Asmus Freytag
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Steffen Nurpmeso
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Asmus Freytag
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Rob Sayre
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Tim Bray
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Tim Bray
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Rob Sayre
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Asmus Freytag
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Rob Sayre
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Tim Bray
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Tim Bray
- Re: [I18ndir] Just uploaded draft-bray-unichars-03 Tim Bray
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Rob Sayre
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Rob Sayre
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Asmus Freytag
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Asmus Freytag
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Rob Sayre
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Asmus Freytag
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Manger, James
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Carsten Bormann
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Carsten Bormann
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Rob Sayre
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Tim Bray
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Tim Bray
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Rob Sayre
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Rob Sayre
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Tim Bray
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Carsten Bormann
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Rob Sayre
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Manger, James
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Rob Sayre
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Asmus Freytag
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Carsten Bormann
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Steffen Nurpmeso
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Steffen Nurpmeso
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Steffen Nurpmeso
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Rob Sayre
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Kevin Marks
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Tim Bray