Re: [I18ndir] [art] Fwd: New Version Notification for draft-bray-unichars-04.txt

Rob Sayre <sayrer@gmail.com> Mon, 18 September 2023 20:31 UTC

Return-Path: <sayrer@gmail.com>
X-Original-To: i18ndir@ietfa.amsl.com
Delivered-To: i18ndir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E56CBC1519B8; Mon, 18 Sep 2023 13:31:52 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.105
X-Spam-Level:
X-Spam-Status: No, score=-7.105 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4KZN2P-Zi_SX; Mon, 18 Sep 2023 13:31:49 -0700 (PDT)
Received: from mail-wr1-x433.google.com (mail-wr1-x433.google.com [IPv6:2a00:1450:4864:20::433]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 0FE48C1519B1; Mon, 18 Sep 2023 13:31:48 -0700 (PDT)
Received: by mail-wr1-x433.google.com with SMTP id ffacd0b85a97d-31fe2c8db0dso4324858f8f.3; Mon, 18 Sep 2023 13:31:48 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1695069107; x=1695673907; darn=ietf.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=IVP8Vcx78eU+E2n10GpokRDlfo6pRZFXbhjya+KwgM4=; b=MgwgUYDC+xAxPri9VHS0Th3TrPx9ty1nNx01ykUhjGSd94eaadEUfYOiEdigw3wVQ0 l4xOfF61jc7oFRqf0Xm9cGCWeJ21LPmfmCK6kYc6+lJcMoCXkzyvEqjC/9pJ7kPlis6X 9Xbnjgukt6LLQ0dBweox0i2g4zxlzXnuWirjhvynSSTSXsk5BoX+Itw5GBbzvjx+WwU+ NltaG2lwg9AWUCjImvsYb3GY43Xa4mxyIRHQ+GyVJ4X0pW3F27MW48+LBqRjxF4cz3iB VzEzOhHV2U255Wq9LVcfLPoZPAALAs/gt0cBOcdc8oAarCrfbuUPD5PQXTxCBeKVPrEr GM0A==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695069107; x=1695673907; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=IVP8Vcx78eU+E2n10GpokRDlfo6pRZFXbhjya+KwgM4=; b=DYqYdvL3+ni186UgcoE6XlHFB5QT/DrwIkONmEYNEsz3R7ADTMRZkCJEloNgaIp3rJ fY651a2knzu3Z0cPBFE6wV8EVwZDVoBNtxIxD7VCDj9vROKNzD5jxDJSnkM6qEbEF06J bpI/OB3hR4W978pbGDHGdOFIP6h6MIVni+BcKUhWHLKuEXsP+ebWMfxp/xn610KTu015 r923U1cDd6SuMmu9IR3/kX1+SW8UIrf8I2Y+jZDS1gBtG6vFPr5cFi0hwk9qymMcjDYQ Bwae1iyqBoM5MTbh0pb9HivSTZ0wL/26FSVHpgrhTnbfIO1YZlxyJ9Uz2Tp6cGzNL2gb 7NUA==
X-Gm-Message-State: AOJu0YzmcgDVt+DS3lge5Fqc227cG64luXL4y1OemBuRayr9jAmNZjos qZ7QWgywOTaohWoLE9/LqHjbe4LR8aWVsjliU30=
X-Google-Smtp-Source: AGHT+IEICR/VopwtzuwHmnc9doVxiOUVTGQRBQWlfsYehtTrTd5DD4EoNwSpP+RM4j/BF/iCLOfDumBy+kjjVwPhC4U=
X-Received: by 2002:a5d:60c5:0:b0:31f:f1f4:ca85 with SMTP id x5-20020a5d60c5000000b0031ff1f4ca85mr8750208wrt.37.1695069107128; Mon, 18 Sep 2023 13:31:47 -0700 (PDT)
MIME-Version: 1.0
References: <169479938668.18742.9199862891950651366@ietfa.amsl.com> <CAHBU6ivzUV947N+n7AoYkCFT3ZfaLobCQ4fBXw3dvkqTT=LBAw@mail.gmail.com> <SY4PR01MB5980D8DDE229D1C57AEDFB55E5FBA@SY4PR01MB5980.ausprd01.prod.outlook.com> <CAChr6SzRa8F+OrELa8N3rAMLmxdvr-g5c0i_9ESnWnwZY-iA4A@mail.gmail.com> <CAChr6Sy05spOW9nsy36kYr8Ob6OYS7vCgrEVPhhWs9Pe4LkpNA@mail.gmail.com> <2e6c2d13-9fc9-d320-3803-2b9a4df3b042@ix.netcom.com>
In-Reply-To: <2e6c2d13-9fc9-d320-3803-2b9a4df3b042@ix.netcom.com>
From: Rob Sayre <sayrer@gmail.com>
Date: Mon, 18 Sep 2023 13:31:35 -0700
Message-ID: <CAChr6Swr5tS2-wW8dZ0A4J7_Jd+RoHZNJkzhNfcVTi84oDvOPA@mail.gmail.com>
To: Asmus Freytag <asmusf@ix.netcom.com>
Cc: "Manger, James" <James.H.Manger=40team.telstra.com@dmarc.ietf.org>, Tim Bray <tbray@textuality.com>, ART Area <art@ietf.org>, "i18ndir@ietf.org" <i18ndir@ietf.org>
Content-Type: multipart/alternative; boundary="0000000000000010e60605a808dc"
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18ndir/Sxjn0gGDatPCGJLoJ6kbKUUDKio>
Subject: Re: [I18ndir] [art] Fwd: New Version Notification for draft-bray-unichars-04.txt
X-BeenThere: i18ndir@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Internationalization Directorate <i18ndir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18ndir/>
List-Post: <mailto:i18ndir@ietf.org>
List-Help: <mailto:i18ndir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 18 Sep 2023 20:31:53 -0000

On Mon, Sep 18, 2023 at 1:17 PM Asmus Freytag <asmusf@ix.netcom.com> wrote:

> On 9/18/2023 11:09 AM, Rob Sayre wrote:
>
> On Mon, Sep 18, 2023 at 10:51 AM Rob Sayre <sayrer@gmail.com> wrote:
>
>> On Mon, Sep 18, 2023 at 7:05 AM Manger, James <James.H.Manger=
>> 40team.telstra.com@dmarc.ietf.org> wrote:
>>
>>> For understandable reasons, JSON supports both *(%x0-D7FF /
>>> %xE000-10FFFF) and *(%x0-FFFF) (arbitrary 16-bit data) as models for the
>>> logical strings it can represent.
>>>
>>
>> ECMA-404 is clear: "JSON syntax describes a sequence of Unicode code
>> points." and the discrepancy between this text and RFC8259 is what
>> motivated this document. The document also seems to fairly clearly
>> recommend against using this production if you can help it.
>>
>
> Perhaps this document should reference <
> https://unicode.org/reports/tr17/#Strings> (note authors), which covers
> similar territory.
>
> Thanks for noticing.
>
> The need for transient states that are discoverable (that is, not fully
> encapsulated) is a big reason why many specs are not tighter.
>
> However, there are points in a protocol where strings aren't in a
> transient processing state, and here the full restrictions should apply
> (and be specified).
>
Yes. The problem here is that JSON can transmit stuff resembling these:
"For example, strings in Java, C#, or ECMAScript are Unicode 16-bit
strings, but are not necessarily well-formed UTF-16 sequences." I also
mentioned it because it says "A string data type is simply a sequence of
code units.", which matches ECMA-404 pretty well.

Here, the distinction between "string" and UTF-8/UTF-16/UTF-32 is clearly
drawn. To use James' example:

---

It does not make sense for a spec to define:
  unicode-code-point = %x0-10FFFF
  string = *unicode-code-point

---

It seems to me that TR17 defines "string" this way. Which is not to say
that I recommend sending these things over the internet, just that it can
happen. I think the draft does a decent job discouraging this one, but I
guess it will have to be yet clearer.

thanks,
Rob