Re: [I18ndir] [art] Fwd: New Version Notification for draft-bray-unichars-05.txt

Rob Sayre <sayrer@gmail.com> Wed, 20 September 2023 02:03 UTC

Return-Path: <sayrer@gmail.com>
X-Original-To: i18ndir@ietfa.amsl.com
Delivered-To: i18ndir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 24BC1C1522AD; Tue, 19 Sep 2023 19:03:27 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.104
X-Spam-Level:
X-Spam-Status: No, score=-2.104 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 5LBv5DMo15Pe; Tue, 19 Sep 2023 19:03:23 -0700 (PDT)
Received: from mail-ej1-x632.google.com (mail-ej1-x632.google.com [IPv6:2a00:1450:4864:20::632]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 017D2C151545; Tue, 19 Sep 2023 19:03:22 -0700 (PDT)
Received: by mail-ej1-x632.google.com with SMTP id a640c23a62f3a-99c93638322so89294066b.1; Tue, 19 Sep 2023 19:03:22 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1695175401; x=1695780201; darn=ietf.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=LriDNX5fYszNfboOPz7JsAzJO5aCCbFF4hEB9oSlCbo=; b=JmuCe6KguJoIxiBlu6Wn3oxuhU0TMu7b3sin1IilYjHYJjbObIqXqjggk33BhnBjHf KATN7T6T/G6fSeUzIJbZayjMxH576Dg/trlWDYo+BMUgGrcyodryjeuwphr/MGOQ5SDC irD0NTMwHpFVlb+x5enfBNf4r9oCnvJ7t47KyAiq4QuWo/x7JQ8y+xHsDtdLGcad0feD SLeAN5UEidCRqKn0co3PAwDhsFVV8mt8R9dkXsTvhCLcD/Dkb1cl/UPicRSMP8lDvz8x fkIWUEiHUrsIc7R7X5pm+YMqKQF1yi/VHERQ3RFLVuTsshBYjuOhX8q1OMzGeJ7YbVql 7x+g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695175401; x=1695780201; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=LriDNX5fYszNfboOPz7JsAzJO5aCCbFF4hEB9oSlCbo=; b=e4ngfKzxmIyAtiuGEfX/PxUxjaYKN2Br+iDbckGCwPhBfyAj1mUjCUCCR3lXGar4e/ +RQs+7Z/wHxbzZ+L77Vo8H7dbxw735P7RWLaVRRGrr7gt+sUD+Yuog530jzihFjOT3kg gBjH6YUW7tH3QZvvK3R3TqONy5QhWbCSBuyuiPm9rfj4rJKokVYGrtp/dYAWEzhF3nrN y95qPuLfzzwtasR+aDWDdgPez4atfewzUhG4qrprDWe/tC7T39ZLtM6K1oN8CbVeawcl /eqqwNgJokBbE8Im0PXvO/gNzeEbxD87F5HNVPihylE5n0HkWPpsJTQBKn+T4cgLwgM9 McOg==
X-Gm-Message-State: AOJu0YxCs3YMgacwpjyeEp2Fa0S+KwHjt+Zbz+/rQFa8dPPyGvo6tmLO 8LKWZsCZqVt2n74e+7HaLaJU+rwCn4ByGooeQUyE8jkUTArOyA==
X-Google-Smtp-Source: AGHT+IFyshSfbYu8G6kg7C9Wk1tfbqO0ZJLhTXybuB5esXyNM2RWFk0BP9cRYg2GE5AKwoJ+AphNO9peIaOkrcaGyjM=
X-Received: by 2002:a05:6402:42c7:b0:522:405f:a7 with SMTP id i7-20020a05640242c700b00522405f00a7mr6268991edc.16.1695175380416; Tue, 19 Sep 2023 19:03:00 -0700 (PDT)
MIME-Version: 1.0
References: <169514412895.12827.17924518978945582691@ietfa.amsl.com> <CAHBU6iuUsa4H_9BNvf3XEuOg3ZB5qB31vQuodQhacQUMxFiUMg@mail.gmail.com> <CAChr6SxMswjKACUr3cpZymjEOqnrxTQV2hX9mpsZO1=H2TwEZg@mail.gmail.com>
In-Reply-To: <CAChr6SxMswjKACUr3cpZymjEOqnrxTQV2hX9mpsZO1=H2TwEZg@mail.gmail.com>
From: Rob Sayre <sayrer@gmail.com>
Date: Tue, 19 Sep 2023 19:02:48 -0700
Message-ID: <CAChr6SxTVx+-9WkEHLX-K7bKSiW_5iWLW4B4XRajRTQMeg9CaA@mail.gmail.com>
To: Tim Bray <tbray@textuality.com>
Cc: i18ndir@ietf.org, ART Area <art@ietf.org>
Content-Type: multipart/alternative; boundary="00000000000061be4a0605c0c6f6"
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18ndir/36MFhpQPvxInE71QtuKcfBWMHwI>
Subject: Re: [I18ndir] [art] Fwd: New Version Notification for draft-bray-unichars-05.txt
X-BeenThere: i18ndir@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Internationalization Directorate <i18ndir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18ndir/>
List-Post: <mailto:i18ndir@ietf.org>
List-Help: <mailto:i18ndir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 20 Sep 2023 02:03:27 -0000

On Tue, Sep 19, 2023 at 4:04 PM Rob Sayre <sayrer@gmail.com> wrote:

> On Tue, Sep 19, 2023 at 10:29 AM Tim Bray <tbray@textuality.com> wrote:
>
>> Two big differences here:
>>
>>
>>    1. The %0-10FFFF production, and discussion of All The Code Points as
>>    a subset, is removed. Which ended up re-organizing the draft quite a bit
>>    (diffs vs -04 may not be very helpful) but we think improved the flow quite
>>    a bit; thanks to James and Carsten for arguing this point so fiercely.
>>    2. The discussion around RFC9413 is changed quite a bit based on
>>    Asmus’ input.
>>
>>
>> Unless substantial new issues are raised, we plan to consult our ADs
>> about advancing the document.
>>
>
> There are still some bugs here. Generally, I think "Abstract Character
> Repertoire" as used here is good:
> https://unicode.org/reports/tr17/#Repertoire
>
> Making it clear that the various encoding and escaping routines happen
> before or after this idea. I don't think you need to add "Abstract" as a
> qualifier. Just explain it.
>
> > The Unicode Standard's definition of "Unicode character" is conceptual.
> > However, each Unicode character is assigned a code point, used to
> represent
> > the characters in computer memory and storage systems and, in
> specifications,
> > to specify the allowed repertoires of Unicode characters.
>
> I think you want to add: "Not all code points represent characters."
>
> > In ABNF, the hexadecimal values for characters are preceded by "%x"
> rather than "U+"."
>
> But these are code points in the ABNF, right? For example:
>
> https://www.ietf.org/archive/id/draft-bray-unichars-05.html#section-4.1
>
> "; exclude surrogates"
>
> These are in the problematic code point types. They are not characters.
> So, it's probably best to go through and clean that up.
>
> I think the "Restricting Character Repertoires" section should be run
> through a grammar checker (MS Word or something). It doesn't say anything
> incorrect, but I often thought "hmm, there should be a comma there" and
> little things like that. Thank you for taking the "conforming JSON text"
> suggestion, but the capitalization differs between the two uses: "JSON
> text" vs "JSON Text".
>

Unfortunately, I also have to add that this sentence is wrong:

"Problematic code points are an example of problematic input. [RFC9413],
"Maintaining Robust Protocols", provides a thorough discussion of
error-handling options when choosing a strategy for dealing with
problematic input."

RFC9413 does not discuss "problematic input". It does discuss errors in
implementations and specifications. JSON is intentionally defined as code
points, not scalar values.

I can understand why people don't like this situation, but it is not an
error.

thanks,
Rob

Un