Re: [I18ndir] [art] New Version Notification for draft-bray-unichars-06.txt

Rob Sayre <sayrer@gmail.com> Mon, 02 October 2023 16:42 UTC

Return-Path: <sayrer@gmail.com>
X-Original-To: i18ndir@ietfa.amsl.com
Delivered-To: i18ndir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 95563C1519A3; Mon, 2 Oct 2023 09:42:48 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.106
X-Spam-Level:
X-Spam-Status: No, score=-2.106 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_BLOCKED=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id JMdB0Iqd0teb; Mon, 2 Oct 2023 09:42:43 -0700 (PDT)
Received: from mail-ej1-x62d.google.com (mail-ej1-x62d.google.com [IPv6:2a00:1450:4864:20::62d]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1DE98C152564; Mon, 2 Oct 2023 09:42:43 -0700 (PDT)
Received: by mail-ej1-x62d.google.com with SMTP id a640c23a62f3a-9b2f73e3af3so320200866b.3; Mon, 02 Oct 2023 09:42:43 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1696264961; x=1696869761; darn=ietf.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=kH95YkCuroRRx9RJWoonJ70ks5TAbdO3YhHPUH6/g9k=; b=STinPQsJhZW7Y6qeHUitxeJSl6FFJA0gezpzDq/AGj5jT3QQ59zwkspep+GMJRGlbH K5S8IGqp4i+CG+N0VRigDnbOM+81ZElkJs1TFfKs4cDvQw+nEJxGaYfwTQM+6riwC9Wg C7RhKMKtmAnEcGdJbyZ9bTsaufyfsm5hhRTEC48Rxe2UVxuC7+guNrUlQhBBUatyA2fx ZbTmdB4NCt50/Qg8y82REhCYMrV0ja+wxmt6Dyck+C6w0w5YogtxAjRwChwCxXBF1cvY 7wOwMPzalwzIhZuaITs2ysraPWhZC82a5jcdrZoR7Wo+K95RPDfNf3oB8k3eFfX4cLL4 9hGw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696264961; x=1696869761; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=kH95YkCuroRRx9RJWoonJ70ks5TAbdO3YhHPUH6/g9k=; b=dQyx86ah2T5VYsd75LcvrExAgB7+RFRE/vm7iVeQDuwL+NC+fbLqdWpl9KpsfLzzCs H2uZyU0J6e2/SGIpQnKvzcNtviXLyaMl/oLNwem6gTql3C57kHEL6rgnbgNTmlby6Y7p 8zERBWL5bZTjAnRhQ7wG7UfIAR4EhS7+tbzJ0ta7UhrQLocEs6oMHZmNlgJXUIvC3uaG JnIjnXXcnj0fFq06+AV7S9RFgZNgd+UTpBgaDyh3Yufn4rxzZv6ZrJl6jrHa+5bLgJWe l3XQVP2OP+sr9D5xPNauekOAKmlLmz0StMua+dbYRVB2boY797m5K8kuIRtuLZZXFT68 bYbA==
X-Gm-Message-State: AOJu0YxhGdurewgJPvTHfYWbev5VuwAStmpIzlC3qdxFvm3SO5GyMhAs zfl8e4/N6ZXSF4+0VhWYqNMKDOpSESeBwQYTsDeiEgjJPXtA2w==
X-Google-Smtp-Source: AGHT+IFGKWfR0WjPNJUflqI7DQqmZvy/U9WIcX5HsWgp82j+v7u4SmMQ8geDTUu1SBgw9WNymcNn/UswtjPxvBG0cb8=
X-Received: by 2002:a17:906:2d2:b0:9a1:bebc:8282 with SMTP id 18-20020a17090602d200b009a1bebc8282mr11943078ejk.32.1696264961152; Mon, 02 Oct 2023 09:42:41 -0700 (PDT)
MIME-Version: 1.0
References: <169566019635.41806.9804796677919971070@ietfa.amsl.com> <CAHBU6is-wU2NLXNWL56nSJ4=nKvDzGv_Aw4qJN6N2O8CuM4-yw@mail.gmail.com> <SYBPR01MB59814B3448F5754AAEDA1740E5C7A@SYBPR01MB5981.ausprd01.prod.outlook.com> <CAHBU6iueqtd5T1T-ciYUMWvmo8XqBQqO5LkWbdRaoXQzPYSQOQ@mail.gmail.com> <SY4PR01MB5980D009F1623E3694B871B7E5C5A@SY4PR01MB5980.ausprd01.prod.outlook.com> <CAChr6SzMXqmEJvwQ0Vb0+CfchBn2kMueQJ-2Th1=4Oct8b9t6A@mail.gmail.com> <E1464943-EB11-4FA4-B933-4F138C6C34A0@tzi.org>
In-Reply-To: <E1464943-EB11-4FA4-B933-4F138C6C34A0@tzi.org>
From: Rob Sayre <sayrer@gmail.com>
Date: Mon, 02 Oct 2023 09:42:29 -0700
Message-ID: <CAChr6Syn1vD9SA+XseafBfheOtyk_M=NP2Rgi2=L9e8FO7aQZQ@mail.gmail.com>
To: Carsten Bormann <cabo@tzi.org>
Cc: "Manger, James" <James.H.Manger=40team.telstra.com@dmarc.ietf.org>, Tim Bray <tbray@textuality.com>, "i18ndir@ietf.org" <i18ndir@ietf.org>, ART Area <art@ietf.org>
Content-Type: multipart/alternative; boundary="000000000000745b390606be7687"
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18ndir/eXeE9KiWNOAqjeSLIbrzS6VsAGo>
Subject: Re: [I18ndir] [art] New Version Notification for draft-bray-unichars-06.txt
X-BeenThere: i18ndir@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Internationalization Directorate <i18ndir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18ndir/>
List-Post: <mailto:i18ndir@ietf.org>
List-Help: <mailto:i18ndir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Oct 2023 16:42:48 -0000

On Mon, Oct 2, 2023 at 9:14 AM Carsten Bormann <cabo@tzi.org> wrote:

> >
> > I don't think anyone really disagrees with the ideals you've got here.
> The problem is that a large amount of running code is confused in exactly
> the way you describe. So, implementers do have to know about code points.
> The IETF could pound its collective fist and say "all ill-formed Unicode
> must be rejected”,
>
> Yes, please.
> The fact that this is the only reasonable way forward is the point of RFC
> 9413.
>
> > but that won't work. This document is a decent attempt at gradually
> improving things.
>
> Rob, you come from a perspective where everyone already is wearing rubber
> boots so they can slog through the toxic waste.
> This is relevant for maybe 5 % of the IETF protocols.
> This document needs to be useful for the other 95 %, without dipping those
> protocols in toxic waste, too.
>

I mean, I didn't really get my way, though. The document still doesn't
acknowledge that you can accept the toxic waste! But that's ok, I can live
with it. The repertoires it describes are really helpful, because they make
it very easy to write a test suite. Just loop through many code points
(maybe all of them), knowing which ones are not allowed.

As for the merits of "halt and catch fire" error handling, I'm pretty sure
the author has been around the block on that one, and he's not going for
that here. It makes sense to me.

Basically, UTF-8 is a work of genius, but it arrived 10 years too late.
JavaScript, Windows, Java, and all the rest of them were underway before we
got RFC 2277. Things would be much different if it had appeared in 1982 and
everyone then realized that you don't need Θ(1) access to characters in a
string...

thanks,
Rob