Re: [I18ndir] [art] Just uploaded draft-bray-unichars-03

Rob Sayre <sayrer@gmail.com> Sun, 10 September 2023 17:57 UTC

Return-Path: <sayrer@gmail.com>
X-Original-To: i18ndir@ietfa.amsl.com
Delivered-To: i18ndir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B5D59C14CE4F; Sun, 10 Sep 2023 10:57:57 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.105
X-Spam-Level:
X-Spam-Status: No, score=-7.105 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4ZVQV6vdpyNn; Sun, 10 Sep 2023 10:57:53 -0700 (PDT)
Received: from mail-ed1-x536.google.com (mail-ed1-x536.google.com [IPv6:2a00:1450:4864:20::536]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D06FDC14CE30; Sun, 10 Sep 2023 10:57:53 -0700 (PDT)
Received: by mail-ed1-x536.google.com with SMTP id 4fb4d7f45d1cf-52e5900cf77so4745200a12.2; Sun, 10 Sep 2023 10:57:53 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1694368671; x=1694973471; darn=ietf.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=ewSNo/o45e7bYCdJOMl5eWjx/gpUtKeiBZUMHo1cZyY=; b=Bug3avl2eldmGY5+nyW1JwPz/VAhDGTe5I2CgBO8j+ZR7QsoqJ66TrwUFCW4X/Se/y x9QC7nzfRlJ8pCMSWSkMTp12JZLboquoJpsijrY/Kxwbz/EkDqHdTJNdKp39PlnMoQ/5 5EBLDvn/p1BFr3SxFGrLuN8db1U+Zo8JQxvV1p8VQdhX227dqFnFOEMEqtSHRbo9DnQB U5kK5g71+0Pq7+4TnQ+QBUypN8GgpiVJYFgQ6NtAcxX0G3uvugh0ZCwmycnVnoakoxO3 Lv6Nr3KXAy8lFR8fxJ6vM1KiAZuWXBXYEXhxOy6VVYRtbIdyl4D0elF0IuP4xNu5v+cI G75w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694368671; x=1694973471; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ewSNo/o45e7bYCdJOMl5eWjx/gpUtKeiBZUMHo1cZyY=; b=r468FNN0KJFGsZt1ip5kYSrd2w/e4qZhksB2oErFIB4D/+cZHGy/V35VhoIuk2DU32 Xr09AinLBIenSNuxVTtOBEToPiDcxHY15ypnNaOVkq6l/uDHbmXXOgHaelZCkBDvl7Re c0LWO54qC3DBgkUD/88WP/bI5IhFhquw9b5YL6VT6vKSzHSlwzUMbOwr4s3HJRCK16uW I3//NTPf41FzVuA0M72mi77REuDDbhSLloIi+PiUGR2zmbV8BzF+aDKYeEMmA3KeSOKv WCpJ/827TFA870EK0LFl0xk0wrfNIuAliymzEeumbmDY1LZZCxLLbcB3YEK32sRGWJq8 DFxA==
X-Gm-Message-State: AOJu0YxGWZdCOLJkF7vaCOKG3FVKLrMwBVGzJGm6npBF9wfJyjFmVns7 /vfzkrnb6VmupfKKG5FJKFfGbYIrAnwMVJkpejuh9WZ9uoroRQ==
X-Google-Smtp-Source: AGHT+IHFajU2qJRHIvsQf++qIh1GULJNKU87mi2yvlXvEeb2H5rJR3B0OWZrlz/xPv8IMZM0pU0Y5QfQ0N6RceFLK9s=
X-Received: by 2002:aa7:cd6c:0:b0:527:3a95:5bea with SMTP id ca12-20020aa7cd6c000000b005273a955beamr6285906edb.32.1694368671342; Sun, 10 Sep 2023 10:57:51 -0700 (PDT)
MIME-Version: 1.0
References: <CAHBU6is50TkpDsqXTp6WxdVSgE66j3gGHZ60ey2jFYbefaHFJw@mail.gmail.com> <ME3PR01MB59730B45D9339180AF00E941E5F3A@ME3PR01MB5973.ausprd01.prod.outlook.com> <CAHBU6ivc4W3KyYtbK2H7PQUa8C4+g=73nSTgBK+xLXnzH7V6GA@mail.gmail.com>
In-Reply-To: <CAHBU6ivc4W3KyYtbK2H7PQUa8C4+g=73nSTgBK+xLXnzH7V6GA@mail.gmail.com>
From: Rob Sayre <sayrer@gmail.com>
Date: Sun, 10 Sep 2023 10:57:39 -0700
Message-ID: <CAChr6SwNy_hUr-TA-s=qswYpr_VGyNTYe9M_8zbFtr3JEcj_Cw@mail.gmail.com>
To: Tim Bray <tbray@textuality.com>
Cc: "Manger, James" <James.H.Manger@team.telstra.com>, Asmus Freytag <asmusf@ix.netcom.com>, "i18ndir@ietf.org" <i18ndir@ietf.org>, ART Area <art@ietf.org>
Content-Type: multipart/alternative; boundary="000000000000c624dc060504f2c8"
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18ndir/rLSIrftQTh9FsZ8AOPOESJxNuig>
Subject: Re: [I18ndir] [art] Just uploaded draft-bray-unichars-03
X-BeenThere: i18ndir@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Internationalization Directorate <i18ndir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18ndir/>
List-Post: <mailto:i18ndir@ietf.org>
List-Help: <mailto:i18ndir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 10 Sep 2023 17:57:57 -0000

On Sun, Sep 10, 2023 at 10:52 AM Tim Bray <tbray@textuality.com> wrote:

>
> Wow. I had no idea. As with many aspects of Java+Unicode, this feels
> deeply wrong. It should either round-trip or throw a damn exception.
> Anyhow, that ship sailed a long time ago.  I think we should include the
> Java example to illustrate another way that surrogates can lead to breakage.
>

I think this one might be best kept in the basement filing cabinet next to
WTF-8*.

If you really look, you'll find stuff like this:
https://docs.rs/jni/latest/jni/strings/index.html

"Wrapper for std::ffi::CStr that also takes care of encoding between UTF-8
and Java’s Modified UTF-8."

/shudder/

thanks,
Rob

* https://simonsapin.github.io/wtf-8/