[Jmap] Re: [art] Re: Artart telechat review of draft-ietf-jmap-contacts-09

Rob Sayre <sayrer@gmail.com> Tue, 21 May 2024 00:30 UTC

Return-Path: <sayrer@gmail.com>
X-Original-To: jmap@ietfa.amsl.com
Delivered-To: jmap@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 95BE7C1D4A70; Mon, 20 May 2024 17:30:14 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.095
X-Spam-Level:
X-Spam-Status: No, score=-7.095 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rUCVXhreYQm2; Mon, 20 May 2024 17:30:09 -0700 (PDT)
Received: from mail-ej1-x632.google.com (mail-ej1-x632.google.com [IPv6:2a00:1450:4864:20::632]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B62AFC1C3D71; Mon, 20 May 2024 17:30:09 -0700 (PDT)
Received: by mail-ej1-x632.google.com with SMTP id a640c23a62f3a-a5a5c930cf6so788584166b.0; Mon, 20 May 2024 17:30:09 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1716251408; x=1716856208; darn=ietf.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=uCuHo4Bkk9jgzEXPUhX/t4FFOdhsx1upJfeEDXvS5go=; b=YTLbpoa5ea8hv2ImwX7Me31OQA02PWZ+NEsfRqT0svK0jtx7SssI5duPk99MXGMkVD FeMxs6ry03XGHocTXphi7s4PsDzT4KSt7k+a3twnB6s6NtGIuhbQaRGrTiAAmOAS6ur5 Qm2MOKvTVuOsbmsO39m7XGEG3gS3AeGHmQjh6RqSU89QQkBf5CQwgtWSzlaznrbikhbU 8fC4uGTEV36BjvB8ZmpELmGKNY5ELEgUpl3LIFBmVbgn2CpxQipbk+FeFh44bGTwsSsR CMGKWyLz8G9vrXWKwXLwqTk9YGMFSmIP2plE8eXMHSqavJfaVedvEDJIw+ehVFzhsfBD ZjZA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716251408; x=1716856208; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=uCuHo4Bkk9jgzEXPUhX/t4FFOdhsx1upJfeEDXvS5go=; b=Rj9I/J26nUwtbhrlRO+sH7/ChLgQVCOSei7RTGD1Wp+HjDrNLog0Y6Vl6Hr8kZUYjh o0LdExbmyJRkMOLMOGRSxZDwmyHKPNj5LOOsZPA4u4T6cxqkm45SHcLgxd6+lNa4dXkv m/PFOE4kG7kiDd8tUZgUzrDiiHl3CV7qJRbTDatfNxLc4d9srNp20daguSafmds1grWc dGUbUdjkjvY2l43JtrrNcs8wvs8jrVDMK/saqi96V6HttqMWPmuHlQ9IyUmq7Z+6eygs mr5DbgkFmiVgLDDAUfwUey2KRhENdCbiMt4+WdbsAiS9v0cS8aLlVcyYtf0H8Hggl+4n mNWA==
X-Forwarded-Encrypted: i=1; AJvYcCWQwjLeoAIkHXxB2RnelN33QX4fVdc8YwDdUy9z3T/eRlZTWR+otrwOiO5biv65/bTcCMMQidcpWkFwdaVeVDCnOoXbwr4orucKZBl0hwHPYooJcUZ6wEfggI0unH1PQrSxkKqmyq5COCu3p0z6DjrfxwYDskzNvL/5fFulha65tPEs6B7TeQ==
X-Gm-Message-State: AOJu0YyFM4to+NWB0JpR8UyiB9FCURYMTbSSppEzlTj67KyYlxpSGajz PqsKf2KbGmEeqFP2EBGxVB7yfTnUCrKLUIsq8nl1AHn+34dUFZstKMSH/EsO1hq4eV2KnRh/Qay xq59CXKHjjLN5kK5pjOgxkZ/dl8o=
X-Google-Smtp-Source: AGHT+IGmBamzXpSCYEOzGBmPWxeHsJRYBntbyQ9GEwH3a/WjqDpXktxeJL7DboIzoLYMN7e4QIRLSTsClW0McCG7cIg=
X-Received: by 2002:a17:906:3b50:b0:a5a:7a1:5d9c with SMTP id a640c23a62f3a-a5a2d66879bmr1820813866b.62.1716251407558; Mon, 20 May 2024 17:30:07 -0700 (PDT)
MIME-Version: 1.0
References: <874jase3h7.fsf@hobgoblin.ariadne.com> <CAHBU6iuzrYqy-ZQV61PpAJxMfEryAT-z=QCZyyKZMEiErCi7TA@mail.gmail.com>
In-Reply-To: <CAHBU6iuzrYqy-ZQV61PpAJxMfEryAT-z=QCZyyKZMEiErCi7TA@mail.gmail.com>
From: Rob Sayre <sayrer@gmail.com>
Date: Mon, 20 May 2024 17:29:56 -0700
Message-ID: <CAChr6SyBrrsAuxouwucLVE4CRX3gu_+CG8Z8Wx-zhODoZkqjmg@mail.gmail.com>
To: Tim Bray <tbray@textuality.com>
Content-Type: multipart/alternative; boundary="0000000000007e2e9e0618ebeba2"
Message-ID-Hash: NQLO6PNCW52XB7NKRJEAC2AIM6L2LNPP
X-Message-ID-Hash: NQLO6PNCW52XB7NKRJEAC2AIM6L2LNPP
X-MailFrom: sayrer@gmail.com
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-jmap.ietf.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
CC: "Dale R. Worley" <worley@ariadne.com>, art@ietf.org, draft-ietf-jmap-contacts.all@ietf.org, jmap@ietf.org, last-call@ietf.org
X-Mailman-Version: 3.3.9rc4
Precedence: list
Subject: [Jmap] Re: [art] Re: Artart telechat review of draft-ietf-jmap-contacts-09
List-Id: JSON Message Access Protocol <jmap.ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/jmap/b8EAhoWZOyn8t8CxRhOPVhR3j0c>
List-Archive: <https://mailarchive.ietf.org/arch/browse/jmap>
List-Help: <mailto:jmap-request@ietf.org?subject=help>
List-Owner: <mailto:jmap-owner@ietf.org>
List-Post: <mailto:jmap@ietf.org>
List-Subscribe: <mailto:jmap-join@ietf.org>
List-Unsubscribe: <mailto:jmap-leave@ietf.org>

>
>
> Uh, is there a coherent explanation why RFC 8259 allows non-character
>> code points?  Or specifically why it allows surrogate code points?
>> "net-yet-assigned" code points I can see as plausibly allowing, but
>> surrogate code points will never be assigned characters.
>>
>
(long version)

The question can be answered, not that I am interested in citing any of
these.

1) There was originally no JSON parser in browsers. It quickly spread,
because you could just use eval(). Obviously, there are security and
conformance issues with that approach. So, now we have JSON.parse or
equivalent everywhere, but this initial effort would have been around the
year 2000. There were also earlier efforts that sometimes looked like JSON
(Netscape Enterprise server, etc).

2) But, because it used JavaScript and eval() originally, it used JS
Strings. json.org had a better one pretty soon after, but it still used JS
Strings.

3) So, why were JS Strings so awkward? It's because most GUI OS strings
were UCS-2 (not even UTF-16).
https://simonsapin.github.io/wtf-8/#motivation

4) The first Linux distribution to switch to UTF-8 by default was in 2002:
"Red Hat Linux 8.0 (September 2002) was the first distribution to take the
leap of switching to UTF-8 as the default encoding for most locales. The
only exceptions were Chinese/Japanese/Korean locales, for which there were
at the time still too many specialized tools available that did not yet
support UTF-8"
https://www.cl.cam.ac.uk/~mgk25/unicode.html#linux

5) Then, you get to the nastier problem of escape sequences. Why have
these? That's what lets you ship Unicode when not every system supports
UTF-8. For example, Shift JIS is still used by 5.2% of sites in the .jp
domain.
https://en.wikipedia.org/wiki/Shift_JIS

It's definitely better to use UTF-8 with no escape sequences if you're
making something new, but sometimes the task is to consume content you
don't control.

thanks,
Rob