Re: [Json] In "praise" of UTF-16

Rob Sayre <sayrer@gmail.com> Tue, 03 September 2019 05:42 UTC

Return-Path: <sayrer@gmail.com>
X-Original-To: json@ietfa.amsl.com
Delivered-To: json@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4A4601201E0 for <json@ietfa.amsl.com>; Mon, 2 Sep 2019 22:42:23 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.997
X-Spam-Level:
X-Spam-Status: No, score=-1.997 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id bxtCV14DpfcI for <json@ietfa.amsl.com>; Mon, 2 Sep 2019 22:42:21 -0700 (PDT)
Received: from mail-io1-xd41.google.com (mail-io1-xd41.google.com [IPv6:2607:f8b0:4864:20::d41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A1C9C1201B7 for <json@ietf.org>; Mon, 2 Sep 2019 22:42:21 -0700 (PDT)
Received: by mail-io1-xd41.google.com with SMTP id m11so1197408ioo.0 for <json@ietf.org>; Mon, 02 Sep 2019 22:42:21 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=DrYcx9yaY3NjQkWd8TTSXs8p0SdBjxR/i1Yuz3PYIYs=; b=nCGoebKjEd3rSgaBg2/3qqAqodKc0/WxyCuKamEONa5V3nD5L8Rq4Wro8aTnBeVPdw /RJXRUuUEpfyb/Fviqe47CHkn3/28K8wlHA1W/LVv+PnsGZnNLu55it6nWeMfN2ejVqY n5AJIWPoPQTZGuQvdxgb4wNFuoERZK5f4tD0QOyWV/WEaSGzfnp8VuDShyJIuHwWAEWQ oVxV7cVE+GzA04Iaeyu8CcU0YUs05l09OW80xtUhJKkRtTaeWWyD5dJQmh2MP6+ShtM3 u7cTxhkY/EGtnIYQHCDQ2wTJE1NPE+P3jdI9GAGDrkDTCzqNdNbxa3BrnVIaMUoq5JvS eOrg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=DrYcx9yaY3NjQkWd8TTSXs8p0SdBjxR/i1Yuz3PYIYs=; b=e9jjoAnAxeFfXYoAz19Uc64IXS9NnL2fhKDRAE7NTHmIMOWr/AdoI4q0IxGvp4ycyr WseSTEz7yD2JMGDLZwl0+4uICTu7VJfBr4gTNgfYveGeE4AETpNjsxJsisNsByfo2yXB jxwmjRaZM9IQIyXmvEo153krb1XPlcCMLZxIXiQ/ye2VM+E1hHevYQomSWQZvJl6sZkt TTJu5NAUqubmBXHGnja4RATY3FwOkUIikirUQWOv2UcLF7/kJ7/8lsZxWRiAx/c/ltgc nZ2aWflqdQOx0g6j4NiMxzk2Nr6MX1MrsvoKX7K4fQ8rnlRN4/HV0QvbJ3LB7Nqq7rkR 6qrg==
X-Gm-Message-State: APjAAAXrXMn/noB29vcG5xVYSF0ie5wGNBsMtJsj9YumCvrJrEF3W8GS RG/BmpJsdCVsdY9mqd4IB9b1gqCVDC5mmbDYZvY=
X-Google-Smtp-Source: APXvYqwf242bBiyPrgxpEVrt4FAMqrdwxM6cXK3HAnvPX9E+tIkKwuw2+4P2WjX9ZY5vZTKnnEgDvknVe7WcEsdyM6A=
X-Received: by 2002:a5e:9b12:: with SMTP id j18mr22072885iok.54.1567489340839; Mon, 02 Sep 2019 22:42:20 -0700 (PDT)
MIME-Version: 1.0
References: <cc3dc24d-3e13-e319-e48f-7b52ddd017d0@gmail.com> <00231270-86DF-4AD2-949E-25B04D518577@tzi.org> <20190902211744.GA7920@localhost> <40386571-301A-47BD-937D-55666566CFB5@tzi.org> <20190902214047.GB7920@localhost> <E387B935-8AA9-41E3-87D1-4EE72BB34BAE@tzi.org> <CAChr6SwLw9srC-9jNMp8frNbr9gSrTDDY8p-Nv9PTgQhHmTjnQ@mail.gmail.com> <3BD0DBAF-21DA-46D0-9BEB-0141FDBCCDF0@tzi.org>
In-Reply-To: <3BD0DBAF-21DA-46D0-9BEB-0141FDBCCDF0@tzi.org>
From: Rob Sayre <sayrer@gmail.com>
Date: Mon, 02 Sep 2019 22:42:06 -0700
Message-ID: <CAChr6Sz+J1EVkt+EJGSvh5bTKa+PYSvupd6cLvPX7S-iqgxJGg@mail.gmail.com>
To: Carsten Bormann <cabo@tzi.org>
Cc: Nico Williams <nico@cryptonector.com>, "json@ietf.org" <json@ietf.org>, Anders Rundgren <anders.rundgren.net@gmail.com>
Content-Type: multipart/alternative; boundary="000000000000599a6505919f8d1d"
Archived-At: <https://mailarchive.ietf.org/arch/msg/json/mTGuvxu7mAaWBlV7QacGiALdTgo>
Subject: Re: [Json] In "praise" of UTF-16
X-BeenThere: json@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "JavaScript Object Notation \(JSON\) WG mailing list" <json.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/json>, <mailto:json-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/json/>
List-Post: <mailto:json@ietf.org>
List-Help: <mailto:json-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/json>, <mailto:json-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 03 Sep 2019 05:42:23 -0000

On Mon, Sep 2, 2019 at 10:32 PM Carsten Bormann <cabo@tzi.org> wrote:

>
> (For those listening in and not understanding what the disagreement could
> possibly be here:
> It is a common misunderstanding of the cited paragraph at the start of
> page 9 of RFC 8259 that this mandates escaping astral code points.  No, it
> just says how you do it if you want to.  But you don’t want to.)
>

I don't think it mandates escaping those astral code points. It says "may".
That means encoders and decoders must deal with that UTF-16 representation
in order to interoperate. I agree that it's not necessary in an end-to-end
UTF-8 pipeline.

thanks,
Rob