Re: [I18ndir] [art] Just uploaded draft-bray-unichars-03
Steffen Nurpmeso <steffen@sdaoden.eu> Sat, 09 September 2023 17:15 UTC
Return-Path: <steffen@sdaoden.eu>
X-Original-To: i18ndir@ietfa.amsl.com
Delivered-To: i18ndir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D6867C14CE4F; Sat, 9 Sep 2023 10:15:17 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.908
X-Spam-Level:
X-Spam-Status: No, score=-6.908 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ZmBpctFWS3kN; Sat, 9 Sep 2023 10:15:14 -0700 (PDT)
Received: from sdaoden.eu (sdaoden.eu [217.144.132.164]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 9BC7AC14F74A; Sat, 9 Sep 2023 10:15:11 -0700 (PDT)
Date: Sat, 09 Sep 2023 18:58:43 +0200
Author: Steffen Nurpmeso <steffen@sdaoden.eu>
From: Steffen Nurpmeso <steffen@sdaoden.eu>
To: Tim Bray <tbray@textuality.com>
Cc: i18ndir@ietf.org, ART Area <art@ietf.org>, Steffen Nurpmeso <steffen@sdaoden.eu>
Message-ID: <20230909165843.GlTJy%steffen@sdaoden.eu>
In-Reply-To: <CAHBU6is50TkpDsqXTp6WxdVSgE66j3gGHZ60ey2jFYbefaHFJw@mail.gmail.com>
References: <CAHBU6is50TkpDsqXTp6WxdVSgE66j3gGHZ60ey2jFYbefaHFJw@mail.gmail.com>
Mail-Followup-To: Tim Bray <tbray@textuality.com>, i18ndir@ietf.org, ART Area <art@ietf.org>, Steffen Nurpmeso <steffen@sdaoden.eu>
User-Agent: s-nail v14.9.24-507-g0e7e3e8c46
OpenPGP: id=EE19E1C1F2F7054F8D3954D8308964B51883A0DD; url=https://ftp.sdaoden.eu/steffen.asc; preference=signencrypt
BlahBlahBlah: Any stupid boy can crush a beetle. But all the professors in the world can make no bugs.
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18ndir/nz6mIPrK_RJNzcxtxvAV637sssg>
Subject: Re: [I18ndir] [art] Just uploaded draft-bray-unichars-03
X-BeenThere: i18ndir@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Internationalization Directorate <i18ndir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18ndir/>
List-Post: <mailto:i18ndir@ietf.org>
List-Help: <mailto:i18ndir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 09 Sep 2023 17:15:17 -0000
Tim Bray wrote in <CAHBU6is50TkpDsqXTp6WxdVSgE66j3gGHZ60ey2jFYbefaHFJw@mail.gmail.com>: |See https://www.ietf.org/archive/id/draft-bray-unichars-03.html | |A bunch of minor corrections and improvements, thanks to everyone for that, |especially James Manger for noticing that the ABNF was entirely wrong in |one place. | |The word “useless” has been replaced by “legacy”. | |I think the feedback was pretty clear that the draft needed to be more |opinionated; just because we document the existence of the default JSON |repertoire (“all the code points”) doesn’t mean that anyone should use it |in the present or future. So, introduced a new section “Refining Character |Repertoires” to highlight those issues and offer a suggestion. In 2.2 i would not give the count on code point types. Instead i would only give the problem statement "among Unicode code point types .. are questionable". This seems more generic. In 2.2.2.2 i would not say "legacy controls", and that they are "mostly obsolete". ECMA-48 is very alive in at least the POSIX aka Linux world, for many purposes, for example terminal interaction. "Likely to occur in data as a result of a programming error"? Any preformatted Unix manual page will come with lots of CSI sequences, or backspace-based ones. ASCII NUL is the base of ISO C-style strings. In fact many network protocols (not enough!!) still seem to use KEY=VALUE\0KEY=VALUE\0\0 style transports. In 5.: [JSON..] It cannot be serialized into legal UTF-8, but many libraries will silently parse this and generate an ill-formed UTF-8 string. Implementors must be prepared to deal with these sorts of problematic code points. But RFC 3629 is very clear and says in 3. (being lengthy) The definition of UTF-8 prohibits encoding character numbers between U+D800 and U+DFFF, which are reserved for use with the UTF-16 encoding form (as surrogate pairs) and do not directly represent [] characters. When encoding in UTF-8 from UTF-16 data, it is necessary to first decode the UTF-16 data to obtain character numbers, which are then encoded in UTF-8 as described above. This contrasts with CESU-8 [CESU-8], which is a UTF-8-like encoding that is not meant for ... So even the weird JSON "string" can be made valid UTF-8, one just has to walk around the corner. (Possibly.) Sorry, but _I_ do not get that JSON supports _that_ "string", RFC 8259, 7.: To escape an extended character that is not in the Basic Multilingual Plane, the character is represented as a 12-character sequence, encoding the UTF-16 surrogate pair. And then in 8. 8. String and Character Issues 8.1. Character Encoding JSON text exchanged between systems that are not part of a closed ecosystem MUST be encoded using UTF-8 [RFC3629]. This is a total contradiction, sorry. I. Hate. JSON. But that does not help anyone. So i mean _if_ i would write such a RFC _i_ would not hammer your sentence on the table, but i would then simply refer to RFC 3629 and say that implementors shall be prepared to convert the JSON standard (grrr) string .. to the UTF-8 standard? 5. also says It is unlikely that anyone specifying a new data format would choose to allow this character repertoire. And A protocol based on JSON could be made more robust and implementor-friendly by requiring that the contents of member names and string values contain only Useful Assignables No. Not me. Sorry .. we are talking string data? I mean, with your restriction one (possibly) cannot even generate a protocol that carries around Linux/POSIX path names? Except by mangling them to something likely non-reproducible (by leaving off "evil" characters, or converting them to a replacement character; which one, the Unicode one, or question mark? Ah, it must be ASCII question mark because the Unicode replacement character is of the evil sort?). Or have i misunderstood something ... which can very well be the truth, of course. So, even if you wipe away all of the above, a hint on replacement characters in a document that restricts the usable set of Unicode characters is well worth a thought. Thank you. --steffen | |Der Kragenbaer, The moon bear, |der holt sich munter he cheerfully and one by one |einen nach dem anderen runter wa.ks himself off |(By Robert Gernhardt)
- [I18ndir] Just uploaded draft-bray-unichars-03 Tim Bray
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Rob Sayre
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Rob Sayre
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Asmus Freytag
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Asmus Freytag
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Asmus Freytag
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Steffen Nurpmeso
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Asmus Freytag
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Rob Sayre
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Tim Bray
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Tim Bray
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Rob Sayre
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Asmus Freytag
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Rob Sayre
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Tim Bray
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Tim Bray
- Re: [I18ndir] Just uploaded draft-bray-unichars-03 Tim Bray
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Rob Sayre
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Rob Sayre
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Asmus Freytag
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Asmus Freytag
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Rob Sayre
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Asmus Freytag
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Manger, James
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Carsten Bormann
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Carsten Bormann
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Rob Sayre
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Tim Bray
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Tim Bray
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Rob Sayre
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Rob Sayre
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Tim Bray
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Carsten Bormann
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Rob Sayre
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Manger, James
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Rob Sayre
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Asmus Freytag
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Carsten Bormann
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Steffen Nurpmeso
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Steffen Nurpmeso
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Steffen Nurpmeso
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Rob Sayre
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Kevin Marks
- Re: [I18ndir] [art] Just uploaded draft-bray-unic… Tim Bray