Re: [I18ndir] Fwd: Re: Working Group Last Call: Structured Headers for HTTP
"Asmus Freytag (c)" <asmusf@ix.netcom.com> Thu, 06 February 2020 04:18 UTC
Return-Path: <asmusf@ix.netcom.com>
X-Original-To: i18ndir@ietfa.amsl.com
Delivered-To: i18ndir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DF338120058 for <i18ndir@ietfa.amsl.com>; Wed, 5 Feb 2020 20:18:13 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.996
X-Spam-Level:
X-Spam-Status: No, score=-1.996 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=ix.netcom.com; domainkeys=pass (2048-bit key) header.from=asmusf@ix.netcom.com header.d=ix.netcom.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id MP02lbFjEG9L for <i18ndir@ietfa.amsl.com>; Wed, 5 Feb 2020 20:18:10 -0800 (PST)
Received: from elasmtp-galgo.atl.sa.earthlink.net (elasmtp-galgo.atl.sa.earthlink.net [209.86.89.61]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 93B8312003E for <i18ndir@ietf.org>; Wed, 5 Feb 2020 20:18:09 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ix.netcom.com; s=dk12062016; t=1580962690; bh=UuV8+gV/SMG2MhykaeSzp8UjfpJr6DXtunSN oSEkNBk=; h=Received:Subject:To:References:From:Message-ID:Date: User-Agent:MIME-Version:In-Reply-To:Content-Type:Content-Language: X-ELNK-Trace:X-Originating-IP; b=XEE1S0tL1f9VbjELo4HCNns3MZScW5M1T HHrlyPnQ7lp5LgCT5C+0jeuw4vAZVRIgq1vhr3jcjVms4Qadfk2/z5dnobA3y0b26NZ gEkqzu2m/caZj3khuqlFvnOetcQBo6g8G48wfDF/31MdiTC0O52aEWfiJKOGriNDfeL hpDv45dXhE0AceWlTRVzcptzX7Igox2Jrmukbb5jrkcAIMnxnyF0uGsirrZ3Uz5G9D2 bzuX14aJgFgIcj70Li24LzN74l3N+k1VGsY4WyXplZpLNT4uJuxYL3/7wL8BFrDoZnI 1hOM3Xpjja4wlRDr+2PPbc0XVBHxFFBCLBWgHU9mg==
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=dk12062016; d=ix.netcom.com; b=eCBb17hjwNY8FkA7cL6ETQJ5frBY6k/WBFsLYdX/NTd5zGJbmUg6ZXC40hgg4JoWe5LSGujR/lA7d/6G9AxnTCOQnbLtoUrQCY8SpidltXNyBD+pzbzM2SIdL6/6dIV8QPgcEEFUl8iIJmVm6bW8WV9WuP1xFwsEvjIOOBUex8H9D3HQeSuaRmfAdGv+vIlLuwFNixyQ4vEcv6OSZ4ehg4XRKZz7Mj/YTi0kEDizU3Q4GMpEXuQdeTPkfe+c/x/siN5P7ygIuPKg+OAJLJF1qbxlEnyJkYY/q5x2Loa4O02OqDe3nbJl+TfcsjWk+u0tVOwZBJCaptSNwSUOaHwfOQ==; h=Received:Subject:To:References:From:Message-ID:Date:User-Agent:MIME-Version:In-Reply-To:Content-Type:Content-Language:X-ELNK-Trace:X-Originating-IP;
Received: from [75.172.116.119] (helo=[192.168.1.106]) by elasmtp-galgo.atl.sa.earthlink.net with esmtpa (Exim 4) (envelope-from <asmusf@ix.netcom.com>) id 1izYcC-0001D4-2S; Wed, 05 Feb 2020 23:18:08 -0500
To: John C Klensin <john-ietf@jck.com>, i18ndir@ietf.org
References: <fd66eb72-2777-3f34-026b-00f4084b88ea@ix.netcom.com> <a7652163-6815-457b-b6b4-96affe237a32@ix.netcom.com> <A942D88A37437ED525455FD6@PSB>
From: "Asmus Freytag (c)" <asmusf@ix.netcom.com>
Message-ID: <caa945d8-e8e6-b206-710d-732b0e944c02@ix.netcom.com>
Date: Wed, 05 Feb 2020 20:18:06 -0800
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.4.2
MIME-Version: 1.0
In-Reply-To: <A942D88A37437ED525455FD6@PSB>
Content-Type: multipart/alternative; boundary="------------48AFC1E9097A3BD2C315C0FA"
Content-Language: en-US
X-ELNK-Trace: 464f085de979d7246f36dc87813833b26976a2cdabd2db7a9a8d723114c0d609cc681f7cf499b0d3350badd9bab72f9c350badd9bab72f9c350badd9bab72f9c
X-Originating-IP: 75.172.116.119
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18ndir/Ma7GrB41dL0-9YL7B445JjxKRYs>
Subject: Re: [I18ndir] Fwd: Re: Working Group Last Call: Structured Headers for HTTP
X-BeenThere: i18ndir@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Internationalization Directorate <i18ndir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18ndir/>
List-Post: <mailto:i18ndir@ietf.org>
List-Help: <mailto:i18ndir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 06 Feb 2020 04:18:14 -0000
On 2/5/2020 5:23 PM, John C Klensin wrote: > Asmus, > > In at least my case and I assume in Patrik's and John Levine's, > when I said "ASCII", I might better have said "the ASCII graphic > subset of Unicode", or "the Basic Latin repertoire of Unicode", > or "Unicode code points in the range U+0030 (or perhaps U+0020) > through U+007E (or maybe +007A), i.e., a repertoire, not a > specific coding standard. Right; so far I'm with you. I do agree that as far as repertoires go, it has some simplicity to it, although even with "ASCII" you run into questions of how you want to handle case. > Given that all of the documents to > this particular discussion are pushing UTF-8 and that a Unicode > string from that repertoire in UTF-8 is indistinguishable at the > octet level from a ASCII string encoded by right-justifying each > seven-bit ASCII code point in an eight bit byte (octet) with a > leading zero bit, the shortcut should be obvious. But you are > correct in that we were talking about repertoire, not encoding > or choice of character set. I also agree that the ASCII repertoire has the nice property that it maps to the same octets whether UTF-8, Latin-1 of many other character sets. Whether a protocol needs to support a CCS declaration is a separate question. Beyond explicitly identifying data as utf-8, that is (rather than implicitly via choice of protocol). Allowing the protocol to be expressed in any other CCS than utf-8 is something that in today's environment should require to clear a substantial hurdle (by promising a really clear win in compatibility with key installed technology and which is not outweighed by imposing costs donwstream). > > That said, it may be worth remembering that, independent of what > operating systems may or may not do, early Web specifications > were written assuming ISO 8859-1 and that there are almost > certainly some applications out there that assume that CCS when > they see octets with the leading bit on. The ASCII repertoire > as described above is still a proper subset, but, because 8859-1 > (and 8859-x more generally) are not UTF-8-compatible, the > ability to define the CCS and encoding in use may still be > necessary even if the necessity is waning. Precisely the reason to make sure new protocols don't add to this problem other than with a gun to their heads, so to speak. Definitely not on the "there might be" level of reasoning, I'd think. A./ > > best, > john > > > > --On Wednesday, February 5, 2020 14:50 -0800 Asmus Freytag > <asmusf@ix.netcom.com> wrote: > >> (didn't go out to the list when I first sent it) >> >> When I read the word "exception" I always think of this: >> >> When we built the first Unicode-enabled OS (Windows NT), we >> had a long discussion of which "strings" in the OS needed to >> be Unicode. >> >> Some thought that there was a clear dividing line between data >> and what would be called "protocol values" in another context. >> >> Some of the latter did look like they were easily limited to >> ASCII; but everywhere we found "exceptions". There might be a >> set of enumerable tokens, but it allowed extended values that >> were network or file identifiers. >> >> After exhaustively researching everything, the conclusion was >> that every single string in the OS had to be Unicode (and >> making any exceptions was either not possible, or not worth >> the effort). >> >> However, while all strings were encoded in Unicode, not all >> string values were allowed. While file names could be >> localized (within the limits of file system syntax), some of >> the enumerated strings were left limited to the ASCII set in >> repertoire (even if encoded in Unicode). >> >> Reading this discussion (and I'm sorry I don't have the time >> right now to properly delve into the details) it seems that a >> natural recommendation would be to require Unicode for any >> native representation and, if necessary (or possible), limit >> the repertoire. >> >> This also requires a definition of the matching protocol for >> all strings that are to be matched as part of the protocol (or >> should be searchable). For any format, that would cover issues >> of casing, white space handling etc., but for Unicode, by >> necessity, that also requires defining the normalization form >> to be used. >> >> A./ >> >> PS: given how few systems these days natively operate in any >> character set other than Unicode, I am always astonished at >> the length to which people go to justify not making something >> native Unicode. They just pick up conversion issues when they >> use platform libraries to do any work or display. >> >
- [I18ndir] Fwd: Working Group Last Call: Structure… Martin J. Dürst
- Re: [I18ndir] Fwd: Working Group Last Call: Struc… John C Klensin
- Re: [I18ndir] Fwd: Working Group Last Call: Struc… John Levine
- Re: [I18ndir] Fwd: Working Group Last Call: Struc… John C Klensin
- Re: [I18ndir] Fwd: Working Group Last Call: Struc… John R Levine
- Re: [I18ndir] Fwd: Working Group Last Call: Struc… John C Klensin
- Re: [I18ndir] Fwd: Working Group Last Call: Struc… John R Levine
- Re: [I18ndir] Fwd: Working Group Last Call: Struc… John C Klensin
- Re: [I18ndir] Working Group Last Call: Structured… Patrik Fältström
- Re: [I18ndir] Working Group Last Call: Structured… John R Levine
- Re: [I18ndir] Working Group Last Call: Structured… John C Klensin
- Re: [I18ndir] Working Group Last Call: Structured… John C Klensin
- Re: [I18ndir] Working Group Last Call: Structured… John C Klensin
- Re: [I18ndir] Working Group Last Call: Structured… John R Levine
- Re: [I18ndir] Working Group Last Call: Structured… John C Klensin
- Re: [I18ndir] Working Group Last Call: Structured… Patrik Fältström
- Re: [I18ndir] Working Group Last Call: Structured… Patrik Fältström
- Re: [I18ndir] Working Group Last Call: Structured… John R Levine
- Re: [I18ndir] Working Group Last Call: Structured… Patrik Fältström
- Re: [I18ndir] Working Group Last Call: Structured… John C Klensin
- Re: [I18ndir] Working Group Last Call: Structured… Patrik Fältström
- Re: [I18ndir] Working Group Last Call: Structured… John C Klensin
- Re: [I18ndir] Working Group Last Call: Structured… Patrik Fältström
- Re: [I18ndir] Working Group Last Call: Structured… John R Levine
- Re: [I18ndir] Working Group Last Call: Structured… John C Klensin
- Re: [I18ndir] Working Group Last Call: Structured… Asmus Freytag
- [I18ndir] Fwd: Re: Working Group Last Call: Struc… Asmus Freytag
- Re: [I18ndir] Fwd: Re: Working Group Last Call: S… John C Klensin
- Re: [I18ndir] Fwd: Re: Working Group Last Call: S… Asmus Freytag (c)
- Re: [I18ndir] Fwd: Re: Working Group Last Call: S… John C Klensin
- Re: [I18ndir] Fwd: Re: Working Group Last Call: S… Asmus Freytag (c)