Re: revised "generic syntax" internet draft

John C Klensin <klensin@mci.net> Wed, 16 April 1997 15:12 UTC

Received: from cnri by ietf.org id aa28924; 16 Apr 97 11:12 EDT
Received: from services.Bunyip.Com by CNRI.Reston.VA.US id aa13306; 16 Apr 97 11:12 EDT
Received: (from daemon@localhost) by services.bunyip.com (8.8.5/8.8.5) id KAA04636 for uri-out; Wed, 16 Apr 1997 10:48:19 -0400 (EDT)
Received: from mocha.bunyip.com (mocha.Bunyip.Com [192.197.208.1]) by services.bunyip.com (8.8.5/8.8.5) with SMTP id KAA04627 for <uri@services.bunyip.com>; Wed, 16 Apr 1997 10:48:15 -0400 (EDT)
Received: from ns.jck.com by mocha.bunyip.com with SMTP (5.65a/IDA-1.4.2b/CC-Guru-2b) id AA22557 (mail destined for uri@services.bunyip.com); Wed, 16 Apr 97 10:48:13 -0400
Received: from tp7.Jck.com ("port 2172"@tp7.jck.com) by a4.jck.com (PMDF V5.1-8 #21705) with SMTP id <0E8QJS92S00KI2@a4.jck.com> for uri@bunyip.com; Wed, 16 Apr 1997 10:48:10 -0400 (EDT)
Date: Wed, 16 Apr 1997 10:48:08 -0400
From: John C Klensin <klensin@mci.net>
Subject: Re: revised "generic syntax" internet draft
In-Reply-To: <20902.861179716@munken.uninett.no>
To: Harald.T.Alvestrand@uninett.no
Cc: fielding@kiwi.ics.uci.edu, uri@bunyip.com, Dan Oscarsson <Dan.Oscarsson@trab.se>
Reply-To: John C Klensin <klensin@mci.net>
Message-Id: <SIMEON.9704161008.G@tp7.Jck.com>
Mime-Version: 1.0
X-Mailer: Simeon for Win32 Version 4.1.1 Build (14)
Content-Type: TEXT/PLAIN; CHARSET="US-ASCII"
Priority: NORMAL
X-Authentication: none
Sender: owner-uri@bunyip.com
Precedence: bulk

On Wed, 16 Apr 1997 10:35:16 +0200 
Harald.T.Alvestrand@uninett.no wrote:

> Factoid:
> 
> UTF-8 is not user-friendly in 8859-1; the standard coding octets for
> putting the 8859-1 charset into UTF-8 insert one character in front of
> each character, and also change the last character for the 4 uppermost
> columns of the 8859-1 character table.

My apologies.  I should have said something more like "more 
user-friendly for Latin-1 than it is for upper-end 
ideographic characters, where it deteriorates even more 
severely :-(

Given the bad behavior *even* for 8859-1, could someone 
please remind me why we are pushing the thing rather than a 
straight 16 or 32-bit encoding with compression if needed?  
(Please, that is a rhetorical question only -- we don't 
need another flaming chain on the subject).

    john