Re: revised "generic syntax" internet draft

Harald.T.Alvestrand@uninett.no Wed, 16 April 1997 09:12 UTC

Received: from cnri by ietf.org id aa11533; 16 Apr 97 5:12 EDT
Received: from services.Bunyip.Com by CNRI.Reston.VA.US id aa06477; 16 Apr 97 5:12 EDT
Received: (from daemon@localhost) by services.bunyip.com (8.8.5/8.8.5) id EAA26613 for uri-out; Wed, 16 Apr 1997 04:34:44 -0400 (EDT)
Received: from mocha.bunyip.com (mocha.Bunyip.Com [192.197.208.1]) by services.bunyip.com (8.8.5/8.8.5) with SMTP id EAA26607 for <uri@services.bunyip.com>; Wed, 16 Apr 1997 04:34:40 -0400 (EDT)
Received: from tyholt.uninett.no by mocha.bunyip.com with SMTP (5.65a/IDA-1.4.2b/CC-Guru-2b) id AA20810 (mail destined for uri@services.bunyip.com); Wed, 16 Apr 97 04:34:38 -0400
Received: from munken.uninett.no (munken.uninett.no [129.241.131.10]) by tyholt.uninett.no (8.7.6/8.7.3) with SMTP id KAA09473; Wed, 16 Apr 1997 10:34:22 +0200 (METDST)
X-Authentication-Warning: tyholt.uninett.no: Host munken.uninett.no [129.241.131.10] didn't use HELO protocol
X-Mailer: exmh version 1.6.7 5/3/96
From: Harald.T.Alvestrand@uninett.no
To: John C Klensin <klensin@mci.net>
Cc: Dan Oscarsson <Dan.Oscarsson@trab.se>, uri@bunyip.com, fielding@kiwi.ics.uci.edu
Subject: Re: revised "generic syntax" internet draft
In-Reply-To: Your message of "Tue, 15 Apr 1997 11:55:43 EDT." <SIMEON.9704151143.E@tp7.Jck.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Date: Wed, 16 Apr 1997 10:35:16 +0200
Message-Id: <20902.861179716@munken.uninett.no>
X-MIME-Autoconverted: from quoted-printable to 8bit by services.bunyip.com id EAA26609
Sender: owner-uri@bunyip.com
Precedence: bulk
Content-Transfer-Encoding: quoted-printable
X-MIME-Autoconverted: from 8bit to quoted-printable by services.bunyip.com id EAA26613

Factoid:

UTF-8 is not user-friendly in 8859-1; the standard coding octets for
putting the 8859-1 charset into UTF-8 insert one character in front of
each character, and also change the last character for the 4 uppermost
columns of the 8859-1 character table.

So "Grøtavær" (my wife's hometown) becomes "GrÃ,tavö" if you "forget"
to put the UTF-8 back into 8859-1 and just dump it to an 8859-1 screen.

ø=F8 = 1111 1000 -> 11000011 10111000 = C3 B8 = Ã, (A with tilde + cedilla)
æ=E6 = 1110 0110 -> 11000011 10100110 = C3 B6 = ö (A with tilde + pilcrow)

(If some text has > 5% A-with-accents, it's probably UTF-8 encoded 8859-1....)

                           Harald A