Re: revised "generic syntax" internet draft

Keld J|rn Simonsen <> Tue, 22 April 1997 11:07 UTC

Received: from cnri by id aa24072; 22 Apr 97 7:07 EDT
Received: from services.Bunyip.Com by CNRI.Reston.VA.US id aa07175; 22 Apr 97 7:07 EDT
Received: (from daemon@localhost) by (8.8.5/8.8.5) id GAA29307 for uri-out; Tue, 22 Apr 1997 06:42:21 -0400 (EDT)
Received: from (mocha.Bunyip.Com []) by (8.8.5/8.8.5) with SMTP id GAA29302 for <>; Tue, 22 Apr 1997 06:42:18 -0400 (EDT)
Received: from by with SMTP (5.65a/IDA-1.4.2b/CC-Guru-2b) id AA03844 (mail destined for; Tue, 22 Apr 97 06:42:15 -0400
Received: (from keld@localhost) by (8.6.12/8.6.12) id MAA14628; Tue, 22 Apr 1997 12:41:34 +0200
Message-Id: <>
From: Keld J|rn Simonsen <>
Date: Tue, 22 Apr 1997 12:41:33 +0200
In-Reply-To: "Martin J. Duerst" <> "Re: revised "generic syntax" internet draft" (Apr 19, 18:54)
X-Charset: ISO-8859-1
X-Char-Esc: 29
Mime-Version: 1.0
Content-Type: Text/Plain; Charset="ISO-8859-1"
Content-Transfer-Encoding: 8bit
Mnemonic-Intro: 29
X-Mailer: Mail User's Shell (7.2.2 4/12/91)
To: "Martin J. Duerst" <>, John C Klensin <>
Subject: Re: revised "generic syntax" internet draft
Cc:,,, Dan Oscarsson <>
Precedence: bulk

"Martin J. Duerst" writes:

> You might come to the state where you have to view UTF-8 with
> a terminal emulator or editor not set to view it, where the
> above effects are occurring, but this should actually be rare.
> And it wouldn't be better if you looked at ideographic characters
> with an 8859-1 editor or so.
> First, we don't want to have UTF-8 and 8859-1 (or any other legacy
> coding) mixed in the same document. Once everything is working as
> envisioned, if you transport a Western European URL in 8859-1,
> you transport the characters, as 8859-1. It's only when this is
> changed to %HH, or to binary 8-bit URLs as such which lack any
> information on character encoding, that you change to UTF-8.

Pardon me, should the %HH notation not be transparant,
in the sense of a transfer encoding of MIME? It should not be dependent
on whether the encoding is 8859-1, UTF-8 or SJIS or whatever.
%HH encodes bytes, unrelated to encoding.