Re: http charset labelling

Keld J|rn Simonsen <keld@dkuug.dk> Mon, 12 February 1996 21:40 UTC

Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa25330; 12 Feb 96 16:40 EST
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa25326; 12 Feb 96 16:40 EST
Received: from services.Bunyip.COM by CNRI.Reston.VA.US id aa15776; 12 Feb 96 16:40 EST
Received: (from daemon@localhost) by services.bunyip.com (8.6.10/8.6.9) id PAA00946 for uri-out; Mon, 12 Feb 1996 15:45:25 -0500
Received: from mocha.bunyip.com (mocha.Bunyip.Com [192.197.208.1]) by services.bunyip.com (8.6.10/8.6.9) with SMTP id PAA00941 for <uri@services.bunyip.com>; Mon, 12 Feb 1996 15:45:22 -0500
Received: from dkuug.dk by mocha.bunyip.com with SMTP (5.65a/IDA-1.4.2b/CC-Guru-2b) id AA11775 (mail destined for uri@services.bunyip.com); Mon, 12 Feb 96 15:45:16 -0500
Received: (from keld@localhost) by dkuug.dk (8.6.12/8.6.12) id VAA17228; Mon, 12 Feb 1996 21:44:09 +0100
Message-Id: <199602122044.VAA17228@dkuug.dk>
Sender: ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Keld J|rn Simonsen <keld@dkuug.dk>
Date: Mon, 12 Feb 1996 21:44:08 +0100
In-Reply-To: Masataka Ohta <mohta@necom830.cc.titech.ac.jp> "Re: http charset labelling" (Feb 7, 5:45)
X-Charset: ISO-8859-1
X-Char-Esc: 29
Mime-Version: 1.0
Content-Type: Text/Plain; Charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
Mnemonic-Intro: 29
X-Mailer: Mail User's Shell (7.2.2 4/12/91)
To: Masataka Ohta <mohta@necom830.cc.titech.ac.jp>, Gavin Nicol <gtn@ebt.com>
Subject: Re: http charset labelling
Cc: masinter@parc.xerox.com, uri@bunyip.com
X-Orig-Sender: owner-uri@bunyip.com
Precedence: bulk

Masataka Ohta writes:

> > I guess you, I, and a lot of other people, think that if people really
> > want to be global, they should avoid using kanji, or whatever, in
> > URL's. However, as a persoan at Astec said, and I agree, people *will*
> > put kanji into resource names, and they *will* expect it to work. As
> > such, I think it better to design a system that can handle *all*
> > cases, as users expect them to be handled.
> 
> Just make viewers bounce any URL with the 8th bit set or, at least,
> mask the bit. '%' notation should still be accepted.
> 
> It is also a good idea to do the same thing at the protocol
> specification level that:
> 
> 	8th bit of URL MUST be 0. Should a malformed URL is found,
> 	its 8th bit MAY be masked to be 0. Otherwise the URL MUST
> 	be rejected.
> 
> Then, non-ASCII URLs will disappear.

Well, URLs do not have a charset per se, they are abstract.
So possibly the URLs with % in them are more than ascii actually.
In fact they could be anything, and everything, like a UTF-8 URL. 

I do not care about how the URLs are looking on the HTTP level,
they may have as many % in them as needed, as long as the
URLs we write on business cards, in magazines aetc can be
natural, that is in evrey language and script of the world.

Keld