Re: revised "generic syntax" internet draft

Keld J|rn Simonsen <keld@dkuug.dk> Tue, 15 April 1997 23:00 UTC

Received: from cnri by ietf.org id aa13816; 15 Apr 97 19:00 EDT
Received: from services.Bunyip.Com by CNRI.Reston.VA.US id aa22610; 15 Apr 97 19:00 EDT
Received: (from daemon@localhost) by services.bunyip.com (8.8.5/8.8.5) id SAA11372 for uri-out; Tue, 15 Apr 1997 18:33:18 -0400 (EDT)
Received: from mocha.bunyip.com (mocha.Bunyip.Com [192.197.208.1]) by services.bunyip.com (8.8.5/8.8.5) with SMTP id SAA11365 for <uri@services.bunyip.com>; Tue, 15 Apr 1997 18:33:15 -0400 (EDT)
Received: from dkuug.dk by mocha.bunyip.com with SMTP (5.65a/IDA-1.4.2b/CC-Guru-2b) id AA17192 (mail destined for uri@services.bunyip.com); Tue, 15 Apr 97 18:33:13 -0400
Received: (from keld@localhost) by dkuug.dk (8.6.12/8.6.12) id AAA29896; Wed, 16 Apr 1997 00:32:33 +0200
Message-Id: <199704152232.AAA29896@dkuug.dk>
From: Keld J|rn Simonsen <keld@dkuug.dk>
Date: Wed, 16 Apr 1997 00:32:31 +0200
In-Reply-To: John C Klensin <klensin@mci.net> "Re: revised "generic syntax" internet draft" (Apr 15, 19:09)
X-Charset: ISO-8859-1
X-Char-Esc: 29
Mime-Version: 1.0
Content-Type: Text/Plain; Charset="ISO-8859-1"
Content-Transfer-Encoding: 8bit
Mnemonic-Intro: 29
X-Mailer: Mail User's Shell (7.2.2 4/12/91)
To: John C Klensin <klensin@mci.net>, Dan Oscarsson <Dan.Oscarsson@trab.se>
Subject: Re: revised "generic syntax" internet draft
Cc: Harald.T.Alvestrand@uninett.no, uri@bunyip.com, fielding@kiwi.ics.uci.edu
Sender: owner-uri@bunyip.com
Precedence: bulk

John Klensin writes about use of UTF-8 and penalties in size 
and readability for various user communities. Some remarks:

I think the size issue is not important. Consider how many
bytes there are in a package, and the typical round-trip latencies
adding say 5-50 bytes for URLs because of UTF-8 expansion is not 
so important, considering also the frequency of URLs in normal
retrieval of web pages. Performance penalties would be close
to not noticeable IMHO.

You must also weight this against the advantages of UTF-8,
namely a clear and easy migration path for the majority of URLs
today, encoded in US-ASCII: the migration is simply no change.

Maybe John wants to be able to use other charsets for encoding
an URL. I actually proposed some time ago a solution labelling
the encoding of the URL in a "URL-charset:" header and a
having UTF-8 as default, and I remember somebody else also proposing
charset labelling - on the URL line. I have not at this time evaluated 
such proposals compared to Martin and Frangois's proposals, but it
is clear that the intended functionality is the same - and my old
proposal could be seen as an extension to Martin/Frangois - but I
am not sure it is necessary.

Keld