Re: http charset labelling

Keld J|rn Simonsen <keld@dkuug.dk> Thu, 01 February 1996 23:49 UTC

Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa23315; 1 Feb 96 18:49 EST
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa23311; 1 Feb 96 18:49 EST
Received: from services.Bunyip.COM by CNRI.Reston.VA.US id aa15559; 1 Feb 96 18:49 EST
Received: (from daemon@localhost) by services.bunyip.com (8.6.10/8.6.9) id LAA19539 for uri-out; Thu, 1 Feb 1996 11:58:00 -0500
Received: from mocha.bunyip.com (mocha.Bunyip.Com [192.197.208.1]) by services.bunyip.com (8.6.10/8.6.9) with SMTP id LAA19533 for <uri@services.bunyip.com>; Thu, 1 Feb 1996 11:57:57 -0500
Received: from dkuug.dk by mocha.bunyip.com with SMTP (5.65a/IDA-1.4.2b/CC-Guru-2b) id AA26943 (mail destined for uri@services.bunyip.com); Thu, 1 Feb 96 11:57:43 -0500
Received: (from keld@localhost) by dkuug.dk (8.6.12/8.6.12) id RAA22955; Thu, 1 Feb 1996 17:50:28 +0100
Message-Id: <199602011650.RAA22955@dkuug.dk>
Sender: ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Keld J|rn Simonsen <keld@dkuug.dk>
Date: Thu, 01 Feb 1996 17:50:26 +0100
In-Reply-To: Larry Masinter <masinter@parc.xerox.com> "Re: http charset labelling" (Feb 1, 3:44)
X-Charset: ISO-8859-1
X-Char-Esc: 29
Mime-Version: 1.0
Content-Type: Text/Plain; Charset="ISO-8859-1"
Content-Transfer-Encoding: 8bit
Mnemonic-Intro: 29
X-Mailer: Mail User's Shell (7.2.2 4/12/91)
To: Larry Masinter <masinter@parc.xerox.com>
Subject: Re: http charset labelling
Cc: uri@bunyip.com
X-Orig-Sender: owner-uri@bunyip.com
Precedence: bulk

Larry Masinter writes:

> It would not require changing any existing specification for someone
> to create a web server that interpreted
> 
>    http://host.dom/encoding/selector
> 
> to mean that 'encoding' was a particular encoding of the given
> selector. The web server could even return a 'Location:' header in the
> results that would give a canonical encoding, just so as not to
> confuse caches.
> 
> 'encoding' could even be UTF7, for example.

So an example could be:

     http://host.dom/UTF7/index.html   ?

The server should then recognize the first part of the
locator as the charset, and then translate the following locator
into the charset of the server. This should only be done when
the first part is one of a set of recognized charsets.

The notation should not be on business cards etc, I think we
all agree on this. It should not either be in URLs in html docs,
I also think we all agree on that.

Would there not then be a problem when the charset be automatically
inserted by the browser? The browser would not know which
servers would understand the new convention. So a lot of
havoc would be created with a browser enhanced in this way.

I think there is a clear migration path stipulated in the HTTP spec,
and that is via the major and minor numbers in the HTTP/M.N
version notation and the rules layed down there, which says that
within the same major version a server should just respond
as normal, ignoring the headers that it does not understand.
That's why I advocate a header-based solution.

keld