Re: http charset labelling
Keld J|rn Simonsen <keld@dkuug.dk> Tue, 13 February 1996 00:13 UTC
Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa28549;
12 Feb 96 19:13 EST
Received: from CNRI.Reston.VA.US by IETF.CNRI.Reston.VA.US id aa28545;
12 Feb 96 19:13 EST
Received: from services.Bunyip.COM by CNRI.Reston.VA.US id aa18378;
12 Feb 96 19:13 EST
Received: (from daemon@localhost) by services.bunyip.com (8.6.10/8.6.9) id
SAA02888 for uri-out; Mon, 12 Feb 1996 18:36:17 -0500
Received: from mocha.bunyip.com (mocha.Bunyip.Com [192.197.208.1]) by
services.bunyip.com (8.6.10/8.6.9) with SMTP id SAA02883 for
<uri@services.bunyip.com>; Mon, 12 Feb 1996 18:36:10 -0500
Received: from dkuug.dk by mocha.bunyip.com with SMTP
(5.65a/IDA-1.4.2b/CC-Guru-2b)
id AA13503 (mail destined for uri@services.bunyip.com);
Mon, 12 Feb 96 18:36:02 -0500
Received: (from keld@localhost) by dkuug.dk (8.6.12/8.6.12) id AAA22028;
Tue, 13 Feb 1996 00:35:22 +0100
Message-Id: <199602122335.AAA22028@dkuug.dk>
Sender: ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Keld J|rn Simonsen <keld@dkuug.dk>
Date: Tue, 13 Feb 1996 00:35:21 +0100
In-Reply-To: Masataka Ohta <mohta@necom830.cc.titech.ac.jp>
"Re: http charset labelling" (Feb 7, 5:16)
X-Charset: ISO-8859-1
X-Char-Esc: 29
Mime-Version: 1.0
Content-Type: Text/Plain; Charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
Mnemonic-Intro: 29
X-Mailer: Mail User's Shell (7.2.2 4/12/91)
To: Masataka Ohta <mohta@necom830.cc.titech.ac.jp>, Gavin Nicol <gtn@ebt.com>
Subject: Re: http charset labelling
Cc: masinter@parc.xerox.com, uri@bunyip.com
X-Orig-Sender: owner-uri@bunyip.com
Precedence: bulk
Masataka Ohta writes: > > The results might > > vary widely depending on whether the data was transmitted as SJIS, > > EUC or UTF-8, if there is no encoding information. > > Because of duplicated shape of 'A' for Latin and Greek capital > letter 'A' and alpha, and because of duplicated encoding of Big5, > encoding information, in general, is no fix for unique conversion > from shape on a paper to internal code. > > Don't try to do something proven to be impossible. Well, Otha, there are a number of ways to do it, for example considering all of greek capital letter alfa, latin capital letter A and the cyrillic letter A as equivalent for matching, and similar equivalence specs may be available for other characters. Also narrow and full width letters may be equivalenced. Anyway it should be clear from the context which version the "A" is - if it is together with greek characters it is most likely an Alfa, if with latin characters it is most likely a latin letter etc. It is up to the maker of the URL to ensure that the intended audience will get the message, and some careful choice may be done there. keld
- http charset labelling Keld J|rn Simonsen
- Re: http charset labelling Larry Masinter
- Re: http charset labelling Peter Paul Sint
- Re: http charset labelling Masataka Ohta
- Re: http charset labelling dupuy
- Re: http charset labelling Gavin Nicol
- Re: http charset labelling Keld J|rn Simonsen
- Re: http charset labelling Keld J|rn Simonsen
- Re: http charset labelling Masataka Ohta
- Re: http charset labelling Masataka Ohta
- Re: http charset labelling Larry Masinter
- Re: http charset labelling Masataka Ohta
- Re: http charset labelling Keld J|rn Simonsen
- Re: http charset labelling Masataka Ohta
- Re: http charset labelling Gavin Nicol
- Re: http charset labelling Masataka Ohta