Re: revised "generic syntax" internet draft

Jon Knight <> Wed, 16 April 1997 10:56 UTC

Received: from cnri by id aa14826; 16 Apr 97 6:56 EDT
Received: from services.Bunyip.Com by CNRI.Reston.VA.US id aa08240; 16 Apr 97 6:56 EDT
Received: (from daemon@localhost) by (8.8.5/8.8.5) id GAA29898 for uri-out; Wed, 16 Apr 1997 06:08:01 -0400 (EDT)
Received: from (mocha.Bunyip.Com []) by (8.8.5/8.8.5) with SMTP id GAA29893 for <>; Wed, 16 Apr 1997 06:07:59 -0400 (EDT)
Received: from by with SMTP (5.65a/IDA-1.4.2b/CC-Guru-2b) id AA21230 (mail destined for; Wed, 16 Apr 97 06:07:43 -0400
Received: from jon by with smtp (Exim 1.61 #1) id 0wHRZg-0002xd-00; Wed, 16 Apr 1997 11:04:24 +0100
Date: Wed, 16 Apr 1997 11:04:23 +0100
From: Jon Knight <>
To: Gary Adams - Sun Microsystems Labs BOS <>
Subject: Re: revised "generic syntax" internet draft
In-Reply-To: <libSDtMail.9704151153.13046.gra@zeppo>
Message-Id: <>
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; charset="US-ASCII"
Precedence: bulk

On Tue, 15 Apr 1997, Gary Adams - Sun Microsystems Labs BOS wrote:
> Using the HotJava browser yesterday to view
> I was able to manually select the "View"->"Character Set" -> "Other" -> UTF8   
> and see the accented characters in the document text as well as in the 
> presentation of the URL. This worked for the 8bit UTF8 bytes, but was 
> not implemented for the %HH escaped characters. This would be a very
> useful feature to support in an I18N browser.

A few more datapoints on the above URL:

* Netscape Navigator 3.01 for X11 running under SunOS 4.1.4 (as
  are all the tests below) displays both that page and the two pages
  linked to from it (or is it one page with two different URLs?  Whatever
  - they both get displayed).  One of the URLs has lots of accented
  characters in which get displayed in the URL window, the above
  document's text and in the bottom left hand corner when the cursor is
  over the appropriate URL in the above document (Netscape is set to have
  a document encoding of "Japanese (auto-detect)" by the way). 

* X Mosaic 2.7b5 doesn't work with the above page or the pages linked to
  from it.  As far as I can tell, this is because there is a charset
  attribute following the "text/html" on the Content-Type header; I think 
  this is confusing it.

* Telnet (yes, I use telnet to get HTML pages once in a while) can
  retrieve the page linked to above.  However cut'n'paste under X11R6
  doesn't cut'n'paste the non-ASCII characters for me so the I18N'ed URL
  can't be cut'n'pasted (either from Netscape's URL window or from the
  document that telnet returned in an xterm).  I notice that the web
  server is returning the charset attribute even though I'm making an
  HTTP/1.0 request.  Is that right?  I thought thinks like charset were an
  HTTP/1.1 thing?

* Lynx version 2.7.1 blows up spectacularly on the above URL, most likely
  because of the charset parameter on the Content-Type header again (it
  complains that "Start file could not be found or is not text/html or
  text/plain" after dumping the raw HTML out to the screen).  The
  document with the %-escaped URL suffers the same fate but the I18N
  version can't even be cut'n'pasted and I've no idea how to generate all
  the accented characters on my keyboard.

* The CERN line mode browser v3.0 blew up on the above URL with a failed
  system call after complaining that it couldn't display it.

As I say folks, just some more datapoints, interpret as you will.

Tatty bye,


Jon "Jim'll" Knight, Researcher, Sysop and General Dogsbody, Dept. Computer
Studies, Loughborough University of Technology, Leics., ENGLAND.  LE11 3TU.
* I've found I now dream in Perl.  More worryingly, I enjoy those dreams. *