Re: Thoughts about characters transmission

Rick Troth <TROTH@ricevm1.rice.edu> Sat, 10 July 1993 17:37 UTC

Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa02199; 10 Jul 93 13:37 EDT
Received: from CNRI.RESTON.VA.US by IETF.CNRI.Reston.VA.US id aa02191; 10 Jul 93 13:37 EDT
Received: from dimacs.rutgers.edu by CNRI.Reston.VA.US id aa11009; 10 Jul 93 13:37 EDT
Received: by dimacs.rutgers.edu (5.59/SMI4.0/RU1.5/3.08) id AA15652; Sat, 10 Jul 93 13:24:51 EDT
Received: from rutvm1.rutgers.edu by dimacs.rutgers.edu (5.59/SMI4.0/RU1.5/3.08) id AA15643; Sat, 10 Jul 93 13:24:50 EDT
Message-Id: <9307101724.AA15643@dimacs.rutgers.edu>
Received: from RUTVM1.RUTGERS.EDU by RUTVM1.RUTGERS.EDU (IBM VM SMTP R1.2.1MX) with BSMTP id 1544; Sat, 10 Jul 93 13:23:40 EDT
Received: from RICEVM1.RICE.EDU (NJE origin MAILER@RICEVM1) by RUTVM1.RUTGERS.EDU (LMail V1.1d/1.7f) with BSMTP id 0977; Sat, 10 Jul 1993 13:23:39 -0400
Received: from ricevm1.rice.edu (NJE origin TROTH@RICEVM1) by RICEVM1.RICE.EDU (LMail V1.1d/1.7f) with BSMTP id 5850; Sat, 10 Jul 1993 12:26:22 -0500
Mime-Version: 1.0
Content-Type: text/plain
Date: Sat, 10 Jul 1993 12:02:24 -0500
Sender: ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Rick Troth <TROTH@ricevm1.rice.edu>
Subject: Re: Thoughts about characters transmission
To: Keld J|rn Simonsen <keld@dkuug.dk>, Andr'e PIRARD <PIRARD@vm1.ulg.ac.be>, "Robert G. Moskowitz" <0003858921@mcimail.com>, ietf-charsets@innosoft.com, ietf-822@dimacs.rutgers.edu, ietf@CNRI.Reston.VA.US, WG-CHAR@rare.nl, Multi-byte Code Issues <ISO10646@jhuvm.rare.nl>
In-Reply-To: Message of Sat, 10 Jul 1993 10:31:56 +0200 from <keld@dkuug.dk>

        [please excuse this cross-post;  I am following a thread]

>> The _most_important_point_ is that a single common representation code
>> be defined _for_the_line_ (suiting the purpose, namely to cover all national
>> languages in one single way) and that people be instructed that every bit
>> of text should travel in that code on the wire, whatever_the_protocol_is.
>
>I agree to most of what Andre'' is saying and I have an additional
>point here: that the single common representation code should be something
>that can be handled by existing software and hardware,   ...

        I agree with most of what Andr) said,  and agree with you on
this one point.   But ...

>will take a long time before the conversion software is installed
>on all machines, or even a large share of the installed base.
>Also I would like to emphasis the need for world-wide solutions.
>This would mean that ISO 8859-1 would not be a good candidate,
>we need something ASCII based (or even with a smaller repertoire
>than ASCII to cover the problems with EBCDIC and national ISO 646
>variants).

        I don't understand the warrant here,  Keld.   You're right that
we need world-wide solutions and you're right that we should have some-
thing ASCII based.   How does these make ISO 8859-1 a bad choice?

        I've spent a significant part of *my* life working with others
toward a true solution to the  ASCII <---> EBCDIC  problem.   Some form
of concensus was reached a long time ago and folks have successfully
"beat IBM over the head"  with it,  and IBM has finally acknowledged a
"de facto network EBCDIC"  [my term]  which they call CodePage 1047.
CP 1047 maps one-for-one with ISO 8859-1.   The mapping of 1047/8859-1
is the most palatable mapping to the most sites on the InterNet.

        I see the common code Andr) mentions.   I see ISO 8859-1
"on the wire".   I see some  greater-than-8-bit  code in the future
that is a superset of  8859-1.   (and whether TCP has been super-
ceeded or wether we "tag" things,  I am NOT addressing here)
What's the problem?

        [I think it was Nathaniel who said,  "memory is cheap and
bandwidth is cheaper".   In agreement,  I say we scrap the 16-bit
stop-gap solution and go directly to 32-bit and then start looking
toward bit-unconstrained (bit-free?) representations.   Just my opinion]

>Keld

--
Rick Troth <troth@rice.edu>,  Rice University,  Information Systems