Re: Thoughts about characters transmission

Rick Troth <TROTH@ricevm1.rice.edu> Sat, 10 July 1993 17:26 UTC

Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa02136; 10 Jul 93 13:26 EDT
Received: from CNRI.RESTON.VA.US by IETF.CNRI.Reston.VA.US id aa02129; 10 Jul 93 13:26 EDT
Received: from ietf.cnri.reston.va.us by CNRI.Reston.VA.US id aa10799; 10 Jul 93 13:26 EDT
Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa02117; 10 Jul 93 13:26 EDT
Received: from CNRI.RESTON.VA.US by IETF.CNRI.Reston.VA.US id aa02090; 10 Jul 93 13:24 EDT
Received: from ricevm1.rice.edu by CNRI.Reston.VA.US id aa10733; 10 Jul 93 13:24 EDT
Received: from RICEVM1.RICE.EDU by ricevm1.rice.edu (IBM VM SMTP V2R2) with BSMTP id 0329; Sat, 10 Jul 93 12:26:22 CDT
Received: from ricevm1.rice.edu (NJE origin TROTH@RICEVM1) by RICEVM1.RICE.EDU (LMail V1.1d/1.7f) with BSMTP id 5850; Sat, 10 Jul 1993 12:26:22 -0500
MIME-Version: 1.0
Content-Type: text/plain
Date: Sat, 10 Jul 1993 12:02:24 -0500
X-Orig-Sender: ietf-request@IETF.CNRI.Reston.VA.US
Sender: ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Rick Troth <TROTH@ricevm1.rice.edu>
Subject: Re: Thoughts about characters transmission
To: Keld J|rn Simonsen <keld@dkuug.dk>, Andr'e PIRARD <PIRARD@vm1.ulg.ac.be>, "Robert G. Moskowitz" <0003858921@mcimail.com>, ietf-charsets@innosoft.com, ietf-822@dimacs.rutgers.edu, ietf@CNRI.Reston.VA.US, WG-CHAR@rare.nl, Multi-byte Code Issues <ISO10646@jhuvm.rare.nl>
In-Reply-To: Message of Sat, 10 Jul 1993 10:31:56 +0200 from <keld@dkuug.dk>
Message-ID: <9307101324.aa10733@CNRI.Reston.VA.US>

        [please excuse this cross-post;  I am following a thread]

>> The _most_important_point_ is that a single common representation code
>> be defined _for_the_line_ (suiting the purpose, namely to cover all national
>> languages in one single way) and that people be instructed that every bit
>> of text should travel in that code on the wire, whatever_the_protocol_is.
>
>I agree to most of what Andre'' is saying and I have an additional
>point here: that the single common representation code should be something
>that can be handled by existing software and hardware,   ...

        I agree with most of what Andr said,  and agree with you on
this one point.   But ...

>will take a long time before the conversion software is installed
>on all machines, or even a large share of the installed base.
>Also I would like to emphasis the need for world-wide solutions.
>This would mean that ISO 8859-1 would not be a good candidate,
>we need something ASCII based (or even with a smaller repertoire
>than ASCII to cover the problems with EBCDIC and national ISO 646
>variants).

        I don't understand the warrant here,  Keld.   You're right that
we need world-wide solutions and you're right that we should have some-
thing ASCII based.   How does these make ISO 8859-1 a bad choice?

        I've spent a significant part of *my* life working with others
toward a true solution to the  ASCII <---> EBCDIC  problem.   Some form
of concensus was reached a long time ago and folks have successfully
"beat IBM over the head"  with it,  and IBM has finally acknowledged a
"de facto network EBCDIC"  [my term]  which they call CodePage 1047.
CP 1047 maps one-for-one with ISO 8859-1.   The mapping of 1047/8859-1
is the most palatable mapping to the most sites on the InterNet.

        I see the common code Andr mentions.   I see ISO 8859-1
"on the wire".   I see some  greater-than-8-bit  code in the future
that is a superset of  8859-1.   (and whether TCP has been super-
ceeded or wether we "tag" things,  I am NOT addressing here)
What's the problem?

        [I think it was Nathaniel who said,  "memory is cheap and
bandwidth is cheaper".   In agreement,  I say we scrap the 16-bit
stop-gap solution and go directly to 32-bit and then start looking
toward bit-unconstrained (bit-free?) representations.   Just my opinion]

>Keld

--
Rick Troth <troth@rice.edu>,  Rice University,  Information Systems