Re: iso-2022-jp

Olle Jarnefors <ojarnef@admin.kth.se> Thu, 27 August 1992 13:30 UTC

From: Olle Jarnefors <ojarnef@admin.kth.se>
Date: Thu, 27 Aug 1992 14:51:14 +0200
Message-Id: <9208271251.AA22754@mercutio.admin.kth.se>
To: ietf-822@dimacs.rutgers.edu
Cc: Olle Jarnefors <ojarnef@admin.kth.se>
In-Reply-To: <9208261333.AA22464@samrat.poel.juice.or.jp> "Wed, 26 Aug 92 22:33:36 +0900" (From: erik@poel.juice.or.jp (Erik M. van der Poel))
Subject: Re: iso-2022-jp

> So please review the document (if you are interested :-) and send
> comments to this list within the next three weeks.

The following excerpts are from: draft-ietf-822ext-iso2022jp-00.txt

: Network Working Group                                          Jun Murai
: Internet Draft                                              Mark Crispin
: 						         Erik van der Poel
: 							  25th August 1992
: 
: 
: 	     Japanese Character Encoding for Internet Messages
: 
: 
: - - -
:
: Introduction
: 
:    This document describes the encoding used in plain text electronic
:    mail and network news in several Japanese networks.

This should be "... used in the plain text parts of electronic
mail and network news messages in several ...". There isn't any
special form of electronic mail that can be characterized as
"plain text mail", I think.

The restriction to plain text parts means that this draft says
nothing about the use of JUNET encoding in header fields. Is the
intention that JUNET encoding shall be allowed there too, in
accordance with the rules of RFC 1342? Either way, it may be
useful to include some text about the use or non-use of JUNET
encoding in message headers.

: Formal Description
: 
:    This section provides a formal description of the JUNET encoding. In
:    the event that this description is not consistent with the above
:    informal description, this formal description shall take precedence.

The formal description is not as complete as the informal
description: It only specifies the syntax -

   which octet sequences are allowed

not the semantics -

   which character set shall be used to interpret a certain
   <segment> (or *<text> outside <segment>s)

The semantics is only specified in the informal specification.

I think it would be nice if the formal description was complete.
The semantics is very simple. Maybe it can be specified right in
the BNF, by a comment at the point where the character set
switch takes place, i.e. at the end of <single-byte-seq> and
<double-byte-seq>. There should also be a character set
specification at the start of the encoded data. Therefore it
would be useful to add the definition

   ISO-2022-JP-encoded-text =    ; current charset := ASCII
                              *line


:    CHAR                = <any ASCII character>      ; ( 0-177,  0.-127.)
: 
:    text                = <any CHAR, including bare
: 			    CR & bare LF, but NOT
: 			    including CRLF>

I see two problems with the definition of <text>, one formal and
one substantial.

Formal problem: <text> as defined is one single character. How
can the definition then say that <text> is "any CHAR ... but not
including CRLF"? <CRLF> is a _sequence_ of two characters.

It is possible to give a formally satisfying definition. Since
<text> occurs in the other definitions only as "*text", that
subexpression may very well be replaced by "text-part", defined by

   text-part = *( other-char / LF / ( *CR ) <other-char> ) *CR

where

   other-char = <any CHAR except CR and LF>

Substantial problem: Is <ESC> allowed as <text>? The present
definition implies that, but obviously at least <ESC> can't be
allowed in the context

     ESC "(" ( "B" / "J" )
   / ESC "$" ( "@" / "B" )

And I guess other similar escape sequences, that would indicate
a switch to other character sets according to ISO 2022 are
disallowed too. Maybe <ESC> should be excluded from <CHAR>?

:    Additional restrictions that are difficult to describe in the above
                ^^^^^^^^^^^^
:    are as follows.
: 
:    Adjacent segments should have different escape sequences. For
                       ^^^^^^
:    example, the following is not recommended:
: 
: 	     ESC $ B .... ESC $ B ....

The use of "should" indicates that this rule isn't a
"restriction" but rather a recommendation.

iso-2022-jp Randall Atkinson
iso-2022-jp Erik M. van der Poel
Re: iso-2022-jp Olle Jarnefors
Re: iso-2022-jp Erik M. van der Poel
Re: iso-2022-jp Erik M. van der Poel
Re: iso-2022-jp Hitoshi Doi 土井仁志
Re: iso-2022-jp Keld J|rn Simonsen
Re: iso-2022-jp Yutaka Sato 佐藤豊
Re: iso-2022-jp Nathaniel Borenstein
iso-2022-jp Erik M. van der Poel