8bit & i18n (was Re: ietf-nntp My notes ...)

Chris Newman <Chris.Newman@innosoft.com> Thu, 19 December 1996 00:45 UTC

Received: from cnri by ietf.org id aa14198; 18 Dec 96 19:45 EST
Received: from ACADEM2.ACADEM.COM by CNRI.Reston.VA.US id aa29075; 18 Dec 96 19:45 EST
Received: (from majordomo@localhost) by academ2.academ.com (8.8.3/8.7.3) id SAA11279 for ietf-nntp-outgoing; Wed, 18 Dec 1996 18:42:54 -0600 (CST)
X-Authentication-Warning: academ2.academ.com: majordomo set sender to owner-ietf-nntp using -f
Received: from academ.com (root@ACADEM.COM [198.137.249.2]) by academ2.academ.com (8.8.3/8.7.3) with ESMTP id SAA11274 for <ietf-nntp@ACADEM2.ACADEM.COM>; Wed, 18 Dec 1996 18:42:52 -0600 (CST)
Received: from THOR.INNOSOFT.COM (THOR.INNOSOFT.COM [192.160.253.66]) by academ.com (8.8.3/8.7.1) with ESMTP id SAA05743 for <ietf-nntp@academ.com>; Wed, 18 Dec 1996 18:42:50 -0600 (CST)
Received: from eleanor.innosoft.com by INNOSOFT.COM (PMDF V5.0-8 #8694) id <01ID5UP5C1YAA8CSPC@INNOSOFT.COM>; Wed, 18 Dec 1996 16:41:51 -0800 (PST)
Date: Wed, 18 Dec 1996 16:42:38 -0800 (PST)
From: Chris Newman <Chris.Newman@innosoft.com>
Subject: 8bit & i18n (was Re: ietf-nntp My notes ...)
In-reply-to: <32B83BF5.6C48@netscape.com>
To: Brian Hernacki <bhern@netscape.com>
Cc: Chris Lewis <clewis@nortel.ca>, ietf-nntp@academ.com
Message-id: <Pine.SOL.3.95.961218161124.14276O-100000@eleanor.innosoft.com>
MIME-version: 1.0
Content-type: TEXT/PLAIN; charset=US-ASCII
Content-transfer-encoding: 7BIT
Sender: owner-ietf-nntp@academ.com
Precedence: bulk

On Wed, 18 Dec 1996, Brian Hernacki wrote:

> Chris Lewis wrote:
> > The defacto standards (INN + Cnews/Reference NNTP implementation) transport
> > 8-bit article bodies unmolested.  I see no reason to continue to enforce
> > this _unnecessary_ archaism which has already been purposefully abandoned.
> > 
> > We don't need to deal with charsets - NNTP is a transport mechanism, and
> > isn't involved in display issues.  The only place where it matters is in
> > the headers - where we may simply wish to take a similar bailout as,
> > say, Posix C did, and insist that the headers are, say, UTF8 (where can
> > I find a listing of this?) or Latin-1.  Indeed, we may well be able
> > to get away with insisting that the keywords are the current ASCII
> > encodings, and most of/all of the keyword values are 8-bit.
> 
> OK...my bad. I thought you were proposing adding alot of charset and
> other i18n supprt stuff to NNTP. I'm all for at least clarifying the use
> of 8-bit vs 7 bit. I think this does fall under documenting current
> de-fact standards and would go a long way to i18n support.

We have a number of choices on this front:

1) Declare the protocol 7-bit and ignore i18n issues.  This doesn't
reflect reality and probably won't pass IETF scrutiny.

2) Declare the protocol 7-bit and use MIME for i18n.  This would probably
work fine.

3) Declare the protocol 8-bit and ignore i18n issues.  This would reflect
reality -- a non-interoperable one.  I'd certainly object and I suspect
many others would.

4) Declare the protocol 8-bit and use MIME for i18n, possibly allowing
8-bit MIME.  Disallow unlabelled localized charsets because they don't
interoperate. This is the best choice, IMHO.

On the issue of 8-bit headers -- this is definitely a bad idea.  The
current installed base is completely non-interoperable, as clients all
over the world use different 8-bit charsets.  In addition, it would make
gatewaying between email and news a nightmare.  We're stuck with MIME
header encodings for i18n headers thanks to the installed base.  Also,
Latin-1 is obviously a lose -- it's not i18n and it creates more problems
than it solves.

You can see RFC 2044 for a description of UTF8.  UTF8 would be a good
choice for new protocol elements which need to be i18n, such as pretty
names.