Extending news to EAI

"Charles Lindsey" <chl@clerew.man.ac.uk> Tue, 02 February 2010 05:13 UTC

Return-Path: <owner-ietf-usefor@mail.imc.org>
X-Original-To: ietfarch-usefor-archive@core3.amsl.com
Delivered-To: ietfarch-usefor-archive@core3.amsl.com
Received: from localhost (localhost []) by core3.amsl.com (Postfix) with ESMTP id B299C28C1F5 for <ietfarch-usefor-archive@core3.amsl.com>; Mon, 1 Feb 2010 21:13:11 -0800 (PST)
X-Quarantine-ID: <B5ERZZmXfIfn>
X-Virus-Scanned: amavisd-new at amsl.com
X-Amavis-Alert: BAD HEADER, Non-encoded 8-bit data (char C3 hex): Xref: clerew dk.test.utf8-\303\246\303\270\303\245:468 local[...]
X-Spam-Flag: NO
X-Spam-Score: -5.161
X-Spam-Status: No, score=-5.161 tagged_above=-999 required=5 tests=[AWL=-0.184, BAYES_00=-2.599, DATE_IN_PAST_06_12=1.069, HELO_MISMATCH_COM=0.553, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([]) by localhost (core3.amsl.com []) (amavisd-new, port 10024) with ESMTP id B5ERZZmXfIfn for <ietfarch-usefor-archive@core3.amsl.com>; Mon, 1 Feb 2010 21:13:10 -0800 (PST)
Received: from balder-227.proper.com (Balder-227.Proper.COM []) by core3.amsl.com (Postfix) with ESMTP id 23FD128C211 for <usefor-archive@ietf.org>; Mon, 1 Feb 2010 21:13:10 -0800 (PST)
Received: from balder-227.proper.com (localhost []) by balder-227.proper.com (8.14.2/8.14.2) with ESMTP id o125CAEx009777 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 1 Feb 2010 22:12:10 -0700 (MST) (envelope-from owner-ietf-usefor@mail.imc.org)
Received: (from majordom@localhost) by balder-227.proper.com (8.14.2/8.13.5/Submit) id o125CAgT009776; Mon, 1 Feb 2010 22:12:10 -0700 (MST) (envelope-from owner-ietf-usefor@mail.imc.org)
X-Authentication-Warning: balder-227.proper.com: majordom set sender to owner-ietf-usefor@mail.imc.org using -f
Received: from v-smtp-auth-relay-2.gradwell.net (v-smtp-auth-relay-2.gradwell.net []) by balder-227.proper.com (8.14.2/8.14.2) with ESMTP id o125C8tW009770 for <ietf-usefor@imc.org>; Mon, 1 Feb 2010 22:12:09 -0700 (MST) (envelope-from news@clerew.man.ac.uk)
Received: from [] ([] helo=clerew.man.ac.uk country=GB ident=postmaster#pop3$clerew^man*ac&uk) by v-smtp-auth-relay-2.gradwell.net with esmtpa (Gradwell gwh-smtpd 1.290) id 4b67b427.51bf.21 for ietf-usefor@imc.org; Tue, 2 Feb 2010 05:12:07 +0000 (envelope-sender <news@clerew.man.ac.uk>)
Received: from clerew.man.ac.uk (localhost []) by clerew.man.ac.uk (8.13.7/8.13.7) with ESMTP id o125C3WY014637 for <ietf-usefor@imc.org>; Tue, 2 Feb 2010 05:12:03 GMT
Received: (from news@localhost) by clerew.man.ac.uk (8.13.7/8.13.7/Submit) id o125C34Q014634 for ietf-usefor@imc.org; Tue, 2 Feb 2010 05:12:03 GMT
To: ietf-usefor@imc.org
Xref: clerew dk.test.utf8-������:468 local.usefor:25236
Path: clerew!chl
From: "Charles Lindsey" <chl@clerew.man.ac.uk>
Subject: Extending news to EAI
Content-Type: text/plain; charset=iso-8859-1
Message-ID: <Kx6CzM.12F@clerew.man.ac.uk>
Content-Transfer-Encoding: 8bit
X-Newsreader: NN version 6.5.2 (NOV)
Mime-Version: 1.0
Date: Mon, 1 Feb 2010 18:06:57 GMT
Lines: 75
Sender: owner-ietf-usefor@mail.imc.org
Precedence: bulk
List-Archive: <http://www.imc.org/ietf-usefor/mail-archive/>
List-Unsubscribe: <mailto:ietf-usefor-request@imc.org?body=unsubscribe>
List-ID: <ietf-usefor.imc.org>

There is now an experimental protocol for UTF-8 headers in Email (RFC5335
and its relations). This was the product of the IMA WG. There has been
recent discussion of applying this to Netnews, and the conclusion seems to
be that the IMA WG is not the place to do this, and that a private draft
would be the way to do this. However, this list would be a reasonable
place to discuss it.

Essentially, under this protocol, UTF-8 may be freely used in Email
headers, but a downgrading mechanism is needed whenever mail passes to a
server that does not advertise the UTF8SMTP capability.

This is much what the USEFOR WG wanted to do in its earlier days, but the
decision was then taken to postpone it until the base documents were
complete, and then to bring it up again as an Experimental protocol. So
maybe now is the time to embark on it.

It is much easier with Netnews than with Email, since the underlying
transport (whether NNTP or UUCP) is already 8-bit clean. I would not
expect it to become the norm on the Big-8 groups for quite some time, but
it would be very useful for National hierarchies, such as the Scandinavian
ones where the inability to have Newsgroup Names with their own special
characters in them is a right pain (apparently).

So the experimental protocol would start off with the extensions allowed
by RFC5535, and then add UTF-8 in the Newsgroups header. It would be up to
individual hierarchies to encourage deployment of the experiment within
their groups.

It has already been established that the existing transport mechanisms
will move such articles around without problem. No downgrading is
envisaged except at gateways to email (at which point the mechanisms
already agreed for EAI/IMA would apply). But existing servers would cope
fairly well without modification, at least until UTF-8 newsgroup-names
were introduced.

Clearly, anyone expecting to read such articles would need a suitable
client. Some clients will already display them (Opera, for example).
otherwise, it would be up to people to install suitable user agents if
they wanted to see these articles properly (existing agents might display
such headers in a garbled form (which might be good enough in languages
which wers based on Latin alphabets). Bodies would still expect to be
covered by a Content-Type: test/plain; charset=utf-8.

With utf-8 newsgroup-names, again people who wanted to subscribe to such
groups would need suitable clients, but existing servers would serve them
once they had been persuaded to store them in their active lists. That
would require control messages that created such groups to be accepted,
and also articles submitted to moderated groups to be forwarded, so in
practice people who wanted to subscribe to such groups would need to
connect to servers which had been upgraded to cope. But the important
point is that articles would still propagate correctly through
non-upgraded servers.

Some early USEFOR drafts show how the Newsgroups header was to be
extended. In particular, it required some very strict normalization, so
that a simple byte-by-byte comparison of newsgroup-names would always

Note that an experimental group dk.test.utf8-æøå (which should show up
in UTF-8 clients properly, althoug this message is somposed in iso-8859-1)
already exists on several servers, notably on news.dotsrc.org, and this
message is crossposted there, and if a thread develops there, then well
and good. But people on that list might do better to subscribe to the
usefor mailing list (see http://www.imc.org/ietf-usefor/index.html).

Charles H. Lindsey ---------At Home, doing my own thing------------------------
Tel: +44 161 436 6131            Web: http://www.cs.man.ac.uk/~chl
Email: chl@clerew.man.ac.uk      Snail: 5 Clerewood Ave, CHEADLE, SK8 3JU, U.K.
PGP: 2C15F1A9      Fingerprint: 73 6D C2 51 93 A0 01 E7 65 E8 64 7E 14 A4 AB A5