Re: Why the normative form of IETF Standards is ASCII

John Levine <johnl@iecc.com> Fri, 12 March 2010 05:58 UTC

Return-Path: <johnl@iecc.com>
X-Original-To: ietf@core3.amsl.com
Delivered-To: ietf@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id D7FDE3A6B23 for <ietf@core3.amsl.com>; Thu, 11 Mar 2010 21:58:48 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.214
X-Spam-Level:
X-Spam-Status: No, score=-9.214 tagged_above=-999 required=5 tests=[AWL=1.985, BAYES_00=-2.599, HABEAS_ACCREDITED_SOI=-4.3, RCVD_IN_BSP_TRUSTED=-4.3]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id st8NjyAX8vNZ for <ietf@core3.amsl.com>; Thu, 11 Mar 2010 21:58:47 -0800 (PST)
Received: from gal.iecc.com (gal.iecc.com [64.57.183.53]) by core3.amsl.com (Postfix) with ESMTP id 707963A6B21 for <ietf@ietf.org>; Thu, 11 Mar 2010 21:58:47 -0800 (PST)
Received: (qmail 50069 invoked from network); 12 Mar 2010 05:58:52 -0000
Received: from mail1.iecc.com (64.57.183.56) by mail1.iecc.com with QMQP; 12 Mar 2010 05:58:52 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=simple; d=iecc.com; h=date:message-id:from:to:subject:in-reply-to:cc:mime-version:content-type:content-transfer-encoding; s=k1003; olt=johnl@user.iecc.com; bh=GwSXc/AipFw6yLbyP631IgfH35pXV5a813Y5FV6aTZA=; b=YTpQlwtWeqXZx1y0VR+IvjyBYw7v5hgm92np2unInGFm3BzdWf4yHZ24B5FVnXRJ1TkCAYCU637QUJrW2U0XWKaNwn8w4t23cLoHCCNELqigRYPIaTOkICAqbbbEYpjAywpGo+eWjInH/xN83Iqnys15M0dGvq/83cZwRQ+cTeg=
Date: Fri, 12 Mar 2010 05:58:51 -0000
Message-ID: <20100312055851.31154.qmail@simone.iecc.com>
From: John Levine <johnl@iecc.com>
To: ietf@ietf.org
Subject: Re: Why the normative form of IETF Standards is ASCII
In-Reply-To: <1028365c1003112037r40cbad68vddeb24eb99253a26@mail.gmail.com>
Organization:
X-Headerized: yes
Mime-Version: 1.0
Content-type: text/plain; charset="iso-8859-1"
Content-transfer-encoding: 7bit
X-BeenThere: ietf@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: IETF-Discussion <ietf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ietf>
List-Post: <mailto:ietf@ietf.org>
List-Help: <mailto:ietf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 12 Mar 2010 05:58:48 -0000

>> PDF/A is a deliberately-limited format designed specifically for
>> archival purposes.
>
>And is clearly a non-starter because I have no idea how to produce PDF
>so limited, not idea how to test a PDF to see if its "PDF/A", etc.

There are certainly arguments against PDF/A, but this doesn't strike
me as a very strong one.  You know how to produce an ASCII I-D, which
looks somewhat like an RFC, but I doubt that you know how to produce
an RFC.  (I don't think anyone does other than the handful of people
who have actually done so, since the list of rules and formatting
twiddles for the RFC style is not perfectly documented.)  Were we to
adopt PDF/A as a format for RFCs, what would matter is that the RFC
production house knew how to create PDF/A files.  Document authors
would continue to send in I-Ds in whatever form they send them in,
with some extensions for figures and non-ASCII characters.

Indeed, I know plenty of people these days who have no idea today how
to produce an ASCII file with only tab, CR, and LF formatting
characters.  This does not mean they are morons, it means that the
text processing tools that people use today are different from the
ones we used in 1973.  If someone writes an I-D using xxe to produce
XML which xml2rfc turns into the text form that idnits wants, that
doesn't make him less manly than someone who edits with teco and codes
the nroff commands by hand.  (I had enough of that in my thesis in the
1970s.)

A major reason that the discussion of RFC formats never gets anywhere
is that it is really a discussion of the process more than about
particular formats, and we don't do process very well.  The current
process uses input and output formats that are similar enough that
people wrongly think they're the same, even though of course they are
not.  Many people seem to assume that if we picked a new output
format, we would necessarily change the input format to be "the same"
as the output format, which I think would be a terrible idea.  The
input formats need to be reasonably easy for non-experts to create,
and to be structured enough so that validation tools can work with
them.  The output format basically needs to be displayable, printable
and searchable.  There is no reason they have to be at all similar.

If I were tsar, I would probably leave the input format as xml2rfc,
give or take tweaks for figures and a broader character set, but make
the output format a more rigidly structured XML that can be
mechanically and consistently transformed into a variety of display
formats.  If you want nroff-style RFCs, that's a display format.

R's,
John