Re: So do both [was Re: Should the IETF be condoning, even promoting, BOM pollution?]

"Joel M. Halpern" <jmh@joelhalpern.com> Wed, 11 October 2017 14:45 UTC

Return-Path: <jmh@joelhalpern.com>
X-Original-To: ietf@ietfa.amsl.com
Delivered-To: ietf@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5C67C132F2C for <ietf@ietfa.amsl.com>; Wed, 11 Oct 2017 07:45:57 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.701
X-Spam-Level:
X-Spam-Status: No, score=-2.701 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=joelhalpern.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 5eS2r9KrvXYk for <ietf@ietfa.amsl.com>; Wed, 11 Oct 2017 07:45:55 -0700 (PDT)
Received: from maila2.tigertech.net (maila2.tigertech.net [208.80.4.152]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 78D4B126CB6 for <ietf@ietf.org>; Wed, 11 Oct 2017 07:45:55 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1]) by maila2.tigertech.net (Postfix) with ESMTP id 6059792013D; Wed, 11 Oct 2017 07:45:55 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelhalpern.com; s=1.tigertech; t=1507733155; bh=18q7UpQ2GkMjYcZ0iJh0h3QwlsCrJ5AiZwWeCQq3Nxk=; h=Subject:To:Cc:References:From:Date:In-Reply-To:From; b=NLTnim7rJOtZqc0rGRGUxvejwEgGuM3kMUR2e9isAWkZaFOofLqtsxqt02yUDLPU+ B8cebXPsEaIIfJqsetFpIH0o7ZXBM2cFYhXwm1pIDioEHQC2kQfCPTusBPi4oJdaOO D09+qx70I7pypbYkGXWWLd36b421/VGQZbG4O46w=
X-Virus-Scanned: Debian amavisd-new at maila2.tigertech.net
Received: from Joels-MacBook-Pro.local (unknown [50.225.209.67]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by maila2.tigertech.net (Postfix) with ESMTPSA id B8873920135; Wed, 11 Oct 2017 07:45:53 -0700 (PDT)
Subject: Re: So do both [was Re: Should the IETF be condoning, even promoting, BOM pollution?]
To: Stewart Bryant <stewart.bryant@gmail.com>
Cc: "Heather Flanagan (RFC Series Editor)" <rse@rfc-editor.org>, IETF Discussion <ietf@ietf.org>
References: <09b0ed8b-c47a-83c3-9174-cce990bdb145@rfc-editor.org> <C38DD5C1-4B67-4C22-9BF8-FBC67AD1B90E@fugue.com> <51d7731c-caf0-c3d1-1f94-8095ea62bed0@nostrum.com> <1818E75B-86D4-4B90-A2A6-D0CD128034A0@fugue.com> <d0f9ce6a-b7ba-6205-c542-3f07c4e56b49@nostrum.com> <65BDB20CE6CCE5E7F4B40252@PSB> <15a84eff-57df-f4ad-96e3-c6c3cd8b133f@gmx.de> <A3DA53FD-3DBF-4B89-918C-A55DC7715FF1@tzi.org> <bf0323cc-f1e9-9f32-ac0d-fe59c9c51721@gmx.de> <599A5DC9-03B4-4234-9824-E0A337D9F871@tzi.org> <cd19b902-dde8-a3fb-132c-27eccdc66a26@gmx.de> <2C5631A0-60B4-4C62-9E74-CAA2A091D75C@tzi.org> <012701d336ea$7b1817a0$4001a8c0@gateway.2wire.net> <9d5162c1-c73f-7aac-5b5a-2289ecb793be@gmail.com> <FF078F6953E5EBF095529D06@PSB> <5e45a160-aa91-1d06-3e33-a60870926f71@rfc-editor.org> <5FD7556A2031BF3C232754CA@PSB> <87B5BD3B-FEA0-4547-B3BC-A0012B4AE1F5@gmail.com> <CAA=duU1AcxH4WfVTqFVR-00yr=i2JZ2dz4GimPs6nxeN4nF=HQ@mail.gmail.com> <9d25f04c-e7e2-827a-4404-d0cca1bd080e@gmail.com>
From: "Joel M. Halpern" <jmh@joelhalpern.com>
Message-ID: <227fbea9-4fbb-2a62-f1df-9301ee3dc394@joelhalpern.com>
Date: Wed, 11 Oct 2017 10:45:52 -0400
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:52.0) Gecko/20100101 Thunderbird/52.3.0
MIME-Version: 1.0
In-Reply-To: <9d25f04c-e7e2-827a-4404-d0cca1bd080e@gmail.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Language: en-US
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf/fkTS8NhGM4SK4Bgnv2-b6JO-kXQ>
X-BeenThere: ietf@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: IETF-Discussion <ietf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf>, <mailto:ietf-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf/>
List-Post: <mailto:ietf@ietf.org>
List-Help: <mailto:ietf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 11 Oct 2017 14:45:57 -0000

Personally, I find page references to be really confusing when people 
talk about RFCs.  But personal preference is not what matters.

We have already agreed that within the near future the pagination of 
RFCs will differ in different representations, and some may not have 
pagination at all.
As such, worrying about keeping page numbers working for the next few 
years while we transition seems a less than effective use of our resources.

Yours,
Joel

On 10/11/17 5:12 AM, Stewart Bryant wrote:
> That is popular in legal circles and I find it so distracting. In those 
> circles distraction of the reader may be good for the author. However 
> our goal is clarity of message to both so we can write and implement 
> unambiguous specifications.
> 
> When doing reviews I also find it useful to have page numbers as a quick 
> way of knowing how much text I need to make time to read.
> 
> - Stewart
> 
> 
> On 10/10/2017 21:06, Andrew G. Malis wrote:
>> We could, of course, introduce paragraph numbering in the left margin, 
>> which would alleviate the page number reference problem since as Yoav 
>> notes, we’ll no longer have constant page numbers in the various 
>> display versions of an RFC.
>>
>> Cheers,
>> Andy
>>
>>
>> On Tue, Oct 10, 2017 at 2:48 PM, Yoav Nir <ynir.ietf@gmail.com 
>> <mailto:ynir.ietf@gmail.com>> wrote:
>>
>>
>>>     On 10 Oct 2017, at 5:29, John C Klensin <john-ietf@jck.com
>>>     <mailto:john-ietf@jck.com>> wrote:
>>>
>>>
>>>
>>>     --On Monday, October 9, 2017 16:36 -0700 "Heather Flanagan (RFC
>>>     Series Editor)" <rse@rfc-editor.org <mailto:rse@rfc-editor.org>>
>>>     wrote:
>>>
>>>>     On 10/9/17 10:14 AM, John C Klensin wrote:
>>>>>     --On Wednesday, September 27, 2017 08:38 +1300 Brian E
>>>>>     Carpenter <brian.e.carpenter@gmail.com
>>>>>     <mailto:brian.e.carpenter@gmail.com>> wrote:
>>>>>
>>>>>>     So why don't we, the Internet standards people who believe in
>>>>>>     rough consensus and running code, request the RFC Editor (a
>>>>>>     friend of ours) to supply two text versions of each RFC, like
>>>>>>
>>>>>>     https://www.rfc-editor.org/rfc/rfc8187.txt
>>>>>>     <https://www.rfc-editor.org/rfc/rfc8187.txt>   as today, with
>>>>>>     BOM if relevant
>>>>>>     https://www.rfc-editor.org/rfc/rfc8187.ut8
>>>>>>     <https://www.rfc-editor.org/rfc/rfc8187.ut8>
>>>>>>     containing pure UTF-8 with no BOM ever
>>>>>     If one were really going to do that, one would need three
>>>>>     representations (pick your own three-character suffixes for
>>>>>     the first two):
>>>>>
>>>>>     rfc8176.utf8   (standard/normal Unicode in UTF-8, no BOM)
>>>>>     rfc8176.utf8-with-BOM (as above, but...)
>>>>>     rfc8176.txt    (ASCII, with characters outside the ASCII
>>>>>     repertoire expressed as \u'[N[N]]NNNN' (see RFC 5137) or
>>>>>     another escaping system of the RFC Editor's choice.
>>>>
>>>>
>>>>     A few points to consider. First, the RFC Editor will review,
>>>>     at least to some extent, every file we produce, and our tools
>>>>     will need to be modified to create the additional formats;
>>>>     that complexity would then need to be maintained going
>>>>     forward. The more files added, the more resources it will take
>>>>     to produce. This has implications for either the time it takes
>>>>     to publish or the cost it takes to publish. Second, there have
>>>>     also been some discussions about creating separate files for
>>>>     paginated versus unpaginated text files. That would take us up
>>>>     to six files just for the plain-text outputs (noting the RFC
>>>>     Editor also has the PDF/A-3 and HTML to review).
>>>>
>>>>     Alternatively, the IETF community that prefers plain text can
>>>>     develop tools that takes the one file created by the RFC
>>>>     Editor and strip the BOM, add pagination, or run it through a
>>>>     translation tool to get it in their native language--these
>>>>     will not be produced or reviewed by the RFC Editor, but will
>>>>     perhaps meet the individual desires here. Given the number of
>>>>     options, opinions, and resources involved, I think this makes
>>>>     the most sense.
>>>
>>>     Up to a point, yes.  On the other hand, unless the RFC Editor
>>>     intends to make a rule requiring either that sections (or
>>>     subsections) not extend over circa a page, or numbering lines,
>>>     or doing something else that facilities references into a
>>>     document, I think you'd best retain a canonical / distributed
>>>     version with page numbers, headers, and footers.
>>
>>     In that case we’d all have to look up that version whenever we
>>     received a reference to something in RFC xxxx page 7. So even if
>>     it’s more comfortable for us to read the RFC in a browser or on a
>>     phone, we’d need access to this canonical version.
>>
>>     IMO it’s far easier to reference section and paragraph number, as
>>     in “the formula in RFC 6962, section 2.1.2, paragraph 3”. This
>>     works with any format, paginated or not.
>>
>>     This gets clunky if people have 4-page long paragraphs or
>>     50-paragraph sections, but that kind of badness can and should be
>>     caught by working groups, shepherd reviews or if all else fails,
>>     gen-art reviews. This one is not up to the RFC editor to make
>>     rules against.
>>
>>     Yoav
>>
>>
>>
>