Re: [xml2rfc] [rfc-i] v3imp #A Convert to PDF with a quality tool

Tony Hansen <tony@att.com> Fri, 23 January 2015 22:55 UTC

Return-Path: <tony@att.com>
X-Original-To: xml2rfc@ietfa.amsl.com
Delivered-To: xml2rfc@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 27F321A00A8 for <xml2rfc@ietfa.amsl.com>; Fri, 23 Jan 2015 14:55:58 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.21
X-Spam-Level:
X-Spam-Status: No, score=-6.21 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, GB_I_LETTER=-2, RCVD_IN_DNSWL_MED=-2.3, T_RP_MATCHES_RCVD=-0.01] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id m_5c-q17r7Es for <xml2rfc@ietfa.amsl.com>; Fri, 23 Jan 2015 14:55:56 -0800 (PST)
Received: from nbfkord-smmo07.seg.att.com (nbfkord-smmo07.seg.att.com [209.65.160.93]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 3AE4C1A87E3 for <xml2rfc@ietf.org>; Fri, 23 Jan 2015 14:55:56 -0800 (PST)
Received: from unknown [144.160.229.23] (EHLO alpi154.enaf.aldc.att.com) by nbfkord-smmo07.seg.att.com(mxl_mta-7.2.2-0) over TLS secured channel with ESMTP id 971d2c45.0.3892853.00-2189.10893531.nbfkord-smmo07.seg.att.com (envelope-from <tony@att.com>); Fri, 23 Jan 2015 22:55:56 +0000 (UTC)
X-MXL-Hash: 54c2d17c430b525a-22d9195c944912d64dca0bf1f16dff03a15ce678
Received: from enaf.aldc.att.com (localhost [127.0.0.1]) by alpi154.enaf.aldc.att.com (8.14.5/8.14.5) with ESMTP id t0NMtqCT009451 for <xml2rfc@ietf.org>; Fri, 23 Jan 2015 17:55:52 -0500
Received: from alpi131.aldc.att.com (alpi131.aldc.att.com [130.8.218.69]) by alpi154.enaf.aldc.att.com (8.14.5/8.14.5) with ESMTP id t0NMto5M009448 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO) for <xml2rfc@ietf.org>; Fri, 23 Jan 2015 17:55:51 -0500
Received: from alpi153.aldc.att.com (alpi153.aldc.att.com [130.8.42.31]) by alpi131.aldc.att.com (RSA Interceptor) for <xml2rfc@ietf.org>; Fri, 23 Jan 2015 22:55:42 GMT
Received: from aldc.att.com (localhost [127.0.0.1]) by alpi153.aldc.att.com (8.14.5/8.14.5) with ESMTP id t0NMtgVX007250 for <xml2rfc@ietf.org>; Fri, 23 Jan 2015 17:55:42 -0500
Received: from dns.maillennium.att.com (maillennium.att.com [135.25.114.99]) by alpi153.aldc.att.com (8.14.5/8.14.5) with ESMTP id t0NMtYch006884 for <xml2rfc@ietf.org>; Fri, 23 Jan 2015 17:55:35 -0500
Received: from gacdtl0ukla830.itservices.sbc.com (gacdtl0ukla830.itservices.sbc.com?[135.110.241.119](misconfigured sender)) by maillennium.att.com (mailgw1) with ESMTP id <20150123225534gw1000ce59e>; Fri, 23 Jan 2015 22:55:34 +0000
X-Originating-IP: [135.110.241.119]
Message-ID: <54C2D165.1020508@att.com>
Date: Fri, 23 Jan 2015 17:55:33 -0500
From: Tony Hansen <tony@att.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:31.0) Gecko/20100101 Thunderbird/31.4.0
MIME-Version: 1.0
To: "rfc-interest@rfc-editor.org" <rfc-interest@rfc-editor.org>
References: <54C20FFF.2040708@seantek.com>
In-Reply-To: <54C20FFF.2040708@seantek.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
X-RSA-Inspected: yes
X-RSA-Classifications: public
X-AnalysisOut: [v=2.0 cv=BpYqN/r5 c=1 sm=1 a=VXHOiMMwGAwA+y4G3/O+aw==:17 a]
X-AnalysisOut: [=mJp9S24oyUUA:10 a=4IRoPlBaROAA:10 a=BLceEmwcHowA:10 a=Ikc]
X-AnalysisOut: [TkHD0fZMA:10 a=zQP7CpKOAAAA:8 a=YNv0rlydsVwA:10 a=f1icpa8r]
X-AnalysisOut: [Ku1YDhSHQtYA:9 a=QEXdDO2ut3YA:10 a=F87LTZbt_lgoO2r0:21 a=z]
X-AnalysisOut: [LQAha-MJ24d4IBl:21]
X-Spam: [F=0.2000000000; CM=0.500; S=0.200(2014051901)]
X-MAIL-FROM: <tony@att.com>
X-SOURCE-IP: [144.160.229.23]
Archived-At: <http://mailarchive.ietf.org/arch/msg/xml2rfc/d4sKotE7obKO5ntgzBHd3vFROZQ>
Cc: xml2rfc@ietf.org
Subject: Re: [xml2rfc] [rfc-i] v3imp #A Convert to PDF with a quality tool
X-BeenThere: xml2rfc@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: <xml2rfc.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/xml2rfc>, <mailto:xml2rfc-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/xml2rfc/>
List-Post: <mailto:xml2rfc@ietf.org>
List-Help: <mailto:xml2rfc-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/xml2rfc>, <mailto:xml2rfc-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 23 Jan 2015 22:55:58 -0000

On 1/23/15 4:10 AM, Sean Leonard wrote:
> Tool Request
> #A Convert to PDF with a quality tool
>
> Despite various pronouncements that pagination doesn't matter, etc., 
> the fact is that people will be using paged media for a long time to 
> come. Pretty much every other SDO on the planet (except perhaps W3C) 
> issues its standards in a paginated form, including physical book or 
> electronic PDF options.
>
> To make IETF docs look good, it would be nice to have a quality tool 
> that captures all of the nuances of the vocabulary in PDF format. 
> Desired features include:
> • bookmarks for sections
> • observing pagination controls (see Improvement #2)
> • observing standardized headers and footers (compare with Tool 
> Request #B, forthcoming)
> • preserving intra-document and extra-document hyperlinks
> • formatting choices that allow documents to be printed on Letter or 
> A4 page sizes at 100% resolution
> • including comments and other annotations in the native PDF format
> • observing whitespace and line break preservation as directed by the 
> input (e.g., NBSP, NBHYPHEN, don't break this range of text, don't 
> collapse multiple spaces)
> • vector artwork
> • preserving text flow for accessibility purposes
> • font embedding
> • preserving "files" and other incorporated blobs as document-level or 
> page-level "File Attachments"
> • metadata preservation

Please review the doc draft-hansen-rfc-use-of-pdf. I think every single 
item you mention above is discussed to some degree or another.

> Short of developing a custom tool, the off-the-shelf standard that I 
> have found to work is xml2rfc -> HTML -> hand-tooling the HTML to look 
> "nice" -> Prince XML -> PDF. Prince XML is CSS aware and therefore 
> gets a lot of the formatting right, in a way that no other layout 
> engine has been able to handle.
>
> For draft-josefsson-pkix-textual-10 I believe that I used the 
> Chrome/Chromium rendering engine to PDF on Mac OS X, as it preserved 
> the no-break and (manually inserted) pagination control properties 
> correctly. Unfortunately, neither Chrome/WebKit nor Firefox/Gecko 
> rendering engines preserve hyperlinks when saving as PDF using the 
> print subsystem.
>
> Prince XML is commercial software but it is cheap enough for a site 
> license that I think the IETF/RFC Editor should just get a license and 
> make it available for online I-D and RFC conversion. Other than this, 
> consider converting xml2rfc v3 directly to PDF, in conjunction with 
> some style sheet input. (Upon writing that sentence, I think that 
> defining the style sheet input is significantly more complex than 
> writing the tool itself...which is why I think that going the (X)HTML 
> route offers a lot more flexibility and commercially maintained options.)

Thanks for the pointers. I'm sure those who eventually work on the v3 
tools will find it useful.

     Tony Hansen