[rfc-i] v3imp #A Convert to PDF with a quality tool

tony at att.com (Tony Hansen) Fri, 23 January 2015 22:55 UTC

From: "tony at att.com"
Date: Fri, 23 Jan 2015 17:55:33 -0500
Subject: [rfc-i] v3imp #A Convert to PDF with a quality tool
In-Reply-To: <54C20FFF.2040708@seantek.com>
References: <54C20FFF.2040708@seantek.com>
Message-ID: <54C2D165.1020508@att.com>

On 1/23/15 4:10 AM, Sean Leonard wrote:
> Tool Request
> #A Convert to PDF with a quality tool
>
> Despite various pronouncements that pagination doesn't matter, etc., 
> the fact is that people will be using paged media for a long time to 
> come. Pretty much every other SDO on the planet (except perhaps W3C) 
> issues its standards in a paginated form, including physical book or 
> electronic PDF options.
>
> To make IETF docs look good, it would be nice to have a quality tool 
> that captures all of the nuances of the vocabulary in PDF format. 
> Desired features include:
> ? bookmarks for sections
> ? observing pagination controls (see Improvement #2)
> ? observing standardized headers and footers (compare with Tool 
> Request #B, forthcoming)
> ? preserving intra-document and extra-document hyperlinks
> ? formatting choices that allow documents to be printed on Letter or 
> A4 page sizes at 100% resolution
> ? including comments and other annotations in the native PDF format
> ? observing whitespace and line break preservation as directed by the 
> input (e.g., NBSP, NBHYPHEN, don't break this range of text, don't 
> collapse multiple spaces)
> ? vector artwork
> ? preserving text flow for accessibility purposes
> ? font embedding
> ? preserving "files" and other incorporated blobs as document-level or 
> page-level "File Attachments"
> ? metadata preservation

Please review the doc draft-hansen-rfc-use-of-pdf. I think every single 
item you mention above is discussed to some degree or another.

> Short of developing a custom tool, the off-the-shelf standard that I 
> have found to work is xml2rfc -> HTML -> hand-tooling the HTML to look 
> "nice" -> Prince XML -> PDF. Prince XML is CSS aware and therefore 
> gets a lot of the formatting right, in a way that no other layout 
> engine has been able to handle.
>
> For draft-josefsson-pkix-textual-10 I believe that I used the 
> Chrome/Chromium rendering engine to PDF on Mac OS X, as it preserved 
> the no-break and (manually inserted) pagination control properties 
> correctly. Unfortunately, neither Chrome/WebKit nor Firefox/Gecko 
> rendering engines preserve hyperlinks when saving as PDF using the 
> print subsystem.
>
> Prince XML is commercial software but it is cheap enough for a site 
> license that I think the IETF/RFC Editor should just get a license and 
> make it available for online I-D and RFC conversion. Other than this, 
> consider converting xml2rfc v3 directly to PDF, in conjunction with 
> some style sheet input. (Upon writing that sentence, I think that 
> defining the style sheet input is significantly more complex than 
> writing the tool itself...which is why I think that going the (X)HTML 
> route offers a lot more flexibility and commercially maintained options.)

Thanks for the pointers. I'm sure those who eventually work on the v3 
tools will find it useful.

     Tony Hansen