[Tools-discuss] Re: PDF [Re: Tools team meeting tomorrow 9 July 2024 1800 UTC]

Carsten Bormann <cabo@tzi.org> Tue, 09 July 2024 03:16 UTC

Return-Path: <cabo@tzi.org>
X-Original-To: tools-discuss@ietfa.amsl.com
Delivered-To: tools-discuss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EFA54C1E58E5; Mon, 8 Jul 2024 20:16:53 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.896
X-Spam-Level:
X-Spam-Status: No, score=-1.896 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01, T_SPF_TEMPERROR=0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vrnGTZklp-cp; Mon, 8 Jul 2024 20:16:47 -0700 (PDT)
Received: from smtp.zfn.uni-bremen.de (smtp.zfn.uni-bremen.de [IPv6:2001:638:708:32::21]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2BEDBC1840CB; Mon, 8 Jul 2024 20:16:41 -0700 (PDT)
Received: from smtpclient.apple (p5089ae14.dip0.t-ipconnect.de [80.137.174.20]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.zfn.uni-bremen.de (Postfix) with ESMTPSA id 4WJ5kJ1ZKKzDCbx; Tue, 9 Jul 2024 05:16:40 +0200 (CEST)
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3774.600.62\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <eb2d896c-d246-42c7-97c6-a4b3c7151cd9@nostrum.com>
Date: Tue, 09 Jul 2024 05:16:29 +0200
Content-Transfer-Encoding: quoted-printable
Message-Id: <A0EA194A-556C-421E-9358-39B0885597D1@tzi.org>
References: <336856cc-9986-43ba-bc26-f5c96aaa9521@nostrum.com> <A4157066-C7BA-4560-812A-21DB8A063AC4@tzi.org> <eb2d896c-d246-42c7-97c6-a4b3c7151cd9@nostrum.com>
To: Robert Sparks <rjsparks@nostrum.com>
X-Mailer: Apple Mail (2.3774.600.62)
Message-ID-Hash: 2BQKPHSLENOVWO6RP7F4VSC3GGPBRNPY
X-Message-ID-Hash: 2BQKPHSLENOVWO6RP7F4VSC3GGPBRNPY
X-MailFrom: cabo@tzi.org
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-tools-discuss.ietf.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
CC: tools-discuss <tools-discuss@ietf.org>, Working Chairs <wgchairs@ietf.org>
X-Mailman-Version: 3.3.9rc4
Precedence: list
Subject: [Tools-discuss] Re: PDF [Re: Tools team meeting tomorrow 9 July 2024 1800 UTC]
List-Id: IETF Tools Discussion <tools-discuss.ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/tools-discuss/zk5XJE3BSOcdVWmlxWEWJ0ZExC8>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tools-discuss>
List-Help: <mailto:tools-discuss-request@ietf.org?subject=help>
List-Owner: <mailto:tools-discuss-owner@ietf.org>
List-Post: <mailto:tools-discuss@ietf.org>
List-Subscribe: <mailto:tools-discuss-join@ietf.org>
List-Unsubscribe: <mailto:tools-discuss-leave@ietf.org>

Hi Robert,

Thank you for responding to this.

On 8. Jul 2024, at 23:29, Robert Sparks <rjsparks@nostrum.com> wrote:
> 
> 
> On 7/8/24 1:34 PM, Carsten Bormann wrote:
>> On 8. Jul 2024, at 20:27, Robert Sparks <rjsparks@nostrum.com> wrote:
>>> The agenda and the beginnings of notes for the meeting is available at https://notes.ietf.org/tools-team-20240709. Details will continue to be added up to and during the meeting.
>> This discusses expensive resources (“endpoints”) for building PDF forms of documents.
>> 
>> Could we maybe stop pdfizing things in weird ways and simply build the correct PDF form on I-D submission?
>> The current state is just such a waste of time.
> 
> This glosses over a lot, so lets dig a little:
> 
> - by "correct PDF form" I assume you mean what xml2rfc would produce given xml input.

Yes.
Something good enough to judge what the RFC editor will ultimately publish.

> What about drafts that are still being submitted as text only

I personally don’t care.  OK, you have to do *something*.

> - do we stop providing any pdf form of those? What about drafts and RFCs from more than 5 years ago?

So do something.  PDF-printing the .TXT seems to have worked so far.

> - The expensive part is running the pdf creating software (currently weasyprint) and moving that expense to draft posting time vs the first time someone accesses the pdf document just moves the expense around.

The expense is trivial (*).  The difference is whether I have to wait while the data is generated on demand or they can be provided immediately (or already is on my laptop via rsync).  I sure value my time more than that trivial expense.

> It might even make the total expense higher if no-one every asks for the pdf of a particular version of a draft.

That is true, but probably not relevant (*).

> - One sharp edge is running the pdf creating software over provided svg (even with our badly constructed restrictions on the SVG). We know of extant submissions that cause the pdf generation call to fail to return.

I can imagine that.  No soup for these drafts then (fall back to PDF-printing the TXT).

Grüße, Carsten

(*) conservative estimate: 2000 active I-Ds, each submitted 10 times per year, at 20 s of CPU per ingestion.
That is about 111 CPU hours per year, or < $10/yr on AWS at retail prices?
You writing your response already cost more...