Re: [Tools-discuss] Why do we even have text formats any more?
Robert Sparks <rjsparks@nostrum.com> Wed, 28 July 2021 02:22 UTC
Return-Path: <rjsparks@nostrum.com>
X-Original-To: tools-discuss@ietfa.amsl.com
Delivered-To: tools-discuss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CB18A3A1702 for <tools-discuss@ietfa.amsl.com>; Tue, 27 Jul 2021 19:22:40 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.079
X-Spam-Level:
X-Spam-Status: No, score=-2.079 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_BLOCKED=0.001, T_SPF_HELO_PERMERROR=0.01, T_SPF_PERMERROR=0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=nostrum.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id DGRHEvehYOz6 for <tools-discuss@ietfa.amsl.com>; Tue, 27 Jul 2021 19:22:36 -0700 (PDT)
Received: from nostrum.com (raven-v6.nostrum.com [IPv6:2001:470:d:1130::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 9B5153A1701 for <tools-discuss@ietf.org>; Tue, 27 Jul 2021 19:22:36 -0700 (PDT)
Received: from unformal.localdomain ([47.186.34.206]) (authenticated bits=0) by nostrum.com (8.16.1/8.16.1) with ESMTPSA id 16S2MZj3077467 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO) for <tools-discuss@ietf.org>; Tue, 27 Jul 2021 21:22:36 -0500 (CDT) (envelope-from rjsparks@nostrum.com)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=nostrum.com; s=default; t=1627438956; bh=U6xeQTou8+ggvoQYCpvQyyTODcqELxfU278nRM6iRjE=; h=To:References:From:Subject:Date:In-Reply-To; b=e2Hf2a2NJ4Uxfs0/O0oizVnF03xbiQ+IeqKhTUUgDcPJzvW3FDa5mRW1lBOTgOjv+ 8p8QBOETGWgNHIQSgFgaz3it2KhbOvEDWpe9ADXrqeiqc0xVGf2O7tuRh3Wph5Hl8o J/Fl5IdDxIpx80loiSyeRkHBUtMtc4so4OIG0AHw=
X-Authentication-Warning: raven.nostrum.com: Host [47.186.34.206] claimed to be unformal.localdomain
To: tools-discuss@ietf.org
References: <4d70a1ac-a275-420a-83f6-99dfd5b5385c@www.fastmail.com>
From: Robert Sparks <rjsparks@nostrum.com>
Message-ID: <14bd112c-fd34-44ce-dcbc-9f3b989cdd7d@nostrum.com>
Date: Tue, 27 Jul 2021 21:22:30 -0500
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:78.0) Gecko/20100101 Thunderbird/78.12.0
MIME-Version: 1.0
In-Reply-To: <4d70a1ac-a275-420a-83f6-99dfd5b5385c@www.fastmail.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: quoted-printable
Content-Language: en-US
Archived-At: <https://mailarchive.ietf.org/arch/msg/tools-discuss/XV1RacdP3X0AEi3uTScowi4kaSM>
Subject: Re: [Tools-discuss] Why do we even have text formats any more?
X-BeenThere: tools-discuss@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF Tools Discussion <tools-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tools-discuss>, <mailto:tools-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tools-discuss/>
List-Post: <mailto:tools-discuss@ietf.org>
List-Help: <mailto:tools-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tools-discuss>, <mailto:tools-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 28 Jul 2021 02:22:41 -0000
This is worth exploring, but a few things: First thought - we have some thousands of RFCs that only exist as txt, so I'm reading your argument as "for new things", but keep in mind we often have to process both old and new things (think diff). Second thought - diff. You and I have discussed some potential candidates for html-diff that would rival the text diff for visual inspection, and they're still not-quite there. And when it comes down to some things, diffing text is still going to be the best generalizable tool. XML-diffing continues to be more of a stretch than it intuitively appears. (I think there's an argument here for keeping as much work that would involve diffing as we can in a language like markdown, but...) Third thought - an alternative already brought up (Carsten I think was first) to html-ization for the things we have v3 xml for is to create a writer for it that builds it from the xml source rather than trying to pull things by heuristics from the text. Maybe where you're pointing would obviate that, but there may be different decisions to make in that writer that would be advantageous. And finally, to your footnote, raising "why aren't people submitting XML?" - I've seen recently that there is fear from some seasoned submitters that the processor at the datatracker will get the references wrong. This is tied up with working in v2 and the issues we are working to correct with bibxml generation. Mitigating that fear will have an impact on the xml submission rate, I think. RjS On 7/27/21 8:53 PM, Martin Thomson wrote: > I realize that this might be a little inflammatory as far as subjects go, but bear with me. > > There are probably a few narrow cases where rendering plain text is better than HTML. But what we've been doing for years (thanks to Henrik's great tool) is take text and turn it into HTML using the power of regular expressions. That's been good, but it's not always reliable (how many errata mention that "Section X of [FOO]" links to Section X of this document?). It's also been lagging as the text format changed (case in point: lack of a table of contents). > > Here's an alternative: style the HTML so that it looks like the text. I tried this and it worked shockingly well. > > Repo: https://github.com/martinthomson/rfc-txt-html > Demo: https://martinthomson.github.io/rfc-txt-html/diff.html > > This isn't perfect, but it seems pretty good to me. Keep in mind that this took only a little bit of time to sketch out. No doubt it can be improved. The readme has a bunch of things I found, all minor. > > I don't think that this is the end of text, but a possible way to limit our use of the htmlizer[1]. People who need to automate access to content might still use text, though I will argue that XML is superior in that regard. The other thing that comes to mind is diffs: HTML-native diff tools are somewhat less than ideal. Either way, serving HTML is just better. > > Enjoy, > Martin > > > [1] Though I still see a shocking number of people authoring in XML (or XML-capable input formats) and submitting in text. But I think we have plans to limit that. > > ___________________________________________________________ > Tools-discuss mailing list - Tools-discuss@ietf.org > This list is for discussion, not for action requests or bug reports. > * Report datatracker and mailarchive bugs to: datatracker-project@ietf.org > * Report tools.ietf.org bugs to: webmaster@tools.ietf.org > * Report all other bugs or issues to: ietf-action@ietf.org > List info (including how to Unsubscribe): https://www.ietf.org/mailman/listinfo/tools-discuss
- [Tools-discuss] Why do we even have text formats … Martin Thomson
- Re: [Tools-discuss] Why do we even have text form… Eric Rescorla
- Re: [Tools-discuss] Why do we even have text form… Robert Sparks
- Re: [Tools-discuss] Why do we even have text form… Martin Thomson
- Re: [Tools-discuss] Why do we even have text form… Carsten Bormann
- Re: [Tools-discuss] Why do we even have text form… Lars Eggert
- Re: [Tools-discuss] Why do we even have text form… Toerless Eckert
- Re: [Tools-discuss] Why do we even have text form… Michael Richardson
- Re: [Tools-discuss] Why do we even have text form… Russ Housley
- Re: [Tools-discuss] Why do we even have text form… Robert Sparks
- Re: [Tools-discuss] Why do we even have text form… Julian Reschke
- Re: [Tools-discuss] Why do we even have text form… Michael Richardson
- Re: [Tools-discuss] Why do we even have text form… Michael Richardson
- Re: [Tools-discuss] Why do we even have text form… Julian Reschke
- Re: [Tools-discuss] Why do we even have text form… Michael Richardson
- Re: [Tools-discuss] Why do we even have text form… Julian Reschke
- Re: [Tools-discuss] Why do we even have text form… Carsten Bormann
- Re: [Tools-discuss] Why do we even have text form… Julian Reschke