Re: [Json] [rfc-i] sourcecode type="json"

Carsten Bormann <cabo@tzi.org> Wed, 27 October 2021 06:21 UTC

Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 14.0 \(3654.120.0.1.13\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <20211027003607.5C4472E60559@ary.qy>
Date: Wed, 27 Oct 2021 08:21:00 +0200
Content-Transfer-Encoding: quoted-printable
Message-Id: <86844357-A8C7-4590-B8DC-D801E223A60A@tzi.org>
References: <20211027003607.5C4472E60559@ary.qy>
To: rfc-interest <rfc-interest@rfc-editor.org>, json@ietf.org
Archived-At: <https://mailarchive.ietf.org/arch/msg/json/_FgQkgb9_3T1QUVGXXBuLbJ7xec>
Subject: Re: [Json] [rfc-i] sourcecode type="json"
Precedence: list

Thank you all for your great feedback.

> On 27. Oct 2021, at 02:36, John Levine <johnl@taugh.com> wrote:
> 
> Right.  We have lots of sourcecode which is a chunk of a program, not a full program.
> 
> If it's a chunk of JSON, call it JSON.

That is certainly true of e.g. C language — there is no expectation that C language in a sourcecode block is a complete program.

The question is really why are we marking up sourcecode as to its type in the first place.

One need would be for rendering.
But amazingly, it’s 2021 and we don’t yet have syntax coloring in RFCs.
This, however, could be added from the information that is in the XML today; full JSON texts and JSON fragments probably can be handled by the same coloring engine.
(The coloring would be heuristic during rendering, not embedded in the XML; only the type information would be needed to select the appropriate heuristics.)

With FDT languages such as ABNF or CDDL, extraction is needed; since some ABNF/CDDL is just for exposition, extraction needs the distinction between that and the normative ABNF/CDDL; in recent RFCs that has sometimes been done by using the @name attribute alongside @type.

Apart from that, the needs I’m more interested in are as a support for authoring.
This is mostly based on extraction, but also on the ability to run CI processes on the sourcecode extracted (per-block and between them).
Additional metadata may be required as input to e.g. a validation process; YANG puts that into the “<CODE BEGINS>” line.

The specific question came up because I was adding some CI to the SDF spec repo.
Inevitably, that made me find one typo, which although not obscuring the intention, if undetected would sooner or later have led to an errata report.
I expect authoring tools like kramdown-rfc will provide some of this validation as a matter of course, but that requires metadata.
Of course, these don’t *have* to be saved in the XML, but for practical reasons much of the CI processing operates on the XML output.

So I’d rather have a way to embed metadata that is required for automatic processing (extraction, validation) in the XML.
This would also help with maintaining validation through the RFC production.

My current CI code has to guess whether type=json means JSON text or JSON fragment (where the latter is well-defined for SDF, but maybe not in general).
My concern is less that this adds 5 lines of code to the now 19-line validation script.
I’d rather not require that a validation step needs to guess the author’s intention.

(The next step then is to add CDDL validation for the JSON texts and fragments.
Additional metadata needed here is: which specific CDDL rule is intended to govern that text/fragment.
I’m not too happy if I need to cram that into @name, but it looks that way.) 

Grüße, Carsten

[Json] sourcecode type="json" Carsten Bormann
Re: [Json] sourcecode type="json" Pete Cordell
Re: [Json] sourcecode type="json" Mark Nottingham
Re: [Json] sourcecode type="json" John Levine
Re: [Json] [rfc-i] sourcecode type="json" Carsten Bormann
Re: [Json] [rfc-i] sourcecode type="json" John Levine