Return-Path: <cabo@tzi.org>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1])
	by ietfa.amsl.com (Postfix) with ESMTP id F3E2DC15153F
	for <cbor@ietfa.amsl.com>; Thu, 25 Jul 2024 12:05:55 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.909
X-Spam-Level: 
X-Spam-Status: No, score=-1.909 tagged_above=-999 required=5
	tests=[BAYES_00=-1.9, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001,
	SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01]
	autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194])
	by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024)
	with ESMTP id Pn-qhdRaNzms for <cbor@ietfa.amsl.com>;
	Thu, 25 Jul 2024 12:05:54 -0700 (PDT)
Received: from smtp.zfn.uni-bremen.de (smtp.zfn.uni-bremen.de [134.102.50.21])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest
 SHA256)
	(No client certificate requested)
	by ietfa.amsl.com (Postfix) with ESMTPS id 87739C151534
	for <cbor@ietf.org>; Thu, 25 Jul 2024 12:05:54 -0700 (PDT)
Received: from smtpclient.apple (p5dc5d6c5.dip0.t-ipconnect.de
 [93.197.214.197])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.zfn.uni-bremen.de (Postfix) with ESMTPSA id 4WVL282b1szDCbG;
	Thu, 25 Jul 2024 21:05:52 +0200 (CEST)
Content-Type: text/plain;
	charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3774.600.62\))
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: 
 <CAKoiRuZZ4UCjUwwUbVuM0_JqXefmrU23YG_3d-JmJEznh7ASQw@mail.gmail.com>
Date: Thu, 25 Jul 2024 21:05:41 +0200
Content-Transfer-Encoding: quoted-printable
Message-Id: <7D42D73E-0DDE-491C-A9AA-BCA6B51773EC@tzi.org>
References: <ZpxlWGAC9UMLqd9c@hephaistos.amsuess.com>
 <CAN40gSudKn5NyD+5J5j59V1fvt2e+f_iAXO9FmmH6Mu8Q823RA@mail.gmail.com>
 <CAKoiRuZZ4UCjUwwUbVuM0_JqXefmrU23YG_3d-JmJEznh7ASQw@mail.gmail.com>
To: Rohan Mahy <rohan.mahy@gmail.com>
X-Mailer: Apple Mail (2.3774.600.62)
Message-ID-Hash: KQDVLAIZ7WLOPHUDBF3O6VB4HVNVBNYL
X-Message-ID-Hash: KQDVLAIZ7WLOPHUDBF3O6VB4HVNVBNYL
X-MailFrom: cabo@tzi.org
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency;
 loop; banned-address; member-moderation; header-match-cbor.ietf.org-0;
 nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size;
 news-moderation; no-subject; digests; suspicious-header
CC: Ira McDonald <blueroofmusic@gmail.com>,
 =?utf-8?Q?Christian_Ams=C3=BCss?= <christian@amsuess.com>,
 CBOR <cbor@ietf.org>
X-Mailman-Version: 3.3.9rc4
Precedence: list
Subject: =?utf-8?q?=5BCbor=5D_Re=3A_Consensus_call_on_EDN_literals_single_ABNF?=
List-Id: "Concise Binary Object Representation (CBOR)" <cbor.ietf.org>
Archived-At: 
 <https://mailarchive.ietf.org/arch/msg/cbor/qlnzZNhFLxHube09Ip_lCkypxo0>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Owner: <mailto:cbor-owner@ietf.org>
List-Post: <mailto:cbor@ietf.org>
List-Subscribe: <mailto:cbor-join@ietf.org>
List-Unsubscribe: <mailto:cbor-leave@ietf.org>

On 21. Jul 2024, at 18:25, Rohan Mahy <rohan.mahy@gmail.com> wrote:
>=20
> Hi Ira,
> Could you please give your reasoning in a few words?

I=E2=80=99m also not Ira.

What Christian said (606 words).

I have written up my own view below (716 words), if you care.

Gr=C3=BC=C3=9Fe, Carsten


There are strong technical reasons for the two-layer model.

The original focus of the original edn-literal proposal was to have a =
pluggable syntax for application-specific literals.
This idea is quite successful and has been embraced already to =
=E2=80=9Cplug=E2=80=9D a number of holes in EDN.
=46rom an architectural point of view, a pluggable literal syntax fits =
exceedingly well with the whole idea of CBOR tags, which are a pluggable =
extension mechanism for the data model behind the interchange format.
(Side observation: It took us a lot of energy in 2013 to get the concept =
of tags accepted, because the concept was rather innovative at the =
time.)

A pluggable literal syntax can only be done cleanly when it is based on =
a common base syntax for all those pluggables.
This becomes clear when you look at section 3.1.
The functionality to handle unknown application-extension identifiers is =
rather important for deployability, to minimize friction when =
introducing new identifiers.
A lot of experience went into getting this right.

Increasing the coupling between the base ABNF and each =
application-extension by moving to a one-layer model would damage the =
proposal by making the base ABNF unstable and by requiring it to change =
it for each new application-extension.

I don't see a point in mashing up the ABNF grammar in the specification =
to make it single-level.
The argument that this has always been done this way in production =
parser implementation is not at all compelling for our case (the =
intuition here appears to come from SIP, which does require =
production-quality text-based parsing =E2=80=94 exactly what we are =
trying to get rid of with CBOR.  OBTW, SIP/SDP is two-layer; you =
wouldn't munch the SDP syntax into the SIP syntax.)
EDN is not about production parsing, it is a tools and documents syntax.

More importantly, an EDN implementation does not have to be built =
directly from the ABNF the document uses for defining the syntax.

The next little innovation was recognizing that we could treat =
hex/base64 literals the same way as the pluggables; this was just an =
obvious simplifying step.
Actually, the various text-based representations for binary data do make =
great pluggables =E2=80=94 there really is no good reason to be limited =
to base16/32/64, and the clear separation even allows us to cleanly =
defer defining base32 because we currently have no implementation =
experience.

Again, this clean model can be broken up and special-cased and be made =
more complicated in the document, but there would need to be a really =
good reason to do so.

The objective of including ABNF was being able to explain the syntax, =
and, if possible, even to generate code from that.
(Parser combinators (nom) were mentioned; I don't see a reason why nom =
cannot be used for directly implementing the two layer approach, by the =
way.)
Doing code generation directly from the ABNF is not the single, =
normative way to do an implementation.
There is nothing wrong with an implementer who comes up with their own =
grammar, and I wouldn=E2=80=99t mind cultivating a single-level ABNF as =
a separate project.
Putting this into the main document (and replacing the clean syntax) is =
just premature optimization, like as if TCP and IP had been described in =
a single document =E2=80=94 this was certainly possible (and was done =
this way in 1974), but the invention of layers and the separation of =
functionalities was what made IP so powerful.

EDN-literals as defined today has a clear mental model for the pluggable =
part.
It is eminently easy to check that the syntax makes sense.
I'd say that swapping out a major part of the grammar is a rather risky =
late change.
There are a few other CBOR and CDDL drafts that really wait for our =
attention, and I really don=E2=80=99t want to waste more time on this =
subject now that we have a stable basis.

If there were any merit to doing this change, I might be more open to =
it.
But the upside is very limited, and it also would be a big regression.

Of course, what we really want is for ABNF itself to specify how the =
two-layer approach works (using the first layer to transform, then parse =
using the second layer).  Stay tuned...

