[Cbor] draft-ietf-cbor-edn-literals-10 implementation notes

Joe Hildebrand <hildjj@cursive.net> Fri, 16 August 2024 19:41 UTC

Return-Path: <hildjj@cursive.net>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CACE5C15108D for <cbor@ietfa.amsl.com>; Fri, 16 Aug 2024 12:41:58 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.108
X-Spam-Level:
X-Spam-Status: No, score=-2.108 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cursive.net
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id i-FXOn8iz2ex for <cbor@ietfa.amsl.com>; Fri, 16 Aug 2024 12:41:54 -0700 (PDT)
Received: from mail-il1-x12f.google.com (mail-il1-x12f.google.com [IPv6:2607:f8b0:4864:20::12f]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D81F5C14CE51 for <cbor@ietf.org>; Fri, 16 Aug 2024 12:41:54 -0700 (PDT)
Received: by mail-il1-x12f.google.com with SMTP id e9e14a558f8ab-39d2cea1239so2728395ab.3 for <cbor@ietf.org>; Fri, 16 Aug 2024 12:41:54 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cursive.net; s=google; t=1723837313; x=1724442113; darn=ietf.org; h=to:date:message-id:subject:mime-version:content-transfer-encoding :from:from:to:cc:subject:date:message-id:reply-to; bh=2kEKPisVkEyPco1aSxQJrXcWf/MgtiR07yJD6sWEqXA=; b=fHDR2Gr0MnTgMUcj91sr86wGyvYl24NpBOVjU/Dr+BpElTJMGwVZ74VjszaMJ0805P eVZEkSB/xy2OcjknbNcPiQFRNeW72vH65F0n1ZKoYb5ahPRQY+EPSYrC8cTImelIEDNr p440BYkUmfyzL1LY3DkdZ0RNEXogeGWNu2dEc=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723837313; x=1724442113; h=to:date:message-id:subject:mime-version:content-transfer-encoding :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=2kEKPisVkEyPco1aSxQJrXcWf/MgtiR07yJD6sWEqXA=; b=l0i8cJ80POJkbPJudHbYg2p6fXq6NDXlopX9ql3nBUoRjbc+fe0lABzTShGumRgAJJ LcInbGMY6/TtPvXkTDP0nI30kngDDvz6eVjw09RwTKtlzOoYBnQLFCnVfBmemgraCX1h BV4jwDj40WkC7JRk+2ARVS5z4jSjkoprQVDQg+dxUxZOxW5CJWpsR6ZBaNGssNWNJ6CS Kj0qfuoEMFptsWHE1jbvy7PJ+L2AYFPyLTqwwwVjHmHxITaQ1yM9sz/kTUXG6/ILcUGX F/CqNydIL1zr8PD0Yjjny8lkbGxri+wCA0bwok5fZrr8x539QvNS97Dhjp7Wvw/O6Qgp /xIw==
X-Gm-Message-State: AOJu0YzQNvI5Y8lluhzfppoIm/Z6a1qVt9TSblcIrWKBZAAnezfyjbko BgCKz8Pox0qJJ2eUOf+O2kfzrhDvC9RNLzOCaPeZ651Rnzuu5ZFz2Y+tZG2V9UXjYmNRx5dptIw =
X-Google-Smtp-Source: AGHT+IEVnNQaYLhd2abqdLTdsmsWIPOB3nio7gRK2AqOifuqBbM7Ti3OltNJNNJ5OHqOHhZSX27jVg==
X-Received: by 2002:a05:6e02:1a8f:b0:395:e85e:f30d with SMTP id e9e14a558f8ab-39d26cde766mr45631425ab.2.1723837313289; Fri, 16 Aug 2024 12:41:53 -0700 (PDT)
Received: from smtpclient.apple ([2601:282:2181:450f:55a5:8a71:3193:301]) by smtp.gmail.com with ESMTPSA id e9e14a558f8ab-39d2c48359csm5002675ab.35.2024.08.16.12.41.52 for <cbor@ietf.org> (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 16 Aug 2024 12:41:52 -0700 (PDT)
From: Joe Hildebrand <hildjj@cursive.net>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3776.700.51\))
Message-Id: <FFD7DB6F-2313-4A64-B434-110B1C524D0C@cursive.net>
Date: Fri, 16 Aug 2024 13:41:42 -0600
To: CBOR <cbor@ietf.org>
X-Mailer: Apple Mail (2.3776.700.51)
Message-ID-Hash: HP4BCWYGQ5N6LNBEFEH4ZG5Y4BTAASYK
X-Message-ID-Hash: HP4BCWYGQ5N6LNBEFEH4ZG5Y4BTAASYK
X-MailFrom: hildjj@cursive.net
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-cbor.ietf.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
X-Mailman-Version: 3.3.9rc4
Precedence: list
Subject: [Cbor] draft-ietf-cbor-edn-literals-10 implementation notes
List-Id: "Concise Binary Object Representation (CBOR)" <cbor.ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/NGs91wolsq-8PeCK9s3g3sdKQoM>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Owner: <mailto:cbor-owner@ietf.org>
List-Post: <mailto:cbor@ietf.org>
List-Subscribe: <mailto:cbor-join@ietf.org>
List-Unsubscribe: <mailto:cbor-leave@ietf.org>

I'm taking a pass over implementing draft-ietf-cbor-edn-literals-10.  Here are some comments:

- Overall, this spec is in pretty good shape.  Aside from the app-strings, I was able to implement directly from the ABNF using my standard tooling.  Reading the actual text of the spec answered most of my questions.

- Like others, I don't like the two-level ABNF.  I understand that you're going for extensibility, but you can still leave an extension point in place while having a single grammar.

- The doc structure is quite odd in that it presents the extension points before the main format.  I understand that's an outgrowth of how the doc grew over time, but it needs a small refactor before publishing.  I'm willing to provide more suggestions or help, if the authors want.

- I wish that floating point encodings had a separate spec syntax from integers, rather than relying on getting a decimal point or "e" into the output somehow (e.g. JS doesn't have good float formatting built in).  For example, 2_f1 could mean 0xf94000, while 2_1 would mean 0x190002.

- s3 could be moved after the ABNF section and the (possibly new) app-string section and make more sense.

- s3.2, "Herewith I buy" /.../ "gned: Alice & Bob" doesn't match the grammar.  I think it needs a +.

- s4.1 It should be an error to mix app-strings and strings, or app-strings of multiple types.  The text at the end of the section starting with "Some of the strings may be app-strings..." is not proscriptive enough.

- It's not clear to me what happens if there are multiple items in the sequence inside <<>>.  I assume they are concatenated together, even though that's a little odd unless you are generating cbor streams.  I would have expected the production to be:

    embedded = "<<" one-item ">>"

You would still get concatenation with << 1 >> + << 2 >> in the unlikely event you need it.

- (nit) I'd prefer basenumber to be split into 3 or 4 rules, one for each base, since each needs special processing.

- s4.2 could be outdented to s5, containing both ABNF and the descriptions of the app-string formats from s2.  Having to go back and forth made reading more difficult than it needed to be.

- s4.2.1, h'/head/ 63 /contents/ 66 6f 6f' should become << "cfoo" >>, not << "foo" >> if I'm understanding correctly.

- s6 Security Considerations seems like it could use some more text about how this format isn't intended for interchange.

I'm building up a large-ish set of test vectors.  I'm willing to put those into a separate repo for sharing if anyone is interested in collaborating.

I'm not quite done with the implementation, so there are likely to be a few more comments as I continue to dig.

— 
Joe Hildebrand