[core] Re: WG Last Call for draft-ietf-core-href-18 (2nd WGLC)

Christian Amsüss <christian@amsuess.com> Wed, 19 February 2025 14:24 UTC

Return-Path: <christian@amsuess.com>
X-Original-To: core@ietfa.amsl.com
Delivered-To: core@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7D64DC180B7A for <core@ietfa.amsl.com>; Wed, 19 Feb 2025 06:24:49 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.908
X-Spam-Level:
X-Spam-Status: No, score=-1.908 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id dDVG1q9_HaQb for <core@ietfa.amsl.com>; Wed, 19 Feb 2025 06:24:45 -0800 (PST)
Received: from smtp.akis.at (smtp.akis.at [IPv6:2a02:b18:500:a515::f455]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 61D59C169436 for <core@ietf.org>; Wed, 19 Feb 2025 06:24:43 -0800 (PST)
Received: from poseidon-mailhub.amsuess.com ([IPv6:2a02:b18:c13b:8010:a800:ff:fede:b1bd]) by smtp.akis.at (8.18.1/8.17.2) with ESMTPS id 51JEOf8M066120 (version=TLSv1.2 cipher=ECDHE-ECDSA-AES256-GCM-SHA384 bits=256 verify=NOT) for <core@ietf.org>; Wed, 19 Feb 2025 15:24:41 +0100 (CET) (envelope-from christian@amsuess.com)
X-Authentication-Warning: smtp.akis.at: Host [IPv6:2a02:b18:c13b:8010:a800:ff:fede:b1bd] claimed to be poseidon-mailhub.amsuess.com
Received: from poseidon-mailbox.amsuess.com (unknown [IPv6:2a02:b18:c13b:8010:a800:ff:fede:b1bf]) by poseidon-mailhub.amsuess.com (Postfix) with ESMTP id CFFBB500D8 for <core@ietf.org>; Wed, 19 Feb 2025 15:24:40 +0100 (CET)
Received: from hephaistos.amsuess.com (unknown [IPv6:2a02:b18:c13b:8010:7213:6d01:fa4d:34f7]) by poseidon-mailbox.amsuess.com (Postfix) with ESMTPSA id 88D814403E for <core@ietf.org>; Wed, 19 Feb 2025 15:24:40 +0100 (CET)
Received: (nullmailer pid 3621271 invoked by uid 1000); Wed, 19 Feb 2025 14:24:40 -0000
Date: Wed, 19 Feb 2025 15:24:40 +0100
From: Christian Amsüss <christian@amsuess.com>
To: "core@ietf.org" <core@ietf.org>
Message-ID: <Z7XpqMiWhB-Eo6hf@hephaistos.amsuess.com>
References: <aebdceb8-2025-43a5-a065-19e9715f4451@ri.se>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg="pgp-sha256"; protocol="application/pgp-signature"; boundary="QU2BuZsiWntIpm5Q"
Content-Disposition: inline
In-Reply-To: <aebdceb8-2025-43a5-a065-19e9715f4451@ri.se>
X-Scanned-By: MIMEDefang 2.86
Message-ID-Hash: BEOGPYABMBEFINVRTKPSFODILTXEYVZE
X-Message-ID-Hash: BEOGPYABMBEFINVRTKPSFODILTXEYVZE
X-MailFrom: christian@amsuess.com
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-core.ietf.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
X-Mailman-Version: 3.3.9rc6
Precedence: list
Subject: [core] Re: WG Last Call for draft-ietf-core-href-18 (2nd WGLC)
List-Id: "Constrained RESTful Environments (CoRE) Working Group list" <core.ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/core/nteOp0w6Cj9ZWe52qqD1_lXIeTo>
List-Archive: <https://mailarchive.ietf.org/arch/browse/core>
List-Help: <mailto:core-request@ietf.org?subject=help>
List-Owner: <mailto:core-owner@ietf.org>
List-Post: <mailto:core@ietf.org>
List-Subscribe: <mailto:core-join@ietf.org>
List-Unsubscribe: <mailto:core-leave@ietf.org>

Hello authors and group,

sorry for being late to the party, here are my comments on the document.
Nits are being collected into a PRs where I think writing them down here
would just be wasting readers' time.

(This is part 1 of a mult-part review.)

* 2: Those design considerations are an undeclared mix of observations
  on URIs in general, properties that are true after normalization, and
  limitations introduced.

  (For example, C1 is URI property up to "to ASCII" followed by
  consequences of syntax based normalization. C4 is a limitation of CRIs
  with an escape hatch for extensions. C5's "joined with dots" is merely
  a convention that CRIs choose to embody in its modelling without
  constraining values.).

  I think it is important to show the real constraints (like that the
  port is in 0..=65535, or not supporting IPvfuture without a per-future
  extension), and not water that down with properties of
  syntax-normalized URIs.

  I didn't even read through all the C items; only those that caught my
  eye.

  As a whole, they do seem to also serve the extra purpose of
  introducing the components of the data model; it is even explicit in
  that, but the main info (a CRI consists of those five) gets completely
  lost under the details. I think this would be more useful if it was
  named "Data Model", had bullets actually named after its components,
  maybe structured (collecting user info up to port into the authority
  group), and maybe already had the CDDL to go with it.

  Technically, that section also talks about CRIs and not CRI
  references. That is fine, but if it is used as the initial definition
  of the data model, it should also describe that for CRI references --
  otherwise, the first mention of CRI references and in particular
  "discard" hits the reader unprepared in 2.1.

* C7 (and its later enactment) is a deviation from what the document
  does in general with regard to basing on syntax-based normalization.
  This can technically be done here, because it builds on registered
  schemes, but that assumes a perfect registry -- and I don't think
  there's anything in the registration process that keeps scheme
  registrants from doing "stupid" things like altering default ports
  over time.

  This came from the determinism goals (distinct CRIs should not yield
  the sam URI), but a) that is now described as a goal and not a
  definite property, and b) the URIs produced by CRIs with and without
  ports *are* different under the desired syntax-based normalization.

  It's a good *recommendation* to elide the port, but making it a
  mandate pushes load on tools to keep default ports in the scheme list,
  adds complexity, and that's even before trouble arises from fluent
  scheme specifications. I think the spec would be better off without
  mandatory default port elision.

* 2.1: Agree with Thomas on picking up the reader where they are.

  The two examples of where it can legally do more than URIS are
  quite distinct in what they mean to the user: "append a slash and then
  add" is a feature that is useful for constrained devices (no more "but
  we need the trailing slash to make relative refrences possible"),
  unsetting the query is more of a convenient consequence of the
  algorithm. (Also that's not really a constraint).

  The actual constraints are fine contentually, but I think they go 

* Some constraints talk of NFC. That term is only later used as a "MAY"
  for converting from user input -- but then it's not a constraint at
  all.

  (And please don't resolve this into an actual requirement: the
  stability guarantee on normalization[1] only makes promises on
  characters from a given version of Unicode, so a URI-to-CRI converter
  that sees %f0%9f%ab%be%f0%9f%ab%bf may not have that in its table and
  express it as "\u{1fafe}\u{1faff}", while another converter may know
  some more Unicode and would then either convert it to
  "\u{whateverthatcombinesto}" or as PET)

* The rules for encoding userinfo into a URI are clear ("any character
  [...] that is not in the set of unreserved characters [...] or
  "sub-delims" [...] MUST be percent-encoded"), but is somewhat
  inconsistent with the rules for path: For path, that set is extended
  with characters that can legally appear (":", "@", consistent with the
  extension of pchar) and MUST NOT be percent-encoding (leaving the
  options for using PET). If the same rationale was applied to userinfo
  (where just as with pchar, ":" is allowed) and ":" was mapped to a
  colon, then the deprecated URIs containing user and password could be
  converted (albeit into a semantically opaque colon-separated form),
  and PET could be used to preserve a percent-encoded colon.

  My main motivation here is consistency (a field has an addition to
  unreserved/pct-encoded/sub-delims in its ABNF).

  Also, reading 3986 3.2.1 carefully, it is deprecated to use the format
  "user:password", but not to have a colon in there. (Even the
  recommendation to mask the data after the first colon is not normative).

* SP2: All examples that are given (after applying PR [170]) are covered
  by CRI extensions. Those are valuable to drive home that people are
  not generally expected to implement all the extensions, but as there
  are URIs left that even the extensions can't cover, it'd be good to
  list those too.

  In particular, if the userinfo comment above is addressed, I'm not
  aware of anything that remains outside of IPvfuture.

* Now that we have that big number table of schemes and the update to
  7595, why is scheme-name even still in there? (Or conversely, if we
  admit scheme-name, why go through the hassle of registering every
  registered or otherwise known scheme?)

* CDDL: The .feature for userinfo is in the group; does having the
  .feature inside the parentheses really affect the whole line? (As
  someone who doesn't have a CDDL parser built in, it reads like
  it would be `userinfo = (false)` for those who don't have that feature
  on).

* Section 3: This reads like it's up to the server to ensure that any
  resources it offers are also valid CRIs (and if it's not the server
  that issues the CRIs, a SHOULD is already violated). That makes life
  easy, but can just as well be read to discourage using CRIs to talk
  about systems that issued URIs (as the currently deployed HTTP and
  CoAP systems do).

  That "SHOULD" could be qualified with "unless a URI has been created
  that can be converted to a CRI" -- but considering the set of URIs
  that are not mappable, that exception is really the regular case.

(I'll continue in a follow-up starting at section 4, given there is
ongoing activy processing reviews).

BR
Christian


[1]: https://www.unicode.org/policies/stability_policy.html#Normalization
[170]: https://github.com/core-wg/href/pull/107

-- 
To use raw power is to make yourself infinitely vulnerable to greater powers.
  -- Bene Gesserit axiom