[core] Review of CIRIs (was: Review of CoRAL)

Klaus Hartke <hartke@projectcool.de> Tue, 22 January 2019 14:54 UTC

Return-Path: <hartke@projectcool.de>
X-Original-To: core@ietfa.amsl.com
Delivered-To: core@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0439D12872C; Tue, 22 Jan 2019 06:54:45 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.898
X-Spam-Level:
X-Spam-Status: No, score=-1.898 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, SPF_FAIL=0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id X9O447qu6hIt; Tue, 22 Jan 2019 06:54:43 -0800 (PST)
Received: from wp382.webpack.hosteurope.de (wp382.webpack.hosteurope.de [IPv6:2a01:488:42:1000:50ed:8597::]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2C6B11286D9; Tue, 22 Jan 2019 06:54:39 -0800 (PST)
Received: from mail-qt1-f176.google.com ([209.85.160.176]); authenticated by wp382.webpack.hosteurope.de running ExIM with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) id 1glxRl-0003mb-61; Tue, 22 Jan 2019 15:54:37 +0100
Received: by mail-qt1-f176.google.com with SMTP id n32so27785362qte.11; Tue, 22 Jan 2019 06:54:37 -0800 (PST)
X-Gm-Message-State: AJcUukcIty1lKqfTpkn/dYBcqh3NBodcFM2rUnTD8MRaE1GdOEHQG++V BL8RR7zXnyYvzQegZHYlEux2e2PSGO0J9WQQtKE=
X-Google-Smtp-Source: ALg8bN4q+K/4PejuNiud3LjkTav8LGW8JOHPYeynDOS99Ai95DMBG+z0SFC6QaR0DbE4YhOT9HRMa90EQGUTcH8GR34=
X-Received: by 2002:aed:35c5:: with SMTP id d5mr31288978qte.212.1548168876122; Tue, 22 Jan 2019 06:54:36 -0800 (PST)
MIME-Version: 1.0
References: <20181031164534.GA4995@hephaistos.amsuess.com>
In-Reply-To: <20181031164534.GA4995@hephaistos.amsuess.com>
From: Klaus Hartke <hartke@projectcool.de>
Date: Tue, 22 Jan 2019 15:53:59 +0100
X-Gmail-Original-Message-ID: <CAAzbHvaBZLDyU2nZ726ZXDddDqq8Fg0ObyPAAx3aQOhH4yuDfw@mail.gmail.com>
Message-ID: <CAAzbHvaBZLDyU2nZ726ZXDddDqq8Fg0ObyPAAx3aQOhH4yuDfw@mail.gmail.com>
To: =?UTF-8?Q?Christian_Ams=C3=BCss?= <christian@amsuess.com>
Cc: T2TRG@irtf.org, draft-hartke-t2trg-coral@ietf.org, "core@ietf.org WG" <core@ietf.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-bounce-key: webpack.hosteurope.de; hartke@projectcool.de; 1548168882; 5da94e1f;
X-HE-SMSGID: 1glxRl-0003mb-61
Archived-At: <https://mailarchive.ietf.org/arch/msg/core/ITSAbkgxMhlgzwn-20Jtk6mfN04>
Subject: [core] Review of CIRIs (was: Review of CoRAL)
X-BeenThere: core@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Constrained RESTful Environments \(CoRE\) Working Group list" <core.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/core>, <mailto:core-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/core/>
List-Post: <mailto:core@ietf.org>
List-Help: <mailto:core-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/core>, <mailto:core-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 22 Jan 2019 14:54:46 -0000

Hi Christian,

thanks again for your review. Here's the next batch of replies, this
time related to CIRIs.

Christian Amsüss wrote:
> * Appendix C: Do I understand correctly that there are some CBOR encoded
>   relative references that can not (even when knowing the rel in
>   append-relation) be expressed in a relative-ref? In particular, those
>   would be the append-* types. If so, it would be prudent to point out
>   right next to where it says that "CBOR-encoded IRI References are not
>   capable of expressing all IRI references".

In principle, it is always to possible to use IRIs everywhere.
Relative references are more like a compression mechanism that allows
us to write IRIs more efficiently when they share a common base. In
that sense, there are no CIRI references that cannot be expressed by
an IRI reference:

    ciri_ref_to_iri_ref(ciri, base_ciri) = make_iri_relative(
                    ciri_to_iri(resolve_ciri(ciri, base_ciri)),
                    ciri_to_iri(base_ciri))

> * Most of the URI resolution gets away without string comparisons. I'd
>   like to suggest making "../" an explicit URI option (say, a `*(parent:
>   5.5)` that'd go between path.type and and path. References with
>   dot-segments inside them are probably safe to put in the "not capable
>   of expressing all" category.

Good idea. My proposal would be to use the path.type option for this:

   path.type
      Specifies the type of the URI path for reference resolution.  The
      option value is an integer in the range 0 to 127, named as
      follows:

         0 - absolute-path
         1 - append-relation
         2 - append-path
         3 - relative-path
         4 - relative-path-1up
         5 - relative-path-2up
         6 - relative-path-3up
         7 - relative-path-4up
         ...

   path
      Specifies one segment of the URI path.  The option value can be
      any Unicode string with the exception of the strings "." and "..".
      This option can occur more than once.

>   As I understand URI resolution, we don't need a "./" because that only
>   needs to be explicit when starting with something that has a colon in
>   it that's not the scheme's colon, otherwise "./x/y" is always
>   equivalent to "x/y".

Good point. Removed with the above proposal.

> * Speaking of expressiveness of the encoded URIs: (Especially as urn: is
>   used in the document,) would it hurt to grow the encoded references
>   such that they can encode urns? It might be sufficient to allow scheme
>   without a host/port but a path.

Makes sense. It would be good to know that set of URIs we want to
support exactly and what the scheme-specific normalization rules are
for these (see also the discussion on comparisons in [1]).

>   Apropos port: Why must this be present after a host name? This
>   complicates round-tripping to URIs (because the algorithm has to know
>   to fill in default ports of all protocols), limits applicability to
>   portless protocols and adds some bytes -- and in the embedded
>   implementation, adding a default port can be as easy as just setting
>   it at initialization time before parsing.

Ari also brought this up in his review. See the discussion in [1].

> * Appendix C: A state transition diagram like the ones on
>   http://json.org/ could be as expressive as the list in C.3 while being
>   easier to grasp.

Added in -00.

> * How are compressed URIs expressed in CBOR? Flat [6, "foo", 6, "bar"]
>   or nested [[6, "foo"], [6, "bar"]]? My first reading (and current
>   implementation) resulted in "nested", supported by the `(option,
>   value) = href[0]` line in C4, but reading C.1 with the CDDL spec next
>   to it (and running the cddl tool) rather indicates "flat" to me.

Flat [6, "foo", 6, "bar"]. I'm adding a clarification and some
examples in the next draft revision.

>   Either way, some examples would be great.

Agreed. Working on it.

> * Nit: calling a sequence of options _absolute_ brings up the
>   association of the "absolute URI", which is not the same as "URI"
>   (which is by definition not a reference, but sometimes it helps to say
>   "full URI"). An Absolute URI would not have a fragment identifier (and
>   is defined to ease definign protocols where the fragment is never
>   sent), whereas the non-relative or full option sequence described here
>   can easily (and should be able to) contain a fragment identifier.

I see the potential confusion, but I don't have a good idea how to
phrase it in a better way. Any suggestions?

Klaus

[1] https://mailarchive.ietf.org/arch/msg/t2trg/7RIEEHql5Jm70TAzlOuQeXAwNfc