Re: [T2TRG] draft-hartke-t2trg-ciri-00 review

Klaus Hartke <hartke@projectcool.de> Tue, 22 January 2019 12:02 UTC

Return-Path: <hartke@projectcool.de>
X-Original-To: t2trg@ietfa.amsl.com
Delivered-To: t2trg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2974B130F09 for <t2trg@ietfa.amsl.com>; Tue, 22 Jan 2019 04:02:29 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, SPF_FAIL=0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id NGrAx7_RygEt for <t2trg@ietfa.amsl.com>; Tue, 22 Jan 2019 04:02:27 -0800 (PST)
Received: from wp382.webpack.hosteurope.de (wp382.webpack.hosteurope.de [IPv6:2a01:488:42:1000:50ed:8597::]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B9A30130F06 for <T2TRG@irtf.org>; Tue, 22 Jan 2019 04:02:27 -0800 (PST)
Received: from mail-qk1-f173.google.com ([209.85.222.173]); authenticated by wp382.webpack.hosteurope.de running ExIM with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) id 1glul6-0008L7-Vi; Tue, 22 Jan 2019 13:02:25 +0100
Received: by mail-qk1-f173.google.com with SMTP id m17so6602986qki.5 for <T2TRG@irtf.org>; Tue, 22 Jan 2019 04:02:24 -0800 (PST)
X-Gm-Message-State: AJcUukcaIIXQ8HOK/oCUP1pUI2Ck9AEUIKTObDvcH4fevhQABk05vjT5 cGFv4JcY4DFaTurt/b+7TdvubdHiXBjfN0LaogM=
X-Google-Smtp-Source: ALg8bN4dMv7PClINzvS8LQXP/lVkBEHN8QQHpn9ro4G8JtqeSWLwm377O03R6IKfZ3cp8eXt6hW2HAl5mw+dejDvOf0=
X-Received: by 2002:a37:455:: with SMTP id 82mr27853247qke.60.1548158543895; Tue, 22 Jan 2019 04:02:23 -0800 (PST)
MIME-Version: 1.0
References: <58aa0ae4-b3fe-abf7-9bda-4908ef0b3fd7@ericsson.com>
In-Reply-To: <58aa0ae4-b3fe-abf7-9bda-4908ef0b3fd7@ericsson.com>
From: Klaus Hartke <hartke@projectcool.de>
Date: Tue, 22 Jan 2019 13:01:47 +0100
X-Gmail-Original-Message-ID: <CAAzbHvZ4FyKErmS8i6_bWPgvXUoZbc+ZX28n1WsMa1tWGbvkzg@mail.gmail.com>
Message-ID: <CAAzbHvZ4FyKErmS8i6_bWPgvXUoZbc+ZX28n1WsMa1tWGbvkzg@mail.gmail.com>
To: Ari Keränen <ari.keranen@ericsson.com>
Cc: "draft-hartke-t2trg-ciri@ietf.org" <draft-hartke-t2trg-ciri@ietf.org>, "T2TRG@irtf.org" <T2TRG@irtf.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-bounce-key: webpack.hosteurope.de; hartke@projectcool.de; 1548158547; 39b63e43;
X-HE-SMSGID: 1glul6-0008L7-Vi
Archived-At: <https://mailarchive.ietf.org/arch/msg/t2trg/7RIEEHql5Jm70TAzlOuQeXAwNfc>
Subject: Re: [T2TRG] draft-hartke-t2trg-ciri-00 review
X-BeenThere: t2trg@irtf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IRTF Thing-to-Thing Research Group <t2trg.irtf.org>
List-Unsubscribe: <https://www.irtf.org/mailman/options/t2trg>, <mailto:t2trg-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/t2trg/>
List-Post: <mailto:t2trg@irtf.org>
List-Help: <mailto:t2trg-request@irtf.org?subject=help>
List-Subscribe: <https://www.irtf.org/mailman/listinfo/t2trg>, <mailto:t2trg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Tue, 22 Jan 2019 12:02:29 -0000

Hi Ari,

thanks a lot for your review!

Ari Keränen wrote:
> 2.1.  Options
>
>     host.ip
>       Specifies the host of the IRI authority as an IPv4 address
>       (4 bytes) or an IPv6 address (16 bytes).
>
> Do we need endianess considerations here? Or does CBOR take care of that?

Is there any protocol or format that does not encode 192.0.2.42 as
h'C000022A' or [2001:db8::42] as h'20010DB80000000000000000000042'? I
always thought that IP addresses do not have an endianness.

> 2.2.  Option Sequences
>
>      A sequence of options is considered _well-formed_ if:
>
> Do we need the _emphasis_ here? Maybe in quotes since it's a defined
> term? Unless we want to follow the CDDL convention -- which perhaps
> makes sense. But then it's probably good to have a note along the lines
> of "New terms are introduced in _cursive_." in the terminology section.

Yes, the idea was to write terms defined by the document in cursive
where they are introduced. I'll add a note to the terminology section.

>      o  a "host.ip" option is followed by a "port" option;
>
> Why is the port option mandatory? No default ports allowed?

This is relates to a tricky question regarding design goals.

As defined by RFC 3986, URIs can often be written in a number of ways:

* Schemes can be written in upper- or lowercase but are case-insensitive.
* Registered name are case-insensitive as well.
* Hex characters in percent encodings are case-insensitive as well.
* Schemes can define scheme-specific rules. E.g., in the case of "coap":
    * The port 5683 and an omitted port are equivalent.
    * The path "" and "/" are equivalent.

The primary function of CIRI References is to dereference them (where
RFC 3986 defines "dereferencing a URI" as using an access mechanism
determined by "URI resolution" to perform an action on the URI's
resource). For that, a client only needs to compare URI schemes, pass
registered names to a name resolution service, and know the default
ports of the protocols it actually implements.

The question is if we secondarily also want to be able to compare
CIRIs. Normally, because of all of these normalization rules, this is
quite tricky to do for all URI schemes (in particular if a URI scheme
has scheme-specific rules and is not recognized by an implementation).
However, if we restrict CIRIs to follow the normalization rules of
CoAP, then we can actually support any URI scheme with the same
normalization rules (such as HTTP) with ease. The only thing we need
to know to successfully compare two CIRIs with an unrecognized scheme
is - the default port.

So, at the cost of two bytes in a CIRI with a default port, we gain
that we can compare any supported CIRI even when the scheme is
unrecognized. Is it worth it? In my opinion, yes - if we want to be
able to compare CIRIs. On that, I don't have a good opinion yet. What
do you think?

> 3.  CBOR
>
>        ciri = [?(scheme:    1, text),
>                ?(host.name: 2, text //
>                  host.ip:   3, bytes .size 4 / bytes .size 16),
>                ?(port:      4, uint .size 2),
>                ?(path.type: 5, path-type),
>                *(path:      6, text),
>                *(query:     7, text),
>                ?(fragment:  8, text)]>
>
> Isn't "text" too permissive type for most (all?) of the components?

RFC 3986 actually makes no restrictions at all on what characters can
be used in registered names, paths segments, queries and fragment
identifiers. (The only requirement is that some of them need to be
percent-encoded.) So, as I understand it, "text" (a.k.a. "tstr") is
the right match here.

The scheme is restricted with a regex in -00:

      ciri = [?(scheme:    1, text .regexp "[A-Za-z][A-Za-z0-9+.-]*"),
              ...

Klaus