[T2TRG] IRIs in CoRAL (was: draft-hartke-t2trg-ciri-00 review)
Klaus Hartke <hartke@projectcool.de> Mon, 04 February 2019 11:46 UTC
Return-Path: <hartke@projectcool.de>
X-Original-To: t2trg@ietfa.amsl.com
Delivered-To: t2trg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7113A130E2E for <t2trg@ietfa.amsl.com>; Mon, 4 Feb 2019 03:46:46 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.329
X-Spam-Level:
X-Spam-Status: No, score=-0.329 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTTP_EXCESSIVE_ESCAPES=1.572, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HJGjxmfnGWct for <t2trg@ietfa.amsl.com>; Mon, 4 Feb 2019 03:46:44 -0800 (PST)
Received: from wp382.webpack.hosteurope.de (wp382.webpack.hosteurope.de [IPv6:2a01:488:42:1000:50ed:8597::]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id E9B7D1294D0 for <T2TRG@irtf.org>; Mon, 4 Feb 2019 03:46:43 -0800 (PST)
Received: from mail-qt1-f170.google.com ([209.85.160.170]); authenticated by wp382.webpack.hosteurope.de running ExIM with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) id 1gqci1-0005d1-AH; Mon, 04 Feb 2019 12:46:41 +0100
Received: by mail-qt1-f170.google.com with SMTP id b8so5036827qtj.1 for <T2TRG@irtf.org>; Mon, 04 Feb 2019 03:46:41 -0800 (PST)
X-Gm-Message-State: AJcUukfbIAEURl/F+U4PWZJAY12q2tuSy3CiFT159ZXYCQUdba2y8w13 vmZxeAbtXAo5Dsr+9mKMJ7eSMvYyWHgo/ENsS+k=
X-Google-Smtp-Source: ALg8bN4AiLcAA2PLxXI7dANtD6wxM8xhQDFbgdmQ6lEkvot/JaXmRsdyEeis20Od9YVkQ8ibMDdJa55IWlE5xaZhhFA=
X-Received: by 2002:a0c:ae30:: with SMTP id y45mr46235179qvc.145.1549280800239; Mon, 04 Feb 2019 03:46:40 -0800 (PST)
MIME-Version: 1.0
References: <58aa0ae4-b3fe-abf7-9bda-4908ef0b3fd7@ericsson.com> <CY4PR21MB0168C83AF295761F73FCDF7FA39F0@CY4PR21MB0168.namprd21.prod.outlook.com> <A0D234F0-51D8-4543-9344-43999C304D73@tzi.org> <CY4PR21MB016884C73B7F842FFF5A53C1A39F0@CY4PR21MB0168.namprd21.prod.outlook.com> <CAAzbHva=YjK5j=W9aFDikYrLLJQ+pDcRy2HV71e0JbyHu_1BBw@mail.gmail.com> <CY4PR21MB0168CEDC3F1EB41FCD21AD28A39F0@CY4PR21MB0168.namprd21.prod.outlook.com>
In-Reply-To: <CY4PR21MB0168CEDC3F1EB41FCD21AD28A39F0@CY4PR21MB0168.namprd21.prod.outlook.com>
From: Klaus Hartke <hartke@projectcool.de>
Date: Mon, 04 Feb 2019 12:46:07 +0100
X-Gmail-Original-Message-ID: <CAAzbHvbUvoqGrAoR_MOkMb_89U-4dQZQusqA+qCQabQX-N-yeA@mail.gmail.com>
Message-ID: <CAAzbHvbUvoqGrAoR_MOkMb_89U-4dQZQusqA+qCQabQX-N-yeA@mail.gmail.com>
To: Dave Thaler <dthaler@microsoft.com>
Cc: "T2TRG@irtf.org" <T2TRG@irtf.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-bounce-key: webpack.hosteurope.de; hartke@projectcool.de; 1549280803; 54b527b4;
X-HE-SMSGID: 1gqci1-0005d1-AH
Archived-At: <https://mailarchive.ietf.org/arch/msg/t2trg/FyLCuSRKpXrttcsDFGHHwYlM7ko>
Subject: [T2TRG] IRIs in CoRAL (was: draft-hartke-t2trg-ciri-00 review)
X-BeenThere: t2trg@irtf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IRTF Thing-to-Thing Research Group <t2trg.irtf.org>
List-Unsubscribe: <https://www.irtf.org/mailman/options/t2trg>, <mailto:t2trg-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/t2trg/>
List-Post: <mailto:t2trg@irtf.org>
List-Help: <mailto:t2trg-request@irtf.org?subject=help>
List-Subscribe: <https://www.irtf.org/mailman/listinfo/t2trg>, <mailto:t2trg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Mon, 04 Feb 2019 11:46:47 -0000
> And I also agree that draft-hartke-t2trg-coral-06 likely has the same > issues because it uses IRIs instead of URIs. Some further thoughts: * In CoAP, the request URI is transported as a sequence of of CoAP options that contain the different parts of an URI without percent-encoding. For example, the URI <http://example.com/city/Montr%C3%A9al> in a request to (http, example.com, 80) would be encoded as an Uri-Path option containing utf8(decode-percent-encodings("city")) = h'63697479' followed by an Uri-Path option containing utf8(decode-percent-encodings("Montr%C3%A9al")) = h'4d6f6e7472c3a9616c'. CoAP does not require any Unicode normalization be performed, so if a client happens to make a request with an Uri-Path option with utf8(nfd("Montréal")) = h'4d6f6e7472_65cc81_616c' where the server expects an Uri-Path option with utf8(nfc("Montréal")) = h'4d6f6e7472_c3a9_616c' (or vice versa), then the client will get a 4.04 Not Found error. CoAP defines a conversion from CoAP options to URIs (and vice versa). This conversion is purely syntactic, so an Uri-Path option with h'4d6f6e7472_65cc81_616c' in the request URI would become <http://example.com/city/Montr%65%CC%81al>. CoRAL, in the binary format, does exactly the same for link targets (except that the conversion of CIRI options is currently defined to be to IRIs, which I'll replace with URIs in the upcoming draft-hartke-t2trg-ciri-01). * In Web Linking [RFC8288], the context and the target of a link are IRIs. However, these are serialized as URIs on the wire in the "Link" header field. CoRAL, in the binary format, does exactly the same for link targets (except that it uses CBOR instead of ASCII characters to delimit the URI components on the wire). * In RDF, concepts are named with globally unique Unicode strings. To make the minting of these strings painless, they are restricted to the syntax of an IRI. These IRIs are used purely as identity tokens (in RFC3987 lingo) and are therefore compared character-by-character. RDF recommends [1] that these IRIs avoid non-normalized forms such as uppercase characters in scheme names, explicitly stated HTTP default port, percent-encoding of characters where it is not required by IRI syntax, and IRIs that are not in NFC. So for example the concept identified by the IRI <http://example.com/city/Montréal> is not the same as the concept identified by the IRI <http://example.com/city/Montréal> if one of them isn't in NFC. CoRAL, in the binary format, does exactly the same for link relation types. Now, in M2M communication, I think we avoid all problems with normalization etc.: Servers authoritatively manage the namespace of their resources. If a client asks a server "Hey, server, what resources do you have?" and the server responds with "I have a resource at [6, h'63697479', 6, h'4d6f6e7472c3a9616c'].", then the client can simply copy those bytes into its next request without ever decoding them. As long as the server accepts its own output as input, everything works. When a client compares link relation types to locally stored strings, it can use byte-for-byte comparison (as suggested by RFC3987) as long as both the server and the client store the link relation types exactly as they are defined. The only issue left is when human users input IRIs as link relation types or as link targets. In CoRAL, this happens in the textual format. Interestingly, Turtle [2] doesn't seem to perform any kind of input normalization, so human users are expected to write in perfect NFC when they identify a concept by the IRI. (Maybe I'm missing something?) If true, this seems like a bad user experience. However, I think it would be a bad user experience as well if one has to write <%E3%81%93%E3%82%93%E3%81%AB%E3%81%A1%E3%81%AF> instead of <こんにちは>. I would prefer to not invent anything new here and to just follow the consensus if possible. Klaus [1] https://www.w3.org/TR/rdf11-concepts/#note-iris [2] https://www.w3.org/TR/turtle/
- [T2TRG] draft-hartke-t2trg-ciri-00 review Ari Keränen
- Re: [T2TRG] draft-hartke-t2trg-ciri-00 review Dave Thaler
- Re: [T2TRG] draft-hartke-t2trg-ciri-00 review Carsten Bormann
- Re: [T2TRG] draft-hartke-t2trg-ciri-00 review Dave Thaler
- Re: [T2TRG] draft-hartke-t2trg-ciri-00 review Klaus Hartke
- Re: [T2TRG] draft-hartke-t2trg-ciri-00 review Klaus Hartke
- Re: [T2TRG] draft-hartke-t2trg-ciri-00 review Dave Thaler
- Re: [T2TRG] draft-hartke-t2trg-ciri-00 review Klaus Hartke
- Re: [T2TRG] draft-hartke-t2trg-ciri-00 review Ari Keränen
- Re: [T2TRG] draft-hartke-t2trg-ciri-00 review Klaus Hartke
- [T2TRG] IRIs in CoRAL (was: draft-hartke-t2trg-ci… Klaus Hartke
- Re: [T2TRG] IRIs in CoRAL (was: draft-hartke-t2tr… Dave Thaler
- Re: [T2TRG] IRIs in CoRAL (was: draft-hartke-t2tr… Jim Schaad