Re: [art] Artart last call review of draft-ietf-core-links-json-07

Carsten Bormann <cabo@tzi.org> Tue, 25 April 2017 21:26 UTC

Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\))
Subject: Re: [art] Artart last call review of draft-ietf-core-links-json-07
From: Carsten Bormann <cabo@tzi.org>
In-Reply-To: <A43ECEE0-47C8-485C-A9AC-E7890B0A6AA4@gbiv.com>
Date: Tue, 25 Apr 2017 23:25:48 +0200
Cc: IETF <ietf@ietf.org>, Julian Reschke <julian.reschke@gmx.de>, art@ietf.org, Herbert Van de Sompel <hvdsomp@gmail.com>, "core@ietf.org WG" <core@ietf.org>, Erik Wilde <erik.wilde@dret.net>, draft-ietf-core-links-json.all@ietf.org
Content-Transfer-Encoding: quoted-printable
Message-Id: <26C26E7B-24E1-4982-B3D8-9991AA1CC6DF@tzi.org>
References: <149188258769.15738.17473942496982365590@ietfa.amsl.com> <A12A8CB3-F756-4790-806A-A67AA8CE1D78@tzi.org> <CAOywMHdqitw-uN09p11j2xkBK6TO8y3wjAWipK7vhqbTWp0T1w@mail.gmail.com> <a2350664-05a7-8909-4cf4-5b765e09f9e7@dret.net> <027F2C41-E498-4801-86E2-047771E10545@tzi.org> <4cd01462-2a0f-803e-df10-e68b3eed0226@dret.net> <B04F33DD-51C1-4545-AD59-2F1A3AF14FF6@tzi.org> <feee7d84-263a-49e4-d95e-09ab8526b703@dret.net> <CAOywMHfJpYB6u7BFVf10Gf=Nxk0E1h5iEvyVX5VeAW0UKQOSzQ@mail.gmail.com> <5EB045F7-09FA-4EE8-844A-5AC0E3BF5C1E@tzi.org> <f1b9f42f-559d-d146-e355-c3e2ba31cb01@gmx.de> <23DDC7F2-D46F-4C19-AEA8-C71187099414@tzi.org> <A43ECEE0-47C8-485C-A9AC-E7890B0A6AA4@gbiv.com>
To: "Roy T. Fielding" <fielding@gbiv.com>
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf/UogIo8QgKzJ2dkcQeAxjV6HyajU>
Precedence: list

>> RFC 6690 says:
>> 
>>  In
>>  order to convert an HTTP Link Header field to this link format, first
>>  the "Link:" HTTP header is removed, any linear whitespace (LWS) is
>>  removed, the header value is converted to UTF-8, and any percent-
>>  encodings are decoded.
> 
> Well, that's broken.

OK, let me start typing that errata report then.

>> coap://example.com?stupid%3Dkey=4711
>> 
>> is not distinguishable from
>> 
>> coap://example.com?stupid=key=4711
>> 
>> (The typical reaction of an implementer is “then don’t do that!” [1,2].)
> 
> That isn't a "limitation”.  

For RFC6690 users, it pretty much is, because certain URIs don’t work.
They tend to design their URIs in such a way that they do, probably more so because these designs are natural for them than because they are fully aware of that limitation.

> It's a bug to decode pct-encoded octets in
> a URI before decomposing the reference into its parts.  

Well, percent-encoding is playing two roles in RFC 3986: hiding characters within syntactic elements from their delimiter roles, and encoding non-ASCII (and C0 etc.) characters.
The passage I cited from RFC 6690 got nicely rid of the latter, and broke the former(*).

> ASCII is already
> in UTF-8.  Decoding a pct-encoding doesn't make it "more UTF-8"; it just
> means the string is no longer a URI reference.  That's broken.  So utterly
> broken that it obviously wasn't reviewed by the right people.

So what should I write into the errata report?

Or more generally speaking, how should we fix RFC 6690, without creating a need for constrained nodes to do full URI processing?

Maybe it is sufficient to document the limitation in the errata, for now?

And, more to the point of the subject line, how should we handle this on the JSON/CBOR level?

There definitely will be a round-tripping problem with RFC 6690 if the URIs collide with the above limitation of RFC 6690.  But that’s OK because that defines the subset.

To be more general, not doing any percent-decoding of URIs when creating JSON/CBOR from scratch is probably the easy way, but it means that when we want to phase out RFC 6690 on the constrained level by replacing it with JSON/CBOR, there is additional complexity.  Horribile dictu, but maybe IRIs are the right thing to do here.

Grüße, Carsten

(*) It may be worth pointing out that the amount of breakage here is much larger than for CoAP itself, which does the percent-decoding only after decomposing a URI into what CoAP considers to be its components, so the URI parsing works properly — coap://example.com/foo%2fbar has one path segment, “foo/bar”.
But the application semantics of hiding application delimiters, which my example above is breaking, is not supported in CoAP either.
Some people think that URIs should be carried around in that decomposed form throughout the constrained space, and I can’t blame them.
I don’t have data how many URI libraries in active use in the non-constrained space get this particular detail right, either.

Artart last call review of draft-ietf-core-links-… Mark Nottingham
Re: [art] Artart last call review of draft-ietf-c… Carsten Bormann
Re: [art] Artart last call review of draft-ietf-c… Herbert Van de Sompel
Re: [art] Artart last call review of draft-ietf-c… Erik Wilde
Re: [art] Artart last call review of draft-ietf-c… Carsten Bormann
Re: [art] Artart last call review of draft-ietf-c… Erik Wilde
Re: [art] Artart last call review of draft-ietf-c… Carsten Bormann
Re: [art] Artart last call review of draft-ietf-c… Carsten Bormann
Re: [art] Artart last call review of draft-ietf-c… Herbert Van de Sompel
Re: [art] Artart last call review of draft-ietf-c… Erik Wilde
Re: [art] Artart last call review of draft-ietf-c… Julian Reschke
Re: [art] Artart last call review of draft-ietf-c… Carsten Bormann
Re: [art] Artart last call review of draft-ietf-c… Carsten Bormann
Re: [art] Artart last call review of draft-ietf-c… Roy T. Fielding
Re: [art] Artart last call review of draft-ietf-c… Carsten Bormann
Re: [art] Artart last call review of draft-ietf-c… Roy T. Fielding