Re: #227: Encoding advice for new headers and parameters

Mark Nottingham <mnot@mnot.net> Tue, 04 October 2016 02:03 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E6CF51294A0 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Mon, 3 Oct 2016 19:03:53 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.917
X-Spam-Level:
X-Spam-Status: No, score=-9.917 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-2.996, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rz432oRxzFVu for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Mon, 3 Oct 2016 19:03:51 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D24E4127078 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Mon, 3 Oct 2016 19:03:51 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.80) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1brF1D-0007ta-I3 for ietf-http-wg-dist@listhub.w3.org; Tue, 04 Oct 2016 01:59:43 +0000
Resent-Date: Tue, 04 Oct 2016 01:59:43 +0000
Resent-Message-Id: <E1brF1D-0007ta-I3@frink.w3.org>
Received: from lisa.w3.org ([128.30.52.41]) by frink.w3.org with esmtps (TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <mnot@mnot.net>) id 1brF18-0007qX-Rr for ietf-http-wg@listhub.w3.org; Tue, 04 Oct 2016 01:59:38 +0000
Received: from mxout-07.mxes.net ([216.86.168.182]) by lisa.w3.org with esmtps (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from <mnot@mnot.net>) id 1brF14-00026g-Fx for ietf-http-wg@w3.org; Tue, 04 Oct 2016 01:59:37 +0000
Received: from [192.168.3.104] (unknown [124.189.98.244]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.mxes.net (Postfix) with ESMTPSA id 4D6B922E256; Mon, 3 Oct 2016 21:59:09 -0400 (EDT)
Content-Type: text/plain; charset="us-ascii"
Mime-Version: 1.0 (Mac OS X Mail 10.0 \(3226\))
From: Mark Nottingham <mnot@mnot.net>
In-Reply-To: <be65e9b9-2211-314d-8f85-2db0cc87e2eb@gmx.de>
Date: Tue, 04 Oct 2016 12:59:07 +1100
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <C239D61B-C0DF-4F89-A4B2-8744FFFBE470@mnot.net>
References: <0168B53E-A4CB-41BA-B371-7499837A327E@mnot.net> <be65e9b9-2211-314d-8f85-2db0cc87e2eb@gmx.de>
To: "Julian F. Reschke" <julian.reschke@gmx.de>
X-Mailer: Apple Mail (2.3226)
Received-SPF: pass client-ip=216.86.168.182; envelope-from=mnot@mnot.net; helo=mxout-07.mxes.net
X-W3C-Hub-Spam-Status: No, score=-8.3
X-W3C-Hub-Spam-Report: AWL=1.351, BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, W3C_AA=-1, W3C_DB=-1, W3C_IRA=-1, W3C_IRR=-3, W3C_WL=-1
X-W3C-Scan-Sig: lisa.w3.org 1brF14-00026g-Fx facabcd7702666ffb70feba8eebe19ce
X-Original-To: ietf-http-wg@w3.org
Subject: Re: #227: Encoding advice for new headers and parameters
Archived-At: <http://www.w3.org/mid/C239D61B-C0DF-4F89-A4B2-8744FFFBE470@mnot.net>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/32456
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

> On 29 Sep. 2016, at 11:48 pm, Julian Reschke <julian.reschke@gmx.de> wrote:
> 
> On 2016-09-28 04:29, Mark Nottingham wrote:
>> [ "just me" hat on ]
>> 
>> <https://github.com/httpwg/http-extensions/issues/227>
>> 
>> After some discussion in Berlin and Stockholm, as well as experience with dealing with i18n in parameters for the Link header (see <https://github.com/mnot/I-D/issues/180>), I think we should give more definite advice about when RFC5987(bis) encoding should and should not be used.
>> 
>> In particular, flagging encoding by using a parameter name complicates extension processing (see the issue referenced above), and causes a lot of uncertainty about precedence, etc.
> 
> It complicates processing *slightly*.

I'm not taking about parsing overhead, I'm talking about complication to the model for headers that use it -- as we saw in Link.


> The issue of parameters potentially repeating, and the fact that you need to define what to do in that case, exists in any way. It is inherent in any format that supports name/value pairs.

Yes, but RFC5987 encoding complicates it further, because now you have the possibility of multiple parameters in two different encodings, and potentially different rules / precedences for each encoding.


>> I think it would be much simpler and more reliable to advise people minting new HTTP headers to *not* use RFC5987(bis) encoding, but instead advise that they mandate use of an encoding on the field (or a specified portion thereof).
> 
> RFC 5987 defines a way to deal with non-ASCII. It's not pretty, has a slightly bizarre syntax, but at least it's there, and it has been implemented successfully in all widely deployed user agents.

I agree with all of that.


> Defining *another* way to achieve this seems like a bad idea to me (insert XKCD reference here...).

I didn't say we needed to define another one; I just think we should stop promoting this one.


> (And yes, I'm all for working on a new common field syntax, which, as side effect, addresses non-ASCII, but that's a separate discussion)
> 
>> E.g., if the "foo" parameter on the "bar" header field might need to accept non-ascii content, it MUST be generated with those characters encoded, and MUST be parsed by first decoding that portion of the header.
> 
> ...which essentially *is* the format used RFC 5987, minus parameter naming and preamble.
> 
> Requiring it's use sounds attractive, but I have my doubts that the typical "producer" of field values will get it right; thus we might see "%" characters which are not meant to be percent-escapes in the wild.
> 
> The RFC 5987 format, as ugly it might be, at least has the property that the producer needs to make a conscious decision to choose the format, and thus hopefully will get it right according to spec.
> 
>> The actual encoding to be used need not be specified, ...
> 
> Here I disagree even more. Telling people not to use a standard format, but *not* to tell them what to use instead is strange.
> 
> > ...but the simplest approach would probably be to use RFC3986 %-encoding over a UTF-8 string.
> 
>> A more aggressive approach would be to also recommend that new parameters on existing fields (even if they specify use of 5987) SHOULD use such encoding.
> 
> -1 to mixing different escaping rules in the same field.

I tend to agree, although in the case of Link, it might be best to go that way, since we can't change title*.


>> Thoughts? I'm not going to lie down in the road for this, in that I suspect that most people will gravitate towards this kind of solution naturally, rather than use 5987, but it'd be nice to put clear advice out there.
> 
> I'm opposed to discourage use of RFC 5987 encoding until we have something better to recommend (and that includes a specification for it).

Fair enough. I look forward to bisbis, when we get to do that.

<chair hat> it sounds like we can close this issue with no action -- anyone else have thoughts?


> I'll also point out that in the meantime, at least one more specification uses this format (<https://tools.ietf.org/html/draft-ietf-httpauth-extension-09#section-4>), so if you are serious about discouraging it's use, you really should comment on that spec right now.

*sigh* Another mailing list.

Thanks,



--
Mark Nottingham   https://www.mnot.net/