Re: #227: Encoding advice for new headers and parameters

Willy Tarreau <w@1wt.eu> Sat, 01 October 2016 06:32 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7B6ED12B03C for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Fri, 30 Sep 2016 23:32:23 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.237
X-Spam-Level:
X-Spam-Status: No, score=-9.237 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-2.316, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id nwIaHKfNitvK for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Fri, 30 Sep 2016 23:32:22 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2CFC712B054 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Fri, 30 Sep 2016 23:32:21 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.80) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1bqDmZ-0001C8-Ny for ietf-http-wg-dist@listhub.w3.org; Sat, 01 Oct 2016 06:28:23 +0000
Resent-Date: Sat, 01 Oct 2016 06:28:23 +0000
Resent-Message-Id: <E1bqDmZ-0001C8-Ny@frink.w3.org>
Received: from lisa.w3.org ([128.30.52.41]) by frink.w3.org with esmtps (TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from <w@1wt.eu>) id 1bqDmX-0001An-Jq for ietf-http-wg@listhub.w3.org; Sat, 01 Oct 2016 06:28:21 +0000
Received: from wtarreau.pck.nerim.net ([62.212.114.60] helo=1wt.eu) by lisa.w3.org with esmtp (Exim 4.80) (envelope-from <w@1wt.eu>) id 1bqDlC-00074t-Th for ietf-http-wg@w3.org; Sat, 01 Oct 2016 06:27:50 +0000
Received: (from willy@localhost) by pcw.home.local (8.15.2/8.15.2/Submit) id u916QWJN031692; Sat, 1 Oct 2016 08:26:32 +0200
Date: Sat, 1 Oct 2016 08:26:32 +0200
From: Willy Tarreau <w@1wt.eu>
To: Julian Reschke <julian.reschke@gmx.de>
Cc: Mark Nottingham <mnot@mnot.net>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <20161001062632.GB31660@1wt.eu>
References: <0168B53E-A4CB-41BA-B371-7499837A327E@mnot.net> <be65e9b9-2211-314d-8f85-2db0cc87e2eb@gmx.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <be65e9b9-2211-314d-8f85-2db0cc87e2eb@gmx.de>
User-Agent: Mutt/1.6.0 (2016-04-01)
Received-SPF: pass client-ip=62.212.114.60; envelope-from=w@1wt.eu; helo=1wt.eu
X-W3C-Hub-Spam-Status: No, score=-5.5
X-W3C-Hub-Spam-Report: AWL=-0.575, BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, W3C_AA=-1, W3C_IRA=-1, W3C_WL=-1
X-W3C-Scan-Sig: lisa.w3.org 1bqDlC-00074t-Th 5381ef9062788e7301464ef639629d15
X-Original-To: ietf-http-wg@w3.org
Subject: Re: #227: Encoding advice for new headers and parameters
Archived-At: <http://www.w3.org/mid/20161001062632.GB31660@1wt.eu>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/32433
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

On Thu, Sep 29, 2016 at 03:48:29PM +0200, Julian Reschke wrote:
> > A more aggressive approach would be to also recommend that new parameters on existing fields (even if they specify use of 5987) SHOULD use such encoding.
> 
> -1 to mixing different escaping rules in the same field.

That reminds me that there are some applications where you can specify a
field name for something, and the value will be passed there. What
applications do in practice is to emit "$name: $value" without checking
*anything* about $name (eg: content-length will work), and by applying
the same encoding on $value. So indeed, having possibly confusing
encoding rules is a bit tricky. BTW, I used to work on a project where
we had two different encodings depending on the field name, and one or
two fields having to carry the same value for legacy reasons, but with
a different encoding, and we found several time code parts like this :

           value = legacy_header_encode(value);
           ...
           req_ptr += sprintf(req_ptr, "legacy_name: %s\n", value);
           ...
           req_ptr += sprintf(req_ptr, "new_name: %s\n", value);

Here new_name should have been encoded with new_header_encode(). And that's
without telling about the number of times "value" is passed unencoded,
possibly carrying escape characters... Of course above it seems obvious
but when reading such code it's much less. Thus it's important to be very
careful about this.

Cheers,
Willy