Re: draft-montenegro-httpbis-uri-encoding

"Nicolas Mailhot" <nicolas.mailhot@laposte.net> Mon, 24 March 2014 09:23 UTC

Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8E9C11A016C for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Mon, 24 Mar 2014 02:23:14 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.912
X-Spam-Level:
X-Spam-Status: No, score=-6.912 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_HI=-5, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id kJgLiFCNe3RX for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Mon, 24 Mar 2014 02:23:12 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com (Postfix) with ESMTP id 9902E1A0174 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Mon, 24 Mar 2014 02:23:12 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1WS14G-0008Ej-Ou for ietf-http-wg-dist@listhub.w3.org; Mon, 24 Mar 2014 09:21:16 +0000
Resent-Date: Mon, 24 Mar 2014 09:21:16 +0000
Resent-Message-Id: <E1WS14G-0008Ej-Ou@frink.w3.org>
Received: from maggie.w3.org ([128.30.52.39]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <nicolas.mailhot@gmail.com>) id 1WS141-0008Dq-BN for ietf-http-wg@listhub.w3.org; Mon, 24 Mar 2014 09:21:01 +0000
Received: from mail-wg0-f42.google.com ([74.125.82.42]) by maggie.w3.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.72) (envelope-from <nicolas.mailhot@gmail.com>) id 1WS140-0000td-9c for ietf-http-wg@w3.org; Mon, 24 Mar 2014 09:21:01 +0000
Received: by mail-wg0-f42.google.com with SMTP id y10so3183498wgg.25 for <ietf-http-wg@w3.org>; Mon, 24 Mar 2014 02:20:33 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:message-id:in-reply-to:references:date:subject:from:to:cc :user-agent:mime-version:content-type:content-transfer-encoding :importance; bh=dDdD3R+vF4E/Ed/C8Db7w9x1Kpgh4wvJrbuvRxnve5Q=; b=Ti4G9I6DFTHpaXQIWb16HVCXBZKwQqxmjaspBVE8nAkflRIi/FXAV1iBFBODQTJ1xk qGpAGQYwlP0BlVUVyE5V/vEEF2A5c6m1TyFCMExKIvf/ODyzQdxjA6wIISzgA+qDnt3+ FMN+Ubry6uVheYYnjrdsSJSTLqGMH8eU73//dFz+e9iNbOzxmG3cCUPPWG2TnvKir5HS MPR/GVM/g5YXGi5kImjF9lbDdmAnUcS7IC9KMDLD78T4tkXenguH7iTE1a4FwxJYbOXx MkadA7rNavuUZZcgLEQOBPr+vvPicbUo4U5HQzTsRyQAcietW3E7SPDCL3NmtuHGWD2H qNNQ==
X-Received: by 10.181.9.65 with SMTP id dq1mr13336307wid.51.1395652833688; Mon, 24 Mar 2014 02:20:33 -0700 (PDT)
Received: from arekh.dyndns.org (sat78-8-88-174-226-208.fbx.proxad.net. [88.174.226.208]) by mx.google.com with ESMTPSA id ff9sm35652200wib.11.2014.03.24.02.20.31 for <multiple recipients> (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 24 Mar 2014 02:20:32 -0700 (PDT)
Sender: Nicolas Mailhot <nicolas.mailhot@gmail.com>
Received: from localhost (localhost [127.0.0.1]) by arekh.dyndns.org (Postfix) with ESMTP id 240DC2E26DA; Mon, 24 Mar 2014 10:20:30 +0100 (CET)
X-Virus-Scanned: amavisd-new at arekh.dyndns.org
Received: from arekh.dyndns.org ([127.0.0.1]) by localhost (arekh.okg [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id UNc8RpJFqBnA; Mon, 24 Mar 2014 10:20:15 +0100 (CET)
Received: from arekh.dyndns.org (localhost [127.0.0.1]) by arekh.dyndns.org (Postfix) with ESMTP; Mon, 24 Mar 2014 10:20:15 +0100 (CET)
Received: from 192.196.142.22 (SquirrelMail authenticated user nim) by arekh.dyndns.org with HTTP; Mon, 24 Mar 2014 10:20:15 +0100
Message-ID: <545475f4642c9923d2ab076798a24ad6.squirrel@arekh.dyndns.org>
In-Reply-To: <380F3763-CAC9-45AF-A5D5-5B9AA9E2D977@mnot.net>
References: <F7DFCF7F-8958-462C-BA97-FBBC96BBEE7D@mnot.net> <CACuKZqFjHXxzmO8onrggPDn7V18DRsMap2USsxPFA8KHDGYjig@mail.gmail.com> <532C6089.4090307@gmx.de> <CACuKZqFYG9HAp+1b0aVcjbRLO7tApVfAuBq0wxxvcB-2oz9U8Q@mail.gmail.com> <380F3763-CAC9-45AF-A5D5-5B9AA9E2D977@mnot.net>
Date: Mon, 24 Mar 2014 10:20:15 +0100
From: Nicolas Mailhot <nicolas.mailhot@laposte.net>
To: Mark Nottingham <mnot@mnot.net>
Cc: Zhong Yu <zhong.j.yu@gmail.com>, "Julian F. Reschke" <julian.reschke@gmx.de>, Gabriel Montenegro <gabriel.montenegro@microsoft.com>, HTTP Working Group <ietf-http-wg@w3.org>
User-Agent: SquirrelMail/1.4.22-13.fc20
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
X-Priority: 3 (Normal)
Importance: Normal
Received-SPF: pass client-ip=74.125.82.42; envelope-from=nicolas.mailhot@gmail.com; helo=mail-wg0-f42.google.com
X-W3C-Hub-Spam-Status: No, score=-3.5
X-W3C-Hub-Spam-Report: AWL=-2.768, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001
X-W3C-Scan-Sig: maggie.w3.org 1WS140-0000td-9c 9009ce8875b431860f0e9cc06ae06dd9
X-Original-To: ietf-http-wg@w3.org
Subject: Re: draft-montenegro-httpbis-uri-encoding
Archived-At: <http://www.w3.org/mid/545475f4642c9923d2ab076798a24ad6.squirrel@arekh.dyndns.org>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/22872
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

Le Sam 22 mars 2014 00:30, Mark Nottingham a écrit :

> In particular, this seems like something that needs to be coupled to
> *where* the link originates; e.g., a browsers’ behaviour for a link from
> an address bar is likely to be different than that from an ‘a’ tag, and
> even again different from a JavaScript-generated link.

True.
What last decade unicode migration taught us, is that once you allow text
with undefined encoding in the pipeline, things are going to fail.
Encoding really wants to be end-to-end.

Still, this is little different from mime-types, where everyone relies on
mime-type hints for things to work well, but servers and web clients
sometimes have to infer them in less-than optimal ways to fill in the http
headers. Asking for url encoding to be defined at the http level (with a
fallback to a clear encoding default otherwise) is similar, with the
detection pushed to servers and web clients, which will presumably push
for better behaviour elsewhere to avoid being saddled with the heuristics
currently pushed on network nodes.

I'll add that we are living in an insecure world right now, that to
mitigate security problems everyone has been adding blacklists of
known-bad urls (from browsers to java to adobe reader to various security
extensions) but those blacklists rely on someone being able to review and
curate them. Which is obviously is a problem when you start to accept URLs
you're not able to decode or display in any reasonable manner.

Regards,

-- 
Nicolas Mailhot