Re: URL, URI and the w3c

Stefan Eissing <stefan@eissing.org> Wed, 15 June 2022 09:26 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1D3F0C14F613 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 15 Jun 2022 02:26:17 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.758
X-Spam-Level:
X-Spam-Status: No, score=-2.758 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.25, MAILING_LIST_MULTI=-1, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=eissing.org
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4GsGi47ccJ66 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 15 Jun 2022 02:26:12 -0700 (PDT)
Received: from lyra.w3.org (lyra.w3.org [128.30.52.18]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A2CA0C14F720 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Wed, 15 Jun 2022 02:26:12 -0700 (PDT)
Received: from lists by lyra.w3.org with local (Exim 4.92) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1o1PFE-0000FO-28 for ietf-http-wg-dist@listhub.w3.org; Wed, 15 Jun 2022 09:23:24 +0000
Resent-Date: Wed, 15 Jun 2022 09:23:24 +0000
Resent-Message-Id: <E1o1PFE-0000FO-28@lyra.w3.org>
Received: from mimas.w3.org ([128.30.52.79]) by lyra.w3.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from <stefan@eissing.org>) id 1o1PFC-0000EV-Kq for ietf-http-wg@listhub.w3.org; Wed, 15 Jun 2022 09:23:22 +0000
Received: from mail.eissing.org ([194.163.179.85]) by mimas.w3.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from <stefan@eissing.org>) id 1o1PFA-0001S4-US for ietf-http-wg@w3.org; Wed, 15 Jun 2022 09:23:22 +0000
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=eissing.org; s=default; t=1655284986; bh=MH4v8qcgJ8pKNrqU89Mo9gy/zQb39kskh3i3sZd1fw8=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=8z/dbzYHVNpfY/HRw/dHfskZ24bIUfQP4zD7XAfPUzuSOyOTNLtOOgmvA663wMHWv parulx4s1crkVCuTU4ZRX3o+7Ncl5QnQxOfBA4VVSqshBSpHvusOFSNJBpWnMDhUAK v81kCpQuTUb5UhtdjMXLo0ExZdLvpt6+Zp9P2kOD7S5FRdGcgd7g6TT4tWXkzThg5z rnN3+uKuXF9CjfKRnN8zlo0UbIDF2BBCJfa5wyojG1gLjD++A2UiECUeT3YiHPvsCW 4TM1dukTHG0Phk61c9g0jhEga90gZQEQFuFy3CmPBvxAxaXoi/gUty5PS3yG4+lV82 1qiWpgb2p6N4A==
Received: from smtpclient.apple (unknown [89.246.53.105]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mail.eissing.org (Postfix) with ESMTPSA id 725EAC004E; Wed, 15 Jun 2022 11:23:05 +0200 (CEST)
Content-Type: text/plain; charset="us-ascii"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.100.31\))
From: Stefan Eissing <stefan@eissing.org>
In-Reply-To: <CAHBU6it+PjWcnQ9xHisw3YpzVy_Yyk1QNQR+GhOT80AZOxdrQg@mail.gmail.com>
Date: Wed, 15 Jun 2022 11:23:02 +0200
Cc: Stenberg Daniel <daniel@haxx.se>, Roberto Polli <robipolli@gmail.com>, Carsten Bormann <cabo@tzi.org>, HTTP Working Group <ietf-http-wg@w3.org>, Giuseppe De Marco <giuseppe.demarco@teamdigitale.governo.it>, Ted Hardie <ted.ietf@gmail.com>
Content-Transfer-Encoding: quoted-printable
Message-Id: <4A17C4C4-92FC-4963-8CD2-F9732157C481@eissing.org>
References: <CAP9qbHVQ7B423jc7tHQo70ZAeXmFHdZo-=JvTSj5L2D6uTTQ9A@mail.gmail.com> <3n57428s-3052-66q7-prp8-118s19q41461@unkk.fr> <A90EB729-EA13-42E1-94F1-4410334E907E@tzi.org> <CAP9qbHUDSGDWyDUzLykFkE_C+GKZGy5SYxNsX=feEnrAmkE1Bw@mail.gmail.com> <CAHBU6iuxgqbkC3nLUi7w+o4JcSMPkW3FmC1b1C4=qRM-ZxUWiQ@mail.gmail.com> <895n295o-175q-ppn7-1472-p4np728p70o1@unkk.fr> <CAHBU6it+PjWcnQ9xHisw3YpzVy_Yyk1QNQR+GhOT80AZOxdrQg@mail.gmail.com>
To: Bray Tim <tbray@textuality.com>
X-Mailer: Apple Mail (2.3696.100.31)
Received-SPF: pass client-ip=194.163.179.85; envelope-from=stefan@eissing.org; helo=mail.eissing.org
X-W3C-Hub-DKIM-Status: validation passed: (address=stefan@eissing.org domain=eissing.org), signature is good
X-W3C-Hub-Spam-Status: No, score=-4.1
X-W3C-Hub-Spam-Report: BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, W3C_AA=-1, W3C_WL=-1
X-W3C-Scan-Sig: mimas.w3.org 1o1PFA-0001S4-US df49077906bb0fba9f4e66d511b5b2ee
X-Original-To: ietf-http-wg@w3.org
Subject: Re: URL, URI and the w3c
Archived-At: <https://www.w3.org/mid/4A17C4C4-92FC-4963-8CD2-F9732157C481@eissing.org>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/40107
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <https://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>


> Am 14.06.2022 um 23:46 schrieb Tim Bray <tbray@textuality.com>:
> 
> I was saying something more specific. AFAIK, an 3986-conforming URL will generally be interpreted by browsers in the way its author intended.
> 
> I went through https://github.com/bagder/docs/blob/master/URL-interop.md and let me revise that slightly: Any 3986-conforming URL that is plausible for inclusion in an Internet-Draft will generally be interpreted by browsers in the way its author intended.  I was having trouble finding counter-examples but am prepared to believe that I'm missing something obvious.

The direction 3986 -> TWUS is ok. The reverse is what the 'living' TWUS raison d'etre is: how to transform the junk that people enter in the address bar and the junk that people/software put into HTML's href attributes into a 3896 url when talking to servers. No RFC addresses that problem and user agents have a real need to have that defined in a common way for interop between browsers. In order to reject at least part of the junk, no browser in the past could to that alone. User would always raise issues with "But it works in XXX!".


This process, define common grounds on what to do with the surprises users/software feeds us, was not really fit for a standard. The surprises always continued. Maybe it's more settled now. And the "living" part of TWUS fades away. I don't know.


The problem looking ahead is that the technical term "URL" has been polluted. It is not clear what standard it refers to. Also, usage of TWUS urls is leaking from browsers into other parts. HTTP clients, like curl, are expected to work with URLs "just as the browsers do". This is a not unreasonable demand from users who are unaware of the finer details. Once you start accepting TWUS urls on the command line, where does it stop? If you see a non-3986 'Location' header, what to do with it? Should you follow its TWUS interpretation or fail the request?


All this mess could still be a happy place if it weren't for SECURITY. Authorization likes it rules to be clean and simple. If its checks assume 3986 and other parts of a system allow TWUS (or vice versa), this becomes exploitable. (and 3986 alone is already tricky for programmers who use terms like "percent-decoded URLs" as if those would really exist!)



> Am 14.06.2022 um 11:55 schrieb Roberto Polli <robipolli@gmail.com>:
> 
> Which spec should I reference in an HTTP derived protocol
> such as OAuth2 ?

I can see only drawbacks in using TWUS for REST APIs or other software-generated documents/request/responses. Where user input is involved, TWUS offers a definition of how to go from there to 3896, if you cannot enforce it straight away.

Kind Regards,
Stefan