Re: [apps-discuss] I-D Action: draft-ietf-appsawg-file-scheme-05.txt (relative URIs)

Matthew Kerwin <matthew@kerwin.net.au> Wed, 16 March 2016 22:05 UTC

Return-Path: <phluid61@gmail.com>
X-Original-To: apps-discuss@ietfa.amsl.com
Delivered-To: apps-discuss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 57E5D12D728 for <apps-discuss@ietfa.amsl.com>; Wed, 16 Mar 2016 15:05:00 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.099
X-Spam-Level:
X-Spam-Status: No, score=-2.099 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FORGED_FROMDOMAIN=0.249, FREEMAIL_FROM=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id b9-2Iad3tFxn for <apps-discuss@ietfa.amsl.com>; Wed, 16 Mar 2016 15:04:52 -0700 (PDT)
Received: from mail-io0-x232.google.com (mail-io0-x232.google.com [IPv6:2607:f8b0:4001:c06::232]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 3234D12D72B for <apps-discuss@ietf.org>; Wed, 16 Mar 2016 15:04:52 -0700 (PDT)
Received: by mail-io0-x232.google.com with SMTP id n190so76849062iof.0 for <apps-discuss@ietf.org>; Wed, 16 Mar 2016 15:04:52 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc; bh=AX2Z1KBbEY1kY4XGSJDJpjGRKBTW1nv/vHdIThEbyhE=; b=GEP666GShKmIagWrbJAuNezjgrDU7tFEzqvgToR9dqSG6XAvNUxDJFRujZbq3ezFCV axjSFSTx7v6WhdHulBB0jJzx7+1Ou89SY6tVsBgdBcbM/AuujpUQRZJe46aLgSnaof3w mt1S+ZklvjJeEsRqZOefB1g7gs+M8dtKOxgwFMGf2U8yS2xE1Y2vcbCCs1TC0mXZOYlS zQOA2N0ehzI0OZ2WUi0QkzdAzaMcCL1FkUmXb3iJxwJhLcIRqzh/A8iJBCReKQ+7lqxB acN8IhVh4jm0FHefhINKQFN9ncRuPL//Zk3/5tb1xq15LSqBe2PZfkum1mY8FAVt7Mhi xk4g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:date :message-id:subject:from:to:cc; bh=AX2Z1KBbEY1kY4XGSJDJpjGRKBTW1nv/vHdIThEbyhE=; b=agb9O/3yOlPooFGs1injCl+H5B5AIhKdrCYeHtZI760pMQ8bkDk8/sMuooRyyJq594 T7ffWd7v25meqbBrW2D/zqCimZ3eUTODMavlx/fUPb7ZsdfDXoxZcyiSsP7+/aBdpQWG IoD1zsSfea1jGD7kTh0on0hiXmvhvaJm3QzJPfmzXdJTvEfZd5bZ1nF69Dw7uV0oFat1 QhN2GfR4L/uIWIarSiVlKq2pPwLQjHV+Q4QP8MD2hQpKO5Hd7ifIV9W8qmL23NAyVbGS avScaIJVt+vANBFkfmNuC6Qc94NT3HAHM0rjbvUyG+a2mZB+G+gnBFPe1qLhtcqk4k9I LsqQ==
X-Gm-Message-State: AD7BkJJ2NBaOIw0YiYRsVqyxn2QIqoVgJ8eKjlwzais6wkoJOByIz+y/YsXXwVssrn0if30SrFGoBS/9oa1L5A==
MIME-Version: 1.0
X-Received: by 10.107.11.10 with SMTP id v10mr6049979ioi.188.1458165891356; Wed, 16 Mar 2016 15:04:51 -0700 (PDT)
Sender: phluid61@gmail.com
Received: by 10.107.181.136 with HTTP; Wed, 16 Mar 2016 15:04:51 -0700 (PDT)
Received: by 10.107.181.136 with HTTP; Wed, 16 Mar 2016 15:04:51 -0700 (PDT)
In-Reply-To: <56E96C95.1020106@rename-it.nl>
References: <20151201004350.6084.94598.idtracker@ietfa.amsl.com> <CACweHNDzUSpZYgR6qabNCPrL_x4HNTAmP-aSTJK+2Jr4txNuVA@mail.gmail.com> <56E96C95.1020106@rename-it.nl>
Date: Thu, 17 Mar 2016 08:04:51 +1000
X-Google-Sender-Auth: njg9d-UWxaSmOAs2zSustK47tM8
Message-ID: <CACweHNDwsNfxsfwSQJNKGkG5k59+_vtou=fQumQWhKqEUfhiog@mail.gmail.com>
From: Matthew Kerwin <matthew@kerwin.net.au>
To: Stephan Bosch <stephan@rename-it.nl>
Content-Type: multipart/alternative; boundary="001a113f9232fa5cb8052e31b347"
Archived-At: <http://mailarchive.ietf.org/arch/msg/apps-discuss/EWSZfRNHdAqMQupuBFEAi-TfvuE>
Cc: IETF Apps Discuss <apps-discuss@ietf.org>
Subject: Re: [apps-discuss] I-D Action: draft-ietf-appsawg-file-scheme-05.txt (relative URIs)
X-BeenThere: apps-discuss@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: General discussion of application-layer protocols <apps-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/apps-discuss/>
List-Post: <mailto:apps-discuss@ietf.org>
List-Help: <mailto:apps-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Mar 2016 22:05:00 -0000

On 17/03/2016 12:25 AM, "Stephan Bosch" <stephan@rename-it.nl> wrote:
>
> Hi,
>
> I have a question about relative URI resolution with respect to the file
URI. I know I am a bit late to the party, but this work hasn't caught my
eye before; I only read this draft in a moment of boredom in my coffee
break. I quickly skimmed the mailing list, but I haven't seen any recent
discussions on this topic. There is only a very specific Windows case
described in Appendix C of the draft.
>
> I am not an expert, but I find RFC 3986 Section 5 rather confusing. When
is reference resolution applied? According to Section 5.2.2, absolute URIs
can be subject to the resolution algorithm as well. The main effect of that
is the application of the remove_dot_segments() algorithm, which removes
all "../" and "./" instances from the path. So, is this always applied
before a URI is interpreted? Does it apply to an absolute file URI?
>

Quoting from RFC 3986:

§4.1 "A URI-reference is either a URI or a relative reference." This is a
confusing and somewhat isolated statement, but it suggests to me that all
URIs can be subject to normalisation as part of resolution.

---8<---
5.2.4.  Remove Dot Segments

   The pseudocode also refers to a "remove_dot_segments" routine for
   interpreting and removing the special "." and ".." complete path
   segments from a referenced path.  This is done after the path is
   extracted from a reference, **whether or not the path was relative**, in
   order to remove any invalid or extraneous dot-segments prior to
   forming the target URI.
--->8---

(My emphasis added.) Note that it says "from a reference", which includes
'URIs' per §4.1.

---8<---
6.2.2.3.  Path Segment Normalization

   The complete path segments "." and ".." are intended only for use
   within relative references (Section 4.1) and are removed as part of
   the reference resolution process (Section 5.2).  However, some
   deployed implementations incorrectly assume that reference resolution
   is not necessary when the reference is already a URI and thus fail to
   remove dot-segments when they occur in non-relative paths.  URI
   normalizers should remove dot-segments by applying the
   remove_dot_segments algorithm to the path, as described in
   Section 5.2.4.
--->8---

This is the bit that gets me. A relative reference is clearly defined in
the spec as: not starting with a scheme. The term 'URI' is fuzzy, but seems
to be the complement (i.e. it does start with a scheme). So perhaps what we
think of as an "absolute URI" is actually just a "URI" until it's been
through remove_dot_segments and co.

The quoted section also says "intended only for use within relative
references", but doesn't outright forbid use elsewhere (i.e. in "URIs").

<confused-shrug.gif>

> This leads to the following concrete questions:
>
> - Does the OS file system path extracted from an absolute file URI like
"file:/frop/./friep/../frml" include the dot segments? Or is the
remove_dot_segments() normalization always applied? So, is "/frop/frml" or
"/frop/./friep/../frml" going to be passed to e.g. the open() system call
on a POSIX system?
>

According to §6.2.2.3, if you normalise the URI then the dots are removed
before the URI is converted to a file system path. It doesn't mandate said
normalisation, so I guess that's between you and the OS.

> [snip]
>
> - In that regard, how are relative URIs handled when the base is a file
URI?
>

No different from any other URI scheme, except that in Windows/DOS you have
this weird sub-authority/drive letter issue (C:/a/../../b => C:/b). Which
is quirky and hard to reconcile, hence the appendix.

> In any case, I think these concerns need to be addressed explicitly in
the document, so that the semantics are always as clear as possible.
>

Maybe. I think I might be missing some text in the appendix saying that a
non-normalised URI passed to a DOS-based OS is likely to treat dot-segments
a particular way.

Maybe it's time to update 3986 itself; although I don't know that there's
enough energy in the community to undertake such a feat.

> Regards,
>
> Stephan.
>

Cheers