Re: [websec] RFC6454 (Origin) vs URI schemes unlike "http"

Adam Barth <ietf@adambarth.com> Wed, 13 February 2013 20:45 UTC

Return-Path: <ietf@adambarth.com>
X-Original-To: websec@ietfa.amsl.com
Delivered-To: websec@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4522321F8658 for <websec@ietfa.amsl.com>; Wed, 13 Feb 2013 12:45:57 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.977
X-Spam-Level:
X-Spam-Status: No, score=-2.977 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id DWM--vVvEwqJ for <websec@ietfa.amsl.com>; Wed, 13 Feb 2013 12:45:55 -0800 (PST)
Received: from mail-lb0-f171.google.com (mail-lb0-f171.google.com [209.85.217.171]) by ietfa.amsl.com (Postfix) with ESMTP id A1F9121F85F0 for <websec@ietf.org>; Wed, 13 Feb 2013 12:45:54 -0800 (PST)
Received: by mail-lb0-f171.google.com with SMTP id gg13so1273057lbb.16 for <websec@ietf.org>; Wed, 13 Feb 2013 12:45:53 -0800 (PST)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:x-received:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc:content-type:content-transfer-encoding :x-gm-message-state; bh=uQeJo7l9mz4YHPGyE9dlUdmSk8wfPXUzCkkKCsYGHTk=; b=UDKQTczAmq6/Yg9+hYjKBXHuAj90wgdl2ChxffLk59rf4i5upppynxBAqIs3HCohkm ZIRaFIEOR8Gl52c3+eHn5p48wv3vFaKuKESNhgoTUDdll7ikIK7Gcr+kF5nIK9pMRpd9 Kn89V+0Q/YJkH2e6tS2hSQaHzpandFtvWuO5xEfKejEpQiVfERZbVJwZ6S5hz3pBFdi1 PsBF1V9T57bipJ+PneIsyYEZY38WZVUVtVzo0M24B8MI3TiEehUy1G5/Xkm+7hLy47Nu Vl4C9m7S96pejOy4IYan6OY0CB/nI+feFwb814aQlJ70xtjNa26YZm77ldhaxS/KY5Cf et5g==
X-Received: by 10.112.23.34 with SMTP id j2mr9277664lbf.118.1360788353500; Wed, 13 Feb 2013 12:45:53 -0800 (PST)
Received: from mail-lb0-f176.google.com (mail-lb0-f176.google.com [209.85.217.176]) by mx.google.com with ESMTPS id t17sm6147858lam.9.2013.02.13.12.45.52 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 13 Feb 2013 12:45:52 -0800 (PST)
Received: by mail-lb0-f176.google.com with SMTP id s4so1239336lbc.7 for <websec@ietf.org>; Wed, 13 Feb 2013 12:45:51 -0800 (PST)
X-Received: by 10.112.48.163 with SMTP id m3mr5000101lbn.90.1360788351482; Wed, 13 Feb 2013 12:45:51 -0800 (PST)
MIME-Version: 1.0
Received: by 10.112.88.210 with HTTP; Wed, 13 Feb 2013 12:45:21 -0800 (PST)
In-Reply-To: <h11nh8hl1037ibut5adr4b2ifuu9umrvie@hive.bjoern.hoehrmann.de>
References: <h11nh8hl1037ibut5adr4b2ifuu9umrvie@hive.bjoern.hoehrmann.de>
From: Adam Barth <ietf@adambarth.com>
Date: Wed, 13 Feb 2013 12:45:21 -0800
Message-ID: <CAJE5ia9ZL_H4wzOQqo=p7zcNo-NXwr+_i9SMit3j78RdwNQg=Q@mail.gmail.com>
To: Bjoern Hoehrmann <derhoermi@gmx.net>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-Gm-Message-State: ALoCoQnO9nw4AsOxSmOwcWQ6mvgqD4DpcbUfDUDnnma3yp7KYYZAHwgc700kK1l84ekRnIe0OUoC
Cc: websec <websec@ietf.org>
Subject: Re: [websec] RFC6454 (Origin) vs URI schemes unlike "http"
X-BeenThere: websec@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Web Application Security Minus Authentication and Transport <websec.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/websec>, <mailto:websec-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/websec>
List-Post: <mailto:websec@ietf.org>
List-Help: <mailto:websec-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/websec>, <mailto:websec-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 13 Feb 2013 20:45:57 -0000

Yeah, the next time we update the spec, we'll probably need to
reference http://url.spec.whatwg.org/, which has the needed
definitions.

Adam


On Wed, Feb 13, 2013 at 6:09 AM, Bjoern Hoehrmann <derhoermi@gmx.net> wrote:
> Hi,
>
>   https://tools.ietf.org/html/rfc6454 fails to properly account for a
> number of cases where URIs and URI schemes are slightly unusual, e.g.
>
>    The origin of a URI is the value computed by the following algorithm:
>
>    1.  If the URI does not use a hierarchical element as a naming
>        authority (see [RFC3986], Section 3.2) or if the URI is not an
>        absolute URI, then generate a fresh globally unique identifier
>        and return that value.
>    ...
>    2.  Let uri-scheme be the scheme component of the URI, converted to
>        lowercase.
>
>    3.  If the implementation doesn't support the protocol given by uri-
>        scheme, then generate a fresh globally unique identifier and
>        return that value.
>
> Consider `javascript://example.org`. In order to make the determination
> whether "the URI" uses "a hierarchical element as a naming authority"
> you have to know the scheme, but the scheme is not mentioned until after
> the first step, which may lead one to believe that you can make this de-
> termination without knowing the scheme.
>
> For 'javascript' in particular there is no "over the wire" "protocol",
> so it's not clear what to do in the third step. Consider this from the
> perspective of someone making a generic URI library and giving URI
> objects some `.origin` property: how would that work? A browser might
> support "ftp" but a user might disable loading resources over FTP in the
> browser; or it might phase out FTP support but keep supporting 'ftp'
> URIs (like by still knowing the default port). What is the "Origin of a
> URI" then? Does it matter if you do not actually load content from such
> a URI, or don't do it in a web-browser-like fashion? I am not sure...
>
> Further down there is
>
>    5.  Let uri-host be the host component of the URI, converted to lower
>        case (using the i;ascii-casemap collation defined in [RFC4790]).
>
> What if there is no `host` component? `news:de.comp.text.xml` does not
> have one, even though the scheme does use "a hierarchical element as a
> naming authority" and the URI is valid? For that matter, what if there
> is such a component but it's the empty string (like in `file:///`, if
> you ignore the specific provision for 'file')? It seems the empty string
> would pass through the "algorithm", but it's unclear if that is inten-
> tional and what the security considerations are in this regard.
>
>    6.  If there is no port component of the URI:
>
>        1.  Let uri-port be the default port for the protocol given by
>            uri-scheme.
>
>        Otherwise:
>
>        2.  Let uri-port be the port component of the URI.
>
> Per RFC 3986 schemes may define a default port but do not have to. What
> if a scheme does not define a default port? Also, what if the component
> is present, but is the empty string? In section 6.1 I'm told
>
>        1.  Append a U+003A COLON code point (":") and the given port, in
>            base ten, to result.
>
> I can't give the empty string in base ten. Per RFC 3986 the port compo-
> nent should be omitted when it is the empty string, which would lead to
> use of the default port if any, but there is no provision in RFC 6454
> for normalising URIs and it's valid to use the empty string as value so
> that is valid input into the "Origin of a URI" "algorithm".
>
> regards,
> --
> Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
> Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de
> 25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
> _______________________________________________
> websec mailing list
> websec@ietf.org
> https://www.ietf.org/mailman/listinfo/websec