Re: [decade] FW: Last Call: <draft-farrell-decade-ni-07.txt> (Naming Things with Hashes) to Proposed Standard

Jonathan A Rees <rees@mumble.net> Fri, 08 June 2012 00:35 UTC

Return-Path: <jonathan.rees@gmail.com>
X-Original-To: ietf@ietfa.amsl.com
Delivered-To: ietf@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 88F2411E8157 for <ietf@ietfa.amsl.com>; Thu, 7 Jun 2012 17:35:44 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.977
X-Spam-Level:
X-Spam-Status: No, score=-2.977 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id V8JBXHoRIKLi for <ietf@ietfa.amsl.com>; Thu, 7 Jun 2012 17:35:43 -0700 (PDT)
Received: from mail-pz0-f44.google.com (mail-pz0-f44.google.com [209.85.210.44]) by ietfa.amsl.com (Postfix) with ESMTP id 6EA3811E809A for <ietf@ietf.org>; Thu, 7 Jun 2012 17:35:43 -0700 (PDT)
Received: by dacx6 with SMTP id x6so1606933dac.31 for <ietf@ietf.org>; Thu, 07 Jun 2012 17:35:43 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=Q/IluWru5J3LWcG4SzmUAqOcPzUgEhaM+QCzqn4LtqA=; b=Wj/ODT4AuMuMwJduSmYp2FSRMwXjnsKj+InYDza5KLp0hKYS27ZsMwq3eLmugL3xOz Cu0stexgSh/aTFCp9KtfsePMn3iR4xZMkfnIkpLFF+4gJOqCIfBc8c7xWvzd4ASjczr7 HY5bG2YkorBtNp3ID3H6XeTSNrZ5Tvv6xzOOCDbbD1BzYdZLczwtHYmYr3e1KH+XCt/s aZl+HhvD683eJtxp11G8xm2qU3ZFmR6FfnJuCT7lKtGNlikdpsya6X7cXTYH29L9lbC2 ADE7grGKsh/xgE6KwNCf80RePslsBQuIcGfDFRp7eG0amxED2rDAlyQ6QCwmCsCaKjG8 75dg==
MIME-Version: 1.0
Received: by 10.68.216.2 with SMTP id om2mr15309212pbc.26.1339115743068; Thu, 07 Jun 2012 17:35:43 -0700 (PDT)
Sender: jonathan.rees@gmail.com
Received: by 10.143.5.7 with HTTP; Thu, 7 Jun 2012 17:35:43 -0700 (PDT)
In-Reply-To: <4FD0FDC2.2090802@cs.tcd.ie>
References: <E33E01DFD5BEA24B9F3F18671078951F23A88B0C@szxeml534-mbx.china.huawei.com> <CAGnGFMKjv2QR+ebynnC2GktpYf2QEx73n+0_ZZeyJrAmTYDnjg@mail.gmail.com> <4FD0FDC2.2090802@cs.tcd.ie>
Date: Thu, 07 Jun 2012 20:35:43 -0400
X-Google-Sender-Auth: sryjDKvUVfGY_ReNIplStaxtDDE
Message-ID: <CAGnGFMLc15-ZkD_Of7k0zK7ef-i6HN2LsPCfbW3pkwFkpSiOfA@mail.gmail.com>
Subject: Re: [decade] FW: Last Call: <draft-farrell-decade-ni-07.txt> (Naming Things with Hashes) to Proposed Standard
From: Jonathan A Rees <rees@mumble.net>
To: Stephen Farrell <stephen.farrell@cs.tcd.ie>
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: quoted-printable
X-Mailman-Approved-At: Mon, 11 Jun 2012 07:33:25 -0700
Cc: ietf@ietf.org
X-BeenThere: ietf@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: IETF-Discussion <ietf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf>, <mailto:ietf-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ietf>
List-Post: <mailto:ietf@ietf.org>
List-Help: <mailto:ietf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 08 Jun 2012 00:35:44 -0000

On Thu, Jun 7, 2012 at 3:15 PM, Stephen Farrell
<stephen.farrell@cs.tcd.ie> wrote:
>
> On 06/06/2012 09:33 PM, Jonathan A Rees wrote:
>> As requested I am sending comments on this last call draft to
>> ietf@ietf.org. I sent them to the authors on 6 May but received no
>> reply.
>
> Once again, sorry about that. No idea why I missed responding,
> your mail is in my client even. Ah well.
>
>>
>> Jonathan Rees
>>
>>
>> ---------- Forwarded message ----------
>> From: Jonathan A Rees <rees@mumble.net>
>> Date: Sun, May 6, 2012 at 7:57 PM
>> Subject: comments on http://tools.ietf.org/html/draft-farrell-decade-ni-06
>> To: Alexey Melnikov <alexey.melnikov@isode.com>, Barry Leiba
>> <barryleiba@computer.org>, "S. Farrell" <stephen.farrell@cs.tcd.ie>,
>> "P. Hallam-Baker" <pbaker@verisign.com>
>>
>> Here are some opinions on
>> http://tools.ietf.org/html/draft-farrell-decade-ni-06 :
>>
>> I think this URI scheme would be a welcome addition to web
>> architecture. Wide review should be sought, because this might become
>> quite important and if there are problems they will be very difficult
>> to fix later.
>>
>> I think using .well-known is a good idea.
>>
>> I think integration into the ecosystem, such as browser support,
>> should be anticipated; for this reason I think content type should be
>> elevated from an 'optional feature' to a 'required feature'.
>>
>> [i.e. conformant implementations must support it, even if providing
>> the content type in the URI is itself optional.]
>
> I could certainly live with that, and I suspect my co-authors too,
> but I'd need to ask 'em. However, we'd like to see more support for
> it before doing that. If we only hear from you on this, then I
> think leaving it in the other draft would be right. (See below
> for why we want to keep that draft.) I guess others have a few
> weeks to chime in on this.

OK, I suspect I can find other W3C TAG members who would agree; will circulate.

It is certainly strongly promoted by the W3C web architecture document, see
http://www.w3.org/TR/webarch/#internet-media-type
so I think I have W3C consensus (as of 2005) behind my claim.

>> If you
>> don't do this, you are just encouraging sniffing and privilege
>> escalation attacks. Sniffing would be a big step backwards. Better to
>> do what the data: scheme does and say that there is a default content
>> type of, say, text/plain, and that otherwise the content type ought to
>> be specified in the URI.
>>
>> Content-type privilege escalation risk (and incorrect sniffing risk)
>> should be mentioned in the security considerations section in any
>> case.
>
> Would appreciate text if you can offer some. (Always happy to
> make the sec. cons. bit better.)

Hmm. Hot potato.
Maybe something can be derived from Adam's draft
http://tools.ietf.org/html/draft-abarth-mime-sniff-06
which I assume you've seen...  obviously I'd prefer you do the
drafting, but will consider myself pressured.

>> Maybe the risk that the host used for retrieval might be spoofing the
>> content-type (by providing a bogus content-type in an HTTP response)
>> should also be mentioned.
>
> Good one. Yep, we should mention this whereever ct= is described

No, this is a threat even if ct= isn't used. Suppose that the document
is intended as an unscriptable media type, but the (malicious) server
sets the content-type: header in the response to a scriptable type.
Not good. I think this threat is described in Adam's draft.

>> (A possible design would be to put the
>> content-type (and maybe other headers like Expires:?) in the hashed
>> content, to be pulled out into the HTTP response when the content is
>> served by an http server and then checked by the client, but I
>> understand that this would be a tooling headache.)
>
> Yep. We thought about it but agree that it gets too complex too
> quickly. Maybe with a bit of experience...
>
>> (I don't understand why you want to separate the 'optional' features
>> into a separate spec. This made me miss the ct= feature entirely at
>> first.)
>
> The intent is to put stuff there if we're not sure if its
> ready or needed everywhere. ct= is definitely the main
> candidate feature for moving to the "base" spec though.
> Some other things (e.g. handling dynamic content) are
> way more experimental and should definitely not move
> and maybe need more time before we want them in an RFC.
>
> If all this does get popular then the RFCs can be revised later
> based on experience in any case. (If nobody cares, then it
> won't be a problem:-) So I don't think where things are
> documented now is hugely important in the long term.

Hmm... if this spec takes off like I hope it will, you will have more
influence than you think now, so the implicit messages will be
important... by putting ct= in a separate document you're saying that
it's not important, and you'll encourage implementations that use
sniffing instead of some rational way to convey the media type.

>> I think the documentation should say that the hash and content type
>> together identify the resource,
>
> Well, IMO the hash identifies the resource (if name-data integrity
> is verified) so I don't think I agree that the content type is
> key for identification. It is for interpretation (or whatever
> the right term there is, maybe rendering?) and probably other
> things.

Well maybe we're quibbling over terminology, but in the HTTP spec and
in all the W3C documents, the media type is part of the resource (or
"representation" in HTTPbis; the distinction is immaterial for your
URI scheme), and therefore part of its identity. It would be
unfortunate to introduce an incompatible world view into the RFC
corpus.

>> and that because the content can be
>> verified, the resource can be sought (using the .well-known path, or
>> any other path for that matter) from any source that the client thinks
>> might have it.
>
> Absolutely. That's our primary motivation for all this.
>
>> The primary and alternate domain name(s), and 'wrapped'
>> URLs, are only provided as hints.
>
> Yes.
>
>> I agree with other commenters on the peculiarity of using // to
>> provide the location hint since the named host is not being trusted as
>> an authority. I don't understand why the 'primary' location isn't just
>> encoded in the query, just like the alternate domain(s) and "wrapped
>> URL(s)".  This would have the nice property that you can put the
>> identifying parts (i.e. hash and content type) first, and the less important
>> location hints parts all together after the identification. The various
>> location hints (whether primary or secondary) would go together and
>> their similarity would be clearer.
>>
>> (Unless I'm misunderstanding something and the part after the //
>> actually has status other than a hint?  That would seem to defeat
>> the purpose.)
>
> I think we could argue this (and we did already between the authors;-)
> and it'd come down to "pick a way." We did already and wrote code
> for that, so we'd prefer to stick with it as-is, especially if there's
> no compelling reason to change. I think its likely we can agree that
> there's no compelling differences here in whether we use "//" or
> a "?loc=" or whatever.

Well, you have a chance to fix this now, and it will be impossible to
fix later. Using // is contrary to RFC 3986, which very clearly says
"governance of the name space defined by the remainder of the URI is
delegated to that authority". This is certainly not what this URI
scheme does, so use of // is contrary to the appicable normative spec.

Best
Jonathan

> Cheers,
> Stephen.
>
>
>>
>> Jonathan
>>
>>