Re: [apps-discuss] Indicating hash size in 'ni' URIs

Stephen Farrell <stephen.farrell@cs.tcd.ie> Wed, 09 May 2012 11:57 UTC

Return-Path: <stephen.farrell@cs.tcd.ie>
X-Original-To: apps-discuss@ietfa.amsl.com
Delivered-To: apps-discuss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id ACDF221F84BF for <apps-discuss@ietfa.amsl.com>; Wed, 9 May 2012 04:57:43 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.532
X-Spam-Level:
X-Spam-Status: No, score=-102.532 tagged_above=-999 required=5 tests=[AWL=0.067, BAYES_00=-2.599, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 1N4mKvihYnZr for <apps-discuss@ietfa.amsl.com>; Wed, 9 May 2012 04:57:39 -0700 (PDT)
Received: from scss.tcd.ie (hermes.scss.tcd.ie [IPv6:2001:770:10:200:889f:cdff:fe8d:ccd2]) by ietfa.amsl.com (Postfix) with ESMTP id 33FDD21F8494 for <apps-discuss@ietf.org>; Wed, 9 May 2012 04:57:39 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1]) by hermes.scss.tcd.ie (Postfix) with ESMTP id A5D98171538; Wed, 9 May 2012 12:57:37 +0100 (IST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cs.tcd.ie; h= content-transfer-encoding:content-type:in-reply-to:references :subject:mime-version:user-agent:from:date:message-id:received :received:x-virus-scanned; s=cs; t=1336564657; bh=GYo3wMHcpnbpdc JLWGyADOCIV8nKrWDhpqV1HN/nQLc=; b=Bx+vU44PZ2g+LbtQN/NFi5HuH072qN l5ArUO17wtUUUQQWL+KIOkDI0NojyNUJJrLzAgPOEtcr1iihvFbCCb2mk+PtTmY7 si+M7EMRCF/FhKwru+raIuWck5+dIZnaARwBKBErDxGQpneaXVlAOFtS2U3Icijj AQ4NmAPS1Vqf9UnezT8UKf4xIYANHlBuh9a8F8cYZ99sOlonlzfj/vazckyCMLyh qFrUyeCNSzK1X+B5wuthHn+vzjBARPDKM2nKGBl0qLI/2Q+n2sXcCNrhlbLLq+Dr AzWzlSJrgoG819vxjyVEai/0gk1dVjj7t2jmke6RGO0f7B+9Izj/E2Vw==
X-Virus-Scanned: Debian amavisd-new at scss.tcd.ie
Received: from scss.tcd.ie ([127.0.0.1]) by localhost (scss.tcd.ie [127.0.0.1]) (amavisd-new, port 10027) with ESMTP id Td7cLIPBtd8W; Wed, 9 May 2012 12:57:37 +0100 (IST)
Received: from [134.226.36.180] (stephen-think.dsg.cs.tcd.ie [134.226.36.180]) by smtp.scss.tcd.ie (Postfix) with ESMTPSA id 14B88171537; Wed, 9 May 2012 12:57:32 +0100 (IST)
Message-ID: <4FAA5B6D.5040208@cs.tcd.ie>
Date: Wed, 09 May 2012 12:56:29 +0100
From: Stephen Farrell <stephen.farrell@cs.tcd.ie>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:12.0) Gecko/20120430 Thunderbird/12.0.1
MIME-Version: 1.0
To: "Manger, James H" <James.H.Manger@team.telstra.com>
References: <255B9BB34FB7D647A506DC292726F6E114F1DD400E@WSMSG3153V.srv.dir.telstra.com> <4F9E8055.1080209@cs.tcd.ie> <255B9BB34FB7D647A506DC292726F6E114F23970AB@WSMSG3153V.srv.dir.telstra.com> <255B9BB34FB7D647A506DC292726F6E114F2853D2C@WSMSG3153V.srv.dir.telstra.com> <4FA68A60.6030207@cs.tcd.ie> <255B9BB34FB7D647A506DC292726F6E114F28540B5@WSMSG3153V.srv.dir.telstra.com> <CABkgnnX6wp=ZFn2n-=O0_spPtZmAvtwYMnrsKM3bLxoAV3kWbw@mail.gmail.com> <4FA8EB06.4020004@cs.tcd.ie> <255B9BB34FB7D647A506DC292726F6E114F2A1546E@WSMSG3153V.srv.dir.telstra.com>
In-Reply-To: <255B9BB34FB7D647A506DC292726F6E114F2A1546E@WSMSG3153V.srv.dir.telstra.com>
X-Enigmail-Version: 1.4.1
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 7bit
Cc: Apps Discuss <apps-discuss@ietf.org>
Subject: Re: [apps-discuss] Indicating hash size in 'ni' URIs
X-BeenThere: apps-discuss@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: General discussion of application-layer protocols <apps-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/apps-discuss>
List-Post: <mailto:apps-discuss@ietf.org>
List-Help: <mailto:apps-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/apps-discuss>, <mailto:apps-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 09 May 2012 11:57:43 -0000

Hiya,

I think the main thing to keep in mind is that we don't
want people to make the mistake of thinking two names are
the same when they aren't. In other words a name with a
32 bit truncated sha-256 hash is not the same as a name
with the full 256 bits of hash output, even if the hash
value for one is a prefix of that for the other.

The reason being that if an application treats those the
same then that'd open up a bunch of attacks. For example,
if I publish an object with the full hash, then I probably
(in general) don't want some other application to treat a
name with just the first 32 bits of that as referring to
the same thing, since the 32 bit name will have lots of
colliding objects. If this stuff gets used, there will
be many cases where names will occur more than once in
application protocols, and it'll be unpredictable which
instance of the name will be used for name-data integrity
checking. (Maybe we ought add a bit more security
considerations text to explain that better, suggestions
welcome.)

Now, IMO, the syntax we have makes that mistake less likely,
whereas your proposal makes it more likely to happen, due
to developer carelessness or misunderstanding.

We also have code [1] that implements the current syntax,
so for us at least, we'd want to see a benefit to switching,
and I only see a downside right now. That code is very
early, so it's not a major major deal, but I'd not want
to break stuff for fun, and given that we've implemented
this in a bunch of languages, it'd be a chunk of work.
(Developer enthusiasm, after many months of discussion
and ppt engineering lead to a burst of coding activity
resulting in c, ruby, java, php, python, clojure and a
few application examples, including wget and curl all
of which do the ni URI as per the current draft:-)

On 05/09/2012 02:02 AM, Manger, James H wrote:
>>>> I would still rather ditch the truncation length from the alg names.
>  
>>> I'm with James on this one.
>>> ... The example of a truncated stream cipher seems contrived to me.
> 
>> Not to me.
>>
>> I think that omitting the size would mean that anyone using
>> these would need to go think about potential truncation attacks,
>> and having to do that is bad, since most times they won't do
>> it at all and even if they do, those attacks, if they apply,
>> will be likely be very subtle. So even if you try figure it
>> out, you may not get the analysis right. Avoiding all that
>> is just better IMO.
> 
> I am trying to understand the truncation attacks you are concerned about. Would an example be an HTTPS web connection that abruptly stops mid stream? All the received TLS records are properly encrypted and have valid MACs. However, only half of the HTML page has been delivered: it might end "<script src='ni://example.com/sha-256;f4OxZX". Somewhere in the receiver's stack it knows truncation has occurred (the TLS and HTTP and layers didn't close properly), but the higher layer ignores that and interprets the content anyway. In this example, a browser (which is very lenient) accepts the incomplete <script> tag and acts on the wrong (truncated) URI.
> 
> The fault is not with the 'ni' URI. Solving this situation would require all protocols (not just 'ni' identifiers) to be "prefix-free" (eg no prefix of a protocol can be meaningful). That is not realistic.

Right. I did say its a corner case at best, but we don't know
what protocols will use these names. And maybe one of those will
allow truncation attacks. I'd really rather not have to think
about it. (Mind you, if the name is just sent encrypted in a
stream cipher without integrity, then swapping bits can also
switch sha-256-96 to sha-256-32 easily enough;-)

> 
> An 'ni' URI can have query parameters. Couldn't a truncation attack strip those regardless of whether the hash length is in the alg name? Why isn't that a problem?

Whether that matters is down to the application protocol I
think. For the base spec, we just say that the names are
the same, iff the hash alg, length and value are the same.

> 
>> Additionally, if we took out the length then we'd have to
>> figure whether or not (or when, yuk) ni:///sha-256;abc is the
>> "same" as ni:///sha-256;abcdef and even if we declare that its
>> never the same, that's something people are liable to get
>> wrong, since sometimes both would actually refer to the same
>> thing. Yuk yuk yuk;-)
> 
> Implementations are just as likely to consider that the following are the "same":
>   ni:///sha-256-32;abcdef
>   ni:///sha-256-64;abcdefghijk

I guess I just disagree about that.

strncmp(a,b,min(len_a,len_b)) would not treat them as equal
with the current syntax but would with your suggestion.

> If a truncation attack really needs to be defended against, we should specify that the output length (and algorithm) are inputs to the hash. For example, Truncate_to_n(Hash(alg || n || content)).

We thought about that but didn't go for it, because it'd
easily get over complicated, e.g. you'd maybe want to also
include the query string, authority etc. and there didn't
seem to be much benefit from that level of complexity.
It'd also mean that you couldn't just create these
names for (large sets of) things for which you've already
calculated the hashes e.g. if you had a large image library
already using sha-256 hashes in their URLs then with
the above it'd not be possible to use the .well-known/ni
URL (and presumably an HTTP 30x re-direct) to de-reference
the name. Allowing that seems like a good thing.

Having said all that, if it turns out that you're right
and we're wrong, then it'd not be too hard to define and
register a new hash function that'd work as you suggest,
but I just don't see it being more useful right now and
do worry about it being less safe.

Cheers,
S.

[1] http://sourceforge.net/projects/netinf/

> --
> James Manger
>