Re: [DNSOP] Whiskey Tango Foxtrot on key lengths...

Nicholas Weaver <nweaver@icsi.berkeley.edu> Wed, 02 April 2014 04:11 UTC

Return-Path: <nweaver@icsi.berkeley.edu>
X-Original-To: dnsop@ietfa.amsl.com
Delivered-To: dnsop@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 729471A0108 for <dnsop@ietfa.amsl.com>; Tue, 1 Apr 2014 21:11:08 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.782
X-Spam-Level:
X-Spam-Status: No, score=0.782 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, MIME_8BIT_HEADER=0.3, SPF_PASS=-0.001, TVD_PH_BODY_ACCOUNTS_PRE=2.393, T_RP_MATCHES_RCVD=-0.01] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qOPg6Rgn4fK4 for <dnsop@ietfa.amsl.com>; Tue, 1 Apr 2014 21:11:03 -0700 (PDT)
Received: from rock.ICSI.Berkeley.EDU (rock.ICSI.Berkeley.EDU [192.150.186.19]) by ietfa.amsl.com (Postfix) with ESMTP id 452E61A00F3 for <dnsop@ietf.org>; Tue, 1 Apr 2014 21:11:03 -0700 (PDT)
Received: from localhost (localhost.localdomain [127.0.0.1]) by rock.ICSI.Berkeley.EDU (Postfix) with ESMTP id CE2182C403B; Tue, 1 Apr 2014 21:10:59 -0700 (PDT)
X-Virus-Scanned: amavisd-new at ICSI.Berkeley.EDU
Received: from rock.ICSI.Berkeley.EDU ([127.0.0.1]) by localhost (maihub.ICSI.Berkeley.EDU [127.0.0.1]) (amavisd-new, port 10024) with LMTP id J4kY8tM9W70N; Tue, 1 Apr 2014 21:10:55 -0700 (PDT)
Received: from [10.0.1.22] (c-76-103-162-14.hsd1.ca.comcast.net [76.103.162.14]) (Authenticated sender: nweaver) by rock.ICSI.Berkeley.EDU (Postfix) with ESMTP id 902BB2C4028; Tue, 1 Apr 2014 21:10:55 -0700 (PDT)
Content-Type: multipart/signed; boundary="Apple-Mail=_6E8EB7A1-8278-4E18-92AF-52EA2589624D"; protocol="application/pgp-signature"; micalg="pgp-sha512"
Mime-Version: 1.0 (Mac OS X Mail 7.2 \(1874\))
From: Nicholas Weaver <nweaver@icsi.berkeley.edu>
In-Reply-To: <CFA0ED6F-6800-4638-90B0-CD414301C501@ogud.com>
Date: Tue, 01 Apr 2014 21:10:54 -0700
Message-Id: <C83A771F-CE75-46F7-914E-3D8859CDF4FD@icsi.berkeley.edu>
References: <0EA28BE8-E872-46BA-85FD-7333A1E13172@icsi.berkeley.edu> <53345C77.8040603@uni-due.de> <B7893984-2FAD-472D-9A4E-766A5C212132@pch.net> <102C13BE-E45E-437A-A592-FA373FF5C8F0@ogud.com> <474B0834-C16B-4843-AA0A-FC2A2085FEFB@icsi.berkeley.edu> <CFA0ED6F-6800-4638-90B0-CD414301C501@ogud.com>
To: Ólafur Guðmundsson <ogud@ogud.com>
X-Mailer: Apple Mail (2.1874)
Archived-At: http://mailarchive.ietf.org/arch/msg/dnsop/nTgAxv7jvsam5o6EXNCseU36iAQ
Cc: dnsop WG <dnsop@ietf.org>, Nicholas Weaver <nweaver@icsi.berkeley.edu>
Subject: Re: [DNSOP] Whiskey Tango Foxtrot on key lengths...
X-BeenThere: dnsop@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: IETF DNSOP WG mailing list <dnsop.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dnsop>, <mailto:dnsop-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/dnsop/>
List-Post: <mailto:dnsop@ietf.org>
List-Help: <mailto:dnsop-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dnsop>, <mailto:dnsop-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 02 Apr 2014 04:11:08 -0000

On Apr 1, 2014, at 7:37 PM, Olafur Gudmundsson <ogud@ogud.com> wrote:

Pardon my language, but you are repeating the same bogus performance arguments that have hurt crypto for years.  Having heard the same thing, over and over and over again, over the better part of a decade and change, it really does get to me.


Symmetric key is effectively free on a modern computer (even when AES first came out you could do nearly a gigabit on a single core.  Try getting a gigabit of I/O into or out of a circa 2001 PC), and even slow-old RSA, especially RSA validation, is close enough to free for practical purposes.

And when you do have big load on a server serving many clients in the very rare case where the crypto IS the critical bottleneck, this is an obscenely parallel task... 



>> There is far far far too much worrying about "performance" of crypto, in cases like this where the performance just doesn't matter!
>> 
> 
> disagree strongly, you are only looking at a part of the whole picture. 
> Verification adds resolution latency + verification adds extra queries which is more latency
> 	latency == un-happy eye-balls.  

You are talking about 50 microseconds of additional latency for using a real key length instead of 1024b lengths.  Not milliseconds.  Microseconds.

3e8 m/s * 1e-6s/us is 300m/us.  So thats 15 kilometers of network travel if you remove all switches and lay down a vacuum cable.


And remember, its not on the critical path except for the final validation, because you can use glue before it validates to do your lookups and validate the upper pieces of the hierarchy while waiting for the lower pieces to arrive, and well, its going to take a lot longer than 50 microseconds for that data to arrive.

I don't see HFT systems doing DNSSEC validation...



>> Yes, you can only do 18K verifies per CPU per second for 2048b keys.  Cry me a river.  Bite the bullet, go to 2048 bits NOW, especially since the servers do NOT have resistance to roll-back-the-clock attacks.
> 
> Why not go to a good ECC instead ? (not sure which one, but not P256 or P384) 
> 
> 18K answers/second ==> is a fraction of what larger resolver servers do today during peak times, yes not all answers need validation.
> BUT you need to take into account that in some cases there is going to be lots of redundancy in verification in large resolver clusters, thus
> if your query stream hits 5 different hosts all of them may end up doing almost 5x of the work, thus adding servers does not scale. 
> Yes people can create any cast clusters in depth where only the front end ones do verification and the back end ones only answer queries, but
> that has different implications. 

If the crypto is too much time in your cluster resolver, hash(name) to find cluster node that already has the answer.  These have known, off the shelf solutions in the unlikely event that such a key length jump makes a difference.


> Remember it is not average load that matters it is peak load, even if the peak for 30 minutes on one day of the year. 

And even in that universal deployment case, of 10x the load of the average case, that is a whopping 10 CPU cores to do the crypto for a 1B lookup/day case.  Thats less than a single server these days.

1B uncached lookups is a lot.  This is a big number for things that are largely driven by human processes and which (even in the nightmare that is CDNs and 10s TTLs), cache, and which ISPs distribute close to the end user to save latency.


Oh, and unless the CDN is doing dynamic signing, the RRSIG TTLs are much longer, so the crypto operations cache for a lot longer TTL.

> Over the years I have been saying use keys that are appropriate, thus for someone like Paypal it makes sense to have strong keys,
> but for my private domain does it matter what key size I use? 
> I do not buy the theory that one size fits all model for crypto, people should not use unsafe crypto, but the one size fits all is not the
> right answer, just like not every zone needs a KSK and ZSK split. ( I use single 1280 bit RSA key with NSEC) 

> A real world analogy is that not everyone needs the same kind of car, some people need big cars, others small ones or even no car. 

But at the same time, we mandate certain crash resistance.  RSA keys of less than 2048b should be deprecated, and DNSSEC software MUST NOT generate them.  

We do not sell cars with bodies made of silly putty and airbags made from party balloons, we SHOULD NOT deploy software with known unsafe key lengths.


> Furthermore using larger keys than your parents is non-sensical as that moves the cheap point of key cracking attack. 

And why do you think I'm so pissed at the root, and com, and everyone else using 1024b keys?

Thats a joke, a bad one, and it needs to be fixed, and it needs to be fixed to be a real value, and real value, based on real cryptographers recommendations dating back the better part of a decade is 2048b.

>> In a major cluster validating recursive resolver, like what Comcast runs with Nominum or Google uses with Public DNS, the question is not how many verifies it can do per second per CPU core, but how many verifies it needs to do per second per CPU core.
> 
> I have no doubt that CPU's can keep up but the point I was trying to make is increasing the key sizes by this big jump 
> ==> invalidates peoples assumptions on what the load is going to be in the near term. 

How many resolvers are currently spending more than 10% of CPU doing DNSSEC validation right now?  Any above 1%?

>> And at the same time, this is a problem we already know how to parallelize, and which is obscenely parallel, and which also caches…
> 
> Do we? Some high performance DNS software is still un-treaded, many resolvers are run in VM's with low number of cores 
> exported to the VM. 

Who is running a major centralized ISP (1M+ customers on a single resolver instance, 1B+ external lookups/day) on a un-threaded, low powered system for DNS yet doing DNSSEC validation?

The systems you are talking about are in the 100M external lookup range or less, and in that case the crypto really doesn't matter to the performance.  Oh, and they aren't validating anyway, so who cares?


>> Lets assume a typical day of 1 billion external lookups for a major ISP centralized resolver, and that all are verified.  Thats less 1 CPU core-day to validate every DNSSEC lookup that day at 2048b keys.  
> 
> 1B is low due to low TTL's and synthetic names used for transactions, and as I said before it is peak load that matters not average. 
> DNSSEC processing is just a part of the whole processing model.

1B external lookups is a number taken from when I was looking at SIE data (which captures the queries from the participating recursive resolvers) about a year ago to detect covert channels.

It was ~2B/day lookups in the data set, but there were multiple major cluster resolvers belonging to a single top 10 US ISP which dominated that dataset.


Even assuming peak load 10x average (an unrealistic amount of peakiness if you've ever looked at network traffic graphs), and every answer has to validate once (yes, you may have a few more, but in practice this caches as well, since all 1B lookups are not going to be to 1B different domains...), thats still just 10 CPU cores.  

The nearly 2-year-old mac mini on my desktop is a quad core.  If your billion lookup a day DNS resolver is validating DNSSEC, everything is signed, yet you  can't afford to run on a single modern dual-CPU oct-core system, you get no sympathy from me.


>> And yeah, DNS is peaky, but that's also why this task is being run on a cluster already, and each cluster node has a lot of CPUs.
> 
> that costs money, and effort to operate.

Effort?  How?  The assumption is you are already going through the hassle of validation.

Money?  You're talking peanuts.

--
Nicholas Weaver                  it is a tale, told by an idiot,
nweaver@icsi.berkeley.edu                full of sound and fury,
510-666-2903                                 .signifying nothing
PGP: http://www1.icsi.berkeley.edu/~nweaver/data/nweaver_pub.asc