Re: [babel] [Babel-users] key rotation take #2

Dave Taht <dave@taht.net> Wed, 28 November 2018 16:31 UTC

Return-Path: <dave@taht.net>
X-Original-To: babel@ietfa.amsl.com
Delivered-To: babel@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 57EF512872C for <babel@ietfa.amsl.com>; Wed, 28 Nov 2018 08:31:21 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mQMrkCb4WE_y for <babel@ietfa.amsl.com>; Wed, 28 Nov 2018 08:31:19 -0800 (PST)
Received: from mail.taht.net (mail.taht.net [176.58.107.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 7DD061277BB for <babel@ietf.org>; Wed, 28 Nov 2018 08:31:19 -0800 (PST)
Received: from dancer.taht.net (unknown [IPv6:2603:3024:1536:86f0:eea8:6bff:fefe:9a2]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.taht.net (Postfix) with ESMTPSA id 000AC221E9; Wed, 28 Nov 2018 16:31:16 +0000 (UTC)
From: Dave Taht <dave@taht.net>
To: Toke Høiland-Jørgensen <toke@toke.dk>
Cc: babel@ietf.org, babel-users@lists.alioth.debian.org
References: <87in0h1ppd.fsf@taht.net> <87efb5v1y6.fsf@toke.dk>
Date: Wed, 28 Nov 2018 08:31:05 -0800
In-Reply-To: <87efb5v1y6.fsf@toke.dk> ("Toke \=\?utf-8\?Q\?H\=C3\=B8iland-J\?\= \=\?utf-8\?Q\?\=C3\=B8rgensen\=22's\?\= message of "Wed, 28 Nov 2018 13:09:05 +0100")
Message-ID: <877egx17w6.fsf@taht.net>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/babel/vQxIaOXVLfHygjWLJTTuZhradcA>
Subject: Re: [babel] [Babel-users] key rotation take #2
X-BeenThere: babel@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "A list for discussion of the Babel Routing Protocol." <babel.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/babel>, <mailto:babel-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/babel/>
List-Post: <mailto:babel@ietf.org>
List-Help: <mailto:babel-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/babel>, <mailto:babel-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 28 Nov 2018 16:31:21 -0000

Toke Høiland-Jørgensen <toke@toke.dk> writes:

> Dave Taht <dave@taht.net> writes:
>
>> so we invent a new keyword "serial".
>
> So what you're trying to express here is the notion of a "receive-only"
> key that is not used for signing outgoing packets, right?


No... the old key is retired from active use in the protocol after
concensus is achieved on the new key by the protocol, and not used again
unless a router comes up with an unreadable hmac. In that case we go
back to at least trying to verify (periodically?) that it's not using
the old key (if we still have it around) and if it's using the old key,
we go back to signing stuff with that key.

Does that concept need to be in the protocol spec?

>If so, I think

Mmmm... I would say that I'm trying to institute key rollover on an
unreliable not-always fully connected network that may or may not be
under attack.

As one example, I have a rogue AP on the network right now. I can't
get into it, nor can I physically find it, but it's exchanging routes. I
can block the routerid on the devices that can hear it via changing the
conf file on the devices that can hear it, but it's "loud" so as fast as
I get rid of the closer machines that can hear it, others pick it up.

Another worse case scenario is that it's announcing routes to 8.8.8.8
with a metric of zero, and also ping flooding the network probing for
every address in the 10.0.0.0/8 range, and every time I try to block the
routerid, it changes. In this case you'd want to add a new key, and
retire the old one as fast as possible, but that's only "as possible".

...

I don't know if you've been in a world where every keystroke takes 10
seconds to transmit, but I have some horror stories that oft require
beer to tell. One attack (in 2004) literally melted down the
switch... the QA team deployed 60? windows virtual machines - fresh
installs - virgins - no anti-virus or upgrades installed - and unbnownst
to my team a worm was already loose on one machine on that
network. Inside of 10 minutes it had grabbed all 60 vms, and it was on a
10 network, and thus every machine started probing the entire /8... and
doing a DOS on every machine it could find - which included those vms -
the carnage proceeded until the switch overheated and failed
overnight. 3AM, we blindly replaced the switch only to see all
the lights go solid again. It's not obvious you're under attack at that
point (things like broadcast storms we explored first), and it wasn't
until the QA guy responsible wandered in at noon wondering why it was
taking 10 seconds for each keystroke to get through to his shiny new QA
network that light dawned on us... "You did *what*???"

> it would be better to express that explicitly as a property of the key
> config that can be changed on a per-key basis. For one thing, 'serial'
> is misleading as it sounds like something that affects the wire
> format,

OK. how about "new" and "old" as keywords? That implies two states and
two states only. I liked 0 and X as numbers, so long as the ascending
property is maintained. As for why not 0 and 1, see below.

Totally open to bikeshedding the name. :) babeltowerno? 

> and for another with your proposal it becomes difficult to re-instate a
> previously retired key (say, if you want to restore connectivity to an
> old router that dropped off while you were changing keys).

No it doesn't. you just would reinstate the prior key with the smaller
serial number. Or if you've gone back to 0 for it you increment the
current key to 1 and then put in the 0th one.

another use case:

I have 6 campus routers in the lab that are not powered on right now,
pending a reflash with the latest stuff. Worse, they used to be
connected to a far-off portion of the network and when I bring them up,
they start announcing routes to that bit (and the longer prefixes
overall mean they end up being my default route to the rest of the
campus network, over a really crappy link going from the basement to the
roof).

I do a key upgrade throughout the net and they can no longer do any
harm, and can deal with each via a dedicated reconfiguration router
that filters them out before they hit the rest of the net. Yea!

Over time I'll end up with a file full of retired keys.

(there are still 20+ machines not powered up in the lab and the rest are
being donated to charity)

> -Toke