Re: [Sidrops] what to do when the CRL is hosed?

Tim Bruijnzeels <tim@nlnetlabs.nl> Tue, 24 March 2020 13:42 UTC

Return-Path: <tim@nlnetlabs.nl>
X-Original-To: sidrops@ietfa.amsl.com
Delivered-To: sidrops@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 13C5C3A0A4D for <sidrops@ietfa.amsl.com>; Tue, 24 Mar 2020 06:42:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.099
X-Spam-Level:
X-Spam-Status: No, score=-2.099 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=nlnetlabs.nl
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Fnrn-qTb3-mN for <sidrops@ietfa.amsl.com>; Tue, 24 Mar 2020 06:42:43 -0700 (PDT)
Received: from dicht.nlnetlabs.nl (dicht.nlnetlabs.nl [185.49.140.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 9AB1D3A0977 for <sidrops@ietf.org>; Tue, 24 Mar 2020 06:42:43 -0700 (PDT)
Received: from [192.168.1.6] (unknown [177.37.212.238]) by dicht.nlnetlabs.nl (Postfix) with ESMTPSA id 0B62621560; Tue, 24 Mar 2020 14:42:39 +0100 (CET)
Authentication-Results: dicht.nlnetlabs.nl; dmarc=fail (p=none dis=none) header.from=nlnetlabs.nl
Authentication-Results: dicht.nlnetlabs.nl; spf=fail smtp.mailfrom=tim@nlnetlabs.nl
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=nlnetlabs.nl; s=default; t=1585057360; bh=EeWj5ICVWWO06gnjrnMjdKoWl/ITTaZ/axteSpTN+U0=; h=Subject:From:In-Reply-To:Date:Cc:References:To; b=F5pc80HF+/A2JRyHS0zVHAc6lFBdE3UpC2q+aNygL+jegnav+E31w+j9iyTvbHHTb Xl4pBl+ncgKcP8pGtm/3OoOp1swwEl091vxR1TNVpwb9rwQmKfKxlvy/FiKNTE18DX ETZcTrMuadMGXEosDE666S4ckZ3E2D3emhkoe6g0=
Content-Type: text/plain; charset="us-ascii"
Mime-Version: 1.0 (Mac OS X Mail 13.0 \(3608.60.0.2.5\))
From: Tim Bruijnzeels <tim@nlnetlabs.nl>
In-Reply-To: <4fe26a30-4a08-41a5-be7f-0c5997230d0a@www.fastmail.com>
Date: Tue, 24 Mar 2020 10:42:37 -0300
Cc: SIDR Operations WG <sidrops@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <3B072025-68C5-4E62-9466-5122D483F691@nlnetlabs.nl>
References: <20200224151532.GD19221@vurt.meerval.net> <20200224211531.GB60925@vurt.meerval.net> <20200225090338.10464b1a@glaurung.nlnetlabs.nl> <9cc3a6a5-f9c8-23df-588e-48dee5db62d4@verizon.net> <3B7006DE-5366-47E7-9CD6-AF392F9ED0CC@nlnetlabs.nl> <6602d1a7-ecbf-73a0-21d8-1254fb2aff97@verizon.net> <253D1ED7-52D8-4A00-9D69-095E61D09C9F@nlnetlabs.nl> <db920115-e188-700f-ceb2-08cd2996046a@verizon.net> <3a683da4-42f9-28c6-f0dd-4d11d3c67857@ripe.net> <4fe26a30-4a08-41a5-be7f-0c5997230d0a@www.fastmail.com>
To: Job Snijders <job@ntt.net>
X-Mailer: Apple Mail (2.3608.60.0.2.5)
Archived-At: <https://mailarchive.ietf.org/arch/msg/sidrops/mDbjP9AYWBu4SGQ_mry-8J2K88M>
Subject: Re: [Sidrops] what to do when the CRL is hosed?
X-BeenThere: sidrops@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: A list for the SIDR Operations WG <sidrops.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/sidrops>, <mailto:sidrops-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/sidrops/>
List-Post: <mailto:sidrops@ietf.org>
List-Help: <mailto:sidrops-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sidrops>, <mailto:sidrops-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 24 Mar 2020 13:42:51 -0000


> On 23 Mar 2020, at 12:27, Job Snijders <job@ntt.net> wrote:
> 
> Dear all,
> 
> I'm observing the discussions in sidrops with increasing alarm. Taking stock of the current situation I see:
> 
> 1) failure to produce consistent output across multiple validators based on the current specs
> 2) failure to produce software updates which address the reported MITM scenarios
> 
> One can argue "well, the specs did not explicitly mentioned what to do in scenario X Y Z", but even if the specs don't give precise guidance on what to do, I do not believe that the specs anywhere specify that insecure behaviour is acceptable. And even /if/ the specs granted "permission" for sloppy validation strategies, I'd expect any software developer producing 'security software' to deviate from such specs and implement validation strategies which are as secure as possible.
> 
> On Mon, Mar 23, 2020, at 15:23, Robert Kisteleki wrote:
>> On 2020-03-23 14:33, Stephen Kent wrote:
>>>> What I am suggesting is that we *could* update 6486 and make
>>>> validation more restrictive regarding manifests:
>>>> - all objects on a manifest must be present and accounted for (I agree
>>>> with Job regarding partial withhold attacks)
>>>> - all objects on a manifest need to be validated
>>>> - objects that are not on a manifest can be considered invalid
>>>> 
>>>> This is in-line with the specifications defined in RFC 6481 (A Profile
>>>> for Resource Certificate Repository Structure), which essentially says
>>>> that all current objects must be published, and that no invalid
>>>> objects may be published.
>> 
>>> agree.
>> 
>> As I wrote before, I believe that a single mistake (withheld /
>> unpublished object or even a bit flip) invalidating a whole repository,
> 
> Some nuance could be added: it depends on where the error in the tree exists. If a top level manifest or CRL is hosted, indeed everything underneath it should be tossed. But for instance when a manifest 'at 2 levels deep', references a file that doesn't exist, you only need to toss that specific manifest (at least). So, yes, errors in the repository should result in the repository (or parts of it) being considered inadmissible.
> 
> RPKI started out on the premise that an unencrypted transport (rsync) was acceptable, because everything was signed and cryptographically verifiable. Now we are in a situation where the transport channel *is* insecure, and the data transmitted across it is not properly verified. Validators are *knowingly* proceeding to produce VRPs with incomplete, expired, and/or corrupted data.

With RRDP over HTTPS a lot of in transit / MITM issues are mitigated. The RFC currently still says that invalid certificates should be accepted though. The thought at the time was that we have object security and people make too many mistakes configuring HTTPS. With Letsencrypt I believe this argument no longer holds.


> 
>> and as a consequence everything that could be validated under it, is a
>> way to high price to pay. It also makes attacks much easier: withholding
>> one single object from a repo and poof, a whole subtree (forest...) is gone?
> 
> I'd be very hesitant to use absolutes such as "the price is way too high". The alternative is to allow for a number of MITM attack vectors. What is the cost of those? I have MITM examples running in my lab with routinator, rpki-client, fort, and ripe ncc's validator where partial withholding attacks are trivial and successful. The result of those MITM attacks is hard-to-troubleshoot network outages.
> 
> Strategically hiding a select few ROAs can easily result in large scale outages because the attacker can make "RPKI Valid" announcements flip to "RPKI Invalid": this is worse than if the validator would've thrown out the manifest (things would've gone from "Invalid" to "Not-Found").
> 
> Performing the RFC 6811 Origin Validation procedure on an *incomplete* set of data is worse than not doing RFC 6811 at all. If RPKI validators allow flipping announcements from Valid to Invalid, that goes against the premise under which we designed RPKI to be incrementally deployable. We taught this industry that deploying RPKI would be safe.

First off I agree with Job that it would be better to disregard information that is known to be a partial version of the truth.

As said above, RRDP with HTTPS + real certificates will help. At least it will make it clear if data is being tampered with.

Finally, I did not say that data must be disregarded if things cannot be fetched. If the RP has all the objects in its local cache it can use them. If the MFT and CRL are still current and valid and local copies exist for all objects as identified by their hashes (such object change far less frequently) then one can just use them.


> 
>> I believe a more nuanced approach is needed, like if there's a problem
>> on a particular validation path (a cert is missing or has an error) then
>> invalidate that path, if a CRL is missing then warn but use a stale one,
>> but leave the otherwise unaffected bits validated.
> 
> I'm not sure you weigh the power of partial withholding attacks as severely as I do. As mentioned before: stricter validator on the RPKI Cache Validator side of the house will put the onus on RPKI CA Publication points to run a tight ship. I believe this to be appropriate because there are far less CA operators than there are RP operators.
> 
> If the validators are not going to be strict, what motivation exists for the CA operators to clean up their act?
> 
> If the validators are not going to be strict, are network operators supposed to just accept that the VRPs they feed into "invalid == reject" policies on their EBGP routers potentially are tainted? How will that convince anyone to deploy RPKI Origin Validation?

I agree that a partial withhold incident (involuntary) or attack - if detected should lead to disregarding things. If not one can cause invalids with serious impact, whereas a disregard would also be painful but it would lead to 'not found' instead.

> 
> Kind regards,
> 
> Job
> 
> _______________________________________________
> Sidrops mailing list
> Sidrops@ietf.org
> https://www.ietf.org/mailman/listinfo/sidrops