Re: [Sidrops] 6486bis: referenced object validation

Martin Hoffmann <martin@opennetlabs.com> Fri, 04 December 2020 10:17 UTC

Return-Path: <martin@opennetlabs.com>
X-Original-To: sidrops@ietfa.amsl.com
Delivered-To: sidrops@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4F1EA3A03F2 for <sidrops@ietfa.amsl.com>; Fri, 4 Dec 2020 02:17:01 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.901
X-Spam-Level:
X-Spam-Status: No, score=-1.901 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_PASS=-0.001] autolearn=unavailable autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PliJAgZFIvxh for <sidrops@ietfa.amsl.com>; Fri, 4 Dec 2020 02:16:58 -0800 (PST)
Received: from outbound.soverin.net (outbound.soverin.net [IPv6:2a01:4f8:fff0:2d:8::215]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 107513A03EF for <sidrops@ietf.org>; Fri, 4 Dec 2020 02:16:57 -0800 (PST)
Received: from smtp.soverin.net (unknown [10.10.3.24]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) (No client certificate requested) by outbound.soverin.net (Postfix) with ESMTPS id DB2F860577; Fri, 4 Dec 2020 10:16:54 +0000 (UTC)
Received: from smtp.soverin.net (smtp.soverin.net [159.69.232.138]) by soverin.net
Date: Fri, 04 Dec 2020 11:16:51 +0100
From: Martin Hoffmann <martin@opennetlabs.com>
To: Ben Maddison <benm=40workonline.africa@dmarc.ietf.org>
Cc: sidrops@ietf.org
Message-ID: <20201204111651.4e865d7d@glaurung.nlnetlabs.nl>
In-Reply-To: <20201203224213.gnb2nawujxm7a32q@benm-laptop>
References: <20201203224213.gnb2nawujxm7a32q@benm-laptop>
Organization: NLnet Labs
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/sidrops/AroXtmTb4xSWoo-gV3AnwXJ3F44>
Subject: Re: [Sidrops] 6486bis: referenced object validation
X-BeenThere: sidrops@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: A list for the SIDR Operations WG <sidrops.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/sidrops>, <mailto:sidrops-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/sidrops/>
List-Post: <mailto:sidrops@ietf.org>
List-Help: <mailto:sidrops-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sidrops>, <mailto:sidrops-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 04 Dec 2020 10:17:01 -0000

Ben Maddison wrote:
> 
> As a result of the incident last night that caused the number of VRPs
> output by some RPs (certainly routinator, possibly others too) to drop
> dramatically, a couple of us have been going over the observed
> behaviour in comparison with the current text of 6486-bis.
> 
> It appears from Job's analysis [1] that the incident was triggered
> when several resource certs that were listed on an APNIC issued
> manifest became invalid due to their 'notAfter' time expiring, which
> in turn caused the affected RPs to consider the manifest invalid and
> fail the fetch.
> 
> It should be perfectly legitimate for a manifest to list objects which
> will become invalid during the lifetime of the manifest, or which are
> not yet valid when the manifest is generated (for pre-staging).
> The presence of such objects should not invalidate the manifest.

I think the incident shows a general flaw in the approach that
anything being wrong with any of the objects published by the CA
should lead to all the CA’s objects being invalidated: It prohibits a
CA from publishing invalid objects on purpose.

The validity time is only one such case. Another one that we already
know about is changing resource entitlements in the CA certificate.
This case is actually worse since the CA itself has no influence on
this happening and, as the protocols are now, there is no automated
way of agreeing on a timeline for the process.

It is not entirely unlikely that we will encounter more such issues in
the future.

So perhaps instead of trying to work around each of these issues as
they arise, we should go back and look at what started this revision in
the first place. That was an observation that by manipulating a
publication point’s objects, an attacker can force a route announcement
to become RPKI invalid.

The current approach does solve this problem but it also attempts to
solve a different problem: to protect a CA from accidentally publishing
invalid data. But as we found out, that requires anticipating the
intentions of a CA. Perhaps it is better to trust a CA and accept that
if it fails there will be problems for its resources.

Under this approach, the manifest expresses which objects the CA
intended to publish. If all the objects listed on the manifest are
present with a matching hash, the publication point reflects the intent
of the CA and can be processed. If it contains invalid objects, these
can be discarded individually.

This approach will be simpler and easier to implement. It will avoid
the invalidation of the entire CA because of expired or not yet valid
CA certificates or changing resource entitlements. It may, however,
lead to a partial set of objects applying to a particular resource.
That may be intentional or by accident. Whether it is better to reject
the partial set accepting that an intentionally partial set is rejected
or to accept an accidental partial set to be included depends on the
particular application of RPKI. For ROV, rejecting the set may be
better (although there are consequences of that, too) whereas for
router keys or Ghostbuster records including it would seem to be fine.

In other words, the proposal is to simplify the text for section 6.4
to say:

      The RP MUST acquire all of the files enumerated in the manifest
      (fileList) from the publication point.  If there are files listed
      in the manifest that cannot be retrieved from the publication
      point the fetch has failed and the RP MUST proceed to Section 6.7;
      otherwise, proceed to Section 6.5.

In addition, section 6.6 can be repurposed to specify the behaviour in
case of invalid objects. As a proposal:

   6.6. Invalid Files

      If files listed in on the manifest fail the validity tests
      specified in [RFC6487] and [RFC6488], the fetch has not
      necessarily failed. However, applications of the RPKI may define
      specific consequences if one or more files are invalid.

Alternatively, this section can provide the rules for existing
applications of the RPKI, i.e., ROV, BGPsec, and GBR as well as
delegation to child CAs.

I understand that this proposal is a departure from our current
approach, but I would like to ask the working group to consider it
nonetheless as it keeps validation relatively simple and, by separating
out the individual steps of the overall process, should limit the
number of unforeseen consequences.

Kind regards,
Martin