[Sidrops] notes on rsync --delete (rrdp withdraw), and garbage collection

Job Snijders <job@ntt.net> Fri, 30 October 2020 16:38 UTC

Return-Path: <job@ntt.net>
X-Original-To: sidrops@ietfa.amsl.com
Delivered-To: sidrops@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id F36CB3A03EC for <sidrops@ietfa.amsl.com>; Fri, 30 Oct 2020 09:38:31 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.898
X-Spam-Level:
X-Spam-Status: No, score=-1.898 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id fS4WmP9Q772P for <sidrops@ietfa.amsl.com>; Fri, 30 Oct 2020 09:38:30 -0700 (PDT)
Received: from mail4.dllstx09.us.to.gin.ntt.net (mail4.dllstx09.us.to.gin.ntt.net [128.241.192.26]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2CCF63A08B1 for <sidrops@ietf.org>; Fri, 30 Oct 2020 09:38:29 -0700 (PDT)
Received: from bench.sobornost.net (162-vpn.londen03.uk.bb.gin.ntt.net [165.254.197.162]) by mail4.dllstx09.us.to.gin.ntt.net (Postfix) with ESMTPSA id CB25CEE0123; Fri, 30 Oct 2020 16:38:28 +0000 (UTC)
Received: from localhost (bench.sobornost.net [local]) by bench.sobornost.net (OpenSMTPD) with ESMTPA id a94f60c0; Fri, 30 Oct 2020 16:38:27 +0000 (UTC)
Date: Fri, 30 Oct 2020 16:38:27 +0000
From: Job Snijders <job@ntt.net>
To: sidrops@ietf.org
Message-ID: <20201030163827.GE34637@bench.sobornost.net>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
X-Clacks-Overhead: GNU Terry Pratchett
Archived-At: <https://mailarchive.ietf.org/arch/msg/sidrops/gNfjJZsr8rV0oQfObTCWo-wUd9k>
Subject: [Sidrops] notes on rsync --delete (rrdp withdraw), and garbage collection
X-BeenThere: sidrops@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: A list for the SIDR Operations WG <sidrops.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/sidrops>, <mailto:sidrops-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/sidrops/>
List-Post: <mailto:sidrops@ietf.org>
List-Help: <mailto:sidrops-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sidrops>, <mailto:sidrops-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 30 Oct 2020 16:38:32 -0000

Hello SIDROPS,

I noticed that most validator implementations accept what are
essentially unauthenticated instructions via an untrusted network input
channel on what files to delete in the local cache.

The below suggestions are *on top of* this year's manifest handling
improvements, so in most implementations I think the below suggestions
will leverage existing code. In order words: any coding efforts
hopefully are modest.

A quick survey on how common implementations are doing things:

Routinator: 'rsync -rltz --delete $source $dest' [1]
                         ^^^^^^^^

OctoRPKI: 'rsync -var $source $dest' [2], and then takes rsync's
    verbose output for a file listing of what exists on the server side,
    then proceeds to delete what is not on that list. (basically the
    same as running 'rsync --delete')

RIPE NCC Validator: 'rsync -u -rt --delete $source $dest' 
                                  ^^^^^^^^

FORT: 'rsync -r --delete -t $source $dest' [4]
                ^^^^^^^^

rpki-client: 'rsync -rt $source $dest' [5] <-- THE ODD ONE OUT!

I imagine implementers came to rely on '--delete' as a quick and easy
way to do garbage collection. This is operating under the assumption
that if a server says a file is not present, it probably is safe to
delete from local disk as well!

However one has to realize that such 'delete' instructions are entirely
unauthenticated: there is no cryptographic proof a file actually was
deleted from the publication point. This conceptual issue is present in
BOTH rsync and RRDP. In the RRDP protocol the 'withdraw' elements are
not authenticated against anything in the X.509 RPKI data. In RRDP the
'withdraw' elements are channeled through a TLS transport (which is
deemed secure), but the validator can't know if the TLS endpoint is the
same entity as the issueing CA. Unauthenticated instructions are - even
when transported through a secure channel - still unauthenticated.

Rpki-client does the garbage collection is to make a list of all files
that appear on valid current manifests and match the associated sha256
hash. Any files which do *not* appear on valid current manifests are
then deleted from the local cache on disk. This is a strategy I would
advise others to consider as well.

'Deletion after validation' leads to the curious situation that the
*coldest* files (aka files that are not referenced anywhere) become the
most transferred files. It is incumbent upon the CA operator to not
leave unreferenced files laying around to avoid frequent transfers.

I believe a filename's absence from any valid current manifest is
superior information about whether to delete a file compared to trusting
a rsync server's file listing or 'withdraw' elements in RRDP. Deleting
what's not present on any manifest makes the garbage collection
dependent on a cryptographically verifiable chain all the way up to the
Trust Anchor, whereas executing 'rsync --delete' (or similar) is akin to
taking orders which are *not* backed by the RPKI data itself.

draft-ymbk-sidrops-6486bis-00 section 8 states:
    """However, a manifest enables an RP to determine if a locally
    maintained copy of a repository is a complete and up-to-date copy,
    even when the repository retrieval operation is conducted over an
    insecure channel."""

The insecure channel should be as narrow as possible. The fewer
facilities exist through the channel, the better!

The locally maintained copy can only exist if the validator software
does *not* perform garbage collection based on unauthenticated
instructions from the insecure channel (aka 'rsync --delete').

My recommendation: implement the RP without rsync's '--delete' option,
and ignore withdraws in RRDP. Ignoring 'withdraws' in RRDP will need to
be backed by updates to the RRDP RFC, but that can happen later. OpenBSD
is not implementing the 'withdraw' element in its RRDP implementation.

If garbage collection is done *after* validating the tree, the removal
of files is anchored in the cryptographically signed reality, which I
think is safer.

Note: this is not a firedrill like 8 months ago related to incomplete
manifest handling, this is merely a small suggestion to optimise a bit.
Consider this an attempt to translate the Internet-Draft's prose about
maintaining a local cache into simpler English with concrete suggestions
for coding.

Keep in mind: falling back to locally cached files is an *optional*
feature. If an implementation doesn't fall back it is akin to booting
with an empty file cache, which is a valid behavior and common case too!
The draft wording is """Termination of processing means that the RP
SHOULD continue to use cached versions of the objects associated with
this CA instance""", I believe a SHOULD indeed is appropriate here. 

Is this real? I think so. There already are several large production
networks where the validators run without '--delete', instead
manipulating the local file cache based on delisting via validated RPKI
manifests. It appears to have no negative effects. :-) On the contrary!
leveraging the local cache to supplement what was fetched via the
network appears to increase robustness. Incomplete fetches can be
'repaired' in some situations. Object security truely is fascinating.

Kind regards,

Job

[1]: https://github.com/NLnetLabs/routinator/blob/404e84f7f13fd12789f4f951e8d638799cd6ab28/src/rsync.rs#L423-L426
[2]: https://github.com/cloudflare/cfrpki/blob/051e8d99151e55a8efc5516fd2b26fae7f7a064b/sync/lib/rsync.go#L44
[3]: https://github.com/RIPE-NCC/rpki-validator-3/blob/6d53f8c6adc0f5340cfbeb76af848c1b1925e416/rpki-validator/src/main/java/net/ripe/rpki/validator3/util/RsyncFactory.java#L46
[4]: https://github.com/NICMx/FORT-validator/blob/940e79057dd27895a65917eb5febe9cc23c90fbd/src/config.c#L851-L859
[5]: Line 322 at http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/usr.sbin/rpki-client/rsync.c?annotate=1.9