Re: [Sidrops] trying to limit RP processing variability

Martin Hoffmann <martin@opennetlabs.com> Thu, 09 April 2020 12:06 UTC

Return-Path: <martin@opennetlabs.com>
X-Original-To: sidrops@ietfa.amsl.com
Delivered-To: sidrops@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6D0683A09C7 for <sidrops@ietfa.amsl.com>; Thu, 9 Apr 2020 05:06:59 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.898
X-Spam-Level:
X-Spam-Status: No, score=-1.898 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_NONE=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2hJUoGIIEvYJ for <sidrops@ietfa.amsl.com>; Thu, 9 Apr 2020 05:06:57 -0700 (PDT)
Received: from dicht.nlnetlabs.nl (dicht.nlnetlabs.nl [185.49.140.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4988B3A09AE for <sidrops@ietf.org>; Thu, 9 Apr 2020 05:06:57 -0700 (PDT)
Received: from glaurung.nlnetlabs.nl (82-197-214-124.dsl.cambrium.nl [82.197.214.124]) by dicht.nlnetlabs.nl (Postfix) with ESMTPSA id 9004012E83; Thu, 9 Apr 2020 14:06:54 +0200 (CEST)
Authentication-Results: dicht.nlnetlabs.nl; dmarc=none (p=none dis=none) header.from=opennetlabs.com
Authentication-Results: dicht.nlnetlabs.nl; spf=none smtp.mailfrom=martin@opennetlabs.com
Date: Thu, 09 Apr 2020 14:06:54 +0200
From: Martin Hoffmann <martin@opennetlabs.com>
To: Robert Kisteleki <robert@ripe.net>
Cc: Stephen Kent <stkent=40verizon.net@dmarc.ietf.org>, "sidrops@ietf.org" <sidrops@ietf.org>
Message-ID: <20200409140654.2804a85f@glaurung.nlnetlabs.nl>
In-Reply-To: <63c18696-fe3b-c66f-d8ae-fb132f78ee9f@ripe.net>
References: <a9448e54-320f-300c-d4f9-d01aca2b6ef4.ref@verizon.net> <a9448e54-320f-300c-d4f9-d01aca2b6ef4@verizon.net> <63c18696-fe3b-c66f-d8ae-fb132f78ee9f@ripe.net>
Organization: Open Netlabs
X-Mailer: Claws Mail 3.17.5 (GTK+ 2.24.32; x86_64-pc-linux-gnu)
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/sidrops/mv8xiDZK_BWLkKLb5c8g72FL0FQ>
Subject: Re: [Sidrops] trying to limit RP processing variability
X-BeenThere: sidrops@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: A list for the SIDR Operations WG <sidrops.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/sidrops>, <mailto:sidrops-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/sidrops/>
List-Post: <mailto:sidrops@ietf.org>
List-Help: <mailto:sidrops-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sidrops>, <mailto:sidrops-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 09 Apr 2020 12:07:00 -0000

Robert Kisteleki wrote:
> 
> IMO an "RP has no obvious way to acquire missing objects" is not
> entirely true.
> 
> If, at the previous run, the RP fetched the relevant (now missing)
> object, then I see no reason to not use it again. Think of the
> previous run as an object a cache if you will: if you're looking for
> an object mentioned in the manifest, and you have it already (hash /
> name / etc. matches) then you can reuse it.

That is theoretical possible, but in practice you treat synchronising
and validation of repository content as two separate steps. I.e, before
you even start looking at a CA’s repository, you synchronize its
content. This is enshrined in the way both rsync and RRDP work: They
don’t update single files but entire directory trees all at once. This
step includes deleting objects that have been deleted on the server.

Since the complete RPKI repository has a hierarchical structure
following the rsync URIs of objects, many RP implementations keep the
objects in the file system only. This is in particular useful for
rsync: Just let rsync update the directory in place. An additional
bonus of this strategy is that you don’t need a fancy database.

You could, of course, concoct a mechanism that marks files for deletion
and only deletes them if they aren’t actually used in the next
validation run. But, considering that this thread is actually
subjected “trying to limit RP processing variability,” I am not sure
this is a good idea. There is a strong likelihood that different
strategies will behave slightly differently. If we really want to
come to a point where every RP implementation produces the same output
from given input, we need to defined simple rules that are easy to
implement in a wide range of circumstances.

Another consequence of doing this is that validation on a newly
deployed RP software differs from one that has been running for a
while. As a consequence, the datasets from two different caches
configured in routers differ. So now you even have difference between
caches running the same software.[0]

Kind regards,
Martin 

[0] Yes, with broken RRDP servers where snapshots differ from a
    sequence of deltas, this can happen too. We should perhaps also
    look into improving the robustness of RRDP.