Re: [Sidrops] Publication Point -> RP synchronization in bandwidth constrained environments (note for RRDP v2)

Ties de Kock <tdekock@ripe.net> Thu, 08 June 2023 12:55 UTC

Return-Path: <tdekock@ripe.net>
X-Original-To: sidrops@ietfa.amsl.com
Delivered-To: sidrops@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B60C5C14F5E0; Thu, 8 Jun 2023 05:55:32 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.097
X-Spam-Level:
X-Spam-Status: No, score=-7.097 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=ripe.net
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tgQ_pKz7RZrE; Thu, 8 Jun 2023 05:55:27 -0700 (PDT)
Received: from mail-mx-1.ripe.net (mail-mx-1.ripe.net [IPv6:2001:67c:2e8:11::c100:1311]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2D4C5C14CF1B; Thu, 8 Jun 2023 05:55:26 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=ripe.net; s=s1-ripe-net; h=To:Message-Id:Cc:Date:From:Subject:Mime-Version:Content-Type ; bh=6+sa3iA8wi4mhElbKIitboqUrjSJ0bjadbIjDjeiNdQ=; b=qixWkx7UkDZlXihWbgi4GSFl WT4BwOsOqH+EGfsVHZ2S5ZsynQRzoN+qr6cB4khjuP4E6JNRbLukr8NdHA1LUsnVqcORBV6cbjh8x WacbXaQMyPBOFsKC671fd63EuFI8TDSuWSocKRxjQ++bnjRUIxxdHX0v/ww3gZad4L0gi/g+y6XvN m1Qgkea4/eM64kqod/W9m2K/DdisFkmna6WoFrhr9JUL9ePuJVBK83mL6QPAmFBTn0g1Bjl1lmKtW AzicPAiZw0dgOlXEfvxHNz5H3NR7KhjRkjv9PliqLJ5DhfO2AL98gH9NGZqeRrsg8Pc67iiykuy68 xmNgIwpeWQ==;
Received: from bufobufo.ripe.net ([2001:67c:2e8:23::c100:170d]:44354) by mail-mx-1.ripe.net with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from <tdekock@ripe.net>) id 1q7FAi-00473M-3A; Thu, 08 Jun 2023 12:55:24 +0000
Received: from sslvpn.ipv6.ripe.net ([2001:67c:2e8:9::c100:14e6] helo=smtpclient.apple) by bufobufo.ripe.net with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from <tdekock@ripe.net>) id 1q7FAi-0000yb-2p; Thu, 08 Jun 2023 12:55:24 +0000
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.600.7\))
From: Ties de Kock <tdekock@ripe.net>
In-Reply-To: <ZIHLBTZJ06/J3bCa@diehard.n-r-g.com>
Date: Thu, 08 Jun 2023 14:55:14 +0200
Cc: Mikhail Puzanov <mpuzanov@ripe.net>, Job Snijders <job=40fastly.com@dmarc.ietf.org>, sidrops@ietf.org
Content-Transfer-Encoding: quoted-printable
Message-Id: <461BCBA1-19FD-4E6C-83C3-584D1597B2E9@ripe.net>
References: <ZHYYt77xdtrkNV1a@snel> <955CFF67-8D19-4B38-8585-3754F3119EDF@ripe.net> <ZIHLBTZJ06/J3bCa@diehard.n-r-g.com>
To: Claudio Jeker <cjeker@diehard.n-r-g.com>
X-Mailer: Apple Mail (2.3731.600.7)
X-RIPE-Signature: 059faafd1cc22ebb05e1592c815fe1e1d01d3b299c0c9ada9d9c746f0422bb4b
Archived-At: <https://mailarchive.ietf.org/arch/msg/sidrops/Qn_g77JKZhv3nqs8QMJJj9I-VxQ>
Subject: Re: [Sidrops] Publication Point -> RP synchronization in bandwidth constrained environments (note for RRDP v2)
X-BeenThere: sidrops@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: A list for the SIDR Operations WG <sidrops.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/sidrops>, <mailto:sidrops-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/sidrops/>
List-Post: <mailto:sidrops@ietf.org>
List-Help: <mailto:sidrops-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sidrops>, <mailto:sidrops-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 08 Jun 2023 12:55:32 -0000


> On 8 Jun 2023, at 14:35, Claudio Jeker <cjeker@diehard.n-r-g.com> wrote:
> 
> On Thu, Jun 08, 2023 at 01:45:46PM +0200, Mikhail Puzanov wrote:
>> Hi Job, all,
>> 
>> I think compression is probably the quickest way of mitigating the size problem:
>> 
>> Repeating some of your experiments:
>> 
>> tmp> ls -lh snapshot.xml*
>> -rw-r--r--  1 mpuzanov  staff   202M Jun  8 12:44 snapshot.xml
>> -rw-r--r--  1 mpuzanov  staff    89M Jun  8 12:58 snapshot.xml.bz2
>> -rw-r--r--  1 mpuzanov  staff    90M Jun  8 13:01 snapshot.xml.gz
>> -rw-r--r--  1 mpuzanov  staff   124M Jun  8 12:57 snapshot.xml.lz4
>> -rw-r--r--  1 mpuzanov  staff    83M Jun  8 12:58 snapshot.xml.lzma
>> 
>> Even the good ol’ gzip can shrink RIPE NCC’s snapshot more than twofold.
>> Also compression should take care of the repetitive parts, i.e. 
>> XML tags, newlines, etc.
>> 
>> Extra 2 cents as an RP implementer: rpki-prover fetches every repository 
>> in a separate OS process with constrained heap, constrained download size 
>> and a timeout, so it pretty much ignores all CVE-2021-43174-related 
>> considerations and compression was never disabled in it. If the fetching 
>> process crashes or runs out of some resource the impact is limited to 
>> the specific repository. So there is a way to have compression and 
>> tolerate crashes.
> 
> The inherent problem of RRDP is that when a CA is forced to issue a new
> RRDP session (or all RP lose their RRDP state for the CA) every RP system
> will grab a snapshot. So you end up with a thundering herd and I doubt a
> reduction by half is enough to make that effect go away.

The 45-70% reduction in traffic and costs is significant.

> The worst case RRDP bandwith requirements are significantly bigger then
> what is needed in the steady state. This is an inherently bad design.
> The only fix is to overprovision by a lot.

The statement on over-provisioning holds for rsync.

In practice a RP needs to be out of sync for a significant amount of time before
a fallback to snapshot is needed (> 8h for RIPE NCC repo). In that period, >60%
of objects in the repository has churned because of manifest+crl rotation.

Kind regards,
Ties