[Sidrops] note on flipping back and forth between RRDP / RSYNC

Job Snijders <job@fastly.com> Tue, 30 March 2021 11:34 UTC

Return-Path: <job@fastly.com>
X-Original-To: sidrops@ietfa.amsl.com
Delivered-To: sidrops@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4AE593A0CC7 for <sidrops@ietfa.amsl.com>; Tue, 30 Mar 2021 04:34:21 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.098
X-Spam-Level:
X-Spam-Status: No, score=-2.098 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=fastly.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id RRayRoWF4lPA for <sidrops@ietfa.amsl.com>; Tue, 30 Mar 2021 04:34:16 -0700 (PDT)
Received: from mail-ej1-x630.google.com (mail-ej1-x630.google.com [IPv6:2a00:1450:4864:20::630]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 81C343A0CCC for <sidrops@ietf.org>; Tue, 30 Mar 2021 04:34:16 -0700 (PDT)
Received: by mail-ej1-x630.google.com with SMTP id l4so24283862ejc.10 for <sidrops@ietf.org>; Tue, 30 Mar 2021 04:34:16 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fastly.com; s=google; h=date:from:to:subject:message-id:mime-version:content-disposition; bh=3Fpjckwcz3kDHlDX34PMEMmJtwQU8gTqlu6+O4TIqPM=; b=ncH0lCuVHPb6HDZ0CDuxcKeZL+mZa49W3Jdw8sr9A3B5rCIkDFaQgrZtp7Ib/NP6p1 qZeTKPjABNEhS+n7cbjNTzJ4kY1lXSODfnQUl9SSRM1vBzJZdIcGpDOBhVygCEBNk+rJ tAGWccWwn6DaFv9hcQho/jIWCOeCopjtsIwl0=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:subject:message-id:mime-version :content-disposition; bh=3Fpjckwcz3kDHlDX34PMEMmJtwQU8gTqlu6+O4TIqPM=; b=j7vy8Cv6IC1xV/wa4ZFkBcbnBmwgZrYbgamtCHy6YwoxHG3ngBoXjoCiGKc+G/IZcl cFbOKsfrPR4MZ+j6PhiMp7D/10r53QpGr99ie04Q0ZTKFk/YOhJOhp5qZBLIStmIwmEn 4pkx1HiBYxgf1KQ2/b6aQcEouWGEOgwPemVHj+jUmiv2321PELbn6PO5yrOPJ8PEFisy CiQH+ZR5sMWKxNflCDSzntZXh5GffPqMxynovreDXDLcPRi3xaJ7IM4ECyPE3OR19YSj J9eBvGSOxNxmhFkUe9B2740MMXzSa4YCtGuJxlfM1aXOuI0OYyhWruyDI2QL0LuzzpvY uTIw==
X-Gm-Message-State: AOAM531Cx/zNKY69cHSqzim7dPJZn4uVJVj4UqglePb5RNsRROq1B6Y2 HRu7Pl3djhhX6earhyLJsVBHo3UpHi1j4nE/Pdqq2dhLHcVuJ4euh9JA8unZqp/alj9jUmDECSU T3jjxevopRU1mb8+CRXSnJDtZVsOmQlu6+BHXiWYdexJqJ3Q/nLy4/nw=
X-Google-Smtp-Source: ABdhPJwOHA+4bupmozlPhIw0n8mz991GZzNCmHMNLGKF0kZBsDVk2H9j0zUTIKtgvO4mCXT3k+gqtA==
X-Received: by 2002:a17:906:414e:: with SMTP id l14mr31697675ejk.406.1617104053722; Tue, 30 Mar 2021 04:34:13 -0700 (PDT)
Received: from snel (mieli.sobornost.net. [45.138.228.4]) by smtp.gmail.com with ESMTPSA id s20sm11016987edu.93.2021.03.30.04.34.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Mar 2021 04:34:13 -0700 (PDT)
Date: Tue, 30 Mar 2021 13:34:11 +0200
From: Job Snijders <job@fastly.com>
To: sidrops@ietf.org
Message-ID: <YGMMs3YEtPhgV5Ss@snel>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
X-Clacks-Overhead: GNU Terry Pratchett
Archived-At: <https://mailarchive.ietf.org/arch/msg/sidrops/I0uM2MZspfTh3vNTKoMzoXrw_Iw>
Subject: [Sidrops] note on flipping back and forth between RRDP / RSYNC
X-BeenThere: sidrops@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: A list for the SIDR Operations WG <sidrops.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/sidrops>, <mailto:sidrops-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/sidrops/>
List-Post: <mailto:sidrops@ietf.org>
List-Help: <mailto:sidrops-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sidrops>, <mailto:sidrops-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 30 Mar 2021 11:34:21 -0000

Dear all,

I'd like to share a simple terminal transcript on how to efficiently
switch from RRDP to RSYNC. This note is in reponse to a previous message
in which the author appeared to assume rsync clients would fall back
with a 'clean copy': https://mailarchive.ietf.org/arch/msg/sidrops/AtFXCBnwGQR97C7tu2ePXd2WpaI/

For this demonstration an open source utility authoried by Nils Fisher
(development supported by APNIC & ARIN) is used: https://github.com/nilsmf/rrdp
This 'rrdp' utility the reverse of https://github.com/NLnetLabs/rrdpit

First bootstrap with empty cache (Amsterdam) from the latest RRDP snapshot:

    $ time ../rrdp -d arin https://rrdp.arin.net/notification.xml
        1m21.37s real     0m05.41s user     0m04.39s system

    $ du -sh arin
    84.0M   arin

    $ ls arin/repository/
    arin-rpki-ta     arin-rpki-ta.cer

The 'rrdp' utility creates a filesystem hierarchy based on the URI XML
attribute which in a subsequent validation phase is expected to map to
each object's SIA attribute.

Because of how URIs in RRDP snapshots / deltas map to SIAs (filepaths) and
'CA Repositories' (directories) on RSYNC servers, it is now possible to
switch over from RRDP to RSYNC, as demonstrated below:

    $ rsync -zrt --stats rsync://rpki.arin.net/repository/ arin/repository/
    Number of files: 30,396 (reg: 28,719, dir: 1,677)
    Number of created files: 0
    Number of deleted files: 0
    Number of regular files transferred: 28,719
    Total file size: 60,399,704 bytes
    Total transferred file size: 60,399,704 bytes
    Literal data: 0 bytes
    Matched data: 60,399,704 bytes
    File list size: 1,495,650
    File list generation time: 0.001 seconds
    File list transfer time: 0.000 seconds
    Total bytes sent: 1,219,102
    Total bytes received: 2,776,383
    
    sent 1,219,102 bytes  received 2,776,383 bytes  319,638.80 bytes/sec
    total size is 60,399,704  speedup is 15.12

Only a total of ~ 4 megabytes of network traffic, instead of the full 84
megabyte repo!

Recycling information retrieved via RRDP (or any other means) is far
more efficient than reaching out to the RSYNC endpoint with an empty
cache:

    $ rm -rf arin && mkdir -p arin/repository
    $ time rsync -zrt --stats rsync://rpki.arin.net/repository/arin/repository/ | tail -3
    sent 560,295 bytes  received 46,906,887 bytes  1,460,528.68 bytes/sec
    total size is 60,406,353  speedup is 1.27
        0m32.66s real     0m02.16s user     0m06.09s system

Wrapping up
-----------

In the above demonstration an *unvalidated* RRDP-seeded cache was synced
with the RSYNC server. Relying Parties are best off if they separately
maintain a copy of the validated cache, and 'incoming' datastores for
RRDP & RSYNC sync attempts. A rogue RRDP / RSYNC server ideally is not
able to wipe out the locally stored validated cache with trash.

Rob Austein mentioned his implementation has such an elegant mechanism:
https://mailarchive.ietf.org/arch/msg/sidrops/rAk9iyjD3fuPLie3DriFgldWPUk/

RPs syncing up with the RSYNC server (because RRDP service is unavailable)
can keep copies of previously fetched RRDP delta files in hopes that
when the RRDP service comes back the RRDP session can be resumed.

All in all - switching back and forth between RSYNC and RRDP should not
impose undue burden on either RPs or the Publication Point servers, iff
the RPs are specifically designed to minimize load.

Arguably HTTPS "If-Modified-Since" fetches to https://rrdp.arin.net/notification.xml
consume less resources than RSYNC requests. Thus it makes sense to
prefer RRDP, and if RRDP is not available not query the RSYNC service as
frequently as one would query the RRDP notification URL.

Kind regards,

Job