Re: [Sidrops] [WG ADOPTION] Adoption call: draft-timbru-sidrops-publication-server-bcp - ENDS 02/08/2024

Job Snijders <job@fastly.com> Tue, 06 February 2024 17:04 UTC

Return-Path: <job@fastly.com>
X-Original-To: sidrops@ietfa.amsl.com
Delivered-To: sidrops@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 72D10C14F5ED for <sidrops@ietfa.amsl.com>; Tue, 6 Feb 2024 09:04:01 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.105
X-Spam-Level:
X-Spam-Status: No, score=-2.105 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=fastly.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id OFCxTHajq_xD for <sidrops@ietfa.amsl.com>; Tue, 6 Feb 2024 09:03:57 -0800 (PST)
Received: from mail-ej1-x62e.google.com (mail-ej1-x62e.google.com [IPv6:2a00:1450:4864:20::62e]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D54DAC14F73F for <sidrops@ietf.org>; Tue, 6 Feb 2024 09:03:57 -0800 (PST)
Received: by mail-ej1-x62e.google.com with SMTP id a640c23a62f3a-a26ed1e05c7so804924866b.2 for <sidrops@ietf.org>; Tue, 06 Feb 2024 09:03:57 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fastly.com; s=google; t=1707239036; x=1707843836; darn=ietf.org; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=whOuQ2rKTI78fLWq9PyaxJEXOOulX+nPSswk1Hphqqw=; b=QmPdOnl5lQ5Z5W3Uesv1v2IB3bINUequddEJ1NkAGDcnf62rIQ60MVh+V40hBVv1PH BwNlVdem73g/dc1lHwV+BucbZpbc9YBEHzRumHGr4INaU0GoZJ7nwmL9sda/wmcsMqJp zFxZDd33NZwA20qzQEpwVlIdJohlmccgpfVJs=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707239036; x=1707843836; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=whOuQ2rKTI78fLWq9PyaxJEXOOulX+nPSswk1Hphqqw=; b=njkSDlee4jZRDrlTcI+3k580dLGdOHU4K9PTaDrX4FGfJXMNDNhXU2iHDlQpREIrBx xqaVpT2+SvHBRWTq4rgsdcbcY04KjdWIGM4uilVqjxUpOTUzXFaqjd7Ri8DqI+WTuhlm YwBbiJPl/PiTiCMgZ5jUumbP3gsEx7+t3Wyt5omcNYBHk5VDGpX+e/vHe8UnQyI4K8E7 EKTQi2WxOPLvd8GHVd1cci1ItdUCtiQHmGfsqPqSljUkmJ0Ot83w4ED51PQBwwqfYTrc iTm1BPEl/s3g1LK4ZLro0u04YEiS7buS/Ybr76d1lHesjhul9Zh1EeLKuWc/U+lP0Dew hmRA==
X-Gm-Message-State: AOJu0Yzyh+wuXlu9xdDZOuK/P9FVhIqKOY6TypItMT+yg+zbNUdVC+Pa ZGXbBDvuSSN9cxz2toMOhs31dGe103D2+zlUCg6Uga/4PVokwpB0N67KLc/zK80=
X-Google-Smtp-Source: AGHT+IGXeaeOGOKWsYed1u5goLKZtL6D4Mf4broEQox1KrpPQp5tXlJ3tmztnoTlfTJIzol7AhP2yg==
X-Received: by 2002:a17:906:d288:b0:a37:79ff:4388 with SMTP id ay8-20020a170906d28800b00a3779ff4388mr2512124ejb.7.1707239035940; Tue, 06 Feb 2024 09:03:55 -0800 (PST)
X-Forwarded-Encrypted: i=0; AJvYcCVSd5x4tHVFROF191rbi3s+66l79iJRx2Xt2ei1i6k/1DKTI3awhbVrOkv6X7gW/baDsAOzvqkZnpvAE4oTSEeRGojPA6uHe+472QDh2xwfcXGDPBJiJi4+MiiGy1wnSGoQ2JLLQw4M4GPZGZpQhzpb3viU2GGvyV+VL9CvM9x7Rxy6AwNjoCPR+Hn8gH7DCbdELJjZWf+lE1XaFRND2JT/FLpKMt91IA==
Received: from snel ([2a10:3781:276:3:16f6:d8ff:fe47:2eb7]) by smtp.gmail.com with ESMTPSA id s12-20020a17090699cc00b00a380857a760sm1319626ejn.60.2024.02.06.09.03.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Feb 2024 09:03:54 -0800 (PST)
Date: Tue, 06 Feb 2024 18:03:52 +0100
From: Job Snijders <job@fastly.com>
To: Tim Bruijnzeels <tbruijnzeels@ripe.net>
Cc: Ties de Kock <tdekock@ripe.net>, Russ Housley <housley@vigilsec.com>, IETF SIDRops <sidrops@ietf.org>, IETF SIDRops Chairs <sidrops-chairs@ietf.org>, sidrops-ads@ietf.org, Keyur Patel <keyur@arrcus.com>
Message-ID: <ZcJmeFCmU9Txsk7M@snel>
References: <87h6j1kug1.wl-morrowc@ops-netman.net> <B60D7B39-FA81-45AF-BCBD-2784F91B43C3@vigilsec.com> <ZcFNNfrkMFxKf5hN@snel> <BBE2320C-4525-4713-B4AF-3F00ECD4228A@ripe.net> <ZcIuI7lS1OtOW_xT@snel> <EFFA95AA-F07D-490B-BEC3-0446ED2D3AA2@ripe.net>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <EFFA95AA-F07D-490B-BEC3-0446ED2D3AA2@ripe.net>
X-Clacks-Overhead: GNU Terry Pratchett
Archived-At: <https://mailarchive.ietf.org/arch/msg/sidrops/IAg_CX2czNz8vuNml2U6AE4KLNA>
Subject: Re: [Sidrops] [WG ADOPTION] Adoption call: draft-timbru-sidrops-publication-server-bcp - ENDS 02/08/2024
X-BeenThere: sidrops@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: A list for the SIDR Operations WG <sidrops.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/sidrops>, <mailto:sidrops-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/sidrops/>
List-Post: <mailto:sidrops@ietf.org>
List-Help: <mailto:sidrops-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sidrops>, <mailto:sidrops-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 06 Feb 2024 17:04:01 -0000

On Tue, Feb 06, 2024 at 05:19:11PM +0100, Tim Bruijnzeels wrote:
> > The notification file (by specification) indeed is mutable.
> > 
> > I think it is helpful to point out that the RRDP deltas really only
> > need to be generated once, as some implementers seem to have gotten
> > this wrong in the past.
> 
> To the best of my knowledge no RRDP implementation has intentionally
> regenerated deltas.

> There have been issues where a server was restored from backup to an
> earlier state and it was unaware of the state changes since the
> backup.
> 
> So, I would like to include text in this BCP that instructs the
> Publication Server to perform a full RRDP session reset in case they
> restored from backup.

Yes, the 'reset session' after backup restore is a good point.

> > The situation I had in mind was an example from last year, when for
> > one of the regional internet registries all my alerts cleared within
> > 2 hours after RRDP was disabled.
> > 
> > You are right to point out that the RRDP notification file usually
> > is smaller than the rsync filename transfer list, but in turn, the
> > rsync transfer list is way smaller than the RRDP snapshot.
> 
> I am in two minds about this.
> 
> I think the BCP should not recommend disabling RRDP, it should
> recommend that enough bandwidth is available and/or a CDN is used.
> 
> The problem happens when this cannot be done. In that case RRDP
> degrades badly, as described in the document, because all RPs fall
> back to full snapshots. This makes the bandwidth load worse.
> 
> In the case of your example there were higher layer (non-technical)
> reasons why the bandwidth capacity could not be increased and a CDN
> could not be used at the time.
> 
> In this specific case disabling RRDP helped because even though there
> could still be bandwidth issues affecting RPs this allowed enough of
> them to perform a sync so that subsequent data usage was reduced.
> 
> So, yes disabling RRDP helped here, but no, I don’t think it should be
> recommended best practice. I want to think about wording that captures
> this...

Or phrased differently: the best practise is to ensure there is
sufficient bandwidth available, while disabling RRDP is a dirty hack :-)

Perhaps a some kind of guidance can be included as to what 'sufficient
bandwidth' is? Here is a starter:

  "Size of snapshot" times "Number of deployed RP instances" should
  comfortably fit in 15 minute delivery window.

(I picked 15 minutes, because rpki-client by default timeboxes
synchronizing individual repositories into 15 minutes.)

So, for an RRDP server serving a 207 megabyte snapshot to 3000 RPs, the
operator would need *at least* (207*8*3000)/900 = 5520 megabit/sec after
a session reset. To make it fit comfortably, double or triple this number.

Given the affiliations of the authors, I'm sure that group can do a
better job and speak from experience how capacity planning is done. :-)

Kind regards,

Job