Re: [Sidrops] Alvaro Retana's Discuss on draft-ietf-sidrops-rov-no-rr-03: (with DISCUSS and COMMENT)

Randy Bush <randy@psg.com> Tue, 23 August 2022 01:12 UTC

Return-Path: <randy@psg.com>
X-Original-To: sidrops@ietfa.amsl.com
Delivered-To: sidrops@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E10A1C14F74B; Mon, 22 Aug 2022 18:12:29 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.909
X-Spam-Level:
X-Spam-Status: No, score=-1.909 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3BCOhrnVP5u5; Mon, 22 Aug 2022 18:12:25 -0700 (PDT)
Received: from ran.psg.com (ran.psg.com [IPv6:2001:418:8006::18]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id F1EC6C14F735; Mon, 22 Aug 2022 18:12:21 -0700 (PDT)
Received: from localhost ([127.0.0.1] helo=ryuu.rg.net) by ran.psg.com with esmtp (Exim 4.93) (envelope-from <randy@psg.com>) id 1oQISo-000bFN-SE; Tue, 23 Aug 2022 01:12:19 +0000
Date: Mon, 22 Aug 2022 18:12:18 -0700
Message-ID: <m2czcrrcel.wl-randy@psg.com>
From: Randy Bush <randy@psg.com>
To: Alvaro Retana via Datatracker <noreply@ietf.org>
Cc: The IESG <iesg@ietf.org>, draft-ietf-sidrops-rov-no-rr@ietf.org, SIDR Operations WG <sidrops@ietf.org>
In-Reply-To: <166120276077.15778.13751342808130076354@ietfa.amsl.com>
References: <166120276077.15778.13751342808130076354@ietfa.amsl.com>
User-Agent: Wanderlust/2.15.9 (Almost Unreal) Emacs/26.3 Mule/6.0 (HANACHIRUSATO)
MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue")
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/sidrops/kSDD-3zcXHkxWmIxd6DhUTe3z28>
Subject: Re: [Sidrops] Alvaro Retana's Discuss on draft-ietf-sidrops-rov-no-rr-03: (with DISCUSS and COMMENT)
X-BeenThere: sidrops@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: A list for the SIDR Operations WG <sidrops.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/sidrops>, <mailto:sidrops-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/sidrops/>
List-Post: <mailto:sidrops@ietf.org>
List-Help: <mailto:sidrops-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sidrops>, <mailto:sidrops-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 23 Aug 2022 01:12:30 -0000

aha!  an alvaro review; always excellent.  thank you!

> As Jeff Haas wrote on the idr list [1]:
> 
>    This isn't any different than any other over-aggressive
>    provisioning tool's impact.
> 
> Was there any consideration given to making a general recommendation
> and not just limiting it to ROV?  I can see the direct impact on
> rfc6811/rfc8481, and how general BGP advice is out of scope for
> sidrops.  However, I am primarily curious whether there is anything
> particular to ROV to focus the recommendation this way.

we are not aware of automated provisioning tools which have a policy
update frequency on the order of ROV, which can now be seen on the order
of ten minutes.

this document was a response to  real operational events, not theory.

the authors are not aware of other provisioning tools which have fluxed
policy to the point of other proviers de-peering due to the route
refresh load.

but, a comment that other high flux rate policy changes might also cause
similar problems would not hurt.  how about

      Other mechanisms, such as automented policy provisioning, which
      have flux rates similar to ROV (i.e. on the order of minutes),
      could very well cause similar problems.

> (2) I have a couple of issues with this paragraph from §4.  Addressing
> them should be relatively easy:
> 
>    When RPKI data cause one or more paths to be dropped due to ROV,
>    those paths MUST NOT be evaluated for best path, but MUST be saved
>    (either separately or marked) so they may be reevaluated with
>    respect to new RPKI data.
> 
> (2a) "paths to be dropped due to ROV, those paths MUST NOT be evaluated for
> best path"
> 
> Neither rfc6811 nor rfc8481 require that routes be "dropped due to ROV". 
> rfc8481 requires that "Absent specific operator configuration, policy MUST NOT
> be applied."
> 
> Please clarify that the trigger above ("dropped due to ROV") is defined by the
> operator and is not just a result of ROV.

embarrassing point given i wrote 8481 <blush>

   When RPKI data cause one or more paths to be dropped by operator
   policy due to ROV, those paths MUST NOT be evaluated for best route,
   but MUST be saved (either separately or marked) so they may be
   reevaluated with respect to new RPKI data.

or was part of your point wanting that to be "MAY be saved?"

> (b) "MUST be saved (either separately or marked)"
> 
> For a required action, the description is not clear.  For starters, "marked"
> how?  Separately where?
> 
> From §1.1/rfc4271:
> 
>    The Adj-RIBs-In contains unprocessed routing information that has
>    been advertised to the local BGP speaker by its peers.
> 
> The RIB structures in rfc4271 are conceptual -- but since this document
> requires keeping information (presumably in the Adj-RIB-In), please be more
> specific about where and marked how.

as you just pointed out, for many implementations, adjribin is purely
conceptual.  so what those implentations do is hidden black magic.  so
how do we describe augmenting black magic?  do you have a suggestion?

***  so issue still open  ***

> (3) The following requirement from §5 is outside the scope of this document:
> 
>    If the BGP speaker has insufficient resources to support either of
>    the two proposed options, it MUST NOT be used for Route Origin
>    Validation.  I.e. the knob in Section 4 should only be used in very
>    well known and controlled circumstances.
> 
> Requiring a node not to be used for ROV is a powerful statement.  It
> basically invalidates the base operation specified on rfc6811/rfc8481
> by always requiring the mechanism in this document.  While I
> understand the potential resource demands, selecting a node to perform
> a specific operation in a particular operator's network is outside the
> scope of this document.

well, i am not sure we're on the same page here.  what was intended was
an operational requirement that a speaker that can not do rov without
attacking their neighbors (either by having a full adjribin or by the §4
hack) should simply not do rov.
> 
> Instead, I would like to see guidance to the operator to consider not using the
> specific piece of equipment to perform a particular function.  This can be as
> easy as:
> 
>     If the BGP speaker has insufficient resources to support either
>     of the two proposed options, the operator is strongly encouraged
>     to consider an alternate piece of equipment to perform Route Origin
>     Validation.
> 
> The second part of the sentence ("I.e. ...") sounds like a better
> recommendation -- and, clearly, not the same as "MUST NOT be used".

how about

   If the BGP speaker has insufficient resources to support either of
   the two proposed options, it MUST NOT be used for Route Origin
   Validation.  The equiptment should either be replaced with capable
   equipement or ROV not used.  I.e. the knob in Section 4 should only
   be used in very well known and controlled circumstances.

> (1) This document should be tagged as replacing
> draft-ymbk-sidrops-rov-no-rr.

tagged where/how?

> (2) Expand ROV on first use.

ack

> (3) rfc4271 doesn't talk about a "best path" -- it does talk about a
> Decision Process that results in a best route (or ineligible routes).
> Please be consistent with existing terminology.

ok.  a bit delicate, but will try.

> (3) The reference to route-refresh should include rfc2918.

ok

> (4) §4: s/there MUST be a knob allowing operator control of this
> feature.  Such a knob MUST NOT be per peer, as this could cause
> inconsistent behavior./an implementation MUST provide a global
> mechanism to control the operation.
> 
> A knob makes me think of the CLI -- I'm sure you want the possibility to
> control the behavior in other ways: YANG, etc..

      As storing these routes could cause problems in resource
      constrained devices, there MUST be a global operation, CLI, YANG,
      ... allowing operator control of this feature.  Such a control
      MUST NOT be per peer, as this could cause inconsistent behavior.

> (5) §5: "Operators...SHOULD ensure that the BGP speaker implementation is not
> causing unnecessary Route Refresh requests to neighbors."
> 
> What is the interoperability-related requirement with "ensuring" that makes
> Normative language needed?  What does "ensure" entail?  When is it ok for the
> operator to not "ensure"?  Why is this action only recommended and not
> required?  After all, eliminating unnecessary route refresh requests is the
> purpose of this document.
> 
> There's a similar phrase later on without Normative language.
> 
> s/SHOULD/should

ack

> (6) §5: "the operator SHOULD enable the vendor's knob"
> 
> Same questions as above: Why is Normative language needed here?  When
> is it ok to not enable the functionality?  Why is this action
> recommended and not required?  ...

ok

> (7) §5: "Pre-policy filtering...SHOULD be used to reduce this
> exposure."
> 
> When is it ok to not use pre-policy filtering?  Why is this action
> recommended and not required?

a bit of grey?

   Operators using the specification in Section 4 should be aware that a
   misconfigured neighbor might erroneously send a massive number of
   paths, thus consuming a lot of memory.  Hence pre-policy filtering
   such as described in [I-D.sas-idr-maxprefix-inbound] could be used to
   reduce this exposure.

> (8) rfc6811 and rfc8481 should be Normative references.

sure

> (9) s/(IXPs)which/(IXPs) which

ack

thanks for a thorough review!!

randy