Re: [pim] AD Review of draft-ietf-pim-bfd-p2mp-use-case-05

Alvaro Retana <aretana.ietf@gmail.com> Tue, 17 August 2021 15:19 UTC

Return-Path: <aretana.ietf@gmail.com>
X-Original-To: pim@ietfa.amsl.com
Delivered-To: pim@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 675823A1F4C; Tue, 17 Aug 2021 08:19:37 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.097
X-Spam-Level:
X-Spam-Status: No, score=-2.097 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id uGzLIW-0rowe; Tue, 17 Aug 2021 08:19:32 -0700 (PDT)
Received: from mail-ed1-x52b.google.com (mail-ed1-x52b.google.com [IPv6:2a00:1450:4864:20::52b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id BDAA53A163C; Tue, 17 Aug 2021 08:19:31 -0700 (PDT)
Received: by mail-ed1-x52b.google.com with SMTP id b7so32120035edu.3; Tue, 17 Aug 2021 08:19:31 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:in-reply-to:references:mime-version:date:message-id:subject:to :cc:content-transfer-encoding; bh=k3szPqYXwEumHsFPwVfqbpu+heb6JkMtNwu7XgoRBP8=; b=SWWp+1fMxVwAxpjmudV7AHJfvOM/dxKi+WFfoEzh6GDhaLVzs2GTcxCgOaFhFuNRX2 hJ3OtJRP9xZLbLK+x+4WwIjDm4Dn91FZu3L9B6/kxTEfRu3C+uLVwXifGMHXV5sfm9P5 8bAejkHPN3P6kEGu0HifPfXA2fhLQqzU3dpxcHjYFl1HMFwVjCs2Cqm52Nrr/CPFTmO5 m1imfs1aFqWPsBqvz1bkLZl6xuIlpq7XOdLRGJRc1sWeK40BOSIma1ZNSD9yQzgw0d6b MONe6n+SbFPu8Lfni4n/RQXViwFQ0scYmheIcb+MMv9Xsxu0VEi3hElv+Y4szI1uiakq aR/w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:in-reply-to:references:mime-version:date :message-id:subject:to:cc:content-transfer-encoding; bh=k3szPqYXwEumHsFPwVfqbpu+heb6JkMtNwu7XgoRBP8=; b=JWsGWACXY+1gqvJufFL6U8SZkGasBHktVcT/Iez+ObUp7kD1wbBw5M4KmxSLAcDvue qD7a8D37WohNIQPf7JhehNcwVBtSVBVdASzgATeTicYWHRkaC8YE5umVhlLX2zifdBNb EORFLRttiMOWOb85oMUHZoNOx94p1gsNM2PzO2w5pfv7Xbqpig19UuwUVHaz9Pn9oIOS u2pFuo17UDgNBDZdPVzRmba2/EM7L8+SL1VG+p+a8BoByGTre8DOf6+XEnCjzO3ZfOkh ge0MxgvS45/9PUKr/sqr5mS9R9oVIBMKKF4bBsM+C+YAMr9W1o5m9a5k/fh47H9H7vEQ nl7w==
X-Gm-Message-State: AOAM532U6oDNuc97O/BJntS9qjwB+IVaIF47mBdmMyu4VyOtFuRptUUp n15r+mwnYR8EgiEZ4UNy20QJyvohNQk+9tb1HrM=
X-Google-Smtp-Source: ABdhPJwhMwpWNzHofD47pDJDIv1immV4zYQ8TeaFlhc2q9GN+f9w672f+dD+eCHvjkkEvdxdTq7AKs6ggI3YY5VZhp0=
X-Received: by 2002:a05:6402:40d4:: with SMTP id z20mr4642224edb.314.1629213567562; Tue, 17 Aug 2021 08:19:27 -0700 (PDT)
Received: from 1058052472880 named unknown by gmailapi.google.com with HTTPREST; Tue, 17 Aug 2021 15:19:26 +0000
From: Alvaro Retana <aretana.ietf@gmail.com>
In-Reply-To: <202107220742327030208@zte.com.cn>
References: <202107220742327030208@zte.com.cn>
MIME-Version: 1.0
Date: Tue, 17 Aug 2021 15:19:26 +0000
Message-ID: <CAMMESsznPjjXD44S5gc=QeAEdZA4cEOwPJJcxPbgxxiktiOoaA@mail.gmail.com>
To: gregory.mirsky@ztetx.com
Cc: mmcbride7@gmail.com, pim-chairs@ietf.org, draft-ietf-pim-bfd-p2mp-use-case@ietf.org, pim@ietf.org
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/pim/d6r2cA93Ra25L1rATWxPbuTtsEw>
Subject: Re: [pim] AD Review of draft-ietf-pim-bfd-p2mp-use-case-05
X-BeenThere: pim@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Protocol Independent Multicast <pim.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/pim>, <mailto:pim-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/pim/>
List-Post: <mailto:pim@ietf.org>
List-Help: <mailto:pim-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/pim>, <mailto:pim-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 17 Aug 2021 15:19:38 -0000

On July 21, 2021 at 7:42:42 PM, gregory.mirsky@ztetx.com wrote:


Greg:

Hi!

Thanks for the update!!

I have some comments in-line.  I am also attaching (below) a review of
-06, which mostly contains minor issues and nits related to
new/updated text.


The main item I would like WG input on is the generalization of the
mechanism.  In short, bootstrapping the BFD session is the main focus
of this document; I don't see a reason to avoid generalizing its use
to other scenarios.  We already agree on the wider applicability --
the incremental changes needed to generalize the text is far less than
the process it will take to do it later.

Mike: as Shepherd/Chair, please start a conversation in the WG as
needed.  Please take a look at the discussion below.


Thanks!

Alvaro.



...
> Sender: AlvaroRetana
...
> This document specifies two things: (1) a Hello option to help the
> tail bootstrap the BFD session, and, (2) the actions that the tail
> takes when a failure is detected. The former is common to all cases,
> while the latter depends on the role of the head with respect to the
> tail. The actions are basically an acceleration of what would
> naturally happen (if BFD was not used and the failure detection was
> "slow"). Is this a fair characterization of the document?

GIM>> Yes, absolutely correct.

...
> Are there other cases that could use this mechanism to track? I can
> think of a couple of cases: monitor PIM neighbors that send Joins, RPF
> neighbors, Assert winners... As with the DR case, for example, these
> cases don't require actions beyond an acceleration. It would be ideal
> if the document could cover these cases, and possibly others, in a
> generic way -- I can't think of good phrasing with now, but I'm sure
> you can. ;-)

GIM>> I agree that the defined PIM Hello BFD Discriminator option can be used
GIM>> by not only PIM DR/BDR nodes. Indeed, there are other use cases where a
GIM>> faster detection will improve the convergence in the control plane and
GIM>> minimize the negative impact on the multicast data plane. These use
GIM>> cases may be covered in the future.

Maybe I'm missing something obvious, and would like to understand what. :-)

The Hello option helps the tail bootstrap a BFD session, correct?  If
so, there's nothing in that description about the function of the tail
(or the head).  The point that I'm trying to make above is that the
only thing that BFD is providing any use case is faster detection of a
failure (which is very significant, of course!), so regardless of the
application or function of the tail/head, the operation can be
generalized.  Is this not true?  What am I missing?

To be more specific, the description in §2.1 (Using P2MP BFD in PIM
DR/BDR Monitoring) is a generic description, and not one that only
applies to a DR/BDR.  In fact, the text says that any node "regardless
of its role, MAY become a head of a p2mp BFD session" -- which means
that it is up to the tail to monitor it or not.

The last two paragraphs in §2.1 do mention the DR/BDR function, but
they could easily be generalized:

OLD>
   If the tail detects a MultipointHead failure [RFC8562], it MUST
   delete the corresponding neighbor state.  If the failed head was the
   DR (or BDR), the DR (or BDR) election mechanism in [RFC7761] or
   [I-D.ietf-pim-dr-improvement] is followed.

   If the head ceases to include the BFD Discriminator PIM Hello option
   in its PIM-Hello message, tails MUST close the corresponding
   MultipointTail BFD session.  Thus the tail stops using BFD to monitor
   the head and reverts to the procedures defined in [RFC7761] and
   [I-D.ietf-pim-dr-improvement].


NEW>
   If the tail detects a MultipointHead failure [RFC8562], it MUST
   delete the corresponding neighbor state.  If the head ceases to
   include the BFD Discriminator PIM Hello option in its PIM-Hello
   message, tails MUST close the corresponding MultipointTail BFD
   session.  In both cases, the tail continues to follow the
   specification related to the function of the head.



GIM>> As I think of it, one aspect would be homogeneity of p2mp BFD
GIM>> capability throughout the domain. In other words, what happens if some
GIM>> PIM nodes don't support the BFD Discriminator option and do not use
GIM>> p2mp BFD? What their slow (regular) convergence impact other nodes? But
GIM>> that, I think, is for further discussion, work. Would you agree?

I don't.

In fact, this is a very important deployment point that I had
overlooked.  If the support is not homogeneous, then some parts of the
network will converge (to the new state) faster than others. As with
unicast routing I think the main effect may be longer than expected
inconsistency, but not worst than without BFD.  The specific effect
relates to the use case...but because the deployment would be
localized (the DR and other routers on the LAN, for example), then any
negative effect of not supporting BFD (i.e. behaving as today) would
be localized.

Thank you for bringing this up.  If you consider the effect
significant for a specific case then please add a couple of sentences.


...
> 183 3.1. Using P2MP BFD in PIM DR/BDR Monitoring
...
> 229 If the head ceased to include BFD TLV in its PIM-Hello message, tails
> 230 MUST close the corresponding MultipointTail BFD session. Thus the
> 231 tail stops using BFD to monitor the head and reverts to the
> 232 procedures defined in [RFC7761] and [I-D.ietf-pim-dr-improvement].
...
> [major] Let me see if I understand: if the head doesn't use the BFD
> hello option anymore then the tail can gracefully stop using BFD.
> IOW, this way the BFD session does not expire and result in the DR
> being declared dead. Is that it?

GIM>> Yes, that is what we've intended - revert to "slow" detection.

> Given that the BFD session can be bootstrapped at the tail by manually
> configuring the corresponding discriminator, it seems that stopping
> the use of the BFD hello option may not result in the expected
> outcome. ???

GIM>> Yes, the head's My Discriminator value can be provisioned using the
GIM>> management plane. If that is the case, then I think this document is not
GIM>> applicable as the head and leaves use RFC 8562 without any additions.

The problem is that the text now says this:

   If the head ceases to include the BFD Discriminator PIM Hello option
   in its PIM-Hello message, tails MUST close the corresponding
   MultipointTail BFD session.  Thus the tail stops using BFD to monitor
   the head...

s/MUST/SHOULD   Provisioning the node is the exception, so the action
should be recommended and not required.


...
> 288 5. Security Considerations
...
> 299 An implementation that supports this specification SHOULD use a
> 300 mechanism to control the maximum number of BFD sessions that can be
> 301 active at the same time.
>
> [major] rfc8562 already requires "protective measures to prevent an
> infinite number of MultipointTail sessions from being created". It
> is then not needed for this document to recommend anything that is
> required elsewhere.

GIM>> Done.

??  You left the paragraph in.


> [major] What new security risks are introduced by the mechanism in
> this draft? In general, a rogue node can stop sending or delay BFD
> packets causing the tail to conclude that the head is down: the DR/BDR
> may change causing instability. I was surprised that rfc8562 did not
> mention the interaction risk, but rfc5880 already does. I feel that
> something needs to be mentioned specific to this document, even if it
> is highlighted that the risk is not new.

GIM>> Then that "rogue" node is the PIM DR/BDR, not a man-in-the-middle. And
GIM>> the attack is by making the p2mp BFD session expire on leaves while
GIM>> still periodically sending PIM Hello. AFAIK, since PIM DR/BDR election
GIM>> takes several Hello cycles, I don't think that that behavior will affect
GIM>> the multicast service. Perhaps I'm missing something, please advise.

Yes, the problem is that §2.1 says that the tail "MUST delete the
corresponding neighbor state".  This results in the DR not being
elected for a while.

I see what you mean: if the DR is still there then the election will
probably elect it again.  However, in the meantime the DR may not
think it is the DR anymore if the other routers in the LAN start a new
DR election. (??)   Please add the explanation (or something like it)
to make it clear that the risk is mitigated by the "double set of
hellos".

In the general case...  If tracking the sender of a Join, for example,
the effect would be more significant: an outage would exist until the
next Join is received.


...
> 310 7.1. Normative References
> ..
> 327 [RFC5881] Katz, D. and D. Ward, "Bidirectional Forwarding Detection
> 328 (BFD) for IPv4 and IPv6 (Single Hop)", RFC 5881,
> 329 DOI 10.17487/RFC5881, June 2010,
> 330 .
>
> [minor] This reference can be Informative.

I know you moved it...but now that you added a reference to it in §2.3
then we need it to be Normative.  Sorry.


...
> [End of review -05.]



[Start -06]
[Line numbers from idnits.]

...
129	2.  BFD Discriminator PIM Hello Option
...
153	   If the value of the OptionLength field is not equal to 4, the BFD
154	   Discriminator PIM Hello option is considered malformed, and the
155	   receiver MUST stop processing PIM Hello options.  If the value of the
156	   My Discriminator field equals zero, then the BFD Discriminator PIM
157	   Hello option MUST be considered invalid, and the receiver MUST ignore
158	   it.  The receiver SHOULD log the notification regarding the malformed
159	   or invalid BFD Discriminator Hello option under the control of a
160	   throttling logging mechanism.

[major] "MUST stop processing PIM Hello options"

Stop in the current Hello message?  Should it ignore all the options
or just the ones after this one?  In all future Hello messages?

I haven't thought about this enough, but there could be an effect on
other functionality.  What is that effect?  I couldn't find anywhere a
general way to handle malformed Hello options -- did I miss it?


[nit] s/log the notification/log a notification


162	2.1.  Using P2MP BFD in PIM DR/BDR Monitoring
...
169	   If a PIM-SM router is configured to monitor the head by using p2mp
170	   BFD, referred to through this document as 'tail', receives PIM-Hello
171	   packet with BFD Discriminator PIM Hello option, the tail MAY create a
172	   p2mp BFD session of type MultipointTail, as defined in [RFC8562].

[minor] s/router is configured/router that is configured

[nit] s/receives PIM-Hello packet with BFD Discriminator/receives a
PIM-Hello packet with the BFD Discriminator


...
188	   If the head ceases to include the BFD Discriminator PIM Hello option
189	   in its PIM-Hello message, tails MUST close the corresponding
190	   MultipointTail BFD session.  Thus the tail stops using BFD to monitor
191	   the head and reverts to the procedures defined in [RFC7761] and
192	   [I-D.ietf-pim-dr-improvement].

[minor] "...MUST close the corresponding MultipointTail BFD session"

It might be a good thing adding that the PIM state is not affected by
this action.


194	2.2.  P2MP BFD in PIM DR Load Balancing

196	   [RFC8775] specifies the PIM Designated Router Load Balancing (DRLB)
197	   functionality.  Any PIM router that advertises the DRLB-Cap Hello
198	   Option can become the head of a p2mp BFD session, as specified in
199	   Section 2.1.  The head router administratively sets the
200	   bfd.SessionState to Up in the MultipointHead session [RFC8562] only
201	   if it is a Group Designated Router (GDR) Candidate, as specified in
202	   Sections 5.5 and 5.6 of [RFC8775].  If the router is no longer the
203	   GDR, then it MUST shut down following the procedures described in
204	   Section 5.9 [RFC8562].  For each GDR Candidate that includes BFD
205	   Discriminator option in its PIM Hello, the PIM DR creates a
206	   MultipointTail session [RFC8562].  PIM DR demultiplexes BFD sessions
207	   based on the value of the My Discriminator field and the source IP
208	   address.  If PIM DR detects a failure of one of the sessions, it MUST
209	   remove that router from the GDR Candidate list and immediately
210	   transmit a new DRLB-List option.

[] Continuing with my theme of generalizing this specification...
This section says everything that the last section already specified
in a generic way.  IOW, it is not really needed.

There is one thing that this paragraph adds: "If the router is no
longer the GDR, then it MUST shut down following the procedures
described in Section 5.9 [RFC8562]."   Yes, shutting down the BFD
session is important, but so is not including the BFD Discriminator
option in the Hello anymore.  As with everything else, this part can
also be generalized:

   If the head is no longer serving the function that prompted it
   to be monitored, then it MUST cease including the BFD Discriminator
   PIM Hello option in its PIM-Hello message, and it MUST shut down
   the BFD session following the procedures described in Section 5.9
   [RFC8562].


212	2.3.  Multipoint BFD Encapsulation

214	   The MultipointHead of a p2mp BFD session when transmitting BFD
215	   Control packet:

[nit] s/packet/packets


217	      MUST set TTL or Hop Limit value to 255 (Section 5 [RFC5881]);

[major] "MUST set...RFC5881"   This action is already required in an
RFC that this document depends on, please don't specify the behavior
again.  I understand that rfc5682 can be used in multi-hop scenarios,
but rfc5881 is the source here.    s/MUST/must


...
222	3.  IANA Considerations
...
227	   +=============+================+===================+===============+
228	   | Value Name  | Length Number  | Name Protocol     | Reference     |
229	   +=============+================+===================+===============+
230	   | TBA         | 4              | BFD Discriminator | This document |
231	   |             |                | Option            |               |
232	   +-------------+----------------+-------------------+---------------+

[major] Please use the same field names as in the registry: Value,
Length, Name, Reference

[EoR -06]