[pim] AD Review of draft-ietf-pim-drlb-10

Alvaro Retana <aretana.ietf@gmail.com> Fri, 21 June 2019 16:46 UTC

Return-Path: <aretana.ietf@gmail.com>
X-Original-To: pim@ietfa.amsl.com
Delivered-To: pim@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 090DE120124; Fri, 21 Jun 2019 09:46:45 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.995
X-Spam-Level:
X-Spam-Status: No, score=-1.995 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, NORMAL_HTTP_TO_IP=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Fo6EduIOAA4v; Fri, 21 Jun 2019 09:46:38 -0700 (PDT)
Received: from mail-ed1-x536.google.com (mail-ed1-x536.google.com [IPv6:2a00:1450:4864:20::536]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 7C861120073; Fri, 21 Jun 2019 09:46:34 -0700 (PDT)
Received: by mail-ed1-x536.google.com with SMTP id p15so10940869eds.8; Fri, 21 Jun 2019 09:46:34 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:mime-version:date:message-id:subject:to:cc; bh=d38fJypYvSdxArlmK5x1Hl9rouRw3ehzHc8TEpSxWvA=; b=b+rNRbuTtC/OGh9ySsgeJ81VbKQhmfjfwg7K1iMqTo/qZXuQqmGsWu1G5Dtl0s74b5 F+CYE2rc30xYaN/Q2+1yzLUQEy8i5BqVTCCsZdlTlbSy2yaEJVyBcqXramFIqxaqaQR9 FO5TSb+szO4rwSVY8nA3LMAj5ShHuo9ojEFtANJDQ+Hl9j/EqQKNBXJ1xkxXDmPKXvhw eLUayRxRdFuI+cpDtfvZegodwYMOc6tmxwXjV/BQuVcPRXMHuA4jpHwrs0vrhkxE+oB0 uZJyR5zZa/P9BKJsAGaJi+k7T3auxexBGYsoISGwvMkO9AyoEYTbHkFCiSuD8QEMsquk 8CkA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:mime-version:date:message-id:subject:to:cc; bh=d38fJypYvSdxArlmK5x1Hl9rouRw3ehzHc8TEpSxWvA=; b=nxG36Cyx3nrTJKvi+6XDUMpmL+6S6VF5aItrmr9N91+JCWi1bzyB3sLBpRcZoIVxNa J1OY9KZZ7g4hEtspxanF471rDHW8z0w+LjDKr1gRwWHvBAxBBXAmUKRFSKq89GwcB8Sq YoqCZ8jfy7wFm2qjSKVcwZC7GsrOtRWBlBds+FB7/h6JSgFIqzM0Dl/jtHXF5RATqiuM 7Mf2/KmSLRsZFyEt8XtoF/o0xm42zCQBzzQKmGSRDcJv+/SIqXzLGpVYti3ol/IQBIBq WODT+tuGUdU2SPi5MkNJw/SfQOl2B/PBJxbaxZnNsvwi1OxFiMoZJh7cxp7gTwAtIB/L fTNw==
X-Gm-Message-State: APjAAAV6BNsSYd7ehoPy7PDoxX6694bX8VumzYyGu2/ovahVdNWv0oew +2z7s034jFsV1KxxbsCt7F2BZZ6xMbMAMElU2upTsA==
X-Google-Smtp-Source: APXvYqw9hid60tbXwUzMVTLXUcxiKzJ8tCTYu/ObpQ4xW4vtCi9TA9H17Hx+GmMvQY2g1tkksRnmCaxEIqHGA1RTDNw=
X-Received: by 2002:a17:906:892:: with SMTP id n18mr91809188eje.10.1561135591994; Fri, 21 Jun 2019 09:46:31 -0700 (PDT)
Received: from 1058052472880 named unknown by gmailapi.google.com with HTTPREST; Fri, 21 Jun 2019 12:46:30 -0400
From: Alvaro Retana <aretana.ietf@gmail.com>
MIME-Version: 1.0
Date: Fri, 21 Jun 2019 12:46:30 -0400
Message-ID: <CAMMESszYeVa9_2EMQjNCUP-8PzK3g_Zbv0cQ5vi7nA8j4Kp9Fw@mail.gmail.com>
To: draft-ietf-pim-drlb@ietf.org
Cc: Mike McBride <mmcbride7@gmail.com>, pim-chairs@ietf.org, pim@ietf.org
Content-Type: multipart/alternative; boundary="000000000000683a04058bd83436"
Archived-At: <https://mailarchive.ietf.org/arch/msg/pim/8eyqWap1Xnohlp_ZAhvCZ3Qttw4>
X-Mailman-Approved-At: Sat, 22 Jun 2019 08:08:49 -0700
Subject: [pim] AD Review of draft-ietf-pim-drlb-10
X-BeenThere: pim@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Protocol Independent Multicast <pim.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/pim>, <mailto:pim-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/pim/>
List-Post: <mailto:pim@ietf.org>
List-Help: <mailto:pim-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/pim>, <mailto:pim-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 21 Jun 2019 16:46:45 -0000

Dear authors:

First of all, thank you for reviving this work!!

I just finished reading the document and have significant concerns (please
see details in line).  In fact, I am pointing at the same general issue
that the original AD review [1] did: under-specification.  Adding text in
6.2 has resulted in behavior being specified (inconsistently!) more than
once.

Also, the specification of the Hash Algorithm is included in §4 (Functional
Overview).  Not being in an appropriately named section is significant.
Please move this part to §6 (Protocol Specification), or maybe its own
section.

Even though there is much work needed, I don't want to return this document
to the WG and risk more delay.  While some major issues will require
changes, I think that many of them are about the structure of the
document.  If needed, I will rely on Mike (Chair/Shepherd) to coordinate
further WG review.

Thanks!

Alvaro.

[1] https://mailarchive.ietf.org/arch/msg/pim/K61q4--5ZBb9RTMkeud-5MnmuOw


[Line numbers from id-nits.]


...
90 1.  Introduction

92   On a multi-access LAN such as an Ethernet, one of the PIM routers is
93   elected as a DR.  The PIM DR has two roles in the PIM-SM protocol.
94   On the first hop LAN, the PIM DR is responsible for registering an
95   active source with the Rendezvous Point (RP) if the group is
96   operating in PIM-SM.  On the last hop LAN, the PIM DR is responsible
97   for tracking local multicast listeners and forwarding to these
98   listeners if the group is operating in PIM-SM.

[minor] The first reference to rfc7761 is below, when talking about DR
election.  Consider moving that reference to this first paragraph so that
the stage is set early.

[nit] I know DR was extended in the Abstract, the acronym is used above and
the extended version again below...  Please extend DR above and use the
contracted form elsewhere.


...
114   Assume R1 is elected as the Designated Router.  According to
115   [RFC7761], R1 will be responsible for forwarding traffic to that LAN
116   on behalf of any local members.  In addition to keeping track of IGMP
117   and MLD membership reports, R1 is also responsible for initiating the
118   creation of source and/or shared trees towards the senders or the
119   RPs.

[minor] We need Informative references to IGMP and MLD.

121   Forcing sole data plane forwarding responsibility on the PIM DR
122   uncovers a limitation in the protocol.  In comparison, even though an
123   OSPF DR or an IS-IS DIS handles additional duties while running the
124   OSPF or IS-IS protocols, they are not required to be solely
125   responsible for forwarding packets for the network.  On the other
126   hand, on a last hop LAN, only the PIM DR is asked to forward packets
127   while the other routers handle only control traffic (and perhaps drop
128   packets due to RPF failures).  Hence the forwarding load of a last
129   hop LAN is concentrated on a single router.

[minor] "In comparison..."  I think that comparing the PIM DR to an OSPF
DR/IS-IS DIS is completely out of place since the justification for this
work is to share forwarding responsibilities...and the IGP DR/DIS are not
responsible for forwarding at all!  IOW, the comparison doesn't make sense.

[minor] RPF failures are only mentioned here.  Either expand on them and
explain why they are relevant to this work, or remove the text to avoid
confusion.

131   This leads to several issues.  One of the issues is that the
132   aggregated bandwidth will be limited to what R1 can handle towards
133   this particular interface.  It is very common that the last hop LAN
134   consists of switches that run IGMP/MLD or PIM snooping.  This allows
135   the forwarding of multicast packets to be restricted only to segments
136   leading to receivers who have indicated their interest in multicast
137   groups using either IGMP or MLD.  The emergence of the switched
138   Ethernet allows the aggregated bandwidth to exceed, sometimes by a
139   large number, that of a single link.  For example, let us modify
140   Figure 1 and introduce an Ethernet switch in Figure 2.

[nit] "this particular interface"  Which interface?  I'm sure you mean the
one to the core networks...but please be explicit.

[minor] "IGMP/MLD or PIM snooping"  We need a reference.


...
184   There is a limitation in the hash algorithm used in this document,
185   but this document provides the option to have different and more
186   consistent hash algorithms in the future.

[minor] I think that the limitations discussion should be left to §4.3.1.
IOW, I suggest taking this paragraph out.


...
193 2.  Terminology

195   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
196   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
197   document are to be interpreted as described in [RFC2119].

[major] Please use the rfc8174 template.


...
204   o  GDR: GDR stands for "Group Designated Router".  For each multicast
205      flow, either a (*,G) for ASM, or an (S,G) for SSM, a hash
206      algorithm (described below) is used to select one of the routers
207      as a GDR.  The GDR is responsible for initiating the forwarding
208      tree building process for the corresponding multicast flow.

[nit] s/GDR: GDR stands for "Group Designated Router"./GDR: Group
Designated Router.

[minor] ASM needs to be expanded

210   o  GDR Candidate: a last hop router that has the potential to become
211      a GDR.  A GDR Candidate must have the same DR priority and must
212      run the same GDR election hash algorithm as the DR router.  It
213      must send and process new PIM Hello Options as defined in this
214      document.  There might be more than one GDR Candidate on a LAN,
215      but only one can become GDR for a specific multicast flow.

[major] This paragraph uses "must" several times.  Should that be Normative
language?

[??] Why is it a requirement to have the same DR priority?  It seems to me
that there would be more options and even better load sharing if more
routers could be GDR.


...
229 4.  Functional Overview
...
256   Hash Masks are defined for Source, Group and RP separately, in order
257   to handle PIM ASM/SSM.  The masks, as well as a sorted list of GDR
258   Candidate Addresses, are announced by the DR in a new DR Load
259   Balancing GDR (DRLBGDR) PIM Hello Option.

[nit / personal opinion] The abbreviations for the new Options are awful --
I keep having to think about the meaning and whether they are spelled
correctly at every step.  Perhaps something reader-friendly acronyms would
be better.  Suggestions:

- DR Load Balancing Capability PIM Hello Option: LB Option
- DR Load Balancing GDR PIM Hello Option: GDR List Option (maybe even GL)

261   A hash algorithm based on the announced Source, Group, or RP masks
262   allows one GDR to be assigned to a corresponding multicast state.
263   And that GDR is responsible for initiating the creation of the
264   multicast forwarding tree for multicast traffic.

[major] Is there a possibility that the assigned GDR for a specific
multicast state can't fulfill its duties?  This document assumes that all
the GDRs are able to service all states...but that may not be true.
Because every GDR candidate calculates who the GDR is for a specific state,
it may never know what an alternate GDR is not able to forward traffic.
 [Does this make sense?]

266 4.1.  GDR Candidates

268   GDR is the new concept introduced by this specification.  GDR
269   Candidates are routers eligible for GDR election on the LAN.  To
270   become a GDR Candidate, a router MUST support this specification,
271   have the same DR priority and run the same GDR election hash
272   algorithm as the DR on the LAN.

[nit] There is a lot of repetition in the text...for example, all the
contents on this paragraph were already mentioned in the last section...
Cleaning up the text would be nice.

[major] "To become a GDR Candidate, a router MUST support this
specification..."  First of all, this is an obvious statement: the GDR is
defined in this document.  Second, there's no Normative value in using MUST.


...
289 4.2.  Hash Mask and Hash Algorithm

291   A Hash Mask is used to extract a number of bits from the
292   corresponding IP address field (32 for v4, 128 for v6) and calculate
293   a hash value.  A hash value is used to select a GDR from GDR
294   Candidates advertised by PIM DR.  For example, 0.0.255.0 defines a
295   Hash Mask for an IPv4 address that masks the first, the second, and
296   the fourth octets.

[minor] s/v4/IPv4 s/v6/IPv6


...
306   The hash masks need to be configured on the PIM routers that can
307   potentially become a PIM DR, unless the implementation provides
308   default Hash Mask values.  An implementation SHOULD provide masks
309   with default values 255.255.255.255 (IPv4) and
310   FFFF:FFFF:FFFF:FFFF:FFFFF:FFFF:FFFF:FFFF (IPv6).

[minor] Can you provide any guidance on how to configure the hash masks?
The example below shows a mask of 0.0.255.0, which seems both "unnatural"
(for the average person used to set masks) and very specific (in order for
N to have a specific value).  Maybe it's just me, but I would have set the
mask to something similar to 255.255.0.0/0.0.255.255.  Guidance should be
added to §8 (Manageability Considerations).

[major] What does "provide" mean in this context?  I guess that it means
that if no hash mask is configured, then the default should be used.  Is
that correct?

[major] When would an implementation not provide those masks?  IOW, why
aren't you using MUST?

312   o  If the group is in ASM mode and the RP Hash Mask announced by the
313      PIM DR is not 0, calculate the value of hashvalue_RP [Section 4.3]
314      to determine GDR.

[major] "the RP Hash Mask...is not 0"  By 0 you mean 0.0.0.0 and ::,
right?  Please be specific!  The same is used in other places.

[nit] s/determine GDR/determine the GDR/g


...
323   A simple Modulo hash algorithm is defined in this document.  However,
324   to allow another hash algorithms to be used, a 1-octet "Hash
325   Algorithm" field is included in DRLBC Hello Option to specify the
326   hash algorithm used by a last hop router.

[nit] s/included in DRLBC Hello Option/included in the DRLBC Hello Option


...
332 4.3.  Modulo Hash Algorithm

[major] The specification of the algorithm doesn't belong in the
"Functional Overview" section.

334   The Modulo hash algorithm is discussed here with a detailed
335   description on hashvalue_RP.  The same algorithm is described in
336   brief for hashvalue_Group using the group address instead of the RP
337   address for an ASM group with zero RP_hashmask, and also with
338   hashvalue_SG for a the source address of an (S,G), instead of the RP
339   address,

[minor] "zero RP_hashmask" is used without prior definition, or a forward
reference

341   o  For ASM groups, with a non-zero RP_Hash Mask, hash value is
342      calculated as:

[minor] "non-zero RP_Hash Mask" is used without prior definition, or a
forward reference

[minor] "RP_Hash Mask", "RP Hash Mask" and "RP_hashmask" are used.  Are
they meant to represent the same thing?  Please be consistent.

344         hashvalue_RP = (((RP_address & RP_hashmask) >> N) & 0xFFFF) % M

[major] The notation used should be explained somewhere.  Note that the
examples below help infer what's going on, but those are examples...I'm
concerned with the *specification* part of this document.


...
382         Compare hashvalue_Group with Ordinal number assigned to Router
383         X, to decide if Router X is the GDR.

[nit] s/Compare hashvalue_Group with Ordinal number/Compare the
hashvalue_Group with the Ordinal number

385   o  For SSM groups, a hash value is calculated using both the Source
386      and Group Hash Mask:

388         hashvalue_SG = ((((Source_address & Source_hashmask) >> N_S) &
389         0xFFFF) ^ (((Group_address & Group_hashmask) >> N_G) & 0xFFFF))
390         % M

[major] Again, the notation must be explained somewhere.

[minor] An example for this more complicated formula would be nice.  In
fact, it may be a good idea to first define the algorithm and then (in a
sub-section) show examples.

392 4.3.1.  Limitations

394   The Modulo Hash Algorithm has poor failover characteristics when a
395   shared LAN has more than two GDRs.  In the case of more than two GDRs
396   on a LAN, when one GDR fails, all of the groups may be reassigned to
397   a new GDR, even if they were not assigned to the failed GDR.
398   However, many deployments use only two routers on a shared LAN for
399   redundancy purposes.  Future work may define new hash algorithms
400   where only groups assigned to the failed GDR get reassigned.

[minor] s/reassigned to a new GDR/reassigned to a single GDR

402 4.4.  PIM Hello Options

404   When a last hop PIM router sends a PIM Hello for an interface with
405   this specification enabled, it includes a new option, called "Load
406   Balancing Capability (DRLBC)".

[nit] s/sends a PIM Hello for an interface/sends a PIM Hello out an
interface

408   Besides this DRLBC Hello Option, the elected PIM DR also includes a
409   new "DR Load Balancing GDR (DRLBGDR) Hello Option".  The DRLBGDR
410   Hello Option consists of three Hash Masks as defined above and also a
411   sorted list of GDR Candidate addresses on the last hop LAN.

413   The elected PIM DR uses DRLBC Hello Option advertised by all routers
414   on the last hop LAN to compose the DRLBGDR Option.  The GDR
415   Candidates use the DRLBGDR Hello Option advertised by the PIM DR to
416   calculate the hash value.

[?] I couldn't figure out from rfc7761 whether all Options have to be
included in all Hellos, all the time. Do they?

[major]  It looks like the DR can change the contents of the DRLBGDR if a
new GDR comes into the network, or one leaves, but there is no
specification related to when the DR should reconsider the contents of the
DRLBGDR.  This specification would belong somewhere in §6.

418 5.  Hello Option Formats

420 5.1.  PIM DR Load Balancing Capability (DRLBC) Hello Option

422      0                   1                   2                   3
423      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
424     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
425     |           Type = TBD          |         Length = 4            |
426     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
427     |                     Reserved                  |Hash Algorithm |
428     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

[major] Please use OptionType and OptionLength [rfc7761].

430          Figure 3: Capability Hello Option

[minor] The title doesn't reflect the full name (or even DRLB)...

432      Type: TBD

[major] Even though the assignment is temporary, please include here the
values.

434      Length: 4

436      Hash Algorithm: 0 for Modulo

[minor] Even though 0 is the only value defined, that is not the definition
of this field.

438   This DRLBC Hello Option MUST be advertised by last hop routers on
439   interfaces with this specification enabled.

441 5.2.  PIM DR Load Balancing GDR (DRLBGDR) Hello Option

443     0                   1                   2                   3
444     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
445     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
446     |           Type = TBD          |         Length                |
447     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
448     |                            Group Mask                         |
449     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
450     |                            Source Mask                        |
451     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
452     |                            RP Mask                            |
453     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
454     |                    GDR Candidate Address(es)                  |
455     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

[major] Please use OptionType and OptionLength [rfc7761].

457          Figure 4: GDR Hello Option

[minor] The title doesn't reflect the name being used (DRLBGDR) for this
option.

459      Type: TBD

[major] Even though the assignment is temporary, please include here the
values.

461      Length: (3 + n) x (4 or 16) where n is the number of GDR
462      candidates.

[nit] s/(4 or 16) where/(4 or 16), where

[minor] I know that rfc7761 says that the OptionLength's unit is bytes, but
it wouldn't hurt repeating it here.

464      Group Mask (32/128 bits): Mask

466      Source Mask (32/128 bits): Mask

468      RP Mask (32/128 bits): Mask

[major] All the descriptions are the same!  §4.2 also lists the masks, but
they are not explicitly described anywhere.  Describe them in either
section and point to the other.

470         All masks MUST be in the same address family as the Hello IP
471         header.

[minor] It sounds relatively obvious that the masks and addresses would use
the same AF...maybe a short heads up at the start of this section would
make it even clearer.

473      GDR Address (32/128 bits): Address(es) of GDR Candidate(s)

[major] I think that the DRs own address must be advertised in this
list...but that is not specified.  If it is not advertised, then it would
not forward any traffic.

475         All addresses must be in the same address family as the Hello
476         IP header.  The addresses are sorted in descending order.  The
477         order is converted to the ordinal number associated with each
478         GDR candidate in hash value calculation.  For example, if
479         addresses advertised are R3, R2, R1, the ordinal number
480         assigned to R3 is 0, to R2 is 1 and to R1 is 2.

[major] "All addresses must be in the same address family as the Hello IP
header."  Should that be a MUST (to match the statement about the masks)?
Perhaps this can be covered in the text at the start of this section.

[major] What should a receiver do if the masks don't match the IP header?
Ignore the option?

[major] "The addresses are sorted in descending order." Should this be
Normative?  Does the ordering (ascending/descending/random) really matter?
It seems to me that because the DR is the only one making the advertisement
then it could do whatever and the hash would still work, am I missing
something?

482         If the "Interface ID" option, as specified in [RFC6395], is
483         present in a GDR Candidate's PIM Hello message, and the "Router
484         ID" portion is non-zero:

[minor] rfc6395 uses "Router Identifier" (and not "Router ID").

486         +  For IPv4, the "GDR Candidate Address" will be set directly
487            to the "Router ID".

489         +  For IPv6, the "GDR Candidate Address" will be set to the
490            IPv4-IPv6 translated address of the "Router ID", as
491            described in [RFC4291], that is the "Router-ID" is appended
492            to the prefix of 96 bits of zeroes.

[major] "IPv4-IPv6 translated address"  I couldn't find that in rfc4291,
but I think you mean the "IPv4-Compatible IPv6 Address" (§2.5.5.1), right?
If so, rfc4291 says that this type of addresses is deprecated.

It seems to me that the purpose of the representation is not to use the
result as an address, but to represent the ID advertised by the router.  If
so, then describing what is wanted, and taking the reference to rfc4291 out
would be enough.

OTOH, why is this extra step/check needed?  The DR election [rfc7761]
simply uses the source address of the Hello message.

494         If the "Interface ID" option is not present in a GDR
495         Candidate's PIM Hello message, or if the "Interface ID" option
496         is present but the "Router ID" field is zero, the "GDR
497         Candidate Address" will be the IPv4 or IPv6 source address of
498         the PIM Hello message.

500         This DRLBGDR Hello Option MUST only be advertised by the
501         elected PIM DR.

[nit] s/This DRLBGDR Hello Option/The DRLBGDR Hello Option

[major] What should a router do if the DRLBGDR Hello Option is received
from a non-DR?

503 6.  Protocol Specification

505 6.1.  PIM DR Operation

507   The DR election process is still the same as defined in [RFC7761].  A
508   DR that has this specification enabled on an interface advertises the
509   new DRLBGDR Hello Option, which contains mask values from user
510   configuration, followed by a sorted list of GDR Candidate Addresses,
511   from the highest value to the lowest value.  Moreover, same as non-DR
512   routers, the DR also advertises DRLBC Hello Option to indicate its
513   capability of supporting this specification and the type of its GDR
514   election hash algorithm.

[minor] In several places the "has this specification enabled" or "support
this specification" phrases are used.  While it is relatively ok, it is
also unnecessary: this document is precisely specifying what to do if you
support it.  It would be great/better if you could find a different way to
refer to "this specification"; maybe saying something like "supports DR
Load Balancing"...or simply avoid requiring support (as it is implicit).

[major] If multiple algorithms exist, and the DR has support for several,
is the DR expected to use the algorithm that results on the most candidate
GDRs?  Should that consideration be a configuration issue?

516   If a PIM DR receives a PIM Hello with the DRLBGDR Option, the PIM DR
517   SHOULD ignore the TLV.

[major] When would the DR not ignore the DRLBGDR Option?  If it is not
ignored, what should it do?  IOW, why is that not a MUST?

519   If a PIM DR receives a neighbor DRLBC Hello Option, which contains
520   the same hash algorithm as the DR, and the neighbor has the same DR
521   priority as the DR, PIM DR SHOULD consider the neighbor as a GDR
522   Candidate and insert the GDR Candidate's Address into the sorted list
523   of the DRLBGDR Option.  However, the DR MAY have policies limiting
524   which GDR Candidates, or the number of GDR Candidates to include.

[major] I think that the "MAY" doesn't really have Normative value in this
document because it is not specifying a behavior, it is just stating a
fact.  s/MAY/may

526 6.2.  PIM GDR Candidate Operation

...
534   If this specification is enabled on an interface, the router MUST
535   include the DRLBC Hello Option in its PIM Hello on the interface.
536   Note that the presence of the DRLBC Option in PIM Hello does not
537   guarantee that this router would be considered as a GDR candidate.
538   Once DR election is done, the DRLBGDR Hello Option would be received
539   from the current PIM DR on the link which would contain a list of
540   GDRs selected by the PIM DR.

[minor] s/contain a list of GDRs/contain a list of GDR candidates

542   A router only acts as a GDR candidate if it is included in the GDR
543   list of the DRLBGDR Hello Option.

[major] Should the receiving router verify that the DR announced the same
hash algorithm in its DRLBC?  What should happen if the receiver's address
is listed in the DRLBGDR, but the advertised hash algorithms don't match?
Even though it is specified in several places that the hash algorithms must
be the same, there's a possibility (bug, attack) that the algorithms don't
match.  If the DRLBGDR is used (when no match in the algorithms is
verified), then it may result in an unexpected state...  [See also my
comments in the Security Considerations section.]


...
553   If the PIM DR does not support this specification, GDR election will
554   not take place, and only the PIM DR joins the multicast tree.

[minor] This last paragraph is not needed as §7 already addresses this case.

556 6.2.1.  Router Receives New DRLBGDR

558   The first time a router receives a DRLBGDR option from the PIM DR, it
559   MUST process the option and check if it is in the GDR list.

[major] "The first time...MUST process..."  What about other times?  It
looks like §6.2.2 addresses this question.  It seems to me that this
document would be clearer if a single (for example) "Receiving a DRLBGDR"
section existed and the specification didn't make this unnecessary
distinction...

[major] Note that the first bullet (below) says that no action is
needed...which seems in contradiction with "MUST process" (above),
specially because the second bullet talks about "MUST process"...  Instead
of using Normative language in the header, use it when talking about
specific actions.

561   1.  If a router is not listed as a GDR candidate in DRLBGDR, no
562       action is needed.

[minor] s/in DRLBGDR/in the DRLBGDR/g   Note that sometimes "DRLBGDR
option" or "DRLBGDR Hello Option" are used...or simply "DRLBGDR" as in this
case.  Please be consistent.

564   2.  If a router is listed as a GDR candidate in DRLBGDR, then it MUST
565       process each of the groups, or source and group pairs if SSM, in
566       the IGMP/MLD reports.  The masks are announced in the PIM Hello
567       by the DR in the DRLBGDR Hello Option.  For each group in the
568       reports that is in ASM mode, and each source and group pair if
569       the group is in SSM mode, it (PIM Router) needs to run the hash
570       algorithm (described in section 4.3) based on the announced
571       Source, Group or RP masks to determine if it is the GDR for
572       specified group, or source and group pair.  If the hash result is
573       to be the GDR for the multicast flow, it does build the multicast
574       forwarding tree.  If it is not the GDR for the multicast flow, no
575       action is needed.

[major] It seems that "MUST process each of the groups" is equivalent to
"needs to run the hash algorithm", is that correct?  Please don't repeat
information and be clear about the actions.  "process" is not clear
enough...I think that "run the algorithm" may not be precise enough
either.  Maybe something along the lines of "MUST use the GDR selection
algorithm...according to the process in §4.3)" -- I'm sure you can come up
with better text.

577 6.2.2.  Router Receives Updated DRLBGDR

[major] The same comments from above about being explicit and using
Normative language appropriately apply to the text in this section.

...
591          If it was the GDR for a group, or source and group pair if
592          SSM, and the new hash result chose it as the GDR, then no
593          processing is required.

595          If it was the GDR for a group, or source and group pair if
596          SSM, earlier and now it is no longer the GDR, then it sets its
597          assert metric for the multicast flow to be
598          (PIM_ASSERT_INFINITY - 1), as explained in Section 6.3.

[major] Some of the language used to mean similar things is different,
making the specification imprecise.  For example, just in the last two
paragraphs: "the new hash result chose it as the GDR" vs "it is no longer
the GDR"; no longer being a GDR is the result of running the algorithm...

[minor] Where is PIM_ASSERT_INFINITY defined?  We need a reference.  I
looked at rfc7761 and found infinite_assert_metric() -- is it the same
thing?  Also, I found one other RFC (rfc7910) that uses
PIM_ASSERT_INFINITY, but didn't see it defined there either. :-(

[major] §6.3 uses MUST to describe the behavior above...but no Normative
language is used here.

[minor] §6.3 presents the justification to use PIM_ASSERT_INFINITY -
1...but it also seems to want to specify the behavior.  Please specify the
behavior only once.

600          If it was not the GDR for a group, or source and group pair if
601          SSM, earlier, and the new hash does not make it GDR, then no
602          processing is required.

[nit] It looks like the explanation of the process could be simplified...
 "if no change in the local state..."


...
628 6.3.  PIM Assert Modification

630   It is possible that the identity of the GDR might change in the
631   middle of an active flow.  Examples when this could happen include:

633      When a new PIM router comes up

635      When a GDR restarts

[minor] Do you mean a restart as in a graceful restart...or simply the GDR
going down?

637   When the GDR changes, existing traffic might be disrupted.
638   Duplicates or packet loss might be observed.  To illustrate the case,
639   consider the following scenario where there are two flows G1 and G2.
640   R1 is the GDR for G1, and R2 is the GDR for G2.  When R3 comes up
641   online, it is possible that R3 becomes GDR for both G1 and G2, hence
642   R3 starts to build the forwarding tree for G1 and G2.  If R1 and R2
643   stop forwarding before R3 completes the process, packet loss might
644   occur.  On the other hand, if R1 and R2 continue forwarding while R3
645   is building the forwarding trees, duplicates might occur.

[nit] s/R3 comes up online/R3 comes up

647   This is not a typical deployment scenario but might still happen.
648   Here we describe a mechanism to minimize the impact.  We essentially
649   want to minimize packet loss.  Therefore, we would allow a small
650   amount of duplicates and depend on PIM Assert to minimize the
651   duplication.

[nit] "This is not a typical deployment scenario but might still happen."
Sounds very possible to me...

[style nit] Some parts of the text are written using "we"...please don't
write in the first person.  Instead use the 3rd person.  For example:
s/Here we describe a mechanism to minimize the impact./This section
describes a mechanism...


...
658   With the introduction of GDR, the following modification to the
659   Assert packet MUST be done: if a router enables this specification on
660   its downstream interface, but it is not a GDR (before network event
661   it was GDR), it would adjust its Assert metric to
662   (PIM_ASSERT_INFINITY - 1).

[minor] The text above is sloppy for a Normative statement.  It is not
clear, until "before network event" is mentioned that the statement doesn't
apply to any router that is not a GDR.  As I mentioned in §6.2.2, please
specify the behavior only once!

664   Using the above example, for G1, assume R1 and R3 agree on the new
665   GDR, which is R3.  R1 will set its Assert metric as
666   (PIM_ASSERT_INFINITY - 1).  That will make R3, which has normal
667   metric in its Assert as the Assert winner.

[minor] Examples are good.  In this case, I think that dividing the example
with normative paragraphs in between may make it confusing.  Instead,
define the behavior first and then provide a worked out example.  Maybe
referring back to the Figures would help...

669   For G2, assume it takes a slightly longer time for R2 to find out
670   that R3 is the new GDR and still considers itself being the GDR while
671   R3 already has assumed the role of GDR.  Since both R2 and R3 think
672   they are GDRs, they further compare their metric and IP addresses.
673   If R3 has the better routing metric, or the same metric but a better
674   tie-breaker, the result will be consistent during GDR selection.  If
675   unfortunately, R2 has the better metric or the same metric but a
676   better tie-breaker, R2 will become the Assert winner and continues to
677   forward traffic.  This will continue until:

679   The next PIM Hello Option from DR selects R3 as the GDR.  R3 will
680   then build the forwarding tree and send an Assert.

[minor] This last part of the example seems to consider a corner case in
which R2 finds out later about R3 being the GDR -- I'm not sure how this is
possible if R1 and R2 are on the same LAN, which is why I cal it a corner
case...and it just seems to help confuse the reader.

[minor] Note also a couple of details:

(1) "PIM Hello Option from DR selects R3 as the GDR": Which option?  The
option itself doesn't select anyone, it just provides the information so
others can.

(2) "R3 will then build the forwarding tree and send an Assert."  Yes, but
until R2 uses PIM_ASSERT_INFINITY - 1 it will keep winning.  Ahh..I see
this is covered next...

682   The process continues until R2 agrees to the selection of R3 as the
683   GDR, and sets its own Assert metric to (PIM_ASSERT_INFINITY - 1),
684   which will make R3 the Assert winner.  During the process, we will
685   see intermittent duplication of traffic but packet loss will be
686   minimized.  In the unlikely case that R2 never relinquishes its role
687   as GDR (while every other router thinks otherwise), the proposed
688   mechanism also helps to keep the duplication to a minimum until
689   manual intervention takes place to remedy the situation.

[major] Why would R2 never relinquish its role?  I can imagine the corner
case you identified here where R2 doesn't receive R1s Hello...but that
could result in even R2 thinking it is the DR.  Is there a chance that the
interpretation of the algorithm could end up producing inconsistent
results?  I haven't thought about this too much, but it seems that the only
possibility is if the candidate GDRs have a different view of the
(*,G)/(S,G) state:  for example, if they think the RP is different for a
specific group...but I don't know how likely that is?  Is there an
opportunity for someone upstream to alter the information so that the local
LAN don't consistently converge on who the GDRs should be?

Is this a potential security issue?  Am I being paranoid?

691 7.  Compatibility

[minor] s/Compatibility/Backward Compatibility


...
704 8.  Manageability Considerations

706   Only the routers announcing the same Hash Algorithm as the DR would
707   be considered as GDR candidates.  Network administrators need to make
708   sure that the desired set of routers announce the same algorithm.
709   Migration between different algorithms is not considered in this
710   document.

[minor] What are the considerations the operator should take into account
when configuring/selecting which routers are GDR candidates?

[minor] Please take a look at rfc5706 (Guidelines for Considering
Operations and Management of New Protocols and Protocol Extensions) for
other considerations for this section.


712 9.  IANA Considerations

714   IANA has temporarily assigned type 34 for the PIM DR Load Balancing
715   Capability (DRLBC) Hello Option, and type 35 for the PIM DR Load
716   Balancing GDR (DRLBGDR) Hello Option in the PIM-Hello Options
717   registry.  IANA is requested to make these assignments permanent when
718   this document is published as an RFC.  The string TBD should be
719   replaced by the assigned values accordingly.

[nit] The last sentence is not needed.

[minor] The registry currently says that for the DRLBC the length is
variable.  That needs to be updated in this document.


...
735 9.2.  Assignment of new hash algorithms

737   Assignment of new hash algorithms is done according to the "IETF
738   Review" model, see [RFC5226].

[major] rfc5226 has been declared Obsolete by rfc8126.

740 10.  Security Considerations

742   Security of the new DR Load Balancing PIM Hello Options is only
743   guaranteed by the security of PIM Hello messages, so the security
744   considerations for PIM Hello messages as described in PIM-SM
745   [RFC7761] apply here.

[major] I think that the new functionality includes a couple of new
vulnerabilities...if a router is subverted:

- the DR may not include all the appropriate GDRs in the list...or may
assign a hash that doesn't result in appropriate election of GDRs...or may
even advertise an unsupported Hash Algorithm to avoid others from even
being candidates...

- the candidate GDRs may ignore the advertisements and start forwarding
anyway (resulting in duplication)...or could not take a group over
(resulting in black holes)...

I know that the general concept of a forged Hello is covered in
rfc7761...but I think it is important to talk about vulnerabilities
specific to this document.  If anything, it shows that the consequences
have been considered.