Re: [pim] AD Review of draft-ietf-pim-drlb-10

Stig Venaas <> Fri, 11 October 2019 21:59 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 4A7D912010C for <>; Fri, 11 Oct 2019 14:59:00 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.898
X-Spam-Status: No, score=-1.898 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_NONE=0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id yK0TF4LoiNv2 for <>; Fri, 11 Oct 2019 14:58:56 -0700 (PDT)
Received: from ( [IPv6:2a00:1450:4864:20::531]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 623CA12010E for <>; Fri, 11 Oct 2019 14:58:55 -0700 (PDT)
Received: by with SMTP id v8so9976448eds.2 for <>; Fri, 11 Oct 2019 14:58:55 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=vr2k8IoNRlZBKmUesJLcB4XWcC3yZwoEgcC87B2MLO4=; b=URqcsWsPbZTV67l/3edONkz9kOuoCgFfgmiVYxvOQxf1jdQSYSG9bI0YbDwS6chFWN x/h4dmqDG3G7Irvy8BjrnklVNumCScwwF9inz/JadP/wjDu+Ox8ZkSvutq5H8PkJzbn7 bb6UtLOk2br4ddECVnkE4JykFNaSRcwtzUUOrCwCaHLH198up6kxcPq8m/zRmyXexvFx gxmb/xZvpEAbCYTS/pmFOY2t0kcBkQgk+iFygTgNQz9Q7bdK+d83jA58CzMJ0ovHgiLI NpuCh8MUOR1/tYpdwZl2Kop42ugdHrlPl/FARQQS1ya53tJDori25KGsG9/n+l6InOid GRmA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=vr2k8IoNRlZBKmUesJLcB4XWcC3yZwoEgcC87B2MLO4=; b=CdSsVJ2L7MnvF/hBLfE1CTlAKQC0zEc70T+WbGIPH9B6F62wGaRQ0zuHUYVniAy418 6wWIRo8e/djSH7y1OJs7fQNMgu5k1ZWKWVmOfVBOi3FCbrpEaQHh2cT7D5Xj5p6z38Ks XJ4YxbOPBzf1pWdAuTQOZinMh4tpQiVwT4DWK83vCEuRkO7LG1fGJwgdQFOChua8X813 Oq3oBlXBWhu0+Im9K4mxOV/BJF3Lsf9WZFhwbREltfVRWpl/DP9alIy0ssmCOUzUwenl 23TmnUFY3YfqPaUGp3UxMBD7dPWSRCESimMlZjvSREPeH9JLQTXLE/QqFvYHtC8heG6e +ssg==
X-Gm-Message-State: APjAAAWm0iVdYyHcRkTtfyp1IZexih5UvS9zcNQJGYMtQlzeNTJo2Z0I aI9ZnSsqFs9LuishLX8LjtViACYM1q7u6Cc+iY7JlCi4Ous=
X-Google-Smtp-Source: APXvYqysVvcDu0lwVwfxqBciA2ruMnxoP6UUlDfPM+0f5X2/rv19gTyohG90bY4rdLmlth5rrUTm/WQi0XbQptuTRaY=
X-Received: by 2002:aa7:d4d9:: with SMTP id t25mr15922056edr.76.1570831132902; Fri, 11 Oct 2019 14:58:52 -0700 (PDT)
MIME-Version: 1.0
References: <>
In-Reply-To: <>
From: Stig Venaas <>
Date: Fri, 11 Oct 2019 14:58:41 -0700
Message-ID: <>
To: Alvaro Retana <>
Cc:, Mike McBride <>,,
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <>
X-Mailman-Approved-At: Fri, 11 Oct 2019 16:33:52 -0700
Subject: Re: [pim] AD Review of draft-ietf-pim-drlb-10
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Protocol Independent Multicast <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Fri, 11 Oct 2019 21:59:00 -0000


I've posted a version 11 which I think addresses your comments. Please
see my comments inline though.

Dear WG, there are a ton of changes, so please have a look at this
latest version and see if you have any concerns.

On Sat, Jun 22, 2019 at 8:08 AM Alvaro Retana <> wrote:
> Dear authors:
> First of all, thank you for reviving this work!!
> I just finished reading the document and have significant concerns (please see details in line).  In fact, I am pointing at the same general issue that the original AD review [1] did: under-specification.  Adding text in 6.2 has resulted in behavior being specified (inconsistently!) more than once.
> Also, the specification of the Hash Algorithm is included in §4 (Functional Overview).  Not being in an appropriately named section is significant.  Please move this part to §6 (Protocol Specification), or maybe its own section.
> Even though there is much work needed, I don't want to return this document to the WG and risk more delay.  While some major issues will require changes, I think that many of them are about the structure of the document.  If needed, I will rely on Mike (Chair/Shepherd) to coordinate further WG review.

Thanks, I think I've addressed these. See further comments below.

See inline please.

> Thanks!
> Alvaro.
> [1]
> [Line numbers from id-nits.]
> ....
> 90 1.  Introduction
> 92   On a multi-access LAN such as an Ethernet, one of the PIM routers is
> 93   elected as a DR.  The PIM DR has two roles in the PIM-SM protocol.
> 94   On the first hop LAN, the PIM DR is responsible for registering an
> 95   active source with the Rendezvous Point (RP) if the group is
> 96   operating in PIM-SM.  On the last hop LAN, the PIM DR is responsible
> 97   for tracking local multicast listeners and forwarding to these
> 98   listeners if the group is operating in PIM-SM.
> [minor] The first reference to rfc7761 is below, when talking about DR election.  Consider moving that reference to this first paragraph so that the stage is set early.


> [nit] I know DR was extended in the Abstract, the acronym is used above and the extended version again below...  Please extend DR above and use the contracted form elsewhere.

> ...
> 114   Assume R1 is elected as the Designated Router..  According to
> 115   [RFC7761], R1 will be responsible for forwarding traffic to that LAN
> 116   on behalf of any local members.  In addition to keeping track of IGMP
> 117   and MLD membership reports, R1 is also responsible for initiating the
> 118   creation of source and/or shared trees towards the senders or the
> 119   RPs.
> [minor] We need Informative references to IGMP and MLD.
> 121   Forcing sole data plane forwarding responsibility on the PIM DR
> 122   uncovers a limitation in the protocol.  In comparison, even though an
> 123   OSPF DR or an IS-IS DIS handles additional duties while running the
> 124   OSPF or IS-IS protocols, they are not required to be solely
> 125   responsible for forwarding packets for the network.  On the other
> 126   hand, on a last hop LAN, only the PIM DR is asked to forward packets
> 127   while the other routers handle only control traffic (and perhaps drop
> 128   packets due to RPF failures).  Hence the forwarding load of a last
> 129   hop LAN is concentrated on a single router.
> [minor] "In comparison..."  I think that comparing the PIM DR to an OSPF DR/IS-IS DIS is completely out of place since the justification for this work is to share forwarding responsibilities...and the IGP DR/DIS are not responsible for forwarding at all!  IOW, the comparison doesn't make sense.

Agree, done.

> [minor] RPF failures are only mentioned here.  Either expand on them and explain why they are relevant to this work, or remove the text to avoid confusion.

Not relevant. Done.

> 131   This leads to several issues.  One of the issues is that the
> 132   aggregated bandwidth will be limited to what R1 can handle towards
> 133   this particular interface.  It is very common that the last hop LAN
> 134   consists of switches that run IGMP/MLD or PIM snooping.  This allows
> 135   the forwarding of multicast packets to be restricted only to segments
> 136   leading to receivers who have indicated their interest in multicast
> 137   groups using either IGMP or MLD.  The emergence of the switched
> 138   Ethernet allows the aggregated bandwidth to exceed, sometimes by a
> 139   large number, that of a single link.  For example, let us modify
> 140   Figure 1 and introduce an Ethernet switch in Figure 2.
> [nit] "this particular interface"  Which interface?  I'm sure you mean the one to the core networks...but please be explicit.
> [minor] "IGMP/MLD or PIM snooping"  We need a reference.
Fixed these.
> ...
> 184   There is a limitation in the hash algorithm used in this document,
> 185   but this document provides the option to have different and more
> 186   consistent hash algorithms in the future.
> [minor] I think that the limitations discussion should be left to §4.3.1.  IOW, I suggest taking this paragraph out.

> ...
> 193 2.  Terminology
> 195   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
> 196   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
> 197   document are to be interpreted as described in [RFC2119].
> [major] Please use the rfc8174 template.
> ...
> 204   o  GDR: GDR stands for "Group Designated Router".  For each multicast
> 205      flow, either a (*,G) for ASM, or an (S,G) for SSM, a hash
> 206      algorithm (described below) is used to select one of the routers
> 207      as a GDR.  The GDR is responsible for initiating the forwarding
> 208      tree building process for the corresponding multicast flow.
> [nit] s/GDR: GDR stands for "Group Designated Router"./GDR: Group Designated Router.
> [minor] ASM needs to be expanded
Done these.
> 210   o  GDR Candidate: a last hop router that has the potential to become
> 211      a GDR.  A GDR Candidate must have the same DR priority and must
> 212      run the same GDR election hash algorithm as the DR router.  It
> 213      must send and process new PIM Hello Options as defined in this
> 214      document.  There might be more than one GDR Candidate on a LAN,
> 215      but only one can become GDR for a specific multicast flow.
> [major] This paragraph uses "must" several times.  Should that be Normative language?
Trying to only use normative language in the specification part.
> [??] Why is it a requirement to have the same DR priority?  It seems to me that there would be more options and even better load sharing if more routers could be GDR.

That means you would have to disable the feature on routers that you
do not want to take part. One can easily configure the same priority
on all the routers that should take part. It is similar to how it is
now, you configure the highest priority on all (except now it is at
most 1) the routers you want to have the DR role.

> ...
> 229 4.  Functional Overview
> ...
> 256   Hash Masks are defined for Source, Group and RP separately, in order
> 257   to handle PIM ASM/SSM.  The masks, as well as a sorted list of GDR
> 258   Candidate Addresses, are announced by the DR in a new DR Load
> 259   Balancing GDR (DRLBGDR) PIM Hello Option.
> [nit / personal opinion] The abbreviations for the new Options are awful -- I keep having to think about the meaning and whether they are spelled correctly at every step.  Perhaps something reader-friendly acronyms would be better.  Suggestions:
> - DR Load Balancing Capability PIM Hello Option: LB Option
> - DR Load Balancing GDR PIM Hello Option: GDR List Option (maybe even GL)

Changed to DRLB-Cap and DRLB-List. Let me know what you think. I think
DRLB for DR Load Balancing is obvious.

> 261   A hash algorithm based on the announced Source, Group, or RP masks
> 262   allows one GDR to be assigned to a corresponding multicast state.
> 263   And that GDR is responsible for initiating the creation of the
> 264   multicast forwarding tree for multicast traffic.
> [major] Is there a possibility that the assigned GDR for a specific multicast state can't fulfill its duties?  This document assumes that all the GDRs are able to service all states...but that may not be true.  Because every GDR candidate calculates who the GDR is for a specific state, it may never know what an alternate GDR is not able to forward traffic.  [Does this make sense?]

I responded to this earlier. This is no different than making sure the
DR can fulfill its duties today.

> 266 4.1.  GDR Candidates
> 268   GDR is the new concept introduced by this specification.  GDR
> 269   Candidates are routers eligible for GDR election on the LAN.  To
> 270   become a GDR Candidate, a router MUST support this specification,
> 271   have the same DR priority and run the same GDR election hash
> 272   algorithm as the DR on the LAN.
> [nit] There is a lot of repetition in the text...for example, all the contents on this paragraph were already mentioned in the last section....  Cleaning up the text would be nice.

Tried to do that.

> [major] "To become a GDR Candidate, a router MUST support this specification..."  First of all, this is an obvious statement: the GDR is defined in this document.  Second, there's no Normative value in using MUST.

Removed some obvious statements and tried to use normative language in
the protocol specification section.
> ...
> 289 4.2.  Hash Mask and Hash Algorithm
> 291   A Hash Mask is used to extract a number of bits from the
> 292   corresponding IP address field (32 for v4, 128 for v6) and calculate
> 293   a hash value.  A hash value is used to select a GDR from GDR
> 294   Candidates advertised by PIM DR.  For example, defines a
> 295   Hash Mask for an IPv4 address that masks the first, the second, and
> 296   the fourth octets.
> [minor] s/v4/IPv4 s/v6/IPv6

> ...
> 306   The hash masks need to be configured on the PIM routers that can
> 307   potentially become a PIM DR, unless the implementation provides
> 308   default Hash Mask values.  An implementation SHOULD provide masks
> 309   with default values (IPv4) and
> [minor] Can you provide any guidance on how to configure the hash masks?  The example below shows a mask of, which seems both "unnatural" (for the average person used to set masks) and very specific (in order for N to have a specific value).  Maybe it's just me, but I would have set the mask to something similar to  Guidance should be added to §8 (Manageability Considerations).

> [major] What does "provide" mean in this context?  I guess that it means that if no hash mask is configured, then the default should be used.  Is that correct?
Right, reworded.

> [major] When would an implementation not provide those masks?  IOW, why aren't you using MUST?

Masks are needed, but if there are no defaults, the implementation
could always require the user to configure them. The defaults are just
for convenience.

> 312   o  If the group is in ASM mode and the RP Hash Mask announced by the
> 313      PIM DR is not 0, calculate the value of hashvalue_RP [Section 4.3]
> 314      to determine GDR.
> [major] "the RP Hash not 0"  By 0 you mean and ::, right?  Please be specific!  The same is used in other places.

Defined what mean by zero.

> [nit] s/determine GDR/determine the GDR/g
> ...
> 323   A simple Modulo hash algorithm is defined in this document.  However,
> 324   to allow another hash algorithms to be used, a 1-octet "Hash
> 325   Algorithm" field is included in DRLBC Hello Option to specify the
> 326   hash algorithm used by a last hop router.
> [nit] s/included in DRLBC Hello Option/included in the DRLBC Hello Option
> ...
> 332 4.3.  Modulo Hash Algorithm
> [major] The specification of the algorithm doesn't belong in the "Functional Overview" section.
Agree, moved.
> 334   The Modulo hash algorithm is discussed here with a detailed
> 335   description on hashvalue_RP.  The same algorithm is described in
> 336   brief for hashvalue_Group using the group address instead of the RP
> 337   address for an ASM group with zero RP_hashmask, and also with
> 338   hashvalue_SG for a the source address of an (S,G), instead of the RP
> 339   address,
> [minor] "zero RP_hashmask" is used without prior definition, or a forward reference
> 341   o  For ASM groups, with a non-zero RP_Hash Mask, hash value is
> 342      calculated as:
> [minor] "non-zero RP_Hash Mask" is used without prior definition, or a forward reference
> [minor] "RP_Hash Mask", "RP Hash Mask" and "RP_hashmask" are used.  Are they meant to represent the same thing?  Please be consistent.
> 344         hashvalue_RP = (((RP_address & RP_hashmask) >> N) & 0xFFFF) % M
> [major] The notation used should be explained somewhere.  Note that the examples below help infer what's going on, but those are examples...I'm concerned with the *specification* part of this document.

> ...
> 382         Compare hashvalue_Group with Ordinal number assigned to Router
> 383         X, to decide if Router X is the GDR.
> [nit] s/Compare hashvalue_Group with Ordinal number/Compare the hashvalue_Group with the Ordinal number
> 385   o  For SSM groups, a hash value is calculated using both the Source
> 386      and Group Hash Mask:
> 388         hashvalue_SG = ((((Source_address & Source_hashmask) >> N_S) &
> 389         0xFFFF) ^ (((Group_address & Group_hashmask) >> N_G) & 0xFFFF))
> 390         % M
> [major] Again, the notation must be explained somewhere.
Fixed all of the above.

> [minor] An example for this more complicated formula would be nice.  In fact, it may be a good idea to first define the algorithm and then (in a sub-section) show examples.
Done the last part.
> 392 4.3.1.  Limitations
> 394   The Modulo Hash Algorithm has poor failover characteristics when a
> 395   shared LAN has more than two GDRs.  In the case of more than two GDRs
> 396   on a LAN, when one GDR fails, all of the groups may be reassigned to
> 397   a new GDR, even if they were not assigned to the failed GDR.
> 398   However, many deployments use only two routers on a shared LAN for
> 399   redundancy purposes.  Future work may define new hash algorithms
> 400   where only groups assigned to the failed GDR get reassigned.
> [minor] s/reassigned to a new GDR/reassigned to a single GDR
> 402 4.4.  PIM Hello Options
> 404   When a last hop PIM router sends a PIM Hello for an interface with
> 405   this specification enabled, it includes a new option, called "Load
> 406   Balancing Capability (DRLBC)".
> [nit] s/sends a PIM Hello for an interface/sends a PIM Hello out an interface

> 408   Besides this DRLBC Hello Option, the elected PIM DR also includes a
> 409   new "DR Load Balancing GDR (DRLBGDR) Hello Option".  The DRLBGDR
> 410   Hello Option consists of three Hash Masks as defined above and also a
> 411   sorted list of GDR Candidate addresses on the last hop LAN.
> 413   The elected PIM DR uses DRLBC Hello Option advertised by all routers
> 414   on the last hop LAN to compose the DRLBGDR Option.  The GDR
> 415   Candidates use the DRLBGDR Hello Option advertised by the PIM DR to
> 416   calculate the hash value.
> [?] I couldn't figure out from rfc7761 whether all Options have to be included in all Hellos, all the time. Do they?
If you mean if every periodic hello contains the same options (unless
a change), then yes.

> [major]  It looks like the DR can change the contents of the DRLBGDR if a new GDR comes into the network, or one leaves, but there is no specification related to when the DR should reconsider the contents of the DRLBGDR.  This specification would belong somewhere in §6.
> 418 5.  Hello Option Formats
> 420 5.1.  PIM DR Load Balancing Capability (DRLBC) Hello Option
> 422      0                   1                   2                   3
> 423      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
> 424     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> 425     |           Type = TBD          |         Length = 4            |
> 426     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> 427     |                     Reserved                  |Hash Algorithm |
> 428     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> [major] Please use OptionType and OptionLength [rfc7761].
If you look at the actual option definitions in 7761, you will see
they look exactly like this. The same for other PIM documents.
Hence I'm keeping this.

> 430          Figure 3: Capability Hello Option
> [minor] The title doesn't reflect the full name (or even DRLB)...
> 432      Type: TBD
> [major] Even though the assignment is temporary, please include here the values.
OK, fixed.
> 434      Length: 4
> 436      Hash Algorithm: 0 for Modulo
> [minor] Even though 0 is the only value defined, that is not the definition of this field.
> 438   This DRLBC Hello Option MUST be advertised by last hop routers on
> 439   interfaces with this specification enabled.
> 441 5.2.  PIM DR Load Balancing GDR (DRLBGDR) Hello Option
> 443     0                   1                   2                   3
> 444     0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
> 445     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> 446     |           Type = TBD          |         Length                |
> 447     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> 448     |                            Group Mask                         |
> 449     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> 450     |                            Source Mask                        |
> 451     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> 452     |                            RP Mask                            |
> 453     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> 454     |                    GDR Candidate Address(es)                  |
> 455     +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> [major] Please use OptionType and OptionLength [rfc7761].
See above comment.
> 457          Figure 4: GDR Hello Option
> [minor] The title doesn't reflect the name being used (DRLBGDR) for this option.
> 459      Type: TBD
> [major] Even though the assignment is temporary, please include here the values.
OK, fixed
> 461      Length: (3 + n) x (4 or 16) where n is the number of GDR
> 462      candidates.
> [nit] s/(4 or 16) where/(4 or 16), where
> [minor] I know that rfc7761 says that the OptionLength's unit is bytes, but it wouldn't hurt repeating it here.
I forgot this, can add it in revision 12 if you like.
> 464      Group Mask (32/128 bits): Mask
> 466      Source Mask (32/128 bits): Mask
> 468      RP Mask (32/128 bits): Mask
> [major] All the descriptions are the same!  §4.2 also lists the masks, but they are not explicitly described anywhere.  Describe them in either section and point to the other.
> 470         All masks MUST be in the same address family as the Hello IP
> 471         header.
> [minor] It sounds relatively obvious that the masks and addresses would use the same AF...maybe a short heads up at the start of this section would make it even clearer.
> 473      GDR Address (32/128 bits): Address(es) of GDR Candidate(s)
> [major] I think that the DRs own address must be advertised in this list...but that is not specified.  If it is not advertised, then it would not forward any traffic.
Yes. Fixed.
> 475         All addresses must be in the same address family as the Hello
> 476         IP header.  The addresses are sorted in descending order.  The
> 477         order is converted to the ordinal number associated with each
> 478         GDR candidate in hash value calculation.  For example, if
> 479         addresses advertised are R3, R2, R1, the ordinal number
> 480         assigned to R3 is 0, to R2 is 1 and to R1 is 2..
> [major] "All addresses must be in the same address family as the Hello IP header."  Should that be a MUST (to match the statement about the masks)?  Perhaps this can be covered in the text at the start of this section.
Fixed. Reworded.
> [major] What should a receiver do if the masks don't match the IP header?  Ignore the option?
I guess I missed this. I would say the option should be ignored. Can
fix in revision 12.

> [major] "The addresses are sorted in descending order." Should this be Normative?  Does the ordering (ascending/descending/random) really matter?  It seems to me that because the DR is the only one making the advertisement then it could do whatever and the hash would still work, am I missing something?
I added text explaining this. The main reason for sorting is to get
consistent GDR selections regardless of which order the DR learns of
the GDR candidates. Or if different implementations do it differently,
then the list gets reordered if the DR changes, which causes needless

> 482         If the "Interface ID" option, as specified in [RFC6395], is
> 483         present in a GDR Candidate's PIM Hello message, and the "Router
> 484         ID" portion is non-zero:
> [minor] rfc6395 uses "Router Identifier" (and not "Router ID").
Missed this. Can fix in revision 12.
> 486         +  For IPv4, the "GDR Candidate Address" will be set directly
> 487            to the "Router ID".
> 489         +  For IPv6, the "GDR Candidate Address" will be set to the
> 490            IPv4-IPv6 translated address of the "Router ID", as
> 491            described in [RFC4291], that is the "Router-ID" is appended
> 492            to the prefix of 96 bits of zeroes.
> [major] "IPv4-IPv6 translated address"  I couldn't find that in rfc4291, but I think you mean the "IPv4-Compatible IPv6 Address" (§, right?  If so, rfc4291 says that this type of addresses is deprecated.
> It seems to me that the purpose of the representation is not to use the result as an address, but to represent the ID advertised by the router.  If so, then describing what is wanted, and taking the reference to rfc4291 out would be enough.
> OTOH, why is this extra step/check needed?  The DR election [rfc7761] simply uses the source address of the Hello message.
Generally not that useful, but might be useful if the router has
multiple interfaces on the LAN and you want to treat them the same. Or
if you want to enforce a certain router to be the GDR for certain
flows without changing the IP address.
> 494         If the "Interface ID" option is not present in a GDR
> 495         Candidate's PIM Hello message, or if the "Interface ID" option
> 496         is present but the "Router ID" field is zero, the "GDR
> 497         Candidate Address" will be the IPv4 or IPv6 source address of
> 498         the PIM Hello message.
> 500         This DRLBGDR Hello Option MUST only be advertised by the
> 501         elected PIM DR.
> [nit] s/This DRLBGDR Hello Option/The DRLBGDR Hello Option
> [major] What should a router do if the DRLBGDR Hello Option is received from a non-DR?
Fixed. Ignore.
> 503 6.  Protocol Specification
> 505 6.1.  PIM DR Operation
> 507   The DR election process is still the same as defined in [RFC7761].  A
> 508   DR that has this specification enabled on an interface advertises the
> 509   new DRLBGDR Hello Option, which contains mask values from user
> 510   configuration, followed by a sorted list of GDR Candidate Addresses,
> 511   from the highest value to the lowest value.  Moreover, same as non-DR
> 512   routers, the DR also advertises DRLBC Hello Option to indicate its
> 513   capability of supporting this specification and the type of its GDR
> 514   election hash algorithm.
> [minor] In several places the "has this specification enabled" or "support this specification" phrases are used.  While it is relatively ok, it is also unnecessary: this document is precisely specifying what to do if you support it.  It would be great/better if you could find a different way to refer to "this specification"; maybe saying something like "supports DR Load Balancing"...or simply avoid requiring support (as it is implicit).
Fixed I think.
> [major] If multiple algorithms exist, and the DR has support for several, is the DR expected to use the algorithm that results on the most candidate GDRs?  Should that consideration be a configuration issue?
Just a configuration issue. This and other things can be considered
later if we define multiple algorithms, but I think it may be
sufficient to just leave it to configuration. The admin would
determine the most suitable algorithm that is supported by the routers
that should take part.

> 516   If a PIM DR receives a PIM Hello with the DRLBGDR Option, the PIM DR
> 517   SHOULD ignore the TLV.
> [major] When would the DR not ignore the DRLBGDR Option?  If it is not ignored, what should it do?  IOW, why is that not a MUST?
Yes, this need to be a MUST.
> 519   If a PIM DR receives a neighbor DRLBC Hello Option, which contains
> 520   the same hash algorithm as the DR, and the neighbor has the same DR
> 521   priority as the DR, PIM DR SHOULD consider the neighbor as a GDR
> 522   Candidate and insert the GDR Candidate's Address into the sorted list
> 523   of the DRLBGDR Option.  However, the DR MAY have policies limiting
> 524   which GDR Candidates, or the number of GDR Candidates to include.
> [major] I think that the "MAY" doesn't really have Normative value in this document because it is not specifying a behavior, it is just stating a fact.  s/MAY/may
> 526 6.2.  PIM GDR Candidate Operation
> ...
> 534   If this specification is enabled on an interface, the router MUST
> 535   include the DRLBC Hello Option in its PIM Hello on the interface.
> 536   Note that the presence of the DRLBC Option in PIM Hello does not
> 537   guarantee that this router would be considered as a GDR candidate.
> 538   Once DR election is done, the DRLBGDR Hello Option would be received
> 539   from the current PIM DR on the link which would contain a list of
> 540   GDRs selected by the PIM DR.
> [minor] s/contain a list of GDRs/contain a list of GDR candidates
> 542   A router only acts as a GDR candidate if it is included in the GDR
> 543   list of the DRLBGDR Hello Option.
> [major] Should the receiving router verify that the DR announced the same hash algorithm in its DRLBC?  What should happen if the receiver's address is listed in the DRLBGDR, but the advertised hash algorithms don't match?  Even though it is specified in several places that the hash algorithms must be the same, there's a possibility (bug, attack) that the algorithms don't match.  If the DRLBGDR is used (when no match in the algorithms is verified), then it may result in an unexpected state...  [See also my comments in the Security Considerations section.]
> ...
> 553   If the PIM DR does not support this specification, GDR election will
> 554   not take place, and only the PIM DR joins the multicast tree.
> [minor] This last paragraph is not needed as §7 already addresses this case.
> 556 6.2.1.  Router Receives New DRLBGDR
> 558   The first time a router receives a DRLBGDR option from the PIM DR, it
> 559   MUST process the option and check if it is in the GDR list..
> [major] "The first time...MUST process..."  What about other times?  It looks like §6.2.2 addresses this question.  It seems to me that this document would be clearer if a single (for example) "Receiving a DRLBGDR" section existed and the specification didn't make this unnecessary distinction...
> [major] Note that the first bullet (below) says that no action is needed...which seems in contradiction with "MUST process" (above), specially because the second bullet talks about "MUST process"...  Instead of using Normative language in the header, use it when talking about specific actions.
> 561   1.  If a router is not listed as a GDR candidate in DRLBGDR, no
> 562       action is needed.
> [minor] s/in DRLBGDR/in the DRLBGDR/g   Note that sometimes "DRLBGDR option" or "DRLBGDR Hello Option" are used...or simply "DRLBGDR" as in this case.  Please be consistent.
> 564   2..  If a router is listed as a GDR candidate in DRLBGDR, then it MUST
> 565       process each of the groups, or source and group pairs if SSM, in
> 566       the IGMP/MLD reports.  The masks are announced in the PIM Hello
> 567       by the DR in the DRLBGDR Hello Option.  For each group in the
> 568       reports that is in ASM mode, and each source and group pair if
> 569       the group is in SSM mode, it (PIM Router) needs to run the hash
> 570       algorithm (described in section 4.3) based on the announced
> 571       Source, Group or RP masks to determine if it is the GDR for
> 572       specified group, or source and group pair.  If the hash result is
> 573       to be the GDR for the multicast flow, it does build the multicast
> 574       forwarding tree.  If it is not the GDR for the multicast flow, no
> 575       action is needed.
> [major] It seems that "MUST process each of the groups" is equivalent to "needs to run the hash algorithm", is that correct?  Please don't repeat information and be clear about the actions.  "process" is not clear enough...I think that "run the algorithm" may not be precise enough either.  Maybe something along the lines of "MUST use the GDR selection algorithm...according to the process in §4.3)" -- I'm sure you can come up with better text.
Tried to address this.
> 577 6.2.2.  Router Receives Updated DRLBGDR
> [major] The same comments from above about being explicit and using Normative language appropriately apply to the text in this section.
Tried to address it.
> ...
> 591          If it was the GDR for a group, or source and group pair if
> 592          SSM, and the new hash result chose it as the GDR, then no
> 593          processing is required.
> 595          If it was the GDR for a group, or source and group pair if
> 596          SSM, earlier and now it is no longer the GDR, then it sets its
> 597          assert metric for the multicast flow to be
> 598          (PIM_ASSERT_INFINITY - 1), as explained in Section 6.3.
> [major] Some of the language used to mean similar things is different, making the specification imprecise.  For example, just in the last two paragraphs: "the new hash result chose it as the GDR" vs "it is no longer the GDR"; no longer being a GDR is the result of running the algorithm...
> [minor] Where is PIM_ASSERT_INFINITY defined?  We need a reference.  I looked at rfc7761 and found infinite_assert_metric() -- is it the same thing?  Also, I found one other RFC (rfc7910) that uses PIM_ASSERT_INFINITY, but didn't see it defined there either. :-(
Addressed all of these.
> [major] §6.3 uses MUST to describe the behavior above...but no Normative language is used here.
> [minor] §6.3 presents the justification to use PIM_ASSERT_INFINITY - 1...but it also seems to want to specify the behavior.  Please specify the behavior only once.
Fixed I think.
> 600          If it was not the GDR for a group, or source and group pair if
> 601          SSM, earlier, and the new hash does not make it GDR, then no
> 602          processing is required.
> [nit] It looks like the explanation of the process could be simplified...  "if no change in the local state..."
> ...
> 628 6.3.  PIM Assert Modification
> 630   It is possible that the identity of the GDR might change in the
> 631   middle of an active flow.  Examples when this could happen include:
> 633      When a new PIM router comes up
> 635      When a GDR restarts
> [minor] Do you mean a restart as in a graceful restart...or simply the GDR going down?
Just going down. Fixed.
> 637   When the GDR changes, existing traffic might be disrupted.
> 638   Duplicates or packet loss might be observed.  To illustrate the case,
> 639   consider the following scenario where there are two flows G1 and G2.
> 640   R1 is the GDR for G1, and R2 is the GDR for G2.  When R3 comes up
> 641   online, it is possible that R3 becomes GDR for both G1 and G2, hence
> 642   R3 starts to build the forwarding tree for G1 and G2.  If R1 and R2
> 643   stop forwarding before R3 completes the process, packet loss might
> 644   occur.  On the other hand, if R1 and R2 continue forwarding while R3
> 645   is building the forwarding trees, duplicates might occur..
> [nit] s/R3 comes up online/R3 comes up
> 647   This is not a typical deployment scenario but might still happen.
> 648   Here we describe a mechanism to minimize the impact.  We essentially
> 649   want to minimize packet loss.  Therefore, we would allow a small
> 650   amount of duplicates and depend on PIM Assert to minimize the
> 651   duplication.
> [nit] "This is not a typical deployment scenario but might still happen." Sounds very possible to me...
Agree! Removed that sentence.
> [style nit] Some parts of the text are written using "we"...please don't write in the first person.  Instead use the 3rd person.  For example: s/Here we describe a mechanism to minimize the impact./This section describes a mechanism...
> ...
> 658   With the introduction of GDR, the following modification to the
> 659   Assert packet MUST be done: if a router enables this specification on
> 660   its downstream interface, but it is not a GDR (before network event
> 661   it was GDR), it would adjust its Assert metric to
> 662   (PIM_ASSERT_INFINITY - 1).
> [minor] The text above is sloppy for a Normative statement.  It is not clear, until "before network event" is mentioned that the statement doesn't apply to any router that is not a GDR.  As I mentioned in §6.2.2, please specify the behavior only once!
Fixed I think.
> 664   Using the above example, for G1, assume R1 and R3 agree on the new
> 665   GDR, which is R3.  R1 will set its Assert metric as
> 666   (PIM_ASSERT_INFINITY - 1).  That will make R3, which has normal
> 667   metric in its Assert as the Assert winner.
> [minor] Examples are good.  In this case, I think that dividing the example with normative paragraphs in between may make it confusing.  Instead, define the behavior first and then provide a worked out example.  Maybe referring back to the Figures would help...
Addressed this.
> 669   For G2, assume it takes a slightly longer time for R2 to find out
> 670   that R3 is the new GDR and still considers itself being the GDR while
> 671   R3 already has assumed the role of GDR.  Since both R2 and R3 think
> 672   they are GDRs, they further compare their metric and IP addresses.
> 673   If R3 has the better routing metric, or the same metric but a better
> 674   tie-breaker, the result will be consistent during GDR selection.  If
> 675   unfortunately, R2 has the better metric or the same metric but a
> 676   better tie-breaker, R2 will become the Assert winner and continues to
> 677   forward traffic.  This will continue until:
> 679   The next PIM Hello Option from DR selects R3 as the GDR.  R3 will
> 680   then build the forwarding tree and send an Assert.
> [minor] This last part of the example seems to consider a corner case in which R2 finds out later about R3 being the GDR -- I'm not sure how this is possible if R1 and R2 are on the same LAN, which is why I cal it a corner case...and it just seems to help confuse the reader.
Agree, removed corner case.
> [minor] Note also a couple of details:
> (1) "PIM Hello Option from DR selects R3 as the GDR": Which option?  The option itself doesn't select anyone, it just provides the information so others can.
> (2) "R3 will then build the forwarding tree and send an Assert."  Yes, but until R2 uses PIM_ASSERT_INFINITY - 1 it will keep winning.  Ahh..I see this is covered next...
> 682   The process continues until R2 agrees to the selection of R3 as the
> 683   GDR, and sets its own Assert metric to (PIM_ASSERT_INFINITY - 1),
> 684   which will make R3 the Assert winner.  During the process, we will
> 685   see intermittent duplication of traffic but packet loss will be
> 686   minimized.  In the unlikely case that R2 never relinquishes its role
> 687   as GDR (while every other router thinks otherwise), the proposed
> 688   mechanism also helps to keep the duplication to a minimum until
> 689   manual intervention takes place to remedy the situation.
> [major] Why would R2 never relinquish its role?  I can imagine the corner case you identified here where R2 doesn't receive R1s Hello...but that could result in even R2 thinking it is the DR.  Is there a chance that the interpretation of the algorithm could end up producing inconsistent results?  I haven't thought about this too much, but it seems that the only possibility is if the candidate GDRs have a different view of the (*,G)/(S,G) state:  for example, if they think the RP is different for a specific group...but I don't know how likely that is?  Is there an opportunity for someone upstream to alter the information so that the local LAN don't consistently converge on who the GDRs should be?
There were some mistakes in this text. I've tried to fix it. R2 will
relinquish its role, it's just a timing thing. It may keep forwarding
until next periodic assert winner message. I've mentioned this in the
new text.

> Is this a potential security issue?  Am I being paranoid?
If bad info is announced, we would still get a stable situation
without duplication.
> 691 7.  Compatibility
> [minor] s/Compatibility/Backward Compatibility
> ...
> 704 8.  Manageability Considerations
> 706   Only the routers announcing the same Hash Algorithm as the DR would
> 707   be considered as GDR candidates.  Network administrators need to make
> 708   sure that the desired set of routers announce the same algorithm.
> 709   Migration between different algorithms is not considered in this
> 710   document.
> [minor] What are the considerations the operator should take into account when configuring/selecting which routers are GDR candidates?
> [minor] Please take a look at rfc5706 (Guidelines for Considering Operations and Management of New Protocols and Protocol Extensions) for other considerations for this section.
Added a lot of text here.
> 712 9.  IANA Considerations
> 714   IANA has temporarily assigned type 34 for the PIM DR Load Balancing
> 715   Capability (DRLBC) Hello Option, and type 35 for the PIM DR Load
> 716   Balancing GDR (DRLBGDR) Hello Option in the PIM-Hello Options
> 717   registry.  IANA is requested to make these assignments permanent when
> 718   this document is published as an RFC.  The string TBD should be
> 719   replaced by the assigned values accordingly.
> [nit] The last sentence is not needed.
> [minor] The registry currently says that for the DRLBC the length is variable.  That needs to be updated in this document.
> ...
> 735 9.2.  Assignment of new hash algorithms
> 737   Assignment of new hash algorithms is done according to the "IETF
> 738   Review" model, see [RFC5226].
> [major] rfc5226 has been declared Obsolete by rfc8126.
> 740 10.  Security Considerations
> 742   Security of the new DR Load Balancing PIM Hello Options is only
> 743   guaranteed by the security of PIM Hello messages, so the security
> 744   considerations for PIM Hello messages as described in PIM-SM
> 745   [RFC7761] apply here.
> [major] I think that the new functionality includes a couple of new vulnerabilities...if a router is subverted:
OK, added text here.
> - the DR may not include all the appropriate GDRs in the list...or may assign a hash that doesn't result in appropriate election of GDRs...or may even advertise an unsupported Hash Algorithm to avoid others from even being candidates...
> - the candidate GDRs may ignore the advertisements and start forwarding anyway (resulting in duplication)...or could not take a group over (resulting in black holes)....
> I know that the general concept of a forged Hello is covered in rfc7761...but I think it is important to talk about vulnerabilities specific to this document.  If anything, it shows that the consequences have been considered..
> _______________________________________________
> pim mailing list

Thanks for a ton of great comments! Sorry the document was in such a poor shape.