Re: [pim] AD Review of draft-ietf-pim-drlb-10

Stig Venaas <stig@venaas.com> Tue, 27 August 2019 21:54 UTC

Return-Path: <stig@venaas.com>
X-Original-To: pim@ietfa.amsl.com
Delivered-To: pim@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 458F0120113 for <pim@ietfa.amsl.com>; Tue, 27 Aug 2019 14:54:52 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.898
X-Spam-Level:
X-Spam-Status: No, score=-1.898 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_NONE=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=venaas-com.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 46I43c6LS42F for <pim@ietfa.amsl.com>; Tue, 27 Aug 2019 14:54:49 -0700 (PDT)
Received: from mail-ed1-x52a.google.com (mail-ed1-x52a.google.com [IPv6:2a00:1450:4864:20::52a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 73D031200B1 for <pim@ietf.org>; Tue, 27 Aug 2019 14:54:49 -0700 (PDT)
Received: by mail-ed1-x52a.google.com with SMTP id m44so576801edd.9 for <pim@ietf.org>; Tue, 27 Aug 2019 14:54:49 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=venaas-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=lz9kOuB5ezXY4K8Q5KyATol3Df8ZRQBWM6/AWUSOMNE=; b=WYH3keTy+feMk+3zdwOp1SPuy6A1ojDqJz2iOqk7hikfsuY5poj6XpMPkSzCzNLuHJ 0G3Jpcpi96RZWuEtOu0bJlS58JhnshdRLj2S/qgBJEo8qS0MVgt/WVpYfo4H953KpFH3 RuvkatfFeLJWpaeGeS5tilkcrQ3VEoVke7eKM+hEbL1ZzJGNgOHHSstwUha5onyWYqva FgADT/38Dr6lbm0gbft6XghhoYbGpCgw5JLU2BxO0oAs8qSnjE13RQJj1V1i9RQIGguZ 52QfCQvf5SN+VTDErjbZY65Icbx/RNMfcqSUS+mgpgrlSKf6huLIFcSsq6BAUNwZUb84 G9Xw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=lz9kOuB5ezXY4K8Q5KyATol3Df8ZRQBWM6/AWUSOMNE=; b=Ufs5U58HZxZSj3kbiC2LtsroNPnILD3nRj17MW9RUqgq1rva1H/9UbKKdpCuhlFNMF eLPMjLSOv87rTM3Evy+RD6fIVMrNQf4e6cggbU29wIVmoUS4Hl8Y3ndayWhH3acirzyD AzmmIcS9HOLSunDumipWp4ujQ7jy1PWFL+7gTlBNJXKLiUtfRg9WQudq5dJWbAyVGF+I HpPYu1HMVjQFBNYKAWinW2Oaz67umvp11Pb9D7jN2oqpXGE3KHdn/NgaqnaTmCqP8bsJ Lx7GRgy/urhenMiZc54ZjVzqmDR4bhf2jo7mOeKiimLzIeh8znNl9ejdETnMVJnOOJtl rfpw==
X-Gm-Message-State: APjAAAVehp/qmFaJep33oWVHbIJOF0Nz452wpRviAVluDOEmYgonvneq GbZu1aJsyKTi1uTXSP6zBtRullMI6uZHEg6yBdx6UA==
X-Google-Smtp-Source: APXvYqy2o8s3RwTpcNBKGuv9JmzbDq9XmHzYujjcPqpIRrRie8U0P9O1pq1HCGv5d5iR/LWDRj4Lxc3rgIxupgSTGsU=
X-Received: by 2002:a17:906:f211:: with SMTP id gt17mr374019ejb.263.1566942887824; Tue, 27 Aug 2019 14:54:47 -0700 (PDT)
MIME-Version: 1.0
References: <CAMMESszYeVa9_2EMQjNCUP-8PzK3g_Zbv0cQ5vi7nA8j4Kp9Fw@mail.gmail.com>
In-Reply-To: <CAMMESszYeVa9_2EMQjNCUP-8PzK3g_Zbv0cQ5vi7nA8j4Kp9Fw@mail.gmail.com>
From: Stig Venaas <stig@venaas.com>
Date: Tue, 27 Aug 2019 14:54:36 -0700
Message-ID: <CAHANBtJZ7id7pq_m+hdzfB6ChSAtN5wCEA4Q+BQZqPcZzgnrxg@mail.gmail.com>
To: Alvaro Retana <aretana.ietf@gmail.com>
Cc: draft-ietf-pim-drlb@ietf.org, Mike McBride <mmcbride7@gmail.com>, pim-chairs@ietf.org, pim@ietf.org
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/pim/2H7GYUYscbVVcicajnFcN_g8scY>
Subject: Re: [pim] AD Review of draft-ietf-pim-drlb-10
X-BeenThere: pim@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Protocol Independent Multicast <pim.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/pim>, <mailto:pim-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/pim/>
List-Post: <mailto:pim@ietf.org>
List-Help: <mailto:pim-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/pim>, <mailto:pim-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 27 Aug 2019 21:54:52 -0000

Hi

On Sat, Jun 22, 2019 at 8:08 AM Alvaro Retana <aretana.ietf@gmail.com> wrote:
>
> Dear authors:
>
> First of all, thank you for reviving this work!!
>
> I just finished reading the document and have significant concerns (please see details in line).  In fact, I am pointing at the same general issue that the original AD review [1] did: under-specification.  Adding text in 6.2 has resulted in behavior being specified (inconsistently!) more than once.
>
> Also, the specification of the Hash Algorithm is included in §4 (Functional Overview).  Not being in an appropriately named section is significant.  Please move this part to §6 (Protocol Specification), or maybe its own section.
>
> Even though there is much work needed, I don't want to return this document to the WG and risk more delay.  While some major issues will require changes, I think that many of them are about the structure of the document.  If needed, I will rely on Mike (Chair/Shepherd) to coordinate further WG review.

Thanks, I agree with most of your comments and will try to address
them. There are a couple of technical points I would like to discuss
though. Please see below.
I only kept the points I'm commenting out.

> [??] Why is it a requirement to have the same DR priority?  It seems to me that there would be more options and even better load sharing if more routers could be GDR.

An administrator can choose exactly which routers should take part by
giving them all the highest priority. The hashing will cause all of
the ones with the highest priority to share the burden more or less
equally, there is no prioritization needed among the routers in the
set. An administrator could potentially give all the routers the
highest priority.

> [major] Is there a possibility that the assigned GDR for a specific multicast state can't fulfill its duties?  This document assumes that all the GDRs are able to service all states...but that may not be true.  Because every GDR candidate calculates who the GDR is for a specific state, it may never know what an alternate GDR is not able to forward traffic.  [Does this make sense?]

It is assumed that a router is only configured as GDR if it is
expected to have the necessary connectivity etc to fulfill its role.
This is no different than expecting today that any router elected as
the DR has the necessary connectivity. Implementations could
potentially have some ways of automatically decrease the priority in
failure scenarios to avoid being the DR. This is not specific to DRLB
though.

> [minor] Can you provide any guidance on how to configure the hash masks?  The example below shows a mask of 0.0.255.0, which seems both "unnatural" (for the average person used to set masks) and very specific (in order for N to have a specific value).  Maybe it's just me, but I would have set the mask to something similar to 255.255.0.0/0.0.255.255.  Guidance should be added to §8 (Manageability Considerations).

The reason for allowing such masks rather than just prefix lengths is
to allow load-balancing regardless of how service providers utilize
their space. I will try to think of examples. It would allow for
ensuring that certain groups always are hashed to the same router
(that the hash value computed is exactly the same for the groups).
That is, H(G1) == H(G2) if G1 and G2 are the same after the mask is
added. Hence the same router will be selected as the GDR for both G1
and G2.

> [major] What does "provide" mean in this context?  I guess that it means that if no hash mask is configured, then the default should be used.  Is that correct?
>
> [major] When would an implementation not provide those masks?  IOW, why aren't you using MUST?

Just using default masks would be sufficient for most deployments.

> [?] I couldn't figure out from rfc7761 whether all Options have to be included in all Hellos, all the time. Do they?

In general every hello messages would contain the exact same options,
unless some state (typically config), changes.

> [major] "The addresses are sorted in descending order." Should this be Normative?  Does the ordering (ascending/descending/random) really matter?  It seems to me that because the DR is the only one making the advertisement then it could do whatever and the hash would still work, am I missing something?

The listed routers are the GDR and when you use modulo, you need all
the GDRs to agree on the order for the modulo to indicate which router
 is responsible.

> [major] Should the receiving router verify that the DR announced the same hash algorithm in its DRLBC?  What should happen if the receiver's address is listed in the DRLBGDR, but the advertised hash algorithms don't match?  Even though it is specified in several places that the hash algorithms must be the same, there's a possibility (bug, attack) that the algorithms don't match.  If the DRLBGDR is used (when no match in the algorithms is verified), then it may result in an unexpected state...  [See also my comments in the Security Considerations section.]

In the worst case if one of the GDRs don't support the listed
algorithm it cannot take part. There is no way for it to compute which
groups it should be responsible for. So if the DR violated the spec
(or there is an attack) this would then result in no GDR acting as GDR
for some groups.

> [major] Why would R2 never relinquish its role?  I can imagine the corner case you identified here where R2 doesn't receive R1s Hello...but that could result in even R2 thinking it is the DR.  Is there a chance that the interpretation of the algorithm could end up producing inconsistent results?  I haven't thought about this too much, but it seems that the only possibility is if the candidate GDRs have a different view of the (*,G)/(S,G) state:  for example, if they think the RP is different for a specific group...but I don't know how likely that is?  Is there an opportunity for someone upstream to alter the information so that the local LAN don't consistently converge on who the GDRs should be?
>
> Is this a potential security issue?  Am I being paranoid?

Let me think about this some more. But there is a risk of getting
duplication or no forwarder if hellos are lost. What is essential is
that whenever he DR sends a list of participating routers, all routers
receive the hello (If the list didn't change, there is no issue if 1-2
hellos are lost). It should not be a problem if other hellos are lost.
Note that with regular PIM, there are also possible issues with loss
and two routers thinking they are DRs, or all routers thinking for a
while that they are not DRs.

It will take a while for us to revise the document.

Thanks for good and detailed comments,
Stig