Re: [MBONED] WGLC for draft-ietf-mboned-dc-deploy

Olufemi Komolafe <> Sat, 29 February 2020 21:59 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 483DA3A145C for <>; Sat, 29 Feb 2020 13:59:18 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.1
X-Spam-Status: No, score=-2.1 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id 70bfbMD1FjBU for <>; Sat, 29 Feb 2020 13:59:16 -0800 (PST)
Received: from ( [IPv6:2607:f8b0:4864:20::102d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id D11F73A1459 for <>; Sat, 29 Feb 2020 13:59:15 -0800 (PST)
Received: by with SMTP id a18so2762406pjs.5 for <>; Sat, 29 Feb 2020 13:59:15 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=googlenew; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=lrqn2PtLnNSPeOJ9DIsarWxNNP0Y1EXlwsm/vEktKgs=; b=kxYIx/mLdHxyqGbIfk/4jFfeNBsXYEzwRNcGf5xoEvvcilo3SW0PIkkLSUz2NGu5oq KlvKQEIbp3awPUC9ZlBCWP+lKL1egQvyVyomuC/7gnLcteTq8fOaEYTuosaT8C5QLJA0 x1+A8ZECR61zmKErZZBkonJlzis2scuH/8tYGCc53EPb36UBBhJVk9GJgqk5DdXA5/Fb 1sQHsDXRMT7sC+IeeEyC3rlX4Ddd2OistaEnt+G2bl21gZLwLvYwo1yOKT5BTl0K64fL 8nW7h7YkCkwwA27JubLhxTmP01LqNQ+qrdTPO9tjS7gA8f1nCuaF+4vPYc43RrQkq252 X9Tw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=lrqn2PtLnNSPeOJ9DIsarWxNNP0Y1EXlwsm/vEktKgs=; b=DgH9fsB249QABQa/L11GA2FFiOrsxuQADh0NSt637Rbxb6S94385hujTEDHdwO5wXB QsI7vqp6vJXJqfmWIGaIHFPuJIIhHVIxBfmVNgplCvUKdQUF54eOPKi1yp+JdaGtzfp4 PLJbz28KxU4+tH+rMbpvujoZjWvCaWk5qYWHWGr5KFf6MB4MvQ0Jr8LWh4LruAA2j2HZ gieQ+fA6sty88AXgqgNknfvRxwcIguVTg39ESapUfBoAz2ijkaXZ3aQ/npErFo3HCzMn C2q2yZfhKWNzIllPv5P2WB51evrU7uq9arJ4Onf5FE86+gjQZjZnYacyc7AoZ9xdXzr8 hyVw==
X-Gm-Message-State: APjAAAWjTQNfEcqNnrnoMzBDYwN/K1D7Dvsa9DEiAMNp39DJBJt9AxZ6 WNcTZQERB7XNEBpigSBhPjg9Ag==
X-Google-Smtp-Source: =?utf-8?q?APXvYqwVKApw31y/CpJxXqIgna0g/Qmxh7M/rFijwrEC?= =?utf-8?q?Kp2wI6TFYYTfqfWkrHoKBmvExcGM1vSTag=3D=3D?=
X-Received: by 2002:a17:90a:234f:: with SMTP id f73mr11912434pje.109.1583013554871; Sat, 29 Feb 2020 13:59:14 -0800 (PST)
Received: from [] ( []) by with ESMTPSA id q66sm13880096pgq.50.2020. (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 29 Feb 2020 13:59:14 -0800 (PST)
Content-Type: text/plain; charset=utf-8
Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\))
From: Olufemi Komolafe <>
In-Reply-To: <>
Date: Sat, 29 Feb 2020 21:59:10 +0000
Cc: Mike McBride <>, Leonard Giuliano <>, MBONED WG <>
Content-Transfer-Encoding: quoted-printable
Message-Id: <>
References: <> <> <>
To: "Holland, Jake" <>
X-Mailer: Apple Mail (2.3273)
Archived-At: <>
Subject: Re: [MBONED] WGLC for draft-ietf-mboned-dc-deploy
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Mail List for the Mboned Working Group <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Sat, 29 Feb 2020 21:59:19 -0000

Appreciate the insightful feedback Jake.  Thanks for taking the time to conduct the review.

We’ll take your comments onboard and try to improve the draft.


> On 28 Feb 2020, at 21:57, Holland, Jake <> wrote:
> Hi Mike,
> Sorry for taking so long.
> This draft seems borderline to me, mostly on editorial grounds.  I support moving
> forward, but I'd prefer to see some tuning of the text first.
> There's some excellent technical insights in here worth publishing, but as an
> informational doc I think it needs to be easier to read and give advice that's
> less tentative, caveat-filled, and speculative.
> Overall, I think it's about 35% too wordy, and as it stands I'm not sure the
> people trying to set up their data center will bother reading it deeply enough to
> extract its valuable insights, whereas more straightforward prose that gets to the
> point would make it useful to them.
> I was trying to do a by-section walkthrough with suggestions, but it was taking me
> way too long, and will maybe never be really done, and incorporated too many
> judgement calls, so I'll just throw in a few examples of what I'm talking about, and
> hope the authors can use them as a guide for how I'd suggest they focus some of their
> efforts.
> Overall, I think this doc should go forward, and provides some value even as-is, but
> I think would be more than twice as useful if the text were revised with an eye toward
> being concise and decisive, with a specific target audience in mind.  And so I urge the
> authors to consider doing so.
> ----
> Editorial:
> 1. With respect, this bit from 2.2 reads to me like 3 lines of awful word salad that
> would be better said as "Overlays provide":
>  "The
>   often fervent and arguably partisan debate about the relative merits
>   of these overlay technologies belies the fact that, conceptually, it
>   may be said that these overlays mainly simply provide"
> This is one of the worst examples I saw, but the overwhelming bulk of my editorial
> objections are about text that's got similarities to this.  It's gotta be tighter text,
> nobody I know can read that kind of stuff for long.  Everything similar to this is the
> main thing that I'd like to see changed.
> I'm not giving a complete list of detailed examples in this review, but when I said
> "35% too wordy overall" in the intro, I mean to suggest that it's probably possible to
> say the same thing more effectively by cutting or rephrasing the least essential 35%
> of the words.
> For the particular snippet above, I was able to suggest about a 96% cut.  Most of the
> rest of the text is much less severe, but has similar opportunities distributed
> liberally throughout, IMO.
> 2.  Every sentence with "likely" or "future" in it seems speculative, and usually like
> it's trying to justify why someone would bother reading this doc.
> I suggest assuming instead that whoever got as far as trying to read this doc already
> strongly suspects they want to roll out multicast in a datacenter, and wants to know
> how to do it, what to watch out for, where they have to make tricky choices, and what
> the important factors in those choices are.  I think they won't care whether things
> looked likely when the doc was first written, and will be annoyed at having to wade
> through that kind of speculation.
> 3. The "widely available" deployment guides and best practices in 3.4 should include
> example references, IMO.  Searching for "PIM best practices" gives a bunch of "Project
> Information Management" junk.
> 4. North/South East/West should get a definition and maybe a reference, I don't think
> these terms have a well-established usage in the RFC series yet.  Probably leaf/spine
> also.
> 5. The "Applications" section would be better split into subsections.  It's sort of a
> wall of text that changes subjects a lot.
> 6. I think 4.3 is far too abstract.  Phrases like "enticing possibility" and "novel
> algorithms and concepts" elide the problem being discussed to the point I don't really
> know what it's talking about from reading it.
> The reference to [Shabaz19] is a good step in the right direction, but I'd recommend
> pulling in some of the references it contains in its "comprehensive overview of other
> approaches", and describing the problems they're solving, along with the pros and cons
> (especially since an acm reference comes with a paywall), and trim most of the abstract
> description of the solution space in the first 3 paragraphs.
> ---
> Technical:
> Though my feedback is mainly about editorial issues, I'll also suggest adding one new
> technical section about gotchas to watch out for.
> I don't insist it be added, especially if it's all well-covered in the references for
> the deployment guides and best practices mentioned in 3.4, but I thought I'd offer a
> few particulars as suggestions to include in such a section.  It's likely there are
> some others I haven't encountered, but below are a few of the most obnoxious that have
> bitten me or that I've heard of.
> I think what ties these together as nasty gotchas is that you think your network is
> working fine, but then it suddenly stops and you have to debug it.  I think these are
> probably the failure modes that are most important to highlight.
> There may be other such failure scenarios worth listing, but these are the ones I know
> of offhand:
> - it's important to get redundancy in your IGMP/ND querier setup, because snooping
> relies on seeing the membership reports.  It's easy to accidentally get traffic that
> works for 60 or 120 seconds after the spontaneous report from the initial join, then
> stops working because nothing is sending the query that causes re-sending of the
> report, or alternatively it starts flooding everywhere in the layer 2 lan instead of
> only to the joined groups when the snooping info expires, both of which can cause
> disruptions in service.
> - it's important to disable igmpv2 everywhere if you rely on ssm, because seeing igmpv2
> messages can put the devices on a LAN into compatibility mode, which can even happen
> spontaneously if the right sequence of igmpv3 messages were dropped, and which can be
> persistent once it happens and the devices on the lan continue sending the v2 messages.
> This can result in service disruptions when using PIM-SSM or otherwise relying on SSM
> for specific (S,G)s, since the older igmp versions don't have the necessary SSM info.
> (With a reference to section 7 of RFC 3376, and probably similar for mldv2.)
> - there's a failure mode from having too many joined groups to re-build the membership
> state in the rpf tree before the membership expires.  This can also cause a persistent
> service disruption after a single link failure with redundant paths but not a redundant
> forwarding tree on an otherwise functional network, and even on a network that can
> recover successfully with fewer groups joined, so it can be a nasty surprise that gets
> worse with scale of multicast usage, and would have a threshold that depends on the
> timers. (I raise this more tentatively because it hasn't hit me, but I've heard of it
> happening.)
> ---
> I guess I'll leave it at that in the interest of actually sending a review out this
> time (I started and got stuck on this response about 3 times, starting in October).
> I hope these comments are helpful, and I do think the doc is worth publishing, though
> I'd ideally like to see it become easier to read first.
> Thanks and regards,
> Jake
> On 2/27/20, 3:13 PM, "Mike McBride" <> wrote:
>    mboned crew,
>    Only one response to the wglc. One more day. These types of drafts are
>    what this wg are chartered to produce. Please give it a quick read and
>    respond either way. If it's not useful we will drop it. But if you
>    find it at all useful please respond so we can finally be done and
>    move to iesg.
>    thanks!
>    mike
>    On Thu, Feb 6, 2020 at 12:27 PM Leonard Giuliano
>    <> wrote:
>> We would like to begin working group last call on Multicast in the Data
>> Center Overview.  This draft has been recently updated based on feedback
>> from last year's WGLC, where there was some support, but not enough
>> responses to advance the draft.  Please post whether you support/oppose
>> the advancement of the drafts as well as any comments you may have to the
>> list by Feb 28.
>> Most recent version of the draft can be found here:
>> -Chairs
>> _______________________________________________
>> MBONED mailing list
>    _______________________________________________
>    MBONED mailing list
> _______________________________________________
> MBONED mailing list