Re: [pim] draft-wijnands-pim-source-discovery-bsr call for adoption

Leonard Giuliano <lenny@juniper.net> Wed, 06 November 2013 16:41 UTC

Return-Path: <lenny@juniper.net>
X-Original-To: pim@ietfa.amsl.com
Delivered-To: pim@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 131A921E8142 for <pim@ietfa.amsl.com>; Wed, 6 Nov 2013 08:41:01 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.791
X-Spam-Level:
X-Spam-Status: No, score=-102.791 tagged_above=-999 required=5 tests=[AWL=-0.792, BAYES_00=-2.599, J_CHICKENPOX_21=0.6, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id JsfD5z+uIQPy for <pim@ietfa.amsl.com>; Wed, 6 Nov 2013 08:40:56 -0800 (PST)
Received: from db8outboundpool.messaging.microsoft.com (mail-db8lp0187.outbound.messaging.microsoft.com [213.199.154.187]) by ietfa.amsl.com (Postfix) with ESMTP id 8C2F121E8138 for <pim@ietf.org>; Wed, 6 Nov 2013 08:40:55 -0800 (PST)
Received: from mail80-db8-R.bigfish.com (10.174.8.247) by DB8EHSOBE005.bigfish.com (10.174.4.68) with Microsoft SMTP Server id 14.1.225.22; Wed, 6 Nov 2013 16:40:54 +0000
Received: from mail80-db8 (localhost [127.0.0.1]) by mail80-db8-R.bigfish.com (Postfix) with ESMTP id ABD41D60303; Wed, 6 Nov 2013 16:40:54 +0000 (UTC)
X-Forefront-Antispam-Report: CIP:66.129.224.52; KIP:(null); UIP:(null); IPV:NLI; H:P-EMF01-SAC.jnpr.net; RD:none; EFVD:NLI
X-SpamScore: -20
X-BigFish: VPS-20(zzdb82h98dI4015I1447Izz1f42h2148h208ch1ee6h1de0h1fdah2073h2146h1202h1e76h1d1ah1d2ah1fc6h1082kzz1de098h1033IL17326ah8275bh8275dh1de097h186068hz2fh2a8h839h944hd25hf0ah11b5h121eh1220h1288h12a5h12a9h12bdh137ah13b6h1441h1504h1537h153bh162dh1631h16a6h1758h18e1h1946h19b5h1ad9h1b0ah1b2fh224fh1fb3h1d0ch1d2eh1d3fh1de2h1dfeh1dffh1fe8h1ff5h2216h1155h)
Received-SPF: pass (mail80-db8: domain of juniper.net designates 66.129.224.52 as permitted sender) client-ip=66.129.224.52; envelope-from=lenny@juniper.net; helo=P-EMF01-SAC.jnpr.net ; SAC.jnpr.net ;
Received: from mail80-db8 (localhost.localdomain [127.0.0.1]) by mail80-db8 (MessageSwitch) id 138375605213118_30494; Wed, 6 Nov 2013 16:40:52 +0000 (UTC)
Received: from DB8EHSMHS004.bigfish.com (unknown [10.174.8.227]) by mail80-db8.bigfish.com (Postfix) with ESMTP id F2930DA020B; Wed, 6 Nov 2013 16:40:51 +0000 (UTC)
Received: from P-EMF01-SAC.jnpr.net (66.129.224.52) by DB8EHSMHS004.bigfish.com (10.174.4.14) with Microsoft SMTP Server (TLS) id 14.16.227.3; Wed, 6 Nov 2013 16:40:50 +0000
Received: from merlot.juniper.net (172.17.27.10) by P-EMF01-SAC.jnpr.net (172.24.192.21) with Microsoft SMTP Server (TLS) id 14.3.146.0; Wed, 6 Nov 2013 08:40:48 -0800
Received: from eng-mail01.juniper.net (eng-mail01.juniper.net [172.17.28.114]) by merlot.juniper.net (8.11.3/8.11.3) with ESMTP id rA6GftW55869; Wed, 6 Nov 2013 08:41:55 -0800 (PST) (envelope-from lenny@juniper.net)
Received: by eng-mail01.juniper.net (Postfix, from userid 1709) id 7F2931144E; Wed, 6 Nov 2013 08:40:47 -0800 (PST)
Received: from localhost (localhost [127.0.0.1]) by eng-mail01.juniper.net (Postfix) with ESMTP id 706AC1141B; Wed, 6 Nov 2013 08:40:47 -0800 (PST)
Date: Wed, 06 Nov 2013 08:40:47 -0800
From: Leonard Giuliano <lenny@juniper.net>
To: Toerless Eckert <eckert@cisco.com>
In-Reply-To: <20131104174051.GQ6467@cisco.com>
Message-ID: <20131106073718.C70099@eng-mail01.juniper.net>
References: <A03B605F-3865-4862-9B91-3639CB8730C2@cisco.com> <20131030124704.F72700@eng-mail01.juniper.net> <25A558C2-5E1E-45BC-B86F-6AB2C61F1075@cisco.com> <20131030133107.A72700@eng-mail01.juniper.net> <52729601.9040606@venaas.com> <20131031104817.O72868@eng-mail01.juniper.net> <5272B584.9030805@venaas.com> <20131031131130.R97291@eng-mail01.juniper.net> <20131101001433.GX6467@cisco.com> <20131104075423.N23002@eng-mail01.juniper.net> <20131104174051.GQ6467@cisco.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
X-OriginatorOrg: juniper.net
X-FOPE-CONNECTOR: Id%0$Dn%*$RO%0$TLS%0$FQDN%$TlsDn%
Cc: Mike McBride <mmcbride7@gmail.com>, "pim@ietf.org" <pim@ietf.org>
Subject: Re: [pim] draft-wijnands-pim-source-discovery-bsr call for adoption
X-BeenThere: pim@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Protocol Independent Multicast <pim.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/pim>, <mailto:pim-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/pim>
List-Post: <mailto:pim@ietf.org>
List-Help: <mailto:pim-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/pim>, <mailto:pim-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 06 Nov 2013 16:41:01 -0000

Thanks Toerless.  

The problem is that this is a point solution with very narrow 
applicability.  If the goal is to make things simpler, my suggestion is to 
analyze the current problems with existing protocols and mechanisms, 
understand the requirements and use cases, and then start from the ground 
up to determine how best to solve this problem.  This documents jumps to 
one particular solution- perhaps extending BSR to behave like DM state 
refresh is the best way to do this, or perhaps there are better solutions 
if we put other things on the table.  Maybe in the end we have a new mode: 
PIM-Simple.

FWIW, I have come to believe the underlying assumption that there is a 
constant linear relationship between simplicity and mcast deployment 
likelyhood is not the case- experience and empirical evidence does not 
seem to support this assumption.  That is, the belief that if only we make 
things simple enough, folks will happily turn on mcast.  This has led us 
on a quixotic journey over the years, as various mechanisms have been 
tried with the goal of just making things a little simpler- SSM, IGMPv3 
lite, URD, state refresh, IGMP Proxy, PIM-Bidir, static IGMP joins, LW 
IGMPv3, AnycastRP with PIM, SSM mapping, Embedded RP, mLDP, etc.  Each of 
these, in some way, promised to make things a little simpler, with the 
assumption that widespread deployment was just around the corner if only 
we just made things a little easier.

In economics, there is a phenomenon known as the "Penny Gap", which says 
that the biggest hurdle for some services is not the question of making it 
cheap enough, but rather getting the someone to pay anything at all- to 
pay the first penny.  I have observed this same phenomenon in mcast.  No 
matter how simple you make the protocols/config, the biggest problem is 
getting someone to turn it on at all- to config the first line.  
Generally, the only reason someone turns on mcast is bc they are forced to 
do so for some reason.  And once they have to turn it on, the complexity 
question isn't much of a barrier.  We've seen this with MVPNs- the same 
folks who scoffed at Internet mcast for years and claimed it was too 
complex were more than willing to turn on a bunch of MVPN mechanisms that 
are exponentially more complex.

After all, the complexity of mcast doesn't lie in it's configuration, but 
rather it's understanding.  In most cases, mcast is "damn simple" to 
config and get up and running.  The problem is when it breaks- you have to 
understand how mcast actually works in order to tshoot it.  In that way, I 
would argue that just using BSR to flood source info is not that much 
simpler than having a RPs+MSDP+shared trees.  With this proposal, one 
still has to understand SPTs, pim joins/prunes, igmp reports, what BSR is 
and how it carries this new source info, new policy to restrict these new 
BSR messages from leaving your domain, new policy to capture the claimed 
benefits of not caching this source info on non-LHRs, etc.  At the end of 
the day, is all of this really any easier than picking 2 or 3 routers to 
be RPs and running MSDP between them (or with AnycastRP w PIM, no need for 
MSDP at all)?  Viewed in totality, the gains are marginal at best and are 
unlikely to get over any real hurdles.

I suspect this proposal will make mcast deployment no more enticing no 
matter how much anyone complains RPs and shared trees are too complex.

-Lenny

On Mon, 4 Nov 2013, Toerless Eckert wrote:

-) Thanks, Lenny.
-) 
-) The PIM-DMv2 thought experiment is a good idea.
-)   - i still would not have SSM forwarding and PIM-control-plane but
-)     something different. Duplication of operational knowledge requirements.
-)     (assuming of course i will also have PIM-SSM capable apps in the network!).
-)   - I very much assume i would still have even on intermediate
-)     hops daa-triggered events
-)   - Still wouldn't be able to solve the basic prune issues where PIM-DM
-)     is worse than DVMRP.
-)   - If i'd outgrow the scale limits of flooding i can't smoothly change
-)     the SA signaling model to something better without changing the
-)     underlying forwarding/PIM_signaling plane from DM to SM. 
-)   - I could not plug potentially different SA signaling mechanisms
-)     (eg: better than just flood) into the system.
-)   - Wouldn't create an incentive to get rid of bursty source apps
-) 
-) Is the incentive for this new idea great enough to make it happen ?
-) I think we should well investigate this. As i tried to elaborate in the
-) prior mail more, i think iv'e seen a lot of cases where really bad workarounds
-) where used to come close to this, and there are IMHO equally many
-) networks where multicast isn't deployed because PIM-SM/MSDP is too 
-) difficult and PIM-DM known to not be a problem of scale but reliability
-) due to rot of implementation/specs and above described limits of mechanisms.
-) 
-) Cheers
-)     Toerless
-) 
-) On Mon, Nov 04, 2013 at 08:46:16AM -0800, Leonard Giuliano wrote:
-) > 
-) > Toerless- thanks for the details and explanation.  
-) > 
-) > The big challenge for ASM multicast routing protocols is source discovery.  
-) > That is, if the receiver doesn't tell you who the source is, the network 
-) > needs to figure this out.  PIM-DM accomplished this with flooding so that 
-) > all routers in the domain become aware of the sources and can join using 
-) > an SPT.  It is really simple, however due to this flooding, it is not 
-) > scalable nor interdomain-friendly.  PIM-SM, on the other hand, uses RPs as 
-) > a meeting place to know all source info, then starts with shared trees 
-) > that may switch to SPTs.  PIM-SM is way more complex, but is far more 
-) > scalable and supportive of interdomain.
-) > 
-) > This proposal, like DM, simply reuses the flooding concept to distribute 
-) > source info throughout the domain.  By ignoring bursty source apps (which 
-) > is perfectly reasonable and defensible, IMO), it just floods this source 
-) > control traffic, rather than the actual data, which DM does at least 
-) > initially (DM state refresh can be used susequently to refresh state by 
-) > just reflooding the source control traffic, rather than data).  This 
-) > proposal does have to periodically reflood the source info to refresh 
-) > state, which is similiar to the DM state refresh.  Now, there is some 
-) > benefit with this proposal by not requiring all the non-LHRs to cache the 
-) > source messages (though, I can't seem to find this in the draft), but 
-) > given the limited use cases you have in mind, that's probabaly not much 
-) > gain.  So, comparing this proposal to DM: flooding of the data packets 
-) > initially is eliminated (but there still is control traffic flooding), it 
-) > is more of a flood-and-join model than a flood-and-prune model, and there 
-) > is some minor state reduction on some intermediate routers.  All said, 
-) > IMO, these are minimal, sublinear benefits over DM, but benefits 
-) > nonetheless.
-) > 
-) > Now, doing roughly the same thing in 2 different protocols is done all the 
-) > time, sometimes even for valid reasons ;)  Variety is, after all, the 
-) > spice of life.  However, alternatives are usually used when there is a 
-) > high demand for this mechanism.
-) > 
-) > In requesting adoption, the implicit question is if this is a problem 
-) > worth working on.  Some will say yes, some no, but in most cases it's 
-) > supposition and guesswork.  In this case, we have clear, unambiguous 
-) > evidence in the PIM-DM experience to guide us when it comes to using 
-) > flooding to distribute source info through a domain to eliminate the 
-) > complexity of SM.  And that experience with DM is not good.  The DM spec 
-) > was left for Experimental Limbo almost a decade ago.  Years before that, 
-) > state refresh, which IMO is about 75% of what this new proposal does, was 
-) > abandoned while still incomplete.  I don't recall seeing any proposals in 
-) > all the time since to do anything more to DM.  Is this complete lack of 
-) > interest in DM not a clear indicator?
-) > 
-) > Put another way, try the following thought experiment:  imagine someone 
-) > submitted a draft for PIM-DMv2.  It sought to fix the problems of DM, 
-) > eliminating data flooding for the initial flood, fixing state refresh for 
-) > the subsequent refloods, adding the option to not cache the state on the 
-) > non-LHRs, and moving towards explicit join.  Would there be enough support 
-) > to adopt that?  I for one would not support a PIM-DMv2 effort based on the 
-) > experiences of PIM-DMv1 and the fact that these improvements are marginal 
-) > and sublinear.  
-) > 
-) > IMO, this draft is essentially PIM-DMv2 by another name.
-) > 
-) > 
-) > On Thu, 31 Oct 2013, Toerless Eckert wrote:
-) > 
-) > -) Oh boy...
-) > -) 
-) > -) Short version:
-) > -)  I support the adoption of this idea wholeheartedly
-) > -) 
-) > -) Long version:
-) > -)  Long set of rants below, some of which might hopefully be convertable into
-) > -)  useful explanatory text for the draft:
-) > -) 
-) > -) a) Beneft:
-) > -) 
-) > -)    Ice may have also other use cases in mind (that i am just missing out
-) > -)    on right now), but IMHO the biggest ones are the cases where PIM-SM/MSDP is
-) > -)    the best solution today: Not SSM (for apps that can't and will never be able
-) > -)    to do SSM) and no source-scale issue (eg: no Bidir-case).
-) > -) 
-) > -)    These standard PIM-SM/MSDP deployments provide great but redundant
-) > -)    job security for vendor consultants and customer architects  to breed,
-) > -)    feed, cocker and babysit RPs and MSDP: Placement, platform selection,
-) > -)    redundancy, performance, configuration, troubleshooting, partitioning
-) > -)    effects and the like. This complexity is 90% of the cost of running
-) > -)    ASM multicast. And IMHO it is 95% of the reason why multicast is not
-) > -)    deployed in more eg: enterprises.
-) > -) 
-) > -)    So, yes, this proposal could put me out of one of my jobs because
-) > -)    i am giving exactly this type of "how to care for your RPs" talk to
-) > -)    customers - *sob* ;-)
-) > -) 
-) > -)    Oh, and i forgot the lovely breed of DRs and RPs: Lots and lots of
-) > -)    register tunnels which have even trickled into HW implementations
-) > -)    because chip designers couldn't find a better feature to waste ASIC
-) > -)    real estate - and admittedly because customers had real problems without
-) > -)    that HW accelerateion in some key situations.
-) > -) 
-) > -)    So yes, this proposal would remove the revenues for companies even
-) > -)    selling register tunnel HW acceleration as an add-on in their
-) > -)    boxes *sob* *sob* ;-)
-) > -) 
-) > -) b) What happened that this idea came up:
-) > -) 
-) > -)    1. We took a decade to explore all SSM transition options. The customers who
-) > -)      understand and can, will go to SSM, and use things like SSM
-) > -)      mapping only as a migration solution - as intended.
-) > -)      
-) > -)      But there is a frightning number of customers who are so annoyed
-) > -)      by PIM-SM/RP/MSDP that they even consider SSM-mapping deployments to
-) > -)      be better than PIM-SM/RP/MSDP even if there is clearly no way to get
-) > -)      the apps towards native SSM.
-) > -) 
-) > -)      And we even had asks for SSM mappings in DC-cluster environments
-) > -)      where one out of 40 sources would send to a group, but with static
-) > -)      SSM mapping you would have 40 times as much state as with PIM-SM.
-) > -)      That says a lot about the complexity of RP/MSDP operations.
-) > -) 
-) > -)    2. Multicast trees IMHO are getting denser.  In the 90th it was hip to
-) > -)      use multicast for any app because you could, now you predominantly
-) > -)      see new multicast deployments of multicast for apps that can't live 
-) > -)      without it. And those are apps with a lot of replication == large
-) > -)      set of receivers.
-) > -)    
-) > -)    3. ASM to do source discovery is becoming less relevant. Traditional
-) > -)      one-off-hacky service discovery was like this: app developers goes
-) > -)      to IANA asks for a multicast group, and programs it's app clients
-) > -)      to send a multicast request packet to the group, and have his app
-) > -)      groups listen to that. Luckily we're getting over that and that
-) > -)      eliminites the real painful set of bursty-source-apps.
-) > -) 
-) > -) c) Why this solution
-) > -)    
-) > -)    b.1) clearly indicates that SSM operationally in the network is
-) > -)    very successfull with operators. It's even successfull to the extend
-) > -)    that they consider going way beyond the usefulness of SSM mapping,
-) > -)    and we should really stop that to happen (with this proposal) because
-) > -)    that will cause backlash againt SSM in some future when it starts to
-) > -)    fails to be manageable under change/expansion.
-) > -) 
-) > -)    b.1) + b2) to me means that we want a solution that effectively
-) > -)    automates SSM mapping. It also means that for the key class of
-) > -)    deployments, its not a problem to have some control plane flooding
-) > -)    of Source-Active state into places where it's not needed. The total
-) > -)    number of states we flood is limited by by the total number of 
-) > -)    forwarding states we could build. I would guess that >= 90% of
-) > -)    enterpride PIM deployments have less than 10,000 multicast states
-) > -)    and with IPv6 and timeouts that's less than half a megabye of memory
-) > -)    without even trying any optimizations on how we store SAs. And a
-) > -)    very high likley that all these SAs are useful for the router anyhow
-) > -)    (b.2).
-) > -)    
-) > -)    b.3) also means that it wouldn't leave a lot of PIM-SM apps behind 
-) > -)    that can not use this solution, so this proposal has a real good chance
-) > -)    to replace PIM-SM in the mayority of deployments with something that's
-) > -)    simpler: purely based on SSM forwarding plus the most lightweight way
-) > -)    to keep ASM apps running.
-) > -)    
-) > -)    To rephrase: The original goal of SSM to simplify deployment did not
-) > -)    happen because SSM could only be added to PIM-SM/RP/MSDP because of
-) > -)    all the ASM leftover apps. With this proposal we could really have a large
-) > -)    set of deployments avoid having ever to learn or worry about RPs, MSDP,
-) > -)    register tunnels and all their associated care.
-) > -) 
-) > -) d) Why not PIM-DM
-) > -) 
-) > -)    PIM-DM might have been quite on-par with PIM-SM in the late 90th,
-) > -)    but i think since then it has deteriorated. It still is as encumbered by
-) > -)    data-triggered events as it ever was.
-) > -) 
-) > -)    HW implementations in routers most certainly will not have optimized
-) > -)    for that in the same way as the optimziations done for PIM-SM/SSM in
-) > -)    platforms (can't of course speak for all vendors). If i remember correctly,
-) > -)    assert is still different between DM and SM/SSM (SM/SSM using the same).
-) > -) 
-) > -)    State refresh was never finalized as an RFC. PIM-DM itself is not
-) > -)    standards track. We never fixed the limitations DM even had vs.
-) > -)    DVMRP. There are even important products not supporting DM at all,
-) > -)    and those who do will have more interop issues than PIM-SM/SSM interop.
-) > -) 
-) > -)    Fixing up that badly, incompletely specified and mostly legacy
-) > -)    implemented DM router hardware forwarding plane as opposed to using the
-) > -)    PIM-SSM forwarding plane plus one tiny bit from PIM-SM (first hop router
-) > -)    new (S,G) discovery) is the best choice to maintain all the HW/implementation
-) > -)    benefits of what we have been using in most deployments in the last
-) > -)    10 years AND to strip down the complexity in the forwarding plane to the
-) > -)    minimum needed to suport a form of SSM that be be zero-config for
-) > -)    the mayority of deployments.
-) > -) 
-) > -)    Last but not least: Maybe i remember DM incorrectly, but i thought
-) > -)    that even with SR it was necessary to have on all routers (S,G)
-) > -)    forwarding-plane ("prune") state. That's a totally different problem
-) > -)    from <= 0.5MByte control plane memory.
-) > -) 
-) > -)    Last but not last: As i mentioned in any rant: I want an SSM network,
-) > -)    so i want customers to have only the need to do SSM forwarding and
-) > -)    control plane debugging and as little more as possible. With PIM-DM
-) > -)    i wold continue to have a totally orthogonal forwading plane and
-) > -)    control plane that just shares the name "PIM", but that's really double
-) > -)    effort. With this poposal i have SSM for everything plus some add-on
-) > -)    for ASM-apps, where i have no idea how we could do the ASM-add on
-) > -)    any easier.
-) > -) 
-) > -) Cheers
-) > -)     Toerless
-) > -) 
-) > -) On Thu, Oct 31, 2013 at 01:27:27PM -0700, Leonard Giuliano wrote:
-) > -) > 
-) > -) > -) > And as for the claim that this is more like SSM- this is no more
-) > -) > -) > SSM-friendly than DM.
-) > -) > -) 
-) > -) > -) It looks a lot like SSM from routing perspective. We're using SM with
-) > -) > -) only SPTs and no RPs, just like with SSM. The only difference is where
-) > -) > -) the source discovery is done.
-) > -) > -) 
-) > -) > 
-) > -) > All you are doing is changing the name of the protocol, not the behavior.  
-) > -) > You are just making SM act like DM and saying it is better bc it is now SM 
-) > -) > and not DM. The problem with PIM-DM is not the name "PIM-DM"; the problem 
-) > -) > is that it floods source data across the domain.  If you just just make SM 
-) > -) > start flooding source info, that isn't any better.  If you can live with a 
-) > -) > solution that floods source data across the domain so that you can use 
-) > -) > SPTs only and don't care about scale, what's the difference between this 
-) > -) > solution and DM, besides the names of the protocols/messages?
-) > -) > 
-) > -) > If it looks like a duck, swims like a duck, and quacks like a duck, then 
-) > -) > it probably is a duck. (http://en.wikipedia.org/wiki/Duck_test)
-) > -) > 
-) > -) > _______________________________________________
-) > -) > pim mailing list
-) > -) > pim@ietf.org
-) > -) > https://www.ietf.org/mailman/listinfo/pim
-) > -) 
-) 
-) -- 
-) ---
-) Toerless Eckert, eckert@cisco.com
-) Cisco NSSTG Systems & Technology Architecture
-) SDN: Let me play with the network, mommy!
-) 
-)