Re: Comments on draft-yourtchenko-colitti-nd-reduce-multicast
Andrew Yourtchenko <ayourtch@cisco.com> Fri, 28 February 2014 15:27 UTC
Return-Path: <ayourtch@cisco.com>
X-Original-To: ipv6@ietfa.amsl.com
Delivered-To: ipv6@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CB15A1A084A for <ipv6@ietfa.amsl.com>; Fri, 28 Feb 2014 07:27:08 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.048
X-Spam-Level:
X-Spam-Status: No, score=-10.048 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RP_MATCHES_RCVD=-0.547, SPF_PASS=-0.001, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id B0kE846i35gi for <ipv6@ietfa.amsl.com>; Fri, 28 Feb 2014 07:27:06 -0800 (PST)
Received: from alln-iport-8.cisco.com (alln-iport-8.cisco.com [173.37.142.95]) by ietfa.amsl.com (Postfix) with ESMTP id 555BF1A0835 for <ipv6@ietf.org>; Fri, 28 Feb 2014 07:27:01 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=9652; q=dns/txt; s=iport; t=1393601219; x=1394810819; h=date:from:to:cc:subject:in-reply-to:message-id: references:mime-version; bh=rz6lDltyOzyu17lDto7aklrXNNfwwf03EO5kSzzZElU=; b=dnClFOxDOzCtx31YjbkIUykBwcC0SRA72wQTM5eyAarwX/7zFDZxI3ZG HBdWbCCAt4OFHMBfXUmtPUAhU0vtVPJxrtYZvDFLw2JhIx93n6FsXBquf gfIrRgUyJa2sNfHWNgPZT4IxmfyTcuIcf3rAyQCgVVKq03J6j9JbVEsiF o=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AgEFAOSpEFOtJXHB/2dsb2JhbABRCIMGgRLBH4EUFnSCJQEBAQMBJxECNgkFCwsYLlcGDod2CMtKF419AlYHhDcElE+KNYthgy6BZ0I
X-IronPort-AV: E=Sophos;i="4.97,562,1389744000"; d="scan'208";a="23976075"
Received: from rcdn-core2-6.cisco.com ([173.37.113.193]) by alln-iport-8.cisco.com with ESMTP; 28 Feb 2014 15:26:59 +0000
Received: from xhc-rcd-x10.cisco.com (xhc-rcd-x10.cisco.com [173.37.183.84]) by rcdn-core2-6.cisco.com (8.14.5/8.14.5) with ESMTP id s1SFQxql004364 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=FAIL); Fri, 28 Feb 2014 15:26:59 GMT
Received: from [10.61.221.73] (10.61.221.73) by xhc-rcd-x10.cisco.com (173.37.183.84) with Microsoft SMTP Server (TLS) id 14.3.123.3; Fri, 28 Feb 2014 09:26:57 -0600
Date: Fri, 28 Feb 2014 16:26:35 +0100
From: Andrew Yourtchenko <ayourtch@cisco.com>
X-X-Sender: ayourtch@ayourtch-mac
To: Erik Nordmark <nordmark@acm.org>
Subject: Re: Comments on draft-yourtchenko-colitti-nd-reduce-multicast
In-Reply-To: <530C9CFD.2000409@acm.org>
Message-ID: <alpine.OSX.2.00.1402272036140.42594@ayourtch-mac>
References: <5305AF13.5060201@acm.org> <75B6FA9F576969419E42BECB86CB1B89115F99A9@xmb-rcd-x06.cisco.com> <alpine.OSX.2.00.1402211620560.49053@ayourtch-mac> <75B6FA9F576969419E42BECB86CB1B89115F9BAE@xmb-rcd-x06.cisco.com> <alpine.OSX.2.00.1402212129450.52880@ayourtch-mac> <530C9CFD.2000409@acm.org>
User-Agent: Alpine 2.00 (OSX 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: text/plain; format="flowed"; charset="US-ASCII"
X-Originating-IP: [10.61.221.73]
Archived-At: http://mailarchive.ietf.org/arch/msg/ipv6/vDMfc5XQFgY0D3yYf2o38nH9xHA
Cc: IETF IPv6 <ipv6@ietf.org>
X-BeenThere: ipv6@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "IPv6 Maintenance Working Group \(6man\)" <ipv6.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ipv6>, <mailto:ipv6-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ipv6/>
List-Post: <mailto:ipv6@ietf.org>
List-Help: <mailto:ipv6-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ipv6>, <mailto:ipv6-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 28 Feb 2014 15:27:09 -0000
Erik, On Tue, 25 Feb 2014, Erik Nordmark wrote: > On 2/21/14 12:36 PM, Andrew Yourtchenko wrote: >> >> Myself I did not like this portion of my reply with no quantifiable data >> (because operating on a basis of a belief is not a good engineering). >> >> I happened to have a couple of hours of offline time, during which I tried >> to sketch the scenario and try to get as far as I can without building a >> lab. >> >> It's a *very rough sketch*. If you think there can be parts that can be >> made better in it, tell me. > Andrew, > I like this sketch. A few comments below. >> >> The initial assumptions for this thought experiment are as follows: >> >> 1) We have 10000 clients in a single /64. >> >> 2) There are multiple APs that bridge the traffic from wired onto >> wireless medium, with the client count limited to 100 per AP. >> >> 3) there is 20x speed difference between unicast transmission and >> multicast transmission.: the effective multicast speed is assumed to be >> 1 mbps, the effective unicast speed is assumed to be 20mbps data rate. >> >> 4) The APs are assumed to be "Naive" i.e. they do not perform any snooping >> nor multicast->unicast conversions, but at the same time they are able to >> bridge the unicast traffic without flooding it between multiple access >> points. I.e. we assume a model where we have a single router (or an FHRP >> pair) and a set of 100 APs bridging the traffic. >> >> (Corollary from the above: The effective unicast capacity is 100*20mbps, >> whereas the effective multicast capacity is 1*1Mbps, therefore the >> difference in throughput is 1000-fold). >> >> Let's first consider the steady state. Suppose each host downloads >> a file at 0.1 mbps. Within each AP, therefore we have a 50% capacity >> utilization (0.1*100 = 10Mbps, we have 20 Mbps capacity). >> >> It's easy to see this comfortably accomodates all the hosts. Obviuously >> the unicast NUD in this traffic is fairly minimal, so I don't think it's >> even worth to count how much it is. >> >> Now, lets look at a potential failure. >> >> At the time of the NUD probe, it's enough to lose the 3 retries, >> spaced 1 second apart. The default reachable time is 30 seconds, >> with the random jitter of 0.5 of that. >> >> So, all that is needed to achieve a mass NUD failure is a ~30 second >> outage during the period when all the hosts are sending the traffic. >> >> A reboot of the majority of the networking gear takes several times longer >> than this. >> >> Therefore, a crash of the default gateway during the peak hour is >> a one guaranteed trigger for this to happen. >> >> So, now we have a situation of 10000 hosts which have deleted their default >> gateway from the neighbor table, and send the multicast neighbor >> solicitations for it. >> >> Assume the NS is 64 bytes, 10000 hosts sending such a packet means 64 bytes >> * 8 bit * 10000 hosts = 5120000 bits/sec - or, 5 mbps. > Just to make it clear, each NS retransmissions is 5 Mbits and the > retransmission time is 1 second. Hence with 4861 we get 5 mbps for 3 seconds. > However, they hosts aren't that likely to all discover the NUD failure in the > same 3 second window. With ReachableTime=30 seconds it depends on when there > last successful NUD probe or higher-level reachability advise can in. Hmm - > with TCP the reachability advise might come with every left window edge > advancement i.e. they would synchronize quite closely. Yes, it's a bit of a fringe assumption here of all the hosts being active and then losing the connectivity, and somewhat uniform implementations of TCP (or, stuff like name resolutions). In reality I think they will be spaced more. > >> >> Note, that this is only the shared bandwidth downstream back to the hosts - >> the airtime spent by the upstream traffic is 20x faster, so I am >> gratuitously discarding it. >> >> Since the APs can not send the traffic at this rate, obviously, we will >> need to drop some of it. Note, if the clients succeed with the ND to the >> default gateway, they will start streaming data again, so the effective >> multicast throughput will drop to the 0.5 mbps as soon as the noticeable >> portion of clients recovers. > While the multicast NS will see downstream drops, what matters for the > resolution is that the router receives it and responds. Thus why wouldn't > that part work? > The NA from the router will be unicast so if that unicast isn't dropped due > to the multicasts then it should resolve quickly. I don't know if unicast is > prioritized higher than multicast or not. > > (We'll see downstream overload due to the multicasts, but that might just > last for 1 second if the NS/NA to/from the router gets through.) Yes, since the router is on the wired, the bandwidth degradation due to a "dummy" multicast retransmission is the biggest problem. The NS will go unicast over the air, then go to the wired + multicast over the air, and the unicast reply from the default gateway on the wired will go over unicast. > >> >> Let's assume the best case of 20% of the clients managing to recover within >> the first second. >> >> As we are approaching full recovery, the lesser number of clients will be >> able to get their multicast NS sent because the airtime is being taken by >> the payload traffic. Anyway, let's discard that and assume that every >> second 20% of the clients will recover. >> >> This means that the recovery of the full set of the clients in this >> conditions will take an *absolute* minimum of 5 seconds. >> >> Did I prove myself wrong ? Seems so. We can see that with some of the >> relaxed assumptions I took, the hosts seem to recover. >> >> But, let's add to this a little bit of mDNS, and other multicast-loving >> protocols, which tend to generate a fair chunk of traffic when they detect >> the network was "restored". >> >> This can shrink the available capacity several times. Add to this that the >> host does not necessarily stream at 100kbps, but might have higher data >> rates, and I think we can consider the available >> multicast capacity at startup to be 1/10th of what it is in theory. > Yes, in general TCP will fill the pipe thus at each AP all of the downstream > bandwidth will be used. > > But if the streaming comes via the default router (which crashed or was > unreachable for a few seconds), then the stream would also slow down due to > TCP, in which case there is less of a multicast overload issue. There might > be cases where this TCP slowdown doesn't happen though. Yes, indeed. Also, given the high utilisation of the media at that point, I wonder if the cross-channel interference will play a bigger role, therefore affecting the results. This seems like an extremely tricky beast to model with any reasonable accuracy! --a > > Erik > >> >> This is where the things may become interesting - 1/10th of the capacity >> means that it will take not 5, but 50 seconds for all the hosts to recover. >> >> This means that the hosts which recovered in the very first second, will >> already be sending NUD traffic while the network is still under stress. If >> these packets are lost, the hosts might back into the pool of the "orphans" >> who are sending the multicast, because they delete their neighbor entry. >> >> There are some other things to consider: >> >> - I deliberately kept the scenario here with *one* ND entry. >> Assuming your hosts are talking with 3-4 other hosts besides the default >> gateway. This increases the load and proportionally makes the dangerous >> state easier to achieve. >> >> - another factor that I am omitting - that such a storm of ND onto the >> default gateway might cause rate-limiting of the control plane packets on >> the gateway. With some of the limits being as low as 1000pps, this might >> give a recovery time on the order of minutes even without the wireless >> multicast being the bottleneck, yet still resulting in a lot of multicast >> NS in the air still during the slow recovery. >> >> This is about as precise of the construct I can build. >> >> Is it perfect ? No, by no means. It also does assume that in the case of >> the default gateway the wireless performance will be limiting factor - I >> think it won't - so it's more of an appropriate scenario for a case of a >> p2p communications on the network. The default-gateway-only will be >> inherently much more stable, I think - because the multicast on the wired >> side is fast, so the drops on the wireless side will not matter. >> >> It's probably worth it to change the text into "There is potential for the >> failing NUD to *contribute* to a longer recovery and possible creation of >> the locked situation in the case of flash failure - but the exact >> quantification of the impact in such an environment is a topic of further >> study". >> >> And then maybe I could dump the above thought experiment into a separate >> draft, to see if the folks could contribute to the experiment / maybe >> someone could run it - and reference it in this item ? >> >> It seems like an interesting area to dig a bit more in - creating a >> suitable model and playing with the parameters to see where it breaks seems >> like a useful exercise to understand how many hosts can there be in a >> single /64 on WiFi with a "naive" set of access-points. >> >> Thoughts ? >> >> --a >> >
- Comments on draft-yourtchenko-colitti-nd-reduce-m… Erik Nordmark
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Ole Troan
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Andrew Yourtchenko
- RE: Comments on draft-yourtchenko-colitti-nd-redu… Hemant Singh (shemant)
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Eric Levy- Abegnoli (elevyabe)
- RE: Comments on draft-yourtchenko-colitti-nd-redu… Hemant Singh (shemant)
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Ole Troan
- RE: Comments on draft-yourtchenko-colitti-nd-redu… Andrew Yourtchenko
- RE: Comments on draft-yourtchenko-colitti-nd-redu… Hemant Singh (shemant)
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Ole Troan
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Ole Troan
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Eric Levy- Abegnoli (elevyabe)
- RE: Comments on draft-yourtchenko-colitti-nd-redu… Hemant Singh (shemant)
- RE: Comments on draft-yourtchenko-colitti-nd-redu… Hemant Singh (shemant)
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Ole Troan
- RE: Comments on draft-yourtchenko-colitti-nd-redu… Hemant Singh (shemant)
- RE: Comments on draft-yourtchenko-colitti-nd-redu… Andrew Yourtchenko
- RE: Comments on draft-yourtchenko-colitti-nd-redu… Hemant Singh (shemant)
- RE: Comments on draft-yourtchenko-colitti-nd-redu… Hemant Singh (shemant)
- RE: Comments on draft-yourtchenko-colitti-nd-redu… Hemant Singh (shemant)
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Ole Troan
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Mark ZZZ Smith
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Erik Nordmark
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Pascal Thubert (pthubert)
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Mark ZZZ Smith
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Ole Troan
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Ole Troan
- Re: Re: Comments on draft-yourtchenko-colitti-nd-… Ray Hunter
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Erik Nordmark
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Erik Nordmark
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Erik Nordmark
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Erik Nordmark
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Erik Nordmark
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Erik Nordmark
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Erik Nordmark
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Erik Nordmark
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Ole Troan
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Pascal Thubert (pthubert)
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Erik Nordmark
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Pascal Thubert (pthubert)
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Ole Troan
- RE: Comments on draft-yourtchenko-colitti-nd-redu… Hemant Singh (shemant)
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Ralph Droms
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Erik Nordmark
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Erik Nordmark
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Erik Nordmark
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Erik Nordmark
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Mark ZZZ Smith
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Erik Nordmark
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Andrew Yourtchenko
- Re: Comments on draft-yourtchenko-colitti-nd-redu… Andrew Yourtchenko