Re: Reducing the battery impact of ND

Erik Nordmark <nordmark@sonic.net> Fri, 17 January 2014 17:21 UTC

Return-Path: <nordmark@sonic.net>
X-Original-To: ipv6@ietfa.amsl.com
Delivered-To: ipv6@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 82F5F1AE174 for <ipv6@ietfa.amsl.com>; Fri, 17 Jan 2014 09:21:00 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.138
X-Spam-Level:
X-Spam-Status: No, score=-2.138 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, MIME_8BIT_HEADER=0.3, RCVD_IN_DNSWL_NONE=-0.0001, RP_MATCHES_RCVD=-0.538] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id xsTibb7HIiKf for <ipv6@ietfa.amsl.com>; Fri, 17 Jan 2014 09:20:58 -0800 (PST)
Received: from c.mail.sonic.net (c.mail.sonic.net [64.142.111.80]) by ietfa.amsl.com (Postfix) with ESMTP id 3FE771AE175 for <ipv6@ietf.org>; Fri, 17 Jan 2014 09:20:58 -0800 (PST)
Received: from [10.0.1.44] (184-23-158-201.dsl.dynamic.sonic.net [184.23.158.201]) (authenticated bits=0) by c.mail.sonic.net (8.14.4/8.14.4) with ESMTP id s0HHKZvS030008 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NOT); Fri, 17 Jan 2014 09:20:36 -0800
Message-ID: <52D96663.6060005@sonic.net>
Date: Fri, 17 Jan 2014 09:20:35 -0800
From: Erik Nordmark <nordmark@sonic.net>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:24.0) Gecko/20100101 Thunderbird/24.2.0
MIME-Version: 1.0
To: Lorenzo Colitti <lorenzo@google.com>, Andrew 👽 Your tchenko <ayourtch@gmail.com>
Subject: Re: Reducing the battery impact of ND
References: <CAKD1Yr29S=O5L4DfhNoiVieWPkgBJ2veuOu6ig5rwgK4ELz7Xw@mail.gmail.com>
In-Reply-To: <CAKD1Yr29S=O5L4DfhNoiVieWPkgBJ2veuOu6ig5rwgK4ELz7Xw@mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 8bit
X-Sonic-ID: C;GiuKrZt/4xGB1jqjisUCUQ== M;0PnDrZt/4xGB1jqjisUCUQ==
X-Mailman-Approved-At: Fri, 17 Jan 2014 11:10:39 -0800
Cc: 6man Chairs <6man-chairs@tools.ietf.org>, 6man WG <ipv6@ietf.org>
X-BeenThere: ipv6@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "IPv6 Maintenance Working Group \(6man\)" <ipv6.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ipv6>, <mailto:ipv6-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ipv6/>
List-Post: <mailto:ipv6@ietf.org>
List-Help: <mailto:ipv6-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ipv6>, <mailto:ipv6-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 17 Jan 2014 17:21:00 -0000

On 1/10/14 8:25 PM, Lorenzo Colitti wrote:
> Andrew,
>
> thank you for bringing solid data and analysis to the equation. That is
> always appreciated.
>
> On Sat, Jan 11, 2014 at 8:57 AM, Andrew 👽 Yourtchenko
> <ayourtch@gmail.com <mailto:ayourtch@gmail.com>> wrote:
>
>     2) The bigger problem - the battery life:
>
>
> This can be a real problem, yes. But as you say, I think we can mitigate
> it well without changing ND at all.

Lorenzo,

One of your "yes please" below is a change to ND - Andrew's b) about 
restarting RS would require a (small) standards update.

And if you work out the details on b) I suspect you'll find that to 
completely avoid the periodic multicast RAs the routers need to know 
whether hosts expect periodic RA as opposed to all the hosts doing the 
RS restart. That would imply some more protocol change.

And just to be perfectly clear, I don't think assuming (near) perfect 
MLD snooping including in WiFi APs is undesirable. Hence my focus has 
been to remove as much as possible of the ND multicasts including the 
DAD and NS packets.

But my overall observation is that if we think we can fix (even a subset 
of) this without a standards update, then we are just fooling ourselves.

Regards,
    Erik

>     WiFi is a pretty expensive thing to do, power-wise. To help extend the
>     battery, the mobile devices try very hard to shut off the radio
>     whenever they can. The wireless station would go to sleep, and inform
>     the AP about this fact - so the AP can buffer the frames for it [1].
>     Further optimizations in this area were introduced by 802.11e (WMM-PS)
>     [2], where the frames destined to workstation could be transmitted
>     more efficiently and radio duty cycle is reduced.
>
>     Spontaneous all-hosts multicasts interfere with these mechanisms, thus
>     the battery drain on the devices is much higher
>
>
> One thing that I think we haven't discussed here is that in principle,
> there are *very few* all-hosts multicasts - basically, only RA. Even RS
> is all-routers multicast, not all hosts. So really the only thing we're
> talking about is solicited-node multicasts for NS (and occasional
> unsolicited NAs, which should only really happen when nodes change their
> MAC addresses. Note that I'm not talking about malicious nodes here
> because any malicious node can decide to spam the link with multicast
> messages regardless of what the ND protocol looks like.
>
> I think that using multicast instead of broadcast and defining
> solicited-node multicast addresses based on the IID, were very wise
> choices. This is because there are so many different groups (2^24), that
> even in very large networks, each solicited-node multicast group will
> likely only have one member. So - at least in principle - there are two
> layers at which the battery impact of NS packets can be reduced to zero
> or close to zero:
>
>  1. If the AP keeps state of what MAC addresses have joined which
>     multicast groups, it can selectively turn multicast NS into a set of
>     unicast NS (L2 unicast; not L3 unicast). This will be a pretty
>     significant bandwidth optimization both because a) as you say,
>     unicast is faster than multicast, and b) because one multicast NS
>     will almost certainly turn into only one unicast NS. Win/win.
>  2. Even if the AP does not do #1, if the wifi chipset in the device
>     keeps state of what multicast groups the device has joined, then the
>     wifi chipset can simply drop packets that aren't interesting to the
>     device's main CPU. From experience we know that this can also lead
>     to massive battery savings - the wifi chipset is basically on a lot
>     of the time anyway, because it has to listen for beacons and
>     incoming packets, and this sort of filtering can be pretty efficient.
>
> That leaves RAs. Multicast RAs are a bane for battery life, because
> every time a device joins the network, it sends an RS. If the router
> responds with a multicast RA, then all devices on the link get a packet
> which they didn't need. On large wifi networks, I've seen this cause RAs
> once every 3-4 seconds, which is really painful in terms of battery
> life. Fortunately, there's an easy solution here: respond to router
> solicitations with unicast RAs sent to the sender. This is pretty
> trivial, and it doesn't require any state anywhere. There is a corner
> case where, if 10000 devices come online at the same time because the AP
> just booted up, you can get a thundering herd problem and have to send
> 10000 unicast packets, but this is can be optimized too: if the router
> sees that it's sending more than 100 solicited unicast RAs per second,
> it can simply send a multicast RA instead.
>
>     - especially in a typical network where the mobile devices move, and
>     might trigger an RS/RA on each L2 roam between BSSIDs.
>
>
> The client devices I'm familiar with don't send RSes when L2 roaming
> between BSSIDs on the same SSID. Is this a bug? Should they?
>
>     I hear someone saying: "Yes, but this is all about wireless, DC is all
>     different story - it's all switches, they don't hold the state!" - yes
>     and no.
>
>
> I'd argue that it's not "yes and no", it's just "no". The state required
> by the approaches I suggest above is the same amount of state as
> multicast snooping, and it's pretty much the same amount of state
> required by the efficient ND approach - but it doesn't require
> complicating the ND protocol
>
>     Maybe worth to split the problem area into smaller subsets and see
>     whether a "cheaper" solutions are possible. Let's first take the
>     "multicast RAs" problem, and assume we do the below steps:
>
>     a) Allow the environments that wish to do so, to send solicited RAs
>     unicast. (standard already permits to do so).
>
>
> Yes please!
>
>     b) Have the hosts restart the  Resilient RS [5] process at 1/2 or so
>     of router lifetime expiry: the router now can have the heuristic to
>     know which hosts are supporting the "unicast RA update" mechanism
>     *and* did not receive the periodic RA which would have been sent
>     usually at 1/3 the lifetime expiry (three RAs per lifetime rule of
>     thumb).
>
>
> Yes please!
>
>     c) bump up the allowable MaxRtrAdvInterval and AdvDefaultLifetime to
>     their maximum theoretically possible values (spec says the hosts
>     should already be able to handle those, I did not test though) - this
>     would allow to further reduce the periodic RA frequency, from 2.5
>     hours today to ~18.2 hours, or, if we feel adventurous, put all-ones
>     to be a "solicited-only RAs by default" - thus, unless another router
>     sends an RA with different lifetime, refreshing the router info
>     becomes host-driven, with an option for a router to override at any
>     time.
>
>
> This sort of depends what device you have. For example - on a mobile
> phone, receiving one packet every 2.5 hours is *so far down* in the
> noise that it just doesn't matter. Your phone is doing a massive amount
> of stuff already - it's syncing your email, receiving wifi beacons,
> checking calendar alarms, etc. etc. All of that uses way more CPU than
> receiving one multicast RA.
>
> There may be other devices that have lower power requirements, though
> I'm not sure - perhaps these devices simply can't use 802.11-style wifi
> because it's too expensive, and so already have to use something like
> 6LowPAN. I don't know if there's a use case here.
>
>     These steps take care of the "multicast RA" problem. But again, also,
>     each of them by itself brings incremental usefulness, and is doable on
>     a single device => no "chicken and egg", they gradually shrink the
>     issue - and are compatible with the existing tweaks that solve the
>     same problem (in a more layer-violating way).
>
>
> Care to work together on a document that provides operational guidance
> to optimize battery impact of ND on wifi networks? Since they involve
> zero protocol changes, v6ops would be a good candidate group for it. And
> it might help persuade vendors that are not your employer to implement
> stuff like this :-)
>
>     Now let's think of updating the various caches while roaming:
>
>
> How much more do you think this sort of thing will help, assuming we do
> the "100% feasible using the existing protocol" stuff above?
>
>     NB: the above does not take care of the "defending the host's address
>     on behalf of the sleeping node" problem.
>
>
> An NS is 72 bytes, and if we do multicast snooping, then a sleeping node
> is only ever going to be woken up if a node with the same last 3 address
> bytes shows up. That should be extremely rare.
>
> Cheers,
> Lorenzo
>
>
> --------------------------------------------------------------------
> IETF IPv6 working group mailing list
> ipv6@ietf.org
> Administrative Requests: https://www.ietf.org/mailman/listinfo/ipv6
> --------------------------------------------------------------------
>