Re: Comments on draft-yourtchenko-colitti-nd-reduce-multicast

Andrew Yourtchenko <ayourtch@cisco.com> Fri, 28 February 2014 15:26 UTC

Return-Path: <ayourtch@cisco.com>
X-Original-To: ipv6@ietfa.amsl.com
Delivered-To: ipv6@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D941A1A02DD for <ipv6@ietfa.amsl.com>; Fri, 28 Feb 2014 07:26:34 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.748
X-Spam-Level:
X-Spam-Status: No, score=-7.748 tagged_above=-999 required=5 tests=[BAYES_50=0.8, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, GB_I_LETTER=-2, J_BACKHAIR_32=1, J_CHICKENPOX_35=0.6, RP_MATCHES_RCVD=-0.547, SPF_PASS=-0.001, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id kTAbDOgp3mYZ for <ipv6@ietfa.amsl.com>; Fri, 28 Feb 2014 07:26:30 -0800 (PST)
Received: from alln-iport-6.cisco.com (alln-iport-6.cisco.com [173.37.142.93]) by ietfa.amsl.com (Postfix) with ESMTP id A1A141A02FD for <ipv6@ietf.org>; Fri, 28 Feb 2014 07:26:29 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=32214; q=dns/txt; s=iport; t=1393601188; x=1394810788; h=date:from:to:cc:subject:in-reply-to:message-id: references:mime-version; bh=Rb2leLy7gcL+EUYpJfUnarQSAr7oEe/Byt131zq6rrI=; b=mrQTB1MJQOmlIUMuMOFCINVwjT5f12cwvmaeoR6PeogeyT6Es6EyNfop ZUBPfYGtLyDY7i/kh00meoen65oULmqyX27IC/q02pkY7Za/6v+iy4piO Zz1P0DqLQng2VMV5QGuEHx/CsFLC/F9HEWO0xTmpsWz3tRE+QzNo8Q1M7 Q=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AqIIAOSpEFOtJXG8/2dsb2JhbABZgwY7V6g2BJhlgRQWdIIlAQEBAgEBGgEMEQIrAgkCBwULCxIGFRlJDgYOHgWHUwgNyz0XjW0QAQMGSQUHCoQtBJRPijWLYYMugWcBBAIZBB4
X-IronPort-AV: E=Sophos;i="4.97,562,1389744000"; d="scan'208";a="23968434"
Received: from rcdn-core2-1.cisco.com ([173.37.113.188]) by alln-iport-6.cisco.com with ESMTP; 28 Feb 2014 15:26:26 +0000
Received: from xhc-rcd-x10.cisco.com (xhc-rcd-x10.cisco.com [173.37.183.84]) by rcdn-core2-1.cisco.com (8.14.5/8.14.5) with ESMTP id s1SFQQkg013896 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=FAIL); Fri, 28 Feb 2014 15:26:26 GMT
Received: from [10.61.221.73] (10.61.221.73) by xhc-rcd-x10.cisco.com (173.37.183.84) with Microsoft SMTP Server (TLS) id 14.3.123.3; Fri, 28 Feb 2014 09:26:24 -0600
Date: Fri, 28 Feb 2014 16:25:51 +0100
From: Andrew Yourtchenko <ayourtch@cisco.com>
X-X-Sender: ayourtch@ayourtch-mac
To: Erik Nordmark <nordmark@acm.org>
Subject: Re: Comments on draft-yourtchenko-colitti-nd-reduce-multicast
In-Reply-To: <530C85A6.5080404@acm.org>
Message-ID: <alpine.OSX.2.00.1402281624510.59137@ayourtch-mac>
References: <5305AF13.5060201@acm.org> <alpine.OSX.2.00.1402201404091.12073@ayourtch-mac> <530C85A6.5080404@acm.org>
User-Agent: Alpine 2.00 (OSX 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: text/plain; format="flowed"; charset="US-ASCII"
X-Originating-IP: [10.61.221.73]
Archived-At: http://mailarchive.ietf.org/arch/msg/ipv6/jiB0QDvIzsChyz7wn5ODRSo9zEk
Cc: IETF IPv6 <ipv6@ietf.org>
X-BeenThere: ipv6@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "IPv6 Maintenance Working Group \(6man\)" <ipv6.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ipv6>, <mailto:ipv6-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ipv6/>
List-Post: <mailto:ipv6@ietf.org>
List-Help: <mailto:ipv6-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ipv6>, <mailto:ipv6-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 28 Feb 2014 15:26:35 -0000

On Tue, 25 Feb 2014, Erik Nordmark wrote:

> On 2/20/14 8:56 AM, Andrew Yourtchenko wrote:
>> 
>> Yeah, it's a very useful optimization to consider, we discussed it I think 
>> alread. The reason I did not include it yet, was to go over the tradeoffs 
>> without the pressure of the submission deadline being a few hours away :)
>> 
>> 1) from the efficiency standpoint this approach is fantastic for the wired 
>> and for the wireless (the implementations that do the mcast->ucast 
>> conversion over the air the win is a bit less, but nonetheless there is a 
>> win since we send only one packet on the wired side instead of many and 
>> save the CPU on the gateway).
>> 
>> 2) This adds an instantaneous behavior change into the network at the peak 
>> load conditions - so if that code path has a problem, this creates 
>> hard-to-debug situation.
>> 
>> 3) There are a few trickier scenarios with L3 roaming (hosts arriving from 
>> other subnets onto the same AP) and that AP having a single 802.11 group 
>> encryption key, which this behavior might make easier to accidentally have 
>> broken.
> I don't quite understand concern #3 (and #2). Since the routers doesn't know 
> whether hosts rely on unsolicited multicast RAs, the multicast RAs must 
> function. Thus adding the flash crowd optimization to multicast solicited RAs 
> wouldn't be anything new or different - it would merely appear in one 
> additional case.

My worry is more about the humans than the hosts :-)

char *foo = NULL;
if(unlikely(event)) {
   // Almost never happens
   *foo = 'x';
} else {
   // Almost always happens
}

Is a strawman bug which will be harder to catch just because it is 
triggered in rare circumstances.

That said, since the router/wireless infra will have to ratelimit ingress 
RSs anyway, they might as well send a multicast RA at that point.

>>> 
>>> In section 4.3 there are two suggestions: MLD snooping and L2 unicast or 
>>> L3 multicast packets. I think there are some operational concerns around 
>>> MLD snooping - I hope others can fill in some information in that space 
>>> since I don't know the issues.
>>> (Editorially it would be clearer if those two are separate sub-sections.)
>>> 
>>> The multicast-over-unicast refers to SAVI as a way to collect the state, 
>>> but it doesn't specify where you see that state used. Presumably for DAD 
>>> (and NS in general unless you also do section 4.7)? But SAVI doesn't claim 
>>> to have all addresses since it is concerned with conflicts. Thus a host 
>>> can exist and be silent - but DAD would still expect to reach it. Or a 
>>> host could have moved to a different port and SAVI having stale state - 
>>> yet DAD should work. My point is that the details of how the state is 1) 
>>> maintained and 2) used to forward packets are key to analyzing how the 
>>> neighbor discovery functionality and robustness would be affected by this 
>>> idea.
>> 
>> The hosts that are "quiet" are indeed a problem. However, I think it might 
>> be very OS-specific - the today's smartphones are anything but quiet :-)
>> So I think this will be much more of an issue for some special-purpose 
>> devices, which are also predominantly servers.

> I guess my first question was how this would work when there is no SAVI-like 
> state for an address. (In that case you don't have a unicast address to which 
> you can send the packet.) Would you multicast or drop?

My instinct is to say "multicast" since that will provide the 
backwards-compatible recovery.

But I think it can be more nuanced. Piggybacking on the resolicit idea 
from this thread below, if the gateway keeps track of the counts of 
"vanilla" and "resolicit-capable" hosts over time, it might be able to 
make an informed decision. If it saw a legacy host within a certain 
timeframe in the past (I am assuming that it would always see the router 
solicitation), then it would err on the side of multicasting, else if 
there are no vanilla hosts and the router did not clean up any of the 
neighbors within a router lifetime, there's no point to multicast.


>> I think the smart-meter type of device would be in this category. But to 
>> access those, the remote party will need to know their address... so, if a 
>> device does some sort of service registration upon boot-up, then it will 
>> not be a quiet device anymore.
> But the service registration could have happened when the device was first 
> installed - I don't think that is likely to happen each time the device wakes 
> up.

True, it actually should *not* happen.

So the failure scenario is:

1) the host does behave according to today's specs
2) the host is providing a service for the remote hosts
3) the router has deleted the neighbor entry for that host
4) the host does not call home to recreate this entry
5) the router has decided not to multicast the NS to resolve an address this host

It seems like a robust implementation can fairly easily avoid the (3) and 
(4) - whereas a router can try to be more conservative about (5).

>> I haven't seen this failure in the "consumer portable devices" networks 
>> that I ran - so this is purely a mental construct.

> Yes, but IPv6 needs to work for other sleepy devices. We want whatever 
> improvements we do to the standards in this space to be useful for the next 
> 20 years or so. (Products, whether end devices or routers/switches can 
> operate on shorter time scales and with more limited applicability than the 
> core standards.)

Indeed. Another aspect is to try to make improvements in such a way that 
they are not required on two sides at once to be useful - this should 
help with their adoption.

>
>> 
>> I do agree that there are limitations - but I think it is very important to 
>> collect the data about the scope of these limitations, to keep a balanced 
>> approach.
>> 
>> The data I have as of today based on 5-6 networks with 10-20K hosts says it 
>> is quite minimal. But I reserve the right to be wrong and thus would be 
>> very interested to hear about other live networks where it is different.
>> 
>> (side note: if these sleepy hosts are quiet and are concerned that their 
>> address be known, they might consider to use DHCPv6 as well).

> Using DHCPv6 doesn't reduce multicast (neither for address resolution nor for 
> dad). One could change DHCP to record the link-layer addresses so that the 
> routers would ask the DHCP servers for the link-layer addresses, but that 
> seems like a lot of change.
>
> Also, see my response to Ole on the other ways that the DHCPv6 we currently 
> have doesn't help solve the problems at hand. [I see others brought up the 
> same concerns around DHCPv6 on the list.]

yeah. I won't fork the discussion and will see if I can contribute 
anything within that thread :-)

>
>> 
>>> 
>>> I section 4.4 the document refers to proxy but without specifying which of 
>>> the different proxy approaches you have in mind.
>>> There is the proxy ND RFC, and there is the DAD proxy internet-draft in 
>>> 6man. Are you referring to one of those, or some slightly different form 
>>> of proxy? (For example, ND proxy would respond with the LLA of the 
>>> router/AP, but that might result in host movement looking like a duplicate 
>>> address - again depends on the details.)
>> 
>> I don't know the spec. I saw it implemented in a product, where two 
>> attached clients A and B during the ND saw the packet sequences that were 
>> close enough to the ND spec to achieve identical practical result and be 
>> compliant to the letter of it, yet allowed to perform the process more 
>> optimally from the broader perspective.
>> 
>> Arguably this kind of behavior would not require its own spec ?

> Yes, because it depending on the details it might break SeND and other ND
> options, or it might only be robust and useful in very limited applicability 
> for instance in specific topologies (such as a single wireless controller).

That'd be a bug for that implementation to fix.

Arguably having a spec in this particular case might do more harm that 
good - if there is a bug in the spec which an implementation follows, then 
first a spec has to be fixed, then the code has to be fixed.

>
>>> I don't have any issue with removing the somewhat arbitrary 9000 second 
>>> max AdvDefaultLifetime in section 5.1. However, the tradeoff for what 
>>> default lifetime to use in section 4.5 needs to take into account one 
>>> additional factor.
>>> The default lifetime serves to garbage collect entries from the default 
>>> router list should a router silently disappear. Thus for links that do not 
>>> have a fixed (set of) link-local address(es) for the router(s), having a 
>>> high default lifetime means that after a failure the hosts would have one 
>>> entry in the default router list which us unreachable - until that high 
>>> lifetime expires. I don't know if there has been a study on the 
>>> performance impact of that would have on the hosts e.g., how often they 
>>> would re-probe the default router.
>> 
>> This is indeed a useful consideration that is worth noting in a doc that 
>> would be removing the 9000 seconds max limit. Mind if sketch a -00 on this 
>> matter and send you the possible text ?
> That would be excellent.

Ok, I'll sketch it today/on weekend and send unicast.

>
>>> Section 4.6 has the same concern. But 4.5 and 4.6  makes lots of sense 
>>> e.g., in a VRRP deployment where the link-local address of the virtual 
>>> default router would always be the same. Ditto for networks with a single 
>>> point of failure single router at a fixed address (e.g., if the router is 
>>> always at fe80::1 or some other fixed address.) Thus I think we should 
>>> recommend 4.5 and 4.6 that within that applicability. Added benefit is 
>>> that the routers control it, hence the operator of the network can set the 
>>> values higher for VRRP or single router cases.
>> 
>> I totally agree! We probably might have a series of documents for different 
>> types of use cases - this could be very helpful for the folks deploying 
>> them. We could turn this doc into a "large-scale high-density WiFi 
>> (stadiums, exhibitions, campuses, etc.)" and then have couple of others 
>> describing other types of deployments. Let's discuss this in London ?
> Yep.

>>> In think it would be good to separate out the second half of section 4.7 
>>> (blocking link-locals) into a separate section, since it is quite 
>>> different than clearing the on-link bit. That idea has significant 
>>> implications since it changes the IP subnet model (RFC 4903 talks about 
>>> this.) I not saying that we shouldn't consider this, but I do think it 
>>> would fall in a very different category than the other ideas in this 
>>> draft. Might even be best to have a separate draft on this radical idea so 
>>> it can be explored fully.
>> 
>> I thought RFC5942 was clarifying this (by the way I need to reference it).
> 5942 re-states the intent of 4861 with more detail. But doesn't change the 
> above semantics. (It does remove these two bullets from on-link:
>       *  a Neighbor Advertisement (NA) message is received for the
>          (target) address, or
>       *  any Neighbor Discovery message is received from the address.
> )
>
> The only case I know of where "blocking link-locals" is part of our standards 
> is for multi-link subnets, for instance in 6lowpan. But in that case it was 
> part of the architecture from the start. The issue we have with the general 
> ND evolution is that we need to support the existing assumptions, and many of 
> those assumptions come from IPv4. For instance, the use and semantics of an 
> IPv4 link-local multicast has been carried to IPv6 link-locals.
>
> Removing that will cause some breakage.

Forward-referencing to [*] below.

>> 
>> My thought process was as follows: suppose we have 9000 people, each with a 
>> smartphone, wandering inside a large building or on campus. We want to 
>> limit the broadcast domains. So we will split these 9000 hosts into 256
>> subnets, each with 35-36 hosts. Those are small enough to be close to what 
>> is a typical "small network" so everything will be fine, and the protocols 
>> which do use link-local addresses will actually be able to function.
>> 
>> Good ? Seems so, but, there are problems with this approach:
>> 
>> 1) "Lobby ambassador" problem: everyone arrives through the same entrance, 
>> and turns on their portable device there. Therefore, the assignment of a 
>> host to the subnet must happen there, not based on physical location, but 
>> on some loadbalancing algorithm. I have 256 subnets, so e.g. for 
>> predictability we can use a first byte of a hash of the host's mac address.
>> 
>> 2) I need to span those 256 vlans across the entire venue, and ensure I 
>> place each of the hosts into the correct VLAN on each AP. A typical network 
>> of such a scale will have about 300 APs. I still need to send RAs within 
>> each vlan. I need to ensure that the hosts do not receive the "wrong" RAs. 
>> (and remember, there is not really a way in a naive 802.11 imlementation to 
>> do "selective" multicast).
>> 
>> 3) I have people that are able to see each other using multicast 
>> advertisements, but what relation do they have to each other ? Equality of 
>> the first byte of the hash of their mac address. Not very useful.
>> 
>> So I conclude, that while splitting the whole crowd into smaller subnets 
>> somewhat takes care of the floods at the expense of additional complexity, 
>> the resulting "working" link-local multicast protocols do not really make 
>> any sense - because the partition of the network does not have any other 
>> logic than pure loadbalancing.
> I agree with all that.
>
> For those types of deployments having /127 subnets (plus DHCPv6 PD if someone 
> wants a personal area network behind their phone) makes a lot of sense.
>
> But my issue is that the goals we are trying to achieve are more general than 
> the stadium (or broadband subscriber) use case.
>

Agreed - though I think it might make sense to more explictly document the 
use case and see what applies still ?

e.g.: say we have sensors on a factory floor. Those sensibly will sit on a 
different SSID compared to all the chatty smartphones worn by the workers; 
and given they will be much more "well-behaved", we may not need to be as 
aggressive in suppressing the traffic between the hosts.

Given we are concerned about their energy consumption, most probably the 
direct access to them will be restricted (else one can just ping -f and 
drain the battery) - so that traffic will be quite controlled.

This is a wildly different use case from a high-density BYOD where we 
essentially have a swarm of almost hostile devices outside of our 
control right in the middle of the network.


>> Of course, I need to disclaim: doing this in the home network with one AP 
>> and < 100 devices all belonging to a single person is not useful at all.
>> 
>> Doing this in a network with 9000 devices belonging to different people is 
>> the thing that made the most sense.
> Agreed.
>> 
>> Now that we've cut off all the services, we need to figure out how to bring 
>> them back... And I think the further mental construct here that would be 
>> useful is to consider this as a giant homenet with a lot of guests. Thus, I 
>> think from the services discovery standpoint, the products of the DNS-SD 
>> workgroup will be very beneficial here.

> I don't think such an approach (first break it, then get folks to fix 
> whatever broke whether that means fixing an IETF standard, some common 
> implementation, or various to us unknown and proprietary approaches makes 
> sense for a retrofit.

[*]

Talking specifically mDNS, it breaks itself as we increase the # of 
hosts - whether we keep the multicast on or not - we just get to pick the 
way it breaks:

1) we keep the multicast - so every phone gets the 10000s of 
advertisements, can not make any use of them anyway, only drains the 
battery. (from my experiments, the # of services usable within the UI is 
limited)

2) we get rid of the multicast - and break the discovery upfront in a 
predictable way, declaring it within the AUP for the network.

(Aside: I did not mention it in the doc because I wanted to keep it 
focused on the ND, but in the high-density environments I specifically 
block udp/5353 multicast in its entirety. This does not have to be this 
way, and grouping the hosts into "broadcast domains" based on some 
criterion might be a different approach, but it's nothing implements 
this, to the best of my knowledge).

>
> Only makes sense for a greenfield deployment where there are no assumptions 
> that existing protocols and code continue to work.

Above, we are discussing the use cases different from HD WiFi that do not 
exist today. I count those as green field.

>> 
>>> 
>>> 4.8 seems to conflate the address assignment with DAD. Just because we 
>>> might want to centralize the DAD checks doesn't imply that we want to 
>>> remove the ability for the host to pick its own privacy enhanced 
>>> interface-IDs to form its addresses.
>>> From a deployment perspective DHCPv6 is available for address assignment, 
>>> but don't think we want to require that for WiFi or other links which have 
>>> packet loss.
>> 
>> This is more of a fallback scenario for those who want a 100% guarantee of 
>> address uniqueness in the network - using the existing mechanism.

> But RFC 3315 doesn't guarantee uniqueness, which is why the host needs to do 
> perform DAD in addition.

I need to correct myself. "Those who want a bigger guarantee".

No mechanism gives a 100% guarantee. We asymptotically approach 100% to 
varying degrees.

>> 
>> Also as a trigger for another small discussion:
>> 
>> I think it's worth modeling the real-world experience in the networks with 
>> varying packet loss to see at which point it stops being usable, and go 
>> from there.
>> 
>> The classic IETF "Built-in hotel WiFi" topic: is it worth being extra-sure 
>> that DAD works in that scenario ? Or is it worth fixing the connectivity by 
>> bringing the loss down to acceptable level ? At how many retransmits do we 
>> declare a failure ? Could we then explore a similar approach to make the 
>> DAD more robust instead - keeping in mind the probability of failure should 
>> be quite low ?
>> 
>> This will be a good robustness improvement for the hosts that will 
>> immediately benefit them.

> If we want a DAD probe mechanism that is more robust to failure then I think 
> we should just reuse the techniques in ACD (RFC 5227) which performs ongoing 
> conflict detection. However, that results in or broadcast/multicast messages!

For the 802.11 case we can have a reasonable assurance that there is no 
duplication of the MAC address (else a bunch of L2 stuff would have been 
broken). So then the question of duplicates reduces to detecting the cases 
where the existing MAC<->IP mapping does not hold. If both of the hosts 
involved into collision are active, the router should be able to notice 
this just from the unicast traffic ?

>
> If we want both less multicast packets *and* a more robust DAD (including 
> efficiently handling DAD for sleeping nodes), then I don't see any approach 
> other than making some devices on the link be able to speak more 
> authoritatively about the addresses present on the link. Those devices can 
> try to build that state implicitly (gleaning from packets on the link), or 
> explicitly. (I'll try to capture the differences between those before next 
> week.)

The DAD also has the explicit function of detecting the duplicate MAC 
addresses. If we omit this function, I think the process can be simplified 
by quite a bit ?

Duplicate MAC addresses are an interesting corner case - but forcing 
everyone to pay the tax for this corner case might be something worth 
reevaluating ?

>> 
>>> In don't understand 4.9. Should I read it as a host shutting things down 
>>> if it goes to sleep even for a short time, and then waking up and 
>>> multicasting a
>> 
>> depends on the definition of "short". Some hosts nap during the 100ms 
>> interval between the 802.11 beacons. Obviously would not work here.
>> Maybe not do it within 5 seconds of turning the screen off.
>> But I certainly know my gadgets stop being reachable if I do not touch them 
>> for a day"
> The loss of reachability is expected.
> But there is also a question about the efficiency when the device wakes up. 
> Whether it needs to start from scratch (multicast RS, do DAD multicasts and 
> wait for a second), or whether it can do DNA (unicast a NS to the router(s) 
> to check it is on the same link) and avoid waiting for the lack of a response 
> to a DAD probe.

If the DNA succeeds - therefore we're on the same link, then we have some 
memory of it ?

Some more on this at [**] below.

Also, the DNA with a unicast NS makes me wonder - can the host achieve 
a similar function by sending a unicast RS *if* it remembers that the 
router earlier was sending the solicited RAs unicast ?

(Treat this as a thought experiment, I realize that I'm probably trying
to rewrite a dozen specs with one sentence ;-)

>> 
>> Sure it needs more whiteboarding. But if we assume one MAC address = one 
>> stack (this is a fair assumption I think?), it should be possible, here is 
>> a raw idea:
> Need to bring a whiteboard to London ... ;-)

I'll get a pack of napkins! :-)

>> 
>> By the destination of the RS being unicast, the router knows the host is 
>> "resolicit-capable". The source address contains the MAC address, so it is 
>> sufficient information to create a neighbor entry with a flag 
>> "resolicit-capable" in it.
> Or add a flag to the RS to say "I will re-solicit" ...

Yes. I tried to avoid having an explicit field - since the overall system 
should converge within 1 router lifetime anyway.

>> 
>> Subsequent neighbor entries with the same MAC address will inherit the 
>> "resolicit-capable" in it.
>> 
>> (Data structure detour: don't have to index by MAC address to do this.
>> Take two counting bloom filters "legacy hosts' MAC addresses" and 
>> "resolicit-capable hosts' MAC addresses". In case of a false positive 
>> match, prefer the membership in the "legacy hosts" to avoid blackout).
>> 
>> Keep the running counters of number of "resolicit-capable" ND entries and 
>> "legacy" ND entries.
> I can see how you increment those counters (based on seeing a RS with a new 
> MAC address.)

Assuming no explicit field: you increment the "legacy" counter upon the 
multicast RS from a host you do not have a neighbor entry for. Then you 
decrement the legacy counter and increment the "resolicit-capable" counter 
upon seeing a unicast RS.

> But when do you decrement them?
>
> (I don't think NUD from the router can be used for this, because the host can 
> be unreachable from the router for a few seconds due to radio issues, yet the
> host will still consider itself connected and not restart by sending an RS.)

Yes, I was counting this particular case as a "NUD is impatient" type of 
situation which should not happen :-).

But I think this is fixable relatively easily by having a single timer of 
"legacy quiet hosts sleepy interval": if you lost all legacy hosts, you wind up a timer 
and behave in a legacy fashion until it expires. If the legacy hosts 
reappear before this interval expires, you reset this timer.

I think the very same logic could govern the decision "to multicast or to 
drop?" above, as well... - if no legacy hosts and timer expired - drop, 
else multicast.

The value for the timer is probably something that would need to be set 
administratively. The bonus is that the default value can be "infinity" - 
that is, always send multicast RAs => today's behavior.


>> 
>> When it's time to send a periodic RA, look at the counters, and if the 
>> count of "legacy" hosts is zero, no need to send it.
>> 
>> This algorithm gives multiple deployment choices:
>> 
>> 1) "legacy" mode:
>> 
>> set the RA interval at 1/4 of the router lifetime. This way none of the 
>> hosts ever reaches the half remaining time, resolicits will never be sent, 
>> business as usual same as today.
>> 
>> 2) "resolicit-capable" mode:
>> 
>> set the RA interval at 3/4 of the router lifetime. Then by this time we 
>> will have seen the RS from all the capable nodes, so we can decide whether
>> to send a multicast RA or not.
> Presumably the RSs can be lost. One way to handle this is by agressive 
> retransmissions (no RA after 1 second, then resend RS). Another way is to 
> space the "retransmissions" inside the lifetime range e.g., by having the 
> host send a RS when 1/3 of the router lifetime has passed, next when 2/3 has 
> passed. (Generalize to k/N for N transmissions before giving up.)

Yes, indeed. I chose the "start at time T with retranstmits + exponential 
backoff" because this seems to take care better of a wider range of the 
possible loss on the medium: i.e. if you just space 3 retransmissions 
within the lifetime, as the loss increases, the chances to get the RA 
decrease. The exponentially decayed retransmits do not seem to have this 
problem - they autotune. They might create a denser burst in case of the 
loss though.

>> 
>> This of course needs to be worked on in order to avoid the synchronization 
>> issues (i.e. the host can not just blast an RS straight at half lifetime). 
>> But they are solvable: Assuming even the existing router lifetime limit of 
>> 2.5 hours, and RA interval of 1.5 hours, we have a space +- 0.5 hour of 
>> jitter.
>> 
>> Taking an arbitrary guess of 180000 hosts within that /64, a uniformly 
>> distributed jitter will give ~100pps of RSs which is more than manageable.
> Your suggestions made me realize the stuff I put in section 8.9 in 
> efficient-nd-05 is more complex than it needs be. No need to worry about 
> prefix and other lifetimes - sufficient to look at the default router 
> lifetime and make sure (by sending RSs) that the host hears from all the 
> routers. That's good.
>> 
>> NB: this is a very first napkin sketch, so it is fairly simplistic. But I 
>> think if we were to tinker a bit more with the timers, this can be made to 
>> work.
>> 
>> There is also a question - what happen if the router has to clean up the 
>> legacy host ND entry due to resource constraints - this can be taken care 
>> of by temporarily going into "legacy only" mode for a period of a couple of 
>> RA lifetimes.

> The router is free to just discard the NCEs - unless we change DAD to use 
> them in some proxy approach.
> Even if we change DAD that way, the router can just do the SAVI thing (send a 
> unicast NS to the host and see if it still there) - no need to multicast RAs 
> to clean up.
> However, the unicast NS cleanup (or any other cleanup driven by the router 
> expecting the host to respond) has issues with sleeping hosts. (I just 
> realize their might be different forms of sleep - completely off and will 
> take up based on a timer, or being woken up by arriving (unicast) packets. I 
> don't know if there is confusion around that and whether we need different 
> terms to make it more clear.)

I think the router might handle them in a FIFO fashion: keep them while 
it has enough space, and start purging them if there is a growing 
contention.

I think the timer idea I wrote about above, should take care of this case.

>> 
>> Of course, it requires the sleepy hosts to wake up every now and then. But 
>> with a lifted router lifetime limit, it will be every 9 hours.
> Assuming DAD is handled without involving the hosts.

yup.

>> 
>> Probably doable, but let's take a SWAG just to sanity check:
>> http://www.ti.com/lit/ds/symlink/cc3000.pdf and suppose they were one day 
>> to implement IPv6 with the similar power consumption.
>> 
>> I count the 802.11g, because of more efficient spectrum usage. This gives 
>> us maximum power consumption of 207ma, and a shutdown current of 5ua.
>> 
>> Now, assuming we couple it with something like 
>> http://www.atmel.com/Images/Atmel-2586-AVR-8-bit-Microcontroller-ATtiny25-ATtiny45-ATtiny85_Datasheet.pdf
>> which has standby of 0.1ua, and 12ma of active consumption (graph at page 
>> 173 of this pdf)
>> 
>> Let's say 20 seconds should be enough to make all the necessary 802.11 
>> arrangements, send the data, send the solicitation, and receive 
>> advertisement.
> That leaves plenty of time to do the current DAD.
>
> However, Margaret had some numbers a while back with lots more frequent 
> wakeups, but with very short runtime. I don't know if that was captured in 
> RFC 6574 or in Margarets slides from the workshop. In any case, one issue I 
> remember is that for Ipv4 the runtime was a lot less than one second, but 
> with IPv6 there was an additional 1 second to wait for DAD to complete which 
> blew the power budget.

This is interesting, and completely un-discussed so far aspect - 
battery-life impacting which does not have much to do with multicast ! So, 
essentially we have to wait 1 extra second before we can use the address.

Assuming it is indeed a problem, notice that this is a problem for the 
router-less segment as well => therefore should be addressed in the hosts 
themselves independently off the network. (i.e.: assuming a host 
self-assigns an IPv6 link-local address, probably the devices are 
*already* paying the tax on it, even in IPv4-only networks!)

I think we should square it into a separate corner and attack there :-)
(probably another draft?)

Couple of approaches for further discussion:

1) the host does not have to *wait wide awake* to receive the packet - 
since the reply will awake it equally. So it is a matter of more careful 
host programming to send a DAD NS, go back to a nap, and then awake either 
when NA is received or when 1 second expires.

2) investigate a different approach to a "1 second" timeout, treating it 
not as a fixed value, but as an envelope: at zero your chance 
of performing successful DAD is zero; then somewhere at around the 
inter-host RTT+sigma the usefulness of it is maximized, then further 
increase of the interval does not increase its usefulness much. [**] So, 
supposedly, the hosts that wake up and detect that they are on the same 
link, might use the memory about the network characteristics to shorten 
this interval.

--a

>
>   Erik
>> 
>> Now, assuming we power it off the typical NiMh 1.2V elements with 1800mAh 
>> capacity, the continous running time off it will be 1800*3600/220 = 29454 
>> seconds.
>> 
>> Let's assume for simplicity that we wake up every 8 hours, so this gives 60 
>> seconds per day. This gives us the active life span of 490 days.
>> 
>> Given that even the low self-discharge batteries 
>> (http://en.wikipedia.org/wiki/Nickel%E2%80%93metal_hydride_battery#Low_self-discharge_cells) 
>> have retain rate of 70%-85% within a year, probably this is a reasonable 
>> lifecycle (also for this reason I am not accounting for the standby 
>> current, it's comparable to self-discharge).
>> 
>> This is of course also a quick napkin sketch, a properly engineered 
>> approach would take into account the 802.11 maintenance stuff - and, also 
>> it's really the "extreme" case which does not listen to the packets while 
>> asleep - so the lifetime would certainly vary - but I think this shows that 
>> this is not a totally unreasonable path.
>> 
>> --a
>> 
>>> 
>>> Regards,
>>>    Erik
>>> 
>>> 
>>> 
>> 
>