Re: [v6ops] new draft: draft-elkins-v6ops-multicast-virtual-nodes

On 9/20/14, Nalini Elkins <nalini.elkins@insidethestack.com> wrote:
>
>
>>
>>>>
>>>>>A directed broadcast ping on IPv4 gives pretty much the same result.
>>>>>Did you test the effects of that ?
>>>>
>>>> I had not.  But, since you mentioned it, I did it on two different
>>>> Windows
>>>> machines.  The one that is the server in question had the following
>>>> results:
>>>>
>>>> C:\Users\Administrator>ping x.x.x.255
>>>>
>>>> Pinging x.x.x.255 with 32 bytes of data:
>>>> Request timed out.
>>>> Request timed out.
>>>> Request timed out.
>>>> Request timed out.
>>>>
>>>> Ping statistics for x.x.x.255:
>>>>     Packets: Sent = 4, Received = 0, Lost = 4 (100% loss),
>>>>
>>>> Arp cache was not updated.   I did a packet trace in the background and
>>>> indeed no ICMP replies were seen.
>>>>
>>>> I did the same ping x.x.x.255 on one of my client PCs and saw:
>>>>
>>>> Pinging x.x.x.255 with 32 bytes of data:
>>>> Request timed out.
>>>> Request timed out.
>>>> Request timed out.
>>>> Request timed out.
>>>>
>>>> But this time the packet trace showed that actually ICMP replies were
>>>> sent.
>>>>
>>>
>>>>So, besides this cosmetic difference the behavior is identical for the
>>>>case of IPv4 ping ?
>>>
>>> Andrew, I think you may have misunderstand me.  The behavior is very
>>> definitely not
>>> identical in IPv4.   In IPv4, many people block or disable directed
>>> broadcast PING.  Such
>>
>>>On the same subnet ? You wrote that the replies were indeed amplified
>>>and sent, or did I misunderstand ?
>>
>> The replies on the same subnet appear to be blocked in one case and not
>> blocked in another case.
>> So, one cannot make a blanket statement (at least based on two data
>> points!).
>>
>
>>With the test I did within my home network, I saw amplified replies on
>>both, so one can not make a blanket statement indeed and it depends on
>>the OS and the setup.
>
>>
>>> PINGs can be used for amplification in Smurf attacks.
>>>
>>> http://www.techrepublic.com/article/understanding-a-smurf-attack-is-the-first-step-toward-thwarting-one/
>>>
>>> Ping to FF02::1 is definitely amplification when 10 echo requests can
>>> create
>>> 2,000+
>>> echo replies.   When you do a Ping on Linux, it is continuous until you
>>> stop
>>> it.
>>> So, you may very easily do amplification without even meaning to.
>>
>>>This is all link-local. The reason Smurf is dangerous is that one can
>>>send the request from across the internet. Here it's on the same
>>>segment - so it's a customer-provider relationship, and there are ton
>>>of the existing operational mechanisms that can be used to take care
>>>of this if this is a concern - documenting them might be useful.
>>>Saying that the hosts should not reply altogether - not; for the
>>>reasons outlined in the previous mail.
>>
>>>(Of course this is just my experience, would be interesting to hear
>>>others' opinions).
>>
>> As I said in another response, in IPv6 subnets, there may be many other
>> nodes
>> on link with you.   We were using the
>>  Ping to FF02::1 as a graphic example of how
>> it is possible to impact both yourself and other nodes without even
>> meaning
>> to.
>
>>So you ping ff02::1 on a network with X hosts, where X is large. So
>>each of these X hosts has to process 1 packet and generate 1 packet,
>>you have to process X responses, you populate ND cache on your hosts
>>with X entries, each of the X hosts populates their ND cache with your
>>address.
>
>>Your host is quite obviously impacted. But the others process a 1/X
>>share of the load compared to you, an equivalent to a regular 1 host
>>exchange. So I don't see how they are impacted.
>
> On Windows, I did a :  ping ff02::1 -n 10
>
> where the -n is the count.
>
> So for each 1 ping request I send out, each of my neighbors sends me 10
> responses.

No, the "-n 10" means you send 10 requests, each of them triggers a
response from each of the neighbors, at least according to Microsoft
documentation at
http://technet.microsoft.com/en-us/library/cc737478%28v=ws.10%29.aspx

> So, I have made them do 10 times as much work as me.

>No, you did not. You made them do the same amount of work as you did
>sending the packets - you sent 10 requests, they sent 10 replies.

You are correct. I was wrong in my explanation of multicast ping amplification. 
How I believe it works is this:

Situation:  node A is on a subnet with 25 other nodes (B-Z)

1.  Node A sends 10 ICMP ping requests to FF02::1
2.  Nodes B through Z send Node A back ICMP 10 ping replies each

Impact: with very little work by Node A, he has made B - Z do work &
created network congestion with 250+ packets.

> Also, since in this
> case, I
> had about 200 neighbors, the link had a bunch of traffic on it (the ping
> requests
> and replies from everyone) (as well as the neighbor discovery packets) which
> were
> in the way of anyone wanting to actual productive work.

>ICMP echo requests in the syntax you described are sent ~1 per second,
>so the replies is about 200 per second. It's a very small amount of
>traffic.

In this case.  The surprise (at least to us) was that it happens at all.  Also,
it highlighted the point of node isolation - that one node can impact others
and the network with both link-local and multicast traffic.

BTW, this is also a security issue.

>
>>
>> We discovered this by accident.   We were doing testing for another
>> purpose and were quite surprised to see that both servers were seeing all
>> the
>> multicast packets and even more surprised when we did the PING to FF02::1.
>
>>Nick has a good point this whole situation would benefit of some more
>>debugging.
>
> Such as what?  BTW, I have sent Nick the trace & can send to you as well.

>Sure, feel free to send it.

Doing so privately.

>
>>
>>>
>>>
>>>>>Of course, private VLANs or (if we are talking VMs) or just using p2p
>>>>>links with /128s would help this in the environments where the hosts
>>>>>can not be trusted - and this of course is not virtual/physical
>>>>>specific.
>>>>
>>>> Yes.  We wanted to bring up the topic of isolation of nodes for
>>>> discussion.
>>>
>>>>This is a useful topic in general - especially in light of the
>>>>MLD-related thread, but the draft seems to focus on a very corner case
>>>>- there are many much better reasons not to do large L2 networks.
>>>
>>> We wanted to pick one specific instance.  Maybe we went TOO small!
>>
>> No, the more focused the doc is, the better.
>>
>>>
>>>>
>>>>>If we're talking specifically virtual environment, here's an approach
>>>>>on how to use ebtables to isolate the hosts:
>>>>
>>>>>ebtables -P FORWARD DROP
>>>>>ebtables -F FORWARD
>>>>>ebtables -A FORWARD -i $uplinkPort -j ACCEPT # let the traffic flow
>>>>>from uplink to any ports
>>>>>ebtables -A FORWARD -o $uplinkPort -j ACCEPT # let the traffic flow
>>>>>from any ports to uplink
>>>>
>>>>>(source:http://serverfault.com/questions/388544/is-it-possible-to-enable-port-isolation-on-linux-bridges)
>>>>
>>>> I think this is very good.  But, unfortunately not very well known.
>>>> Also,
>>>> is this possible for all platforms or just Linux?
>>>
>>>>This was just my first google.com search result when searching for
>>>>"private vlans linux".
>>>
>>>>The first google.com search result on "private vlans windows" brings
>>>>up
>>>> http://blogs.technet.com/b/scvmm/archive/2013/06/04/logical-networks-part-iv-pvlan-isolation.aspx
>>>>which seems to document similar technique.
>>>
>>>>"private vlan vmware" results in
>>>>http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1010691
>>>>as its first hit.
>>>
>>>>Would an IETF publication be easier to find than the above ?
>>>
>>> Maybe not but the hosting company who provided us the service where we
>>> did
>>> this real life test does not seem to have read any of the documents you
>>> reference.
>>
>>>If you were not spoofing the source, then the only DoS target was
>>>yourself, and the inter-customer resource isolation should take care
>>>of the rest. This is relates to "Oops I don't know what I'm doing"
>>>part.
>>
>> Actually, not.  In this case, there was no inter-customer resource
>> isolation.  So, we were impacting everyone.   As I say, we discovered
>> this by accident.
>
>>Again, it's obvious to see how you're impacting yourself but it's
>>unclear to me how were you impacting the others.
>
>>I did a quick test in my lab, and the only node that significantly
>>grew the ND cache in my test was the one sending the ping. The rest
>>just added only the sender's address. Sure, the sender also triggered
>>NUD on the addresses, but again it is only the sender that was
>>impacted - the rest had just 1 packet to handle, which is not much.
>
>>Please help me understand where is the problem ?
>
> See above.
> Also, I believe it is not a question of the neighbor cache.
> We just showed that to prove that other nodes actually were responding.
>
> Also, our topic actually was a bit more broad.  We were not talking only
> of the PING to FF02::1, but of multicast.   Multicast is seen by all the
> neighbors and can be quite a bit of the link activity.   Again, getting in
> the way of productive traffic.
>
> We may be spending a bit too much time focused solely on the Ping.
> I stand by my statement that the Ping can be an amplification.
> However, we used the example of the Ping to show that if you have a
> number of neighbors on link, as shown by the Ping, then they will see
> all the multicast packets generated as well.

>The address is named all-hosts multicast, so it is logical for the
>hosts to see it and react.

>
>>
>> I am thinking that the bigger question that we should address is
>> node isolation.  I think that people implementing IPv6 have got the
>> /64 for subnet in their heads pretty well.  But, not the other part
>> about how multicast - and other packets impact nodes on link with you.
>> And that they should do node isolation.
>>
>> What do you think?
>
>>Give each vhost a routed /64 and use link-local next-hops - this will
>>give the same "broadcast profile" as for IPv4, and reduce the hoster's
>>resources needed to manage the allocated addresses, and still allows
> to sell 65536 vhosts, assuming the said hoster has only /48.
>
>>Then on the shared segment that carries the routed traffic,
>>intra-subnet apply the same security policy to the ff02::1 IPv6
>>traffic as you do to IPv4 x.x.x.255 originated within and destined for
>>that subnet - private VLANs, p2p links, or nothing.
>
>
>
> I think this focuses too narrowly on the Ping.    I need to think some
> more on how to talk about isolation.

>Subnets is the industry-standard way of isolating the broadcast domains.

Then, IMHO, there needs to be a "Best Practices" document for how to create
subnets.  I am not of the opinion that creating a subnet that spans multiple
virtual machines, which could really expect to have separation from each
other, is a great practice.

I believe that in the wild, /64 appears to be the standard for a subnet no matter
what the circumstances are.

--a

>
>
> --a
>
>>
>>>Now, if you start to spoof your source and send the traffic at high
>>>rates, we're into the ToS violation territory, and the hosting company
>>>probably assumes a certain degree of good citizenship from its clients
>>>towards each other.
>>
>>>If you were to try this kind of link-local smurf, why not
>>>try stealing other VM's IPv4 addresses and running MITM on them ? Or
>>>try spoofing the default router ? Or try sweep-scanning their entire
>>>/64 from Internet at high rate ? Or get into legacy and try
>>>ARP-flooding with broadcast ARP requests/replies at line rate ?
>>
>>>There usually is a very simple mitigation to all of the above: first
>>>warn the hosting customer that is being naughty, then terminate the
>>>contract. I know for sure it is being practiced.
>>
>> We discovered this by accident & were using the example of Ping FF02::1 to
>> illustrate what can happen.
>>
>>
>>
>> --a
>>
>>>
>>> What to do is a good question.  IPv6 implementation at end user sites is
>>> new
>>> to many & mistakes are being made.  I do not know what is the answer.
>>>
>>>>
>>>>>So looks like the question at hand is:
>>>>
>>>>>"Should IPv6 nodes respond to Ping to FF0x::1?"
>>>>
>>>>>Which can be rephrased differently to ease the start of the discussion:
>>>>
>>>>>"What are the legitimate uses of a ping to ff0x::1 ?"
>>>>
>>>>>Right ?
>>>>
>>>> Yes.
>>>
>>>>Allright, so I'll mention my use of ping6 ff0x::1:
>>>
>>>>* quick check to find "a few hosts that are alive on this link"
>>>>(obviously not to be done on large segments).
>>>
>>>>* A way to trigger the remote side's ND process without having to do
>>>>one myself (I know the multicast packet to all-hosts *will* usually
>>>>get received regardless of the underlying issues with the L2.5
>>>>infrastructure (snooping, etc.)
>>>
>>>>* A way to further debug malfunctioning L2.5 infra, by comparing the
>>>>reaction to pings to ff02::1, solicited node multicast, other
>>>>multicast groups, etc.
>>>
>>>>Summary: the ff02::1 current ping behavior does help in a non-trivial
>>>>amount of cases.
>>>
>>>>The security aspects the draft mentions indeed do exist, but in a
>>>>properly configured network can be easily mitigated as I've shown
>>>>above. So I'd be against changing the functionality in the existing
>>>>stacks.
>>>
>>> I don't know, Andrew.  I worry about the broadcast nature of the ping.
>>> I feel like it is trouble waiting to happen.
>>>
>>>
>>> --a
>>>
>>>>
>>>> --a
>>>>
>>>>
>>>> On 9/19/14, fred@cisco.com <fred@cisco.com> wrote:
>>>>> A new draft has been posted, at
>>>>> http://tools.ietf.org/html/draft-elkins-v6ops-multicast-virtual-nodes.
>>>>> Please take a look at it and comment.
>>>>>
>>>>> _______________________________________________
>>>>> v6ops mailing list
>>>>> v6ops@ietf.org
>>>>> https://www.ietf.org/mailman/listinfo/v6ops
>>>>>
>>>
>>