Re: [Int-area] WG Adoption Call: IP over Intentionally Partitioned Links

"Pascal Thubert (pthubert)" <pthubert@cisco.com> Mon, 03 April 2017 17:34 UTC

From: "Pascal Thubert (pthubert)" <pthubert@cisco.com>
To: Erik Nordmark <nordmark@acm.org>, Richard Li <renwei.li@huawei.com>, "int-area@ietf.org" <int-area@ietf.org>, Wassim Haddad <wassim.haddad@ericsson.com>, "Dorothy Stanley (DStanley@arubanetworks.com)" <DStanley@arubanetworks.com>
Thread-Topic: [Int-area] WG Adoption Call: IP over Intentionally Partitioned Links
Thread-Index: AQHScpxO+lEWKzah3E+5mqsDI9bBLaFKUEuAgAEwkQCAEd1yAIABTo8AgCMAPNCAK34SgIAHKYgA
Date: Mon, 03 Apr 2017 17:34:22 +0000
Deferred-Delivery: Mon, 3 Apr 2017 17:33:34 +0000
Message-ID: <2545a911bed74cdb8a655cda48d144e9@XCH-RCD-001.cisco.com>
References: <FB580294-14F5-4ED7-B692-F5F3872247A9@ericsson.com> <F061CEB6876F904F8EA6D6B92877731C3AF2AC0E@SJCEML703-CHM.china.huawei.com> <923f7967-0e76-fa13-9cd3-fc5e153df784@acm.org> <a78107c31cb245c6bfbcdf0f61c111e1@XCH-RCD-001.cisco.com> <f9f55cfc-3b97-2300-a4dd-a469c4773868@acm.org> <425de076ca264ac0bca2fc56aa4b60c6@XCH-RCD-001.cisco.com> <d0ba88b1-54cd-5a22-2fcf-9e4c010e7e02@acm.org>
In-Reply-To: <d0ba88b1-54cd-5a22-2fcf-9e4c010e7e02@acm.org>
Accept-Language: fr-FR, en-US
Content-Language: en-US
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/int-area/nUsKp2bTN2b6_FeVkON59h8-fmA>
Subject: Re: [Int-area] WG Adoption Call: IP over Intentionally Partitioned Links
Precedence: list

Hello Erik

-----Original Message-----
From: Erik Nordmark [mailto:nordmark@acm.org] 
Sent: mercredi 29 mars 2017 17:46
To: Pascal Thubert (pthubert) <pthubert@cisco.com>; Erik Nordmark <nordmark@acm.org>; Richard Li <renwei.li@huawei.com>; int-area@ietf.org; Wassim Haddad <wassim.haddad@ericsson.com>; Dorothy Stanley (DStanley@arubanetworks.com) <DStanley@arubanetworks.com>
Subject: Re: [Int-area] WG Adoption Call: IP over Intentionally Partitioned Links

Pascal,

I think the root of this discussion is the distinction between IPPL and multi-link subnets, which are similar but not they same.
Thus I'm planning to add this paragraph to the end of section 1:
    The cases covered in this document are where the link has been
    intentionally partitioned, which is different from the cases where a
    collection of links are joined to have a common IP subnet prefix.  An
    example of the differences is the expected behavior for packets sent
    to link-local IP addresses.  The issues for such multi-link subnets
    are described in [RFC4903].

I don't know if we should add more text to further clarify this distinction.

[Pascal] I have to reread the whole draft but we basically need to make sure that all the consequences of IPPL vs. broadcast domain are clearly identified in some section.
Comparing with ML subnet is useful, helps understanding what IPPL is. Are there consequences beyond link scope activities?

please see more inline.

On 03/02/2017 06:29 AM, Pascal Thubert (pthubert) wrote:
> Erik wrote:
> Do you know of any existing document/protocol which describes such 
> intentionally partitioned tunnel setups?
>
> If so there would be something we could reference, and there would be 
> a reason to clarify the scope in this document.
>
>
>
>
>
> [Pascal] I'm thinking of the cases described in IPv6 ND proxies, RFC 
> 4389; the case above would be scenario 2, and the desire would be to 
> avoid copying the L2 broadcasts over the possibly slow PPP link. ND 
> proxy can indeed be useful in subnet configurations where people want 
> to isolate portions of the layer 2 network, not necessarily for 
> security reasons but also, say, to limit the broadcast domain and 
> protect wireless clients. The configurations proposed in the RFC force 
> going through L3 and appears to me as forms of IPPL. Would you agree?

The IPPL draft does mention RFC4389 not as a network topology but as a potential solution. You might want to look at that text if you haven't already.

So while in principle IPPL can be applied to a subnet split across tunnels, I still don't have a document to reference which shows a IPPL tunnel setup.
[Pascal] 
[Pascal] Interconnecting L2 clouds over L3 tunnels is a very useful technique. If that falls into IPPL, then this should be mentioned. 
Maybe it would help to provide drawings for a set of reference examples to which the draft applies, and in contrast, one or to that are different (e.g. the Ml subnet)?

>> * Finally there's the multilink subnet whereby the router
> interconnects the limited ports with the core at L3. In that case, the 
> router has multiple ports on the subnet, as opposed to the PVLAN where 
> it has only one, thus the risk of ARP loop. This case appears to be 
> not covered, though it makes the proxy ARP operation much more 
> natural. Is that correct?
>
>
>
> For me to understand your case you have to separate out the L3 
> configuration and what it sees as L3 interfaces (whether those are L3 
> ports or SVIs); for IPPL it doesn't matter which mapping there is 
> between L3 interfaces and L2 ports.
>
>
>
> So are the ports you refer to L2 ports with IP seeing an SVI? Or 
> something else?
>
>
>
> [Pascal] No, I’m not talking about an SVI but about placing a router 
> where the IPPL draft places a switch.

But that would be out of scope of this draft.
IPPL is looking at cases where the layer2 is intentionally set up to provide partial connectivity and what layer3 needs to do to correctly handle such links i.e. coping with the L2 behavior.

If layer3 is doing something (like multi-link subnets) to gather multiple links into a subnet prefix, then layer3 is fully in control.

Note that in a multi-link subnet case, packets sent to link-local addresses (IPv4/v6 multicast and IPv6 unicast) would by definition stay within the scope of the link. The subnet scope is larger than the link.

For IPPL the link itself is partitioned hence the expectation is that packets addressed to link local destinations will be delivered. (Of course, in some deployments there might be security policies resulting blocking some particular 5-tuple

[Pascal]  understood and agreed...

>
>
>
> The router also has the effect of partitioning the L2 broadcast domain.

No, in that case the router is gluing things together, not coping with some L2 behavior. The broadcast domain would be partitioned before the router glues together the multiple links into one subnet.

[Pascal] idem

> I understand that IEEE Std 802.11 recommends the use of ND proxy to 
> protect the wireless edge, which is yet another form of topology, not 
> explicited by RFC 4389, but covered by draft-ietf-6lo-backbone-router 
> for the context of constrained power. Ccing Dorothy to make sure I’m 
> not mis-representing the IEEE position.

That sounds like a combined L2 and L3 solution, which is different than IPPL.

[Pascal] idem

> My question is really about the scope of the IPPL document, considering
> that the term IPPL seems a lot larger in scope than private vlans…

The document has three examples (DSL, Cable, and private VLANs) where 
the L2 forwarding of frames on the link is not transitive yet the link 
is seen as a single link by IP, and where IP has to cope with the 
implications of the L2 behavior. Private VLANs has the richest behavior 
of those, hence requires the most care.

Intentionally joining multiple links into a multi-link IP subnet is 
different and we already have that behavior specified in other RFCs.

[Pascal] idem

> [Pascal] I agree that the DAD that the Pvlan nodes see are issued by the
> router, and that’s how the end device are kept unaware of each other,
> and that’s part of the router being a proxy. My point was not about the
> information being transported with the same packet, but about the loop
> avoidance text in the proxy operation:
>
> “
>
> For the ARP proxy to be robust it MUST avoid loops where router1
>    attached to the link sends an ARP request which is received by
>    router2 (also attached to the link), resulting in an ARP request from
>    router2 to be received by router1.
> “
> Point is, the L3 switch makes a difference between what’s coming from an
> access port that’s private and what’s coming from a port connected to
> the primary vlan (the backbone) where the routers are sitting. In
> particular, DAD requests coming from the primary are not proxied back to
> the primary, so there is no loop. The proxy checks if it has a matching
> state on a private port in which case it replies, otherwise it drops the
> DAD packet.

Such a VLAN ID check is what that section suggests. Do you see something 
missing?

[Pascal] Works for me.

>> "  At a minimum, the reception of an ARP request MUST NOT  result in sending an ARP request, and..." The "and" seems to read like this is a rule and there is more. But this is not required to obtain the intented loop avoidance, for instance the suggestion that
> the routers do not forward ARPs coming from other routers known by MAC@
> but forward the others, and it appears to kill the design above. I'd
> suggest to remove that quoted text, or say that this in one way of
> achieving the recommendation to avoid loops that is already there.

If router1 has sent out an ARP request using the promiscuous VLAN ID, 
then the bridges will deliver it to all the ports. Hence there is no 
purpose for router2 to send an ARP request for the same target as a 
result of receiving the ARP request from router1.

[Pascal] I had in mind the SAVI like operation where the switch (or the AP) bars the ARPs that are not destined for the nodes discovered on the access ports that it serves, but forward the ARP or regenerate one (e.g. to revalidate its state) if it has a match. But that operation may be beyond the scope of this document?

> [Pascal] I thought this discussion is about proxying the ARP. And I’m
> saying that “the reception of an ARP request” may actually “ result in
> sending an ARP request” as long as the flooding domain is controlled and
> loops are avoided, as discussed just above.  Probably le being confused
> in the scope of the text I was reading.

Let's get together and draw the packet flow so we can understand the 
case you have in mind.

[Pascal] that's what I was describing above, which can be seen as a filter that isolates/protects the access port, plus ML subnet. The latter is sorted out. The former is not really specified anywhere but is implemented, e.g. by SAVI switches to protect the wireless access against excessive broadcasts. This text makes full sense in the context you describe, but seems to preclude the sort of operation I have in mind here. Is there a way to make it more open?

>>     "For that reason it is NOT RECOMMENDED to configure outbound multicast forwarding from private VLANs." Not all protocols would be impacted in a same manner. I think that the recommendation, which applies to ND more than to many others because of DAD, could be
> more specific, like that whatever is done to propagate a multicast
> beyond the limited scope allowed by the PVLAN should ensure that the
> multicast is either harmless to the protocol it is propagating, or not
> echoed back to the source device at all. Then again, the latter is
> something that can be done in a L3 switch and a ML subnet router.
 >
>
>
>
> ND packets are never forwarded in any useful way since forward (as
> described in RFCs) is a an IP operation which includes decrementing the
> ttl/hopcount and ND verifies that hopcount is 255 on receipt.
>
>
>
> [Pascal] I was confused again. I though the forwarding term here was
> describing the switch behavior, and I read this text as a recommendation
> that the switch does not propagate the multicast..

I'll make this more clear by using "IP forward" and "L2 forward" instead 
of the generic "forward".
[Pascal]  or use "routing" for "L3 forwarding" ?

> Section 8 tries to explain that issue with
>
>
>
>     IP Multicast which spans across multiple IP links and that have
>
>     senders that are on community or isolated ports require additional
>
>     forwarding mechanisms in the routers that are attached to the
>
>     promiscuous ports, since the routers need to forward such packets out
>
>     to any allowed receivers in the private VLAN without resulting in
>
>     packet duplication.  For multicast senders on isolated ports such
>
>     forwarding would result in the sender potentially receiving the
>
>     packet it transmitted.  For multicast senders on community ports, any
>
>     receivers in the same community VLAN are subject to receiving
>
>     duplicate packets; one copy directly from layer 2 from the sender and
>
>     a second copy forwarded by the multicast router.
>
>
>
>
>
> [Pascal] I must admit I do not fully understand what this text. Maybe a
> picture with the routers and the switches showing the packet duplication
> at work?

Let me see what I can do.

[Pascal]  : )

Thanks for all, looking forward for the next version.

Take care,

Pascal

[Int-area] WG Adoption Call: IP over Intentionall… Wassim Haddad
Re: [Int-area] WG Adoption Call: IP over Intentio… Pascal Thubert (pthubert)
Re: [Int-area] WG Adoption Call: IP over Intentio… Erik Nordmark
Re: [Int-area] WG Adoption Call: IP over Intentio… Richard Li
Re: [Int-area] WG Adoption Call: IP over Intentio… Erik Nordmark
Re: [Int-area] WG Adoption Call: IP over Intentio… Pascal Thubert (pthubert)
Re: [Int-area] WG Adoption Call: IP over Intentio… Erik Nordmark
Re: [Int-area] WG Adoption Call: IP over Intentio… Pascal Thubert (pthubert)
[Int-area] Fwd: WG Adoption Call: IP over Intenti… Wassim Haddad
Re: [Int-area] WG Adoption Call: IP over Intentio… Erik Nordmark
Re: [Int-area] WG Adoption Call: IP over Intentio… Pascal Thubert (pthubert)