[rbridge] Developing a hybrid router/bridge.

touch at ISI.EDU (Joe Touch) Fri, 07 May 2004 14:16 UTC

From: "touch at ISI.EDU"
Date: Fri, 07 May 2004 14:16:31 +0000
Subject: [rbridge] Developing a hybrid router/bridge.
In-Reply-To: <409BFB1F.2080602@sun.com>
References: <9501A65C15B56148AAB5D40CC03E81F1042B30C1@wanewporms01.gsm1900.org> <409BFB1F.2080602@sun.com>
Message-ID: <409BFC84.7040106@isi.edu>
X-Date: Fri May 7 14:16:31 2004

I guess a larger question to this issue is:

	why is an Rbridge worried about forwarding a packet twice?

The loop isn't create by the campus; it's external. That's like a single 
Internet router detecting a loop in the Internet as a whole. I don't 
think it is reasonable to assume a router can do that. That's for 
Internet routing to avoid, IMO...

Joe

Radia Perlman wrote:

> To answer Konrad's question about traceroute, I think it's pretty clear 
> that *if* Rbridges
> decrement TTL, then they'd need to send ICMP messages when TTL expires, and
> therefore appear as a hop in traceroute.
> 
> However, I agree with Joe that if we are making IP think this is one 
> subnet, then we can't
> decrement the IP TTL.
> 
> There was a case that I was concerned about, but I was having trouble 
> finding a remotely
> plausible example to explain it with. I think I have one now, and 
> luckily also I think I have
> a solution that will allay my paranoid fantasies.
> 
> I was kind of worried about a packet EVER not being protected by a hop 
> count. Is it possible
> for a packet to be encapsulated across the RBridged campus, 
> decapsulated, and then reintroduced.
> 
> The answer is yes, in a weird temporary case.
> 
> The temporary case is where there are two DRs on a LAN. This would be a 
> temporary situation (but temporary bridge loops can be quite devastating).
> 
> Let's say it happened because R1's LAN and R2's LAN suddenly got 
> connected by a bridge coming up. Let's say that endnode D is on R2's LAN 
> and S is on R1's LAN. Let's say that S sends a packet for D. R1 might 
> encapsulate it, send it across the campus, where it might be received by 
> R2 (since there is still a route across the campus to R2...routing 
> hasn't yet coped with the merging of the two LANs, say), and 
> decapsulated. Then R1 would pick it up again, ...
> 
> If multiple LANs got merged simultaneously this way then theoretically 
> not only could you have (temporary) looping, but proliferation.
> 
> So that was why I was kind of hoping that RBridges would decrement the 
> IP TTL.
> 
> But I have a proposal for solving this problem without introducing a lot 
> of complexity (since I suspect people will not have sufficient sympathy 
> with my paranoid fantasies to be willing to
> introduce a lot of complexity to solve them).
> 
> What I want to do is have it be possible for R2 to recognize when a 
> packet is being sent by an endnode and when it was decapsulated by 
> another RBridge. This is not possible for non-IP packets, since they 
> have to be "transparent"...there are no bits to safely play with.
> 
> However, with IP, we can play with the layer 2 header. IP nodes don't 
> (as far as I know...anyone know any different), care what the source 
> address in the layer 2 header looks like when an IP packet is received.
> 
> How about having a specific, constant MAC address, say "X",  that means 
> "transmitted by an RBridge".
> When an RBridge decapsulates an IP packet onto the destination LAN, it 
> can set the source
> address in the layer 2 header to be X. The rule will be that an RBridge 
> is not allowed to forward a packet that has layer 2 source address=X.
> 
> This won't solve the potential problem with non-IP packets, but we're 
> really trying to just make sure this works for IP packets.
> 
> Comments?
> 
> Radia
> 
> 
> Roeder, Konrad wrote:
> 
>> I agree with Joe that MPLS switches are L2 devices with a special L2.5 
>> shim layer.  As part of their behavior, MPLS switches do/can decrement 
>> TTL.  So there is precedence here in L2 (L2.5?) devices meddling with 
>> TTL.  I believe that RBridges could decrement TTL to prevent loops, 
>> although doing so is not "a clean L2 implementation" because 
>> decrementing TTL indicates either a second of time went away or that a 
>> router was traversed.
>>
>> Another point to bring up: Do we want the rbridge to behave 
>> differently depending on what protocol is carried in L3?  If it's IP 
>> it uses TTL, if it's not it's decrementing a hop counter in its 
>> encapsulation.  Do we really want to add this complexity?
>> Going back briefly to the subject of the traceroute kludge... if TTL 
>> gets decremented from 1 to 0, does an Rbridge send an ICMP Timeout 
>> message?  Doing so would make an RBridge appear on a traceroute.  Not 
>> doing so, would make an RBridge appear as a *.  I suppose this is an 
>> implementation detail, but it might be worthwhile specifying the 
>> interaction.  (traceroute is a nice debugging tool)
>>
>> Konrad
>>
>> Konrad Roeder
>> Broadband Wireless Network Engineer
>>
>> T-Mobile USA, Inc.
>> Office:  425-748-2381
>> PCS:   425-444-2076
>> FAX:    425-748-3050
>> 12920 SE 38th Street
>> Bellevue, WA  98006
>>
>> e-mail: konrad.roeder@t-mobile.com
>>
>>
>> Message: 3
>> Date: Thu, 06 May 2004 09:39:56 -0700
>> From: Joe Touch <touch@ISI.EDU>
>> Subject: Re: [rbridge] RE: rbridge Digest, Vol 1, Issue 2
>> To: "Developing a hybrid router/bridge." <rbridge@postel.org>
>> Message-ID: <409A6A5C.9020305@isi.edu>
>> Content-Type: text/plain; charset="us-ascii"
>>
>>
>>
>> Radia Perlman wrote:
>>  
>>
>>> I'm not sure it's useful to try to answer a philosophical question 
>>> like whether RBridge is
>>> layer 2 or layer 3. I don't think there's any agreed-upon definition, 
>>> or whether it's too useful
>>> to categorize. One might say it is layer 3 because it participates in 
>>> some layer 3 protocols
>>> (like ARP and ND).
>>>   
>>
>>
>> It may be useful to distinguish between what is provided to the 
>> outside of an rbridge (i.e., what it emulates) and how it operates 
>> internally.
>>
>> An rbridge (at least to me) emulates an L2 bridge - in which case the 
>> TTL would not be decremented at all.
>>
>> When such a device emulates an L3 router, it is closer (IMO) to 
>> Virtual Routers (see the draft; this was developed in the X-Bone project
>> for recursive Internets) - which are VERY closely related, but each 
>> [L2 and L3] present unique issues. This draft focuses on the 
>> L2-providing device.
>>
>> Internally, either may be implemented with L2 encapsulation, L3 
>> encapsulation, or carrier pidgeon ;-)
>>
>> ARP and ND are interesting cases because they are L3 protocols that 
>> use L2 information, i.e., they are 'glue' layers of a sort. They must 
>> be interfaced to the edge of a campus of rbridges carefully, but I 
>> don't see any showstoppers.
>>
>> Other more complex cases would be MPLS and other so-called layer 2.5 
>> protocols. I confess they never made much sense in that regard to me; 
>> IMO, MPLS is just a different L2 protocol that is layered over other 
>> L2 protocols.
>>
>>  
>>
>>> Someone privately told me there are certain IP protocols that will 
>>> not work if the TTL
>>> gets decremented, like "local broadcast".
>>>   
>>
>>
>> RFC1812 talks specifically about how to process all-1's broadcast 
>> (local broadcast), and how it MUST NOT be forwarded if the TTL is 
>> decremented. There are other cases, e.g., subnet broadcast, which 
>> SHOULD NOT me forwarded (although described as permitted in 1812, this 
>> is depricated in RFC2644).
>>
>> In many ways, the definition of an IP subnet is "that in which the IP 
>> TTL is not decremented".
>>
>> I agree with Radia that this is more important for things like:
>>     - broadcast
>>     - multicast
>>
>> which would affect:
>>     - ARP/RARP
>>     - DHCP
>>     - BOOTP
>>     - IGMP
>>     - ICMP
>>     etc...
>>
>> (see draft-ietf-pilc-link-design for details - notably the broadcast 
>> and multicast sections, which I was responsible for).
>>
>> Joe
>> -------------- next part --------------
>> A non-text attachment was scrubbed...
>> Name: signature.asc
>> Type: application/pgp-signature
>> Size: 250 bytes
>> Desc: OpenPGP digital signature
>> Url : 
>> http://www.postel.org/pipermail/rbridge/attachments/20040506/38070885/signature-0001.bin 
>>
>>
>> ------------------------------
>>
>> _______________________________________________
>> rbridge mailing list
>> rbridge@postel.org
>> http://www.postel.org/mailman/listinfo/rbridge
>>
>>
>> End of rbridge Digest, Vol 1, Issue 3
>> *************************************
>> _______________________________________________
>> rbridge mailing list
>> rbridge@postel.org
>> http://www.postel.org/mailman/listinfo/rbridge
>>  
>>
> 
> 
> _______________________________________________
> rbridge mailing list
> rbridge@postel.org
> http://www.postel.org/mailman/listinfo/rbridge
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 250 bytes
Desc: OpenPGP digital signature
Url : http://www.postel.org/pipermail/rbridge/attachments/20040507/bb2cf6e5/signature.bin
From Radia.Perlman at Sun.COM  Fri May  7 14:18:57 2004
From: Radia.Perlman at Sun.COM (Radia Perlman)
Date: Fri May  7 14:19:26 2004
Subject: [rbridge] Forwarding based on layer 3 vs layer 2 destination
In-Reply-To: <409BFB1F.2080602@sun.com>
References: <9501A65C15B56148AAB5D40CC03E81F1042B30C1@wanewporms01.gsm1900.org>
	<409BFB1F.2080602@sun.com>
Message-ID: <409BFD41.4000202@sun.com>

Currently the rules for forwarding a packet in the internet draft are:

a) if it's non-IP, forward based on the inner layer 2 header destination 
address
b) if it's off-campus IP, forward based on the inner layer 2 header 
destination address
c) if it's on-campus IP, forward based on the inner layer 3 header 
destination address

The reason for making c) be a different case was so that two IP nodes on 
the same campus,
but connected via incompatible layer 2 protocols, could talk.

Question: What kind of incompatible layer 2 protocols might there be? Do 
we want to
support this case? Clearly it would be simpler to always forward based 
on the layer 2 destination
address as put in by the source.

Radia