Re: [OSPF] Comments / questions on RFC3623 and Update GR part 1 : BMA env
Erblichs <erblichs@earthlink.net> Fri, 17 November 2006 00:13 UTC
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1GkrMi-0002Tx-Hl; Thu, 16 Nov 2006 19:13:56 -0500
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1GkrMh-0002Tc-4j for ospf@ietf.org; Thu, 16 Nov 2006 19:13:55 -0500
Received: from elasmtp-dupuy.atl.sa.earthlink.net ([209.86.89.62]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1GkrMg-0003Yo-Cb for ospf@ietf.org; Thu, 16 Nov 2006 19:13:55 -0500
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=dk20050327; d=earthlink.net; b=ZCv+2hyGb5n9oQ975G8zdAr3jNQZdmmZtchIjJmlGBbOJKLbkkvWXQu+mgOYtdSU; h=Received:Message-ID:Date:From:X-Sender:X-Mailer:X-Accept-Language:MIME-Version:To:CC:Subject:References:Content-Type:Content-Transfer-Encoding:X-ELNK-Trace:X-Originating-IP;
Received: from [68.164.152.240] (helo=earthlink.net) by elasmtp-dupuy.atl.sa.earthlink.net with asmtp (Exim 4.34) id 1GkrMd-00048J-LQ; Thu, 16 Nov 2006 19:13:52 -0500
Message-ID: <455CFFAA.E39D6B25@earthlink.net>
Date: Thu, 16 Nov 2006 16:17:46 -0800
From: Erblichs <erblichs@earthlink.net>
X-Sender: "Erblichs" <erblichs@earthlink.net@smtpauth.earthlink.net> (Unverified)
X-Mailer: Mozilla 4.72 [en]C-gatewaynet (Win98; I)
X-Accept-Language: en
MIME-Version: 1.0
To: sujay gupta <sujay.ietf@gmail.com>
Subject: Re: [OSPF] Comments / questions on RFC3623 and Update GR part 1 : BMA env
References: <45590834.B988E949@earthlink.net> <b33c82d0611151043h5f2680a6u51274e2292e41d1d@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-2"
Content-Transfer-Encoding: 7bit
X-ELNK-Trace: 074f60c55517ea841aa676d7e74259b7b3291a7d08dfec79cadbe2db9f4d8a678740e6ec3d9be163350badd9bab72f9c350badd9bab72f9c350badd9bab72f9c
X-Originating-IP: 68.164.152.240
X-Spam-Score: 0.1 (/)
X-Scan-Signature: 16a2b98d831858659c646b3dec9ed22b
Cc: ospf@ietf.org
X-BeenThere: ospf@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: The Official IETF OSPG WG Mailing List <ospf.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ospf>, <mailto:ospf-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ospf>
List-Post: <mailto:ospf@ietf.org>
List-Help: <mailto:ospf-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ospf>, <mailto:ospf-request@ietf.org?subject=subscribe>
Errors-To: ospf-bounces@ietf.org
Sujay Gupta, Let me give you some insight on part of this thinking. IMO, GR has added a few holes in our converged routing assumptions and that your update might be able to close a couple.. There is alot here!!!! Summary says to think multiple times before you exit prematurely based on a topology change (LSDB) and assume that you need router "X" no matter what. So... Why don't you tell me how a LSDB change is going to be communicated iff the env is BMA and GR router "X" is the DR and no BDR exists? I consider this a huge hole.. I don't really see a DR-other communicating LSAs to other DR-others to allow them to understand about a topology change. Second, lets assume that router X's is an extremely important router within our area with the connection to the internet, etc.. If it weren't we would / COULD just take it down. thus, we have router "X" is a DR, and is the ONLY path to a set of routes and end-systems within this area.. Then lets conservatively look that a topology change is a removal of a single router LSA that has no effect after a SPF calculation. If router "X" has sucessfully entered its grace period / GR interval, aren't we ... .... black holing the routers, routes, end-systems that reside on the other end of router "X"? .... since router "X" CAN not probe to verify that a topology change has occured. It is in non-stop forwarding mode.. It will still forward pkts to the removed LSA routes until... so what... The pkts that are forwarded by router "X" to say non-reach Z will cause minimal harm.. They will be dropped and most likely a ICMP message will be sent on the reverse path... The ICMP message might be interpreted by X, but if it is just forwarded, So,,, what we have here is the unability is to allow router "X" to state that its routes are of a level of imporantance that most topology changes should be balanced against a delay in making the change. New LSAs or changed LSAs that don't show up as primary LSAs after the SPF calculation COULD in theory IMO, also be ignored. The originator could in theory check out whether the LSA is needed to be communicated. (This says just because a minor toplogy change has occured do we need to go thru all the work and possibly isolate router's "X"'s dependencies). If we feel that the new / changed / or removed LSA should be communicated ... router X still isn't going to be in convergence until it resync it's LSDB If the change of LSA results in a better route, is it worth isolating router's X's entities???... Third, most large OSPF router vendors support the ability to set the LSA refresh time on non-DNA LSAs. If we are in a normal environment that the admin does this to decrease the number of LSA refreshes versus the drastic steps to using DNA LSAs. Then assuming having a 45 min LSA refresh would not be unreasonable. Going with this logic we have a few simple steps to take. One is that all HELPERS that originate LSAs, re-originate their LSAs upon first seeing a grace LSA in addition to router "X" doing the same. This allows us to take almost the full hour of the grace-time. If we could communicate a wished grace-time in excess of 1 hour, we could in theory send out DNA LSAs which would allow a static environment to support grace times in excess of 1 hour. :: Part of this logic follows the damand circuit support. Part of this is can, should we figure out a way to stop router "X" in his grace-period/interval from forwarding pkts based on obsolete OSPF control information.????? The other part says, why don't we just run in a degraded mode if we -realize that router "X" is so important that we would just isolate ourselves or so many routes/routers that we REALLY need to keep router "X" alive as long as feasibly possible. Third minor item is I didn't see any suggestion of removing entitites from HELPERS forwarding table based on removing the GR router because of a LSDB change. Mitchell Erblich ------------------ Hi Mitchell, I have some comments inline; On 11/14/06, Erblichs < erblichs@earthlink.net> wrote: Group, First, I have a few questions to the base and the update. These comments / questions pertain ONLY to BMA envs but are also relevelent to other envs. FYI) IMO, helpers are defined by two major requirements a) the ability to recieve a grace-LSA before a hello "new hello" and not restart the adj b) the ability of a helper "Y" to (re)send a router-LSA, etc upon router "X" exiting graceful restart that are in its retransmit-list. If this "helper" split functionality is supported, then only a limited number of routers within a area need to be in Full-helper mode. Within BMA envs a DR-other above helper "b" functionality is not really necessary for retransmit. We use the DR/BDR for retransmit support. IFFFFFFF, a GR router is the DR, who is in it grace-period, and a DR-other stops sending hellos, if their are no alternate paths thru another router for any forwarding data. Then what benefit is it to prematurely end helper mode? Basicly, most algs can determine ahead of time whether the exiting router has contributed a primary link/path. If it isn't then it has no consequence. Yes, data pkts that have a intermediate dest will be forwarded 1 or two extra hops, before they are dropped. (Update: 2.0 Action on route calculation) If a topology change occurs and the GR restarting router is identified as the next hop. By definition it is in non-stop forwarding mode and should be able to forward packets. If this is identified as a topology change and the ONLY route is thru this GR router, shouldn't packets be forwarded thru it anyway? Not doing so blackholes those routes, IMO. Suj>> Here topology change also includes routes which are no more valid, if the lsa seeks a route to be removed and that route involves the GR router the helper exits, thus black-holing and circumventing the GR router as router under plain restart.Note here that previously(rfc3623) the helper was exiting for all reasons, which has been curbed now so as you have pointed out if the lsa effects the route which is ONLY through the restarting router will the system exit, which makes sense. Yes, and no.. Router X, once he has entered GR, is not in the mode to accept topology changes.. He will forwarded no-matter what based on his old information. Thus, any routing inconsistencies done by him until his GR time period ends CAN result in him routing pkts improperly. This is basic IP forwarding. The originator of the change will be the one who can make the correct IP forwarding decisions with reciving or forwarding IP pkts based on the LSA change. However, because router X is no longer in sync, should he inform others not to send or allow reciept of data pkts from router X? At the end of the grace-period or when router X re-enables his OSPF control code, he can be resync'ed. Until then, yes we SHOULD have some level of AI that determines the benefit of keeping router X in GR. Second, most environments have most of the LSA changes not result in any primary path alterations.. Thus, 1) Helpers should be able to be able to support a, without b If this is the case, router "X", if it is a DR should allow all the DR-others to support a, while only the BDR need be a "Full-Helper". Suj>> I like your analysis in breaking up the helper job into partial and full. IMO "Helper" is a behaviour of a router w.r.t to the GR operation, where as the idea whether the helper is "Full-helper" else a "Partial" depends upon the result of election. In brief a "Helper" is always expected to perform "Full-helper" features as per your describe , if it does less; guess it is plain lucky ! ,( also note Helper support maybe a feature switched on / off by admin, not a partial/full helper). 2) If router "X" is a DR-other only one of a "DR or BDR" need to be a Full-helper. All LSAs can be placed on the retransmit list of the for router "X" by the DR or BDR. 3) Topology change. A LSA aging out of the LSDB, will not necessarily remove a element in the forwarding table? If this can be detected, should it still force the exit of a helper wrt graceful restart? Suj>> as per the base ospf operation a max aged lsa on the lsa, should trigger it to flood the lsa (-->" which in turn should instigate the originator of the lsa to resend a fresh copy, if fails ends in removal of the corresponding route everywhere! "), thus if the case happens as you have described a GR with "strict lsa checking enabled" i.e catch all topology changes should force the helper to exit as it has to flood this lsa to the GR router, here the update draft would tell you prohibit your exit only if the lsa somehows affects a route through the GR router, else let peace prevail, no need to exit. 4) If router "X" is a DR and has entered graceful restart with no BDR present, should the DR-other routers allow the LSAs to become not refreshed and aged out because the grace period has not expired. How can "one" or more "HELPERS" force a new DR election? Suj>> IMO the DR-other allows normal aging to continue, if the lsa age's out before grace period expiry, the helpers would exit GR(see above) , again applying the same logic. A new election would disturb all previous adjacencies, do we need it? 5) Even with a 30 min max for the grace period, some routers may only retransmit once every 45 mins or so. How can we guarantee that those retransmited-LSAs are not aged-out if router "X" is the DR and is down for a longer time? Shouldn't recieving the first grace-LSAs from a restarting router initiate a re-send of the LSAs? Suj>> With the LSRefreshtime is 30 min , the MaxAge double(60min), the max grace period is same as LSRefreshtime 30 min, I dont see how we reach a stage where the problem will come, you might want to consider some border cases i guess there the previous arguments(see above) hold good. Mitchell Erblich Best Regards, Sujay _______________________________________________ OSPF mailing list OSPF@ietf.org https://www1.ietf.org/mailman/listinfo/ospf _______________________________________________ OSPF mailing list OSPF@ietf.org https://www1.ietf.org/mailman/listinfo/ospf