Re: draft-ietf-ospf-ospfv3-auth-04.txt

Erblichs <erblichs@EARTHLINK.NET> Sat, 10 July 2004 00:04 UTC

Received: from cherry.ease.lsoft.com (cherry.ease.lsoft.com [209.119.0.109]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id UAA14898 for <ospf-archive@LISTS.IETF.ORG>; Fri, 9 Jul 2004 20:04:17 -0400 (EDT)
Received: from vms.dc.lsoft.com (209.119.0.2) by cherry.ease.lsoft.com (LSMTP for Digital Unix v1.1b) with SMTP id <21.00E0DDA9@cherry.ease.lsoft.com>; Fri, 9 Jul 2004 20:04:17 -0400
Received: from PEACH.EASE.LSOFT.COM by PEACH.EASE.LSOFT.COM (LISTSERV-TCP/IP release 1.8e) with spool id 25268077 for OSPF@PEACH.EASE.LSOFT.COM; Fri, 9 Jul 2004 20:04:15 -0400
Received: from 207.217.120.122 by WALNUT.EASE.LSOFT.COM (SMTPL release 1.0i) with TCP; Fri, 9 Jul 2004 20:04:15 -0400
Received: from user-38lc12k.dialup.mindspring.com ([209.86.4.84] helo=earthlink.net) by pintail.mail.pas.earthlink.net with esmtp (Exim 3.33 #1) id 1Bj5Lh-00060W-00 for OSPF@PEACH.EASE.LSOFT.COM; Fri, 09 Jul 2004 17:04:14 -0700
X-Sender: "Erblichs" <@smtp.earthlink.net> (Unverified)
X-Mailer: Mozilla 4.72 [en]C-gatewaynet (Win98; I)
X-Accept-Language: en
MIME-Version: 1.0
References: <8D260779A766FB4A9C1739A476F84FA401F79A8E@daebe009.americas.nokia.com>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Message-ID: <40EF3379.8AE9BAA0@earthlink.net>
Date: Fri, 09 Jul 2004 17:08:25 -0700
Reply-To: Mailing List <OSPF@PEACH.EASE.LSOFT.COM>
Sender: Mailing List <OSPF@PEACH.EASE.LSOFT.COM>
From: Erblichs <erblichs@EARTHLINK.NET>
Subject: Re: draft-ietf-ospf-ospfv3-auth-04.txt
To: OSPF@PEACH.EASE.LSOFT.COM
Precedence: list
Content-Transfer-Encoding: 7bit

Mukesh,

        Inline..

        Mitchell

Mukesh.Gupta@NOKIA.COM wrote:
>
> Mitchell,
>
> Comments inline..
>
> > > Do you still think that we should mention it everywhere in the
> > > draft that the pkts will be dropped by the IPsec layer ?  Or
> > > we should clarify it more in section 2 and make it explicit
> > > that whenever we say "drop the pkt", it means it is dropped
> > > by the IPsec layer ?
> >
> >         IMO, Minimally it should be explicitly stated.
>
> Ok we will try to make it explicit in the next version.
>
> > > By the way, isn't it the way we handle authentication in OSPFv2 ?
> > > If you don't have any authentication and FULL adjacency between
> > > the neighbors and then you change the configuration of one of
> > > the neighbors to use simple/md5 authentication, the adjacency
> > > is not torn down immediately.  It takes the dead interval time
> > > before each of them mark the adjacency down.
> >
> >         First, what we are dealing here is online reconfiguration
> >         or dynamic reconfiguration. I haven't seen really any
> >         discussion in 2328, wrt changing values after 2-ways/
> >         adjs are established.
>
> I haven't seen anything either.  I tried our implementation
> and reported the findings.
>
> Are there any implementations that will tear down the adjacency
> immediately when the authentication method is changed ?
>
> >         IMO, I have two very different ways of thinking about this.
> >
> >         #1 If a field within a pkt is changed that "forces" the
> >          link partner to now drop the pkt, the router dead
> >          interval time frame is a built in latency period to
> >          re-establish re-synchronization of the values. However,
> >         this allows data movement accross the adj, where the
> >         adj REALLY is no longer valid / synchronized. In addition,
> >         if another router duplicates anothers router-id, but
> >         changes one required synchronized value, to invalidate
> >         the pkt, and this one pkt causes the adj to be dropped,
> >         it could be a DOS type attack.
> >
> >         #2 To properly achieve "Faster Failure Detection",
> >         IMO Guyal, et al, should have considered the reception
> >         of a NOW invalid pkt. This would eliminate the router
> >         dead interval delay in tearing down the adj.
> >
> >         It is the delay's / latencies that are built in
> >         tearing down a now no longer valid adj.
> >
> >         So, we should be leaning from OSPFv2 the behaviours
> >         that we don't want in v3, and attempt to remove them
> >         versus forcing us to live with our past ?mistakes? for
> >         consistency sakes. Thus, IMO, if authentication is
> >         CHANGED locally the adj should be torn down immediately,
> >         and the reception of a pkt that would cause it to be
> >         dropped by the recvr, should effect the state of the
> >         adj. BTW, this should be only one of many fields that
> >         should effect the state of the adj. Don't we have to assume
> >         that the adj is going to fail anyway? Is the reception
> >         of a hello pkt that will be dropped repeatedly,
> >         the first indication that the nbr is no longer reachable?
> >         I think yes. Thus, it is a valid nbr state machine event
> >         to the down state. However, since it is so drastic,
> >         I assume that a CLI command should allow this event.
>
> The current behavior is actually helpful during the configuration
> changes.  Consider the scnerio when an admin is transitioning the
> OSPF (v2 or v3) network from "no authentication" to "simple
> authentication".  Because of the current behavior, he/she gets
> enough time to change the configuration on all the routers without
> bringing the network (or data forwarding) down.  If the router
> tears down the adjacency immediately, there will be a forwarding
> break in this case.
>
> IMHO, it is not worth tearing down the adjacency when the admin
> makes this configuration change just to notify him/her for some
> incorrect configuration.  If the configuration is incorrect
> (mismatching authentication types on routers), the admin will
> know within minutes anyway.

        Is it really helpful? Ask any customer whether having
        his system that takes minutes to respond to a router CLI
        change and I think you will loose that customer!

        During this latency period that can be 1 hr,
        no hellos are being responded to, no LSAs are being
        sent or responded to, the DR doesn't acknowledge new
        nbrs, etc,... All due to a CLI timebomb. If it happens
        immediately, the admin should know that he just ran
        a CLI change and this was the cause.

        The latency time for the system to normally ack the
        change results in longer than necessary LSDB
        synchronization and failure detection.

        If you listed your nbrs, the list would not actually
        be correct, since you are no longer communicating
        with your nbrs. Lets see. If I am a router and I
        have my hello interval set to X secs, I can set up
        adjs with one set of routers. Then I change it to
        another value and get adjs with another set of routers.
        Then if I just toggle the value back and forth within
        router dead interval time frame, I will have adjs
        with routers with different hello intervals.. Is
        this proper operation????

        If a change is done that is going to eventually drop
        the adj, then why wait? If the change is a two step
        process on a single router, then impliment a commit
        type functionality.

        I just think that this is an area that really needs
        a standardized RFC that allows the customer to get
        immediate changes to changed configurations.
>
> Regards
> Mukesh