Re: [Isis-wg] two drafts

mike shand <mshand@cisco.com> Fri, 05 March 2010 10:32 UTC

Return-Path: <mshand@cisco.com>
X-Original-To: isis-wg@core3.amsl.com
Delivered-To: isis-wg@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 8556628C242 for <isis-wg@core3.amsl.com>; Fri, 5 Mar 2010 02:32:57 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[AWL=-2.350, BAYES_00=-2.599, J_CHICKENPOX_84=0.6, MIME_CHARSET_FARAWAY=2.45]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id libb3zSVUpH8 for <isis-wg@core3.amsl.com>; Fri, 5 Mar 2010 02:32:55 -0800 (PST)
Received: from ams-iport-2.cisco.com (ams-iport-2.cisco.com [144.254.224.141]) by core3.amsl.com (Postfix) with ESMTP id D689C3A8746 for <isis-wg@ietf.org>; Fri, 5 Mar 2010 02:32:54 -0800 (PST)
Authentication-Results: ams-iport-2.cisco.com; dkim=neutral (message not signed) header.i=none
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: Ag8BAOZskEuQ/uCWe2dsb2JhbACDCpg7FQEBFiQGHJxhiB4IkEKBMYEmgThuBIMELw
X-IronPort-AV: E=Sophos;i="4.49,586,1262563200"; d="scan'208";a="4021981"
Received: from ams-core-1.cisco.com ([144.254.224.150]) by ams-iport-2.cisco.com with ESMTP; 05 Mar 2010 10:00:07 +0000
Received: from [10.61.110.176] (dhcp-10-61-110-176.cisco.com [10.61.110.176]) by ams-core-1.cisco.com (8.13.8/8.14.3) with ESMTP id o25AWtwZ011125; Fri, 5 Mar 2010 10:32:55 GMT
Message-ID: <4B90DDD5.8000701@cisco.com>
Date: Fri, 05 Mar 2010 10:32:53 +0000
From: mike shand <mshand@cisco.com>
User-Agent: Thunderbird 2.0.0.23 (Windows/20090812)
MIME-Version: 1.0
To: lizhenqiang@chinamobile.com
References: <OF08B3EEA9.1495B010-ON482576DD.0039BF4B-482576DD.0039BF59@china.mobile>
In-Reply-To: <OF08B3EEA9.1495B010-ON482576DD.0039BF4B-482576DD.0039BF59@china.mobile>
Content-Type: text/plain; charset="GB2312"
Content-Transfer-Encoding: 8bit
Cc: isis-wg <isis-wg@ietf.org>, lilianyuan@chinamobile.com
Subject: Re: [Isis-wg] two drafts
X-BeenThere: isis-wg@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: IETF IS-IS working group <isis-wg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/isis-wg>, <mailto:isis-wg-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/isis-wg>
List-Post: <mailto:isis-wg@ietf.org>
List-Help: <mailto:isis-wg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/isis-wg>, <mailto:isis-wg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 05 Mar 2010 10:32:57 -0000

I fully understand that reality and specification can and do diverge,
but I don't see how yet another piece of specification helps.

If all it is doing is repeating the existing specification then it is
redundant.

If it is proposing changes to improve interoperability then they will
only be fully effective if everyone implements them. In which case,
wouldn't it be better for everyone to implement the correct behaviour in
the first place.

I don't believe we should be making changes to the protocol to
accommodate broken implementations.

Mike


lizhenqiang@chinamobile.com wrote:
> Hi, Mike,
>
> Specification is different from reality and implementation.
> Misbehaviour caused network flapping occured in China Mobile's network in 2007, 5 years after the correct processing mechanism published.
>
> What are proposed in draft-li-isis-error-lsp-processing are helpful to increase the interoperability of ISs from different vendors and to increase the robustness of ISIS protocol.
>
> Best Regards,
>
> --------------------------------------------------------------------------------
> Zhenqiang Li
> 13911635816
> Department of Network Technology
> China Mobile Research Institute
> 2010-03-05
>
> --------------------------------------------------------------------------------
>
> ·¢¼þÈË£º mike shand
> ·¢ËÍʱ¼ä£º 2010-03-05 17:45:21
> ÊÕ¼þÈË£º lizhenqiang@chinamobile.com
> ³­ËÍ£º isis-wg
> Ö÷Ì⣺ Re: [Isis-wg] two drafts
>
> I see no reason to change anything. The correct procedure has been
> published for the last 8 years. I see no benefit in yet another draft
> drawing attention to this.
>
> Mike
>
> lizhenqiang@chinamobile.com wrote:
>   
>> Hi, Les,
>>  
>> I will revise draft-wei-isis-tlv according to the discuss raised here.
>>  
>> For draft-li-isis-error-lsp-processing, it was presented in IETF 75 meeting and we did not get more argument after the answer in the meeting. Please refer to the minute.
>> It is not historic to provide the switch. Cisco GSR routers still have the switch in the field network. From the view of the operaters, at least China Mobile, such switch is welcome. But the default state should be what is said in my draft to in according with ISO 10589.
>> Besides, some other suggestions are given in my draft, such as providing correct chechsum even in purge LSP, correct processing procedure for zero checksum/remianing lifetime LSPs.
>> Why do you want to abandon the whole draft?
>>  
>>
>> --------------------------------------------------------------------------------
>>
>> Zhenqiang Li
>> 13911635816
>> Department of Network Technology
>> China Mobile Research Institute
>> 2010-03-05
>>
>> --------------------------------------------------------------------------------
>>
>> ·¢¼þÈË£º Les Ginsberg (ginsberg)
>> ·¢ËÍʱ¼ä£º 2010-03-05 11:41:55
>> ÊÕ¼þÈË£º ÀîÕñÇ¿; isis-wg
>> ³­ËÍ£º 
>> Ö÷Ì⣺ RE: RE: [Isis-wg] two drafts
>>  
>> Zhenqiang -
>>  
>> You submitted two drafts.
>>  
>> 1) draft-wei-isis-tlv-00.txt
>>  
>> This is proposing a new TLV to advertise the source of the purge. This has received some favorable interest and in my view (which is not authoritative) should move forward - though significant revisions will need to be made to the draft to address all of the concerns/limitations that have been raised.
>>  
>> 2) draft-li-isis-error-lsp-processing-02.txt
>>  
>> This is proposing to allow an IS to either discard or purge a received corrupted LSP.
>>  
>> It is the latter draft's intentions that I was responding to in my most recent reply to you. The fact that some implementations have historically provided a knob to enable either behavior (discard/purge) is not reason to preserve the purge behavior. Evidence is now clear that discard is the correct behavior and has proven to be safe. There is no reason to reintroduce purging as a valid option - so in my view the second draft should be abandoned.
>>  
>>    Les
>>  
>>   
>>     
>>> -----Original Message-----
>>> From: ÀîÕñÇ¿ [mailto:lizhenqiang@chinamobile.com]
>>> Sent: Thursday, March 04, 2010 6:37 PM
>>> To: Les Ginsberg (ginsberg); isis-wg
>>> Subject: Re: RE: [Isis-wg] two drafts
>>>
>>>
>>> Hi, Les,
>>>
>>> Thank you for your active discussion in the mail list.
>>>
>>> Yes, you are right. ISO 10589 specified the correct behavior for
>>> routers receiving a corrupt LSP in 2002. However, the network flapping
>>> occurred in China Mobile's network in 2007. Unfortunately, we even did
>>> not see any corruptedLSPReceived warning from the NMS.
>>>
>>> What we are discussed here are the differences between implementations
>>> and specifications. The purpose of this draft is not to revise the
>>> specification back to purge corrupted LSPs, but to add a new TLV to
>>> record the source of the purge LSP. This method I believe is benificial
>>> to debug and locate network problems, such as flapping.
>>>
>>> Best Regards,
>>> ---------------------------------------
>>> Zhenqiang Li
>>> 13911635816
>>> Department of Network Technology
>>> China Mobile Research Institute
>>> 2010-03-05
>>>
>>>
>>>
>>> ·¢¼þÈË£º Les Ginsberg (ginsberg)
>>> ·¢ËÍʱ¼ä£º 2010-03-04 23:37:03
>>> ÊÕ¼þÈË£º lizhenqiang@chinamobile.com; isis-wg
>>> ³­ËÍ£º
>>> Ö÷Ì⣺ RE: [Isis-wg] two drafts
>>>
>>> Zhenqiang -
>>>
>>> ISO 10589(2002) Section 7.3.14.2e specifies:
>>>
>>>   <snip   >
>>> An Intermediate system receiving a Link State PDU with an incorrect LSP
>>> Checksum or with an invalid PDU syntax shall
>>>
>>> 1) generate a corruptedLSPReceived circuit event,
>>>
>>> 2) discard the PDU.
>>>   <end snip   >
>>>
>>> This follows a safe practice of dealing with corrupted LSPs in a way
>>> which avoids dangerous LSP storms.
>>>
>>> Your argument is if the corrupted LSP is discarded "silently" then you
>>> do not actually know that corrupted LSPs are being received - so you
>>> would like the spec to be revised to allow ISs to purge corrupted LSPs.
>>> This of course puts the network at risk for LSP storms - which
>>> according to your post below you have actually experienced.
>>>
>>> The safe solution is to follow the spec which not only stipulates that
>>> corrupted LSPs should be discarded but also specifies that a management
>>> event be generated so the occurrence of corrupted LSPs is reported.
>>> Rather than do this, your proposal not only would restore a proven
>>> dangerous behavior (purging corrupted LSPs) but do so in a way which
>>> does NOT generate any events (you specify treating corrupted LSPs in
>>> the same manner as LSPs which have aged out).
>>>
>>> I do not see any reason to support this proposal.
>>>
>>>    Les
>>>
>>>
>>>   > -----Original Message-----
>>>   > From: isis-wg-bounces@ietf.org [mailto:isis-wg-bounces@ietf.org] On
>>>   > Behalf Of lizhenqiang@chinamobile.com
>>>   > Sent: Thursday, March 04, 2010 1:18 AM
>>>   > To: isis-wg
>>>   > Subject: Re: [Isis-wg] two drafts
>>>   >
>>>   > I did not see the mail I posted to isis-wg@ietf.org using foxmail.
>>> Sent
>>>   > it out again using web mail. Sorry for the possible duplication.
>>>   > /////////////////////////////////////
>>>   >
>>>   >
>>>   > Hi, Les, Tony and Sheng,
>>>   >
>>>   > Thank you for your discussion.
>>>   > China Mobile submitted this draft since several times of severe whole
>>>   > network flapping occurred in our network in the past three years. We
>>>   > found the reason was that the products of some vendor generated purge
>>>   > LSP when they received LSP with an invalid checksum. The flapping
>>>   > impacted the services carried in our network and we spent a lot of
>>> hard
>>>   > time to look for the reason and to locate the flapping source.
>>>   >
>>>   > One disadvantage of discarding the corrupted LSP silently is no
>>>   > indication of network misbehavior. So, some vendors (e.g. Cisco)
>>>   > provide switch to control the process manner: whether to discard the
>>>   > corrupted LSP or to generate a purge LSP.
>>>   >
>>>   > The method mentioned in this draft is helpful to locate the flapping
>>>   > source when flapping does occur in the network.
>>>   >
>>>   > Best Regards,
>>>   >
>>>   >
>>>   > ---------------------------------------------------------------------
>>> --
>>>   > ---------
>>>   >
>>>   > Zhenqiang Li
>>>   > 13911635816
>>>   > Department of Network Technology
>>>   > China Mobile Research Institute
>>>   > 2010-03-02
>>>   >
>>>   > ---------------------------------------------------------------------
>>> --
>>>   > ---------
>>>   >
>>>   > ·¢¼þÈË£º Les Ginsberg (ginsberg)
>>>   > ·¢ËÍʱ¼ä£º 2010-03-02 17:46:14
>>>   > ÊÕ¼þÈË£º shengcheng; Tony Li; ÀîÕñlizhenqiang@chinamobile.com   >;
>>> isis-wg
>>>   > ³­ËÍ£º duanxiaodong@chinamobile.com; adrian.farrel; Li Lianyuan; Dan
>>>   > King
>>>   > Ö÷Ì⣺ RE: [Isis-wg] two drafts
>>>   >
>>>   > Sheng -
>>>   >
>>>   > In regards to OSPF, RFC 2328 specifies equivalent behavior when LSAs
>>>   > reach MAXAGE. From Section 14:
>>>   >
>>>   > "As a router ages its link state database, an LSA's LS age may reach
>>>   >     MaxAge.[21] At this time, the router must attempt to flush the
>>> LSA
>>>   >     from the routing domain.  This is done simply by reflooding the
>>>   >     MaxAge LSA..."
>>>   >
>>>   > Apart from purging on checksum error (which was long ago removed from
>>>   > the specification) - I am not aware that purging is problematic. So I
>>>   > am not sure what problem it is that you believe will be solved if we
>>>   > just don't purge at all.
>>>   >
>>>   > That said, purging is an optimization. It reduces the size of the
>>>   > stored link state database - something that customers (in my
>>>   > experience) have been concerned about. You could continue to leave
>>>   > stale entries in the database (and in some scenarios this does occur
>>>   > e.g. when an IS fails) and the protocol continues to operate
>>> correctly.
>>>   > This fact does not make it desirable to leave stale entries.
>>>   >
>>>   > In regards to the purge history - while it is no doubt true that many
>>>   > folks are not familiar with the old ISO Technical Corrigenda - the
>>>   > issue of purging on checksum error is well known and has been widely
>>>   > discussed in public many times in forums like this - as well as by
>>>   > protocol vendors in their documentation and interaction w customers.
>>> It
>>>   > is also discussed in RFC 3719 Section 8 - which is not nearly as
>>>   > lengthy as ISO 10589. :-)
>>>   >
>>>   > So I do not agree that the community of experts has been remiss in
>>>   > addressing this issue nor in publicizing it.
>>>   >
>>>   >    Les
>>>   >
>>>   >    > -----Original Message-----
>>>   >    > From: shengcheng [mailto:shengc@huawei.com]
>>>   >    > Sent: Tuesday, March 02, 2010 12:56 AM
>>>   >    > To: Les Ginsberg (ginsberg); Tony Li; ÀîÕñÇ¿; isis-wg
>>>   >    > Cc: duanxiaodong@chinamobile.com; adrian.farrel; Li Lianyuan; Dan
>>>   > King
>>>   >    > Subject: Re: [Isis-wg] two drafts
>>>   >    >
>>>   >    > Les,
>>>   >    >
>>>   >    > I know the ISIS purge specification and the "unfortunate" history.
>>> :)
>>>   >    >
>>>   >    > Yes, purge should be based on a specific reason, whether it is LSP
>>>   > age
>>>   >    > or DIS change.
>>>   >    > But if am right, OSPF keep silent(compared to purge other's LSP)
>>> in
>>>   >    > the same scenarios and the protocol still works well in reality.
>>>   >    > So my question is to why we can't follow OSPF's purge(flush)
>>>   > behaviour?
>>>   >    >
>>>   >    > Second, i think the "purge" history is just well known by a few
>>> ISIS
>>>   >    > experts just like you. Many ISIS guys maybe know it after a long
>>> time
>>>   >    > from knowing the protocol.
>>>   >    > It is difficult to find the modificaiton by comparing ISO 10589
>>> 1992
>>>   >    > and 2002 verstion(most content are same except the document
>>> format)
>>>   >    >
>>>   >    > Anyway, in my personal view, history has made such "mistakes" and
>>>   >    > leave negative effect. Maybe, the Group need to do sth to improve.
>>>   >    >
>>>   >    >
>>>   >    > Thanks
>>>   >    > Sheng
>>>   >    >
>>>   >    > ----- Original Message -----
>>>   >    > From: "Les Ginsberg (ginsberg)"     <ginsberg@cisco.com    >
>>>   >    > To: "shengcheng"     <shengc@huawei.com    >; "Tony Li"
>>>   <tony.li@tony.li
>>>   >    >;
>>>   >    > "ÀîÕñÇ¿"     <lizhenqiang@chinamobile.com    >; "isis-wg"     <isis-
>>> wg@ietf.org
>>>   >    >
>>>   >    > Cc:     <duanxiaodong@chinamobile.com    >; "adrian.farrel"
>>>   >    >     <Adrian.Farrel@huawei.com    >; "Li Lianyuan"
>>>   >    >    <lilianyuan@chinamobile.com    >; "Dan King"     <daniel@olddog.co.uk
>>>   >
>>>   >    > Sent: Tuesday, March 02, 2010 4:36 PM
>>>   >    > Subject: RE: [Isis-wg] two drafts
>>>   >    >
>>>   >    >
>>>   >    >     > Sheng -
>>>   >    >     >
>>>   >    >     > IS-IS does not allow purging LSPs owned by another IS except
>>> in
>>>   > two
>>>   >    > circumstances:
>>>   >    >     >
>>>   >    >     > 1)LSP ages out
>>>   >    >     > 2)DIS change (new DIS is allowed to purge the old DIS pseudo-
>>> node
>>>   >    > LSPs- thanx to Tony for reminding me)
>>>   >    >     >
>>>   >    >     > There was an unfortunate error in the original ISO 10589-1992
>>> spec
>>>   >    > which specified purging an LSP received w checksum error. This was
>>>   >    > quickly corrected in Technical Corrigendum I (ISO speak...)
>>> published
>>>   >    > in 1993 to specify discard. Unfortunately, many folks only look at
>>>   > the
>>>   >    > 1992 spec and are unaware of TC1.
>>>   >    >     >
>>>   >    >     > The 2002 edition of 10589 (the latest version) incorporates
>>> TC1.
>>>   >    >     >
>>>   >    >     >   Les
>>>   >    >     >
>>>   >    >     >    > -----Original Message-----
>>>   >    >     >    > From: shengcheng [mailto:shengc@huawei.com]     >    > Sent:
>>> Tuesday,
>>>   >    > March 02, 2010 12:15 AM     >    > To: Tony Li; Les Ginsberg
>>> (ginsberg);
>>>   >    > ÀîÕñÇ¿; isis-wg     >    > Cc: duanxiaodong@chinamobile.com;
>>> adrian.farrel;
>>>   > Li
>>>   >    > Lianyuan; Dan King     >    > Subject: Re: [Isis-wg] two drafts     >    >
>>>   >    >
>>>   > Hi
>>>   >    > Tony,     >    >     >    > I have a question: Can we modify the ISIS
>>> behaviour
>>>   > of
>>>   >    > purging other's     >    > LSP to just permit to purge self-originated
>>> LSP
>>>   >    > which is similar to     >    > what OSPF does ?
>>>   >    >     >    >
>>>   >    >     >    > Thanks
>>>   >    >     >    > Sheng
>>>   >    >     >    >
>>>   >    >     >    >
>>>   >    >     >    > ----- Original Message -----
>>>   >    >     >    > From: "Tony Li"     <tony.li@tony.li    >     >    > To: "Les
>>> Ginsberg
>>>   >    > (ginsberg)"     <ginsberg@cisco.com    >; "ÀîÕñÇ¿"
>>>   >    >     >    >     <lizhenqiang@chinamobile.com    >; "isis-wg"     <isis-
>>> wg@ietf.org    >
>>>   >    >    >    > Cc:     <duanxiaodong@chinamobile.com    >; "adrian.farrel"
>>>   >    >     >    >     <Adrian.Farrel@huawei.com    >; "Li Lianyuan"
>>>   >    >     <lilianyuan@chinamobile.com    >;
>>>   >    >     >    > "Dan King"     <daniel@olddog.co.uk    >     >    > Sent: Tuesday,
>>> March 02,
>>>   >    > 2010 4:04 PM     >    > Subject: Re: [Isis-wg] two drafts     >    >     >
>>>   >     >    >     >
>>>   >    >    >    >     > Hi Les,     >    >     >     >    >     >    > I believe the
>>> legitimate purge cases
>>>   >    > are limited to:
>>>   >    >     >    >     >    >
>>>   >    >     >    >     >    > 1)Originating IS purges its own LSP. This case is
>>> not
>>>   > really
>>>   >    > of     >    > concern here.
>>>   >    >     >    >     >    >
>>>   >    >     >    >     >    > 2)LSP owned by another IS ages out - in which case
>>> all ISs
>>>   >    > will     >    > trigger the     >    >     >    > purge at roughly the same
>>> time. The
>>>   >    > enhancement does not seem useful     >    > in this     >    >     >    >
>>> case.
>>>   >    >     >    >     >    >
>>>   >    >     >    >     >    > Have I overlooked something??
>>>   >    >     >    >     >
>>>   >    >     >    >     >
>>>   >    >     >    >     > I disagree that having a tag in case 2 is not useful.
>>> In
>>>   >    > fact, it     >    > would     >    >     > seem very useful in debugging
>>> purge
>>>   >    > propagation.
>>>   >    >     >    >     >
>>>   >    >     >    >     > Another case of legitimate purges is when there is a
>>> new DIS
>>>   >    >    >    > election.  See     >    >     > 8.4.5.d&e.
>>>   >    >     >    >     >
>>>   >    >     >    >     > In general, the cost seems very low, the potential for
>>>   >    > debugging     >    > information     >    >     > seems worthwhile.
>>>   >    >     >    >     >
>>>   >    >     >    >     > Why not?
>>>   >    >     >    >     >
>>>   >    >     >    >     > Tony
>>>   >    >     >    >     >
>>>   >    >     >    >     >
>>>   >    >     >    >     > _______________________________________________
>>>   >    >     >    >     > Isis-wg mailing list
>>>   >    >     >    >     > Isis-wg@ietf.org
>>>   >    >     >    >     > https://www.ietf.org/mailman/listinfo/isis-wg
>>>   >    >     >
>>> ------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> Isis-wg mailing list
>>> Isis-wg@ietf.org
>>> https://www.ietf.org/mailman/listinfo/isis-wg
>>>     
>>>