Re: [OSPF] Solicit comments to OSPF LSA flushing problem statement and mitigation solution

"Dongjie (Jimmy)" <jie.dong@huawei.com> Sat, 26 November 2016 06:39 UTC

Return-Path: <jie.dong@huawei.com>
X-Original-To: ospf@ietfa.amsl.com
Delivered-To: ospf@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7763C129531 for <ospf@ietfa.amsl.com>; Fri, 25 Nov 2016 22:39:30 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -5.717
X-Spam-Level:
X-Spam-Status: No, score=-5.717 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-1.497, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Ql533Ppv16BD for <ospf@ietfa.amsl.com>; Fri, 25 Nov 2016 22:39:28 -0800 (PST)
Received: from lhrrgout.huawei.com (lhrrgout.huawei.com [194.213.3.17]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 01734129492 for <ospf@ietf.org>; Fri, 25 Nov 2016 22:39:26 -0800 (PST)
Received: from 172.18.7.190 (EHLO lhreml708-cah.china.huawei.com) ([172.18.7.190]) by lhrrg01-dlp.huawei.com (MOS 4.3.7-GA FastPath queued) with ESMTP id DBK49647; Sat, 26 Nov 2016 06:39:25 +0000 (GMT)
Received: from NKGEML412-HUB.china.huawei.com (10.98.56.73) by lhreml708-cah.china.huawei.com (10.201.5.202) with Microsoft SMTP Server (TLS) id 14.3.301.0; Sat, 26 Nov 2016 06:39:24 +0000
Received: from NKGEML515-MBX.china.huawei.com ([fe80::a54a:89d2:c471:ff]) by nkgeml412-hub.china.huawei.com ([10.98.56.73]) with mapi id 14.03.0235.001; Sat, 26 Nov 2016 14:39:15 +0800
From: "Dongjie (Jimmy)" <jie.dong@huawei.com>
To: "Acee Lindem (acee)" <acee@cisco.com>, "ospf@ietf.org" <ospf@ietf.org>
Thread-Topic: [OSPF] Solicit comments to OSPF LSA flushing problem statement and mitigation solution
Thread-Index: AQHSRau1b/M+830xHU2UDSf962uJ6KDqxzmQ
Date: Sat, 26 Nov 2016 06:39:14 +0000
Message-ID: <76CD132C3ADEF848BD84D028D243C92793556086@NKGEML515-MBX.china.huawei.com>
References: <D45B21DE.8A84E%acee@cisco.com>
In-Reply-To: <D45B21DE.8A84E%acee@cisco.com>
Accept-Language: en-US, zh-CN
Content-Language: zh-CN
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.130.151.75]
Content-Type: multipart/alternative; boundary="_000_76CD132C3ADEF848BD84D028D243C92793556086NKGEML515MBXchi_"
MIME-Version: 1.0
X-CFilter-Loop: Reflected
X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020206.58392E1D.020E, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=0.0.0.0, so=2013-06-18 04:22:30, dmn=2013-03-21 17:37:32
X-Mirapoint-Loop-Id: 212bbf5885e7309dd75aa080ddf15ef5
Archived-At: <https://mailarchive.ietf.org/arch/msg/ospf/EzatkG8BVhpDWv4tuVHSf5kOY0Y>
Cc: "Zhangxudong (zhangxudong, VRP)" <zhangxudong@huawei.com>, "lizhenqiang@chinamobile.com" <lizhenqiang@chinamobile.com>
Subject: Re: [OSPF] Solicit comments to OSPF LSA flushing problem statement and mitigation solution
X-BeenThere: ospf@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: The Official IETF OSPG WG Mailing List <ospf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ospf>, <mailto:ospf-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ospf/>
List-Post: <mailto:ospf@ietf.org>
List-Help: <mailto:ospf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ospf>, <mailto:ospf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 26 Nov 2016 06:39:30 -0000

Hi Acee,

Thanks a lot for your comments on the drafts.

You are right that the problem could be caused by the cases you described, and there might be other cases of improper LSA flushing. No matter what the root cause is, the purpose of the problem statement draft is to emphasize the consequence of this kind of problem. Such severe impact is a big headache to the operators, which makes them want more robust protocol in addition to fixing the problematic routers after the problem happens in their network.

Your suggestion about rate-limiting the non-self-originated flushing makes sense to me. This can be complementary to the solution in draft-dong-ospf-flush-mitigation. Every router in the domain needs to avoid bringing trouble to the network, also it needs to protect itself from being knocked down due to other's fault. Thus IMO a complete solution needs to cover both the sender side and the receiver side. What do you think?

Best regards,
Jie

From: Acee Lindem (acee) [mailto:acee@cisco.com]
Sent: Thursday, November 24, 2016 1:05 AM
To: Dongjie (Jimmy) <jie.dong@huawei.com>; ospf@ietf.org
Cc: Zhangxudong (zhangxudong, VRP) <zhangxudong@huawei.com>; lizhenqiang@chinamobile.com
Subject: Re: [OSPF] Solicit comments to OSPF LSA flushing problem statement and mitigation solution

Hi Jie,

Sorry we didn't have time to adequately cover the problem and solution in the OSPF WG meeting.

As I understand the two use cases which the problem statement is targeted at are:

   1. Timer bugs - either in he system or local to the OSPF process that result in the LSA reaching MaxAge in faster than the refresh interval (which some implementations violate and most carrier-class implementations jitter).

   2. LS Update packet corruption impacted the LSA Age field but not the rest of the LSA (as this would be detected by the LSA checksum).

At the time of the first polling, many felt that these two problems were both due to bugs and we shouldn't modify the protocol to address them.

If the WG decides to work on this problem, I would prefer solutions that focus on the OSPF router max aging the LSAs as opposed to every router in the domain. In other words, what I would advocate is rate-limiting the max aging of LSAs with more aggressive rate limiting for LSAs that are not self-originated. We already have some guidance on rate limiting in section 7.4 of RFC 7503. This could be more formalized with the same amount of state preservation as your proposal (https://www.ietf.org/id/draft-dong-ospf-flush-mitigation-00.txt).

Thanks,
Acee


From: OSPF <ospf-bounces@ietf.org<mailto:ospf-bounces@ietf.org>> on behalf of Jie Dong <jie.dong@huawei.com<mailto:jie.dong@huawei.com>>
Date: Wednesday, November 16, 2016 at 11:57 PM
To: OSPF WG List <ospf@ietf.org<mailto:ospf@ietf.org>>
Cc: "Zhangxudong (zhangxudong, VRP)" <zhangxudong@huawei.com<mailto:zhangxudong@huawei.com>>, "lizhenqiang@chinamobile.com<mailto:lizhenqiang@chinamobile.com>" <lizhenqiang@chinamobile.com<mailto:lizhenqiang@chinamobile.com>>
Subject: [OSPF] Solicit comments to OSPF LSA flushing problem statement and mitigation solution

Dear all,

Due to time limit, at the OSPF meeting we didn't have much time to discuss the two drafts related to OSPF LSA flushing problem.

The coauthors would like to encourage comments and discussion about both the problem statement and the mitigation solution.

Here is the link to the slides:

https://www.ietf.org/proceedings/97/slides/slides-97-ospf-ospf-maxage-flooding-00.pdf

Best regards,
Jie/Zhenqiang/Xudong