Re: BFD WG adoption for draft-haas-bfd-large-packets

"Acee Lindem (acee)" <acee@cisco.com> Tue, 23 October 2018 17:51 UTC

Return-Path: <acee@cisco.com>
X-Original-To: rtg-bfd@ietfa.amsl.com
Delivered-To: rtg-bfd@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8B85A128C65 for <rtg-bfd@ietfa.amsl.com>; Tue, 23 Oct 2018 10:51:56 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -14.969
X-Spam-Level:
X-Spam-Status: No, score=-14.969 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.47, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cisco.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id U8kEKV-8Q5Gz for <rtg-bfd@ietfa.amsl.com>; Tue, 23 Oct 2018 10:51:54 -0700 (PDT)
Received: from alln-iport-5.cisco.com (alln-iport-5.cisco.com [173.37.142.92]) (using TLSv1.2 with cipher DHE-RSA-SEED-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C50901274D0 for <rtg-bfd@ietf.org>; Tue, 23 Oct 2018 10:51:53 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=33994; q=dns/txt; s=iport; t=1540317113; x=1541526713; h=from:to:subject:date:message-id:references:in-reply-to: mime-version; bh=Gft3N8zKguDUdVEWitVDcJoUyald0B784H+iufacqjA=; b=EhJp1eVab7usbyDVWIBBUxkgdfkUdRIhslUXdLPnxLC8RqmLg2PCGax9 Mn4XK1AuNmBgNrMfF3eSlvgzlzqjyifrEgwxf5nckv3VOIU/owewyjZhL 6DbvQUdx/MjhI3u4DQJXXpYKQC2D7NB/X5qDDsYg/fg79u0n5fqshMpBu o=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: A0AEAAAPX89b/5RdJa1jGgEBAQEBAgEBAQEHAgEBAQGBUQUBAQEBCwGBDXdmfygKg2uIGIwdgWgllxUUgWYLAQGEbAIXhRMhNA0NAQMBAQIBAQJtKIU6AQEBBCMKXAIBBgIOAwMBAhYLAQYDAgICMBQJCAIEARKDIQGBHWSMGZtNgS6KIYtiF4IAgREnDBOCTIQ5XAkWCYJEMYImAo43kBkJApByF4FShHSJapZEAhEUgSYdOIFVcBVlAYJBgiYXjhpvjA6BHwEB
X-IronPort-AV: E=Sophos;i="5.54,417,1534809600"; d="scan'208,217";a="189886148"
Received: from rcdn-core-12.cisco.com ([173.37.93.148]) by alln-iport-5.cisco.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 23 Oct 2018 17:51:52 +0000
Received: from XCH-RTP-001.cisco.com (xch-rtp-001.cisco.com [64.101.220.141]) by rcdn-core-12.cisco.com (8.15.2/8.15.2) with ESMTPS id w9NHpqP3029292 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=FAIL); Tue, 23 Oct 2018 17:51:52 GMT
Received: from xch-rtp-015.cisco.com (64.101.220.155) by XCH-RTP-001.cisco.com (64.101.220.141) with Microsoft SMTP Server (TLS) id 15.0.1395.4; Tue, 23 Oct 2018 13:51:51 -0400
Received: from xch-rtp-015.cisco.com ([64.101.220.155]) by XCH-RTP-015.cisco.com ([64.101.220.155]) with mapi id 15.00.1395.000; Tue, 23 Oct 2018 13:51:51 -0400
From: "Acee Lindem (acee)" <acee@cisco.com>
To: Albert Fu <afu14@bloomberg.net>, "Les Ginsberg (ginsberg)" <ginsberg@cisco.com>, "rtg-bfd@ietf.org" <rtg-bfd@ietf.org>
Subject: Re: BFD WG adoption for draft-haas-bfd-large-packets
Thread-Topic: BFD WG adoption for draft-haas-bfd-large-packets
Thread-Index: AQHUavUM2obCS5tWFUCJRL2ls+mZdKUtHDwA
Date: Tue, 23 Oct 2018 17:51:51 +0000
Message-ID: <5679BA3B-2BBD-4669-8906-CCC26D6EA481@cisco.com>
References: <5BCF58F5029804A2003907DE_0_60079@msllnjpmsgsv06>
In-Reply-To: <5BCF58F5029804A2003907DE_0_60079@msllnjpmsgsv06>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-ms-exchange-messagesentrepresentingtype: 1
x-ms-exchange-transport-fromentityheader: Hosted
x-originating-ip: [10.116.152.200]
Content-Type: multipart/alternative; boundary="_000_5679BA3B2BBD46698906CCC26D6EA481ciscocom_"
MIME-Version: 1.0
X-Outbound-SMTP-Client: 64.101.220.141, xch-rtp-001.cisco.com
X-Outbound-Node: rcdn-core-12.cisco.com
Archived-At: <https://mailarchive.ietf.org/arch/msg/rtg-bfd/MUBNlcDt_1nMaJy6cwdg4-8cVRI>
X-BeenThere: rtg-bfd@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "RTG Area: Bidirectional Forwarding Detection DT" <rtg-bfd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rtg-bfd/>
List-Post: <mailto:rtg-bfd@ietf.org>
List-Help: <mailto:rtg-bfd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 23 Oct 2018 17:51:57 -0000


From: "Albert Fu (BLOOMBERG/ 120 PARK)" <afu14@bloomberg.net>
Reply-To: Albert Fu <afu14@bloomberg.net>
Date: Tuesday, October 23, 2018 at 1:23 PM
To: "Les Ginsberg (ginsberg)" <ginsberg@cisco.com>, Acee Lindem <acee@cisco.com>, "rtg-bfd@ietf.org" <rtg-bfd@ietf.org>
Subject: Re: BFD WG adoption for draft-haas-bfd-large-packets

(* Resending smaller message *)

Hi Acee,

Please see comments in-line.

Thanks,

Albert

From: acee@cisco.com At: 10/23/18 13:02:49
To: Albert Fu (BLOOMBERG/ 120 PARK ) <mailto:afu14@bloomberg.net> , rtg-bfd@ietf.org<mailto:rtg-bfd@ietf.org>, ginsberg@cisco.com<mailto:ginsberg@cisco.com>
Subject: Re: BFD WG adoption for draft-haas-bfd-large-packets
Hi Albert,

From: "Albert Fu (BLOOMBERG/ 120 PARK)" <afu14@bloomberg.net>
Reply-To: Albert Fu <afu14@bloomberg.net>
Date: Tuesday, October 23, 2018 at 12:45 PM
To: "rtg-bfd@ietf.org" <rtg-bfd@ietf.org>, "Les Ginsberg (ginsberg)" <ginsberg@cisco.com>, Acee Lindem <acee@cisco.com>
Subject: Re: BFD WG adoption for draft-haas-bfd-large-packets

Hi Acee,

You are right in that this issue does not happen frequently, but when it does, it is time consuming to troubleshoot and causes unnecessary network downtime to some applications (e.g. between two end hosts, some applications worked fine, but others would intermittently fail when they tried to send large size packets over the failing ECMP path).

So youā€™re saying there is a problem where the data plane interfaces do not support the configured MTU due to a SW bug? I hope these are not our routers šŸ˜‰

AF> There's no bug.


1)  The issue we have seen is with the Telco network. The router can happily transmit and receive up to configured interface MTU, but the Telco circuit fails to support it. One example is when Telco uses L2VPN to deliver the P2P service to us, but due to some faults, traffic was re-routed to a mis-configured path that did not support our MTU size (e.g. MTU on Telco PE router was not increased to account for MPLS headers for the L2VPN service).
Ok ā€“ in this case, youā€™d need to exercise the data plane with maximum size packets. And make the MTU part of the SLA you secure from the provider.
Thanks, Acee



2) AFAIK, the OSPF MTU detection is based on checking MTU value in the DBD packet, The actual OSPF packet size may be smaller and may not detect data plane issue in Telco network during OSPF session establishment.





I believe the OSPF MTU detection is a control plane mechanism to check config, and may not necessary detect a data plane MTU issue (since OSPF does not support padding). Also, most of our issues occurred after routing adjacency had been established, and without any network alarms.

Right. However, if the interface is flapped when the MTU changes, OSPF would detect dynamic MTU changes (e.g., configuration), that the control plane is aware of.

AF> We have encountered the MTU issue without any interface flaps on our routers (no config change on our routers). The MTU issue occurred within the Telco network. Note also some Telco providers that provide WAN circuit spanning several countries may make use of smaller local providers to provide the last mile.  We have seen issues with the smaller providers.


Thanks,
Acee