RE: Re: BFD WG adoption for draft-haas-bfd-large-packets

"Les Ginsberg (ginsberg)" <ginsberg@cisco.com> Mon, 29 October 2018 18:13 UTC

Return-Path: <ginsberg@cisco.com>
X-Original-To: rtg-bfd@ietfa.amsl.com
Delivered-To: rtg-bfd@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 14F2C131062 for <rtg-bfd@ietfa.amsl.com>; Mon, 29 Oct 2018 11:13:58 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -14.969
X-Spam-Level:
X-Spam-Status: No, score=-14.969 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.47, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cisco.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8ouUikZqoHLZ for <rtg-bfd@ietfa.amsl.com>; Mon, 29 Oct 2018 11:13:55 -0700 (PDT)
Received: from rcdn-iport-5.cisco.com (rcdn-iport-5.cisco.com [173.37.86.76]) (using TLSv1.2 with cipher DHE-RSA-SEED-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 33C1013105F for <rtg-bfd@ietf.org>; Mon, 29 Oct 2018 11:13:55 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=19898; q=dns/txt; s=iport; t=1540836835; x=1542046435; h=from:to:subject:date:message-id:references:in-reply-to: mime-version; bh=mcVnoG5jf7OwSR46Mn7ppYOIWbex1mmhOKUHVag23f0=; b=jaqX3XzeRuK/8MKP2NrJzKB0ojJFSopplNr9o7uiUFjCJri4zvO8NKrj 5WokYIqhlzXC1zHXBw1QUeEYl1jran9xnM08+vI3NwpJvMwv9iXg9h/a1 N8c2LSDulP5iMm0A9y+kUb1nBHPk3EASTaqlZb4n7OiuTY/XurMo2Jcs1 o=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: =?us-ascii?q?A0ANAAACTddb/5xdJa1lGwEBAQEDAQE?= =?us-ascii?q?BBwMBAQGBUQYBAQELAYENSC9mfygKg2uIGIwZgg2NVoN/hUuBegsBAYRsAhe?= =?us-ascii?q?DFiE0DQ0BAwEBAgEBAm0ohToBAQEBAgEjClELAgEIDgMEAQErAgICMBoDCAI?= =?us-ascii?q?EARIIgxqBHVwIqXuBLooVi2cXgUE/gRCDE4UVD4JeglcCiSeFI4Yiih0JApB?= =?us-ascii?q?6IJBHgh6UVwIRFIEmHTiBVXAVgyeCJheOGQFvjAaBHwEB?=
X-IronPort-AV: E=Sophos;i="5.54,441,1534809600"; d="scan'208,217";a="254897535"
Received: from rcdn-core-5.cisco.com ([173.37.93.156]) by rcdn-iport-5.cisco.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 29 Oct 2018 18:13:54 +0000
Received: from XCH-ALN-001.cisco.com (xch-aln-001.cisco.com [173.36.7.11]) by rcdn-core-5.cisco.com (8.15.2/8.15.2) with ESMTPS id w9TIDs81017989 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=FAIL); Mon, 29 Oct 2018 18:13:54 GMT
Received: from xch-aln-001.cisco.com (173.36.7.11) by XCH-ALN-001.cisco.com (173.36.7.11) with Microsoft SMTP Server (TLS) id 15.0.1395.4; Mon, 29 Oct 2018 13:13:53 -0500
Received: from xch-aln-001.cisco.com ([173.36.7.11]) by XCH-ALN-001.cisco.com ([173.36.7.11]) with mapi id 15.00.1395.000; Mon, 29 Oct 2018 13:13:53 -0500
From: "Les Ginsberg (ginsberg)" <ginsberg@cisco.com>
To: Albert Fu <afu14@bloomberg.net>, "jhaas@juniper.net" <jhaas@juniper.net>, "rtg-bfd@ietf.org" <rtg-bfd@ietf.org>
Subject: =?utf-8?B?UkU6IFJlOsKgQkZEwqBXR8KgYWRvcHRpb27CoGZvcsKgZHJhZnQtaGFhcy1i?= =?utf-8?Q?fd-large-packets?=
Thread-Topic: =?utf-8?B?UmU6wqBCRkTCoFdHwqBhZG9wdGlvbsKgZm9ywqBkcmFmdC1oYWFzLWJmZC1s?= =?utf-8?Q?arge-packets?=
Thread-Index: AQHUb644T8nLkygxZE2NIpxUpeu3A6U2fffQ
Date: Mon, 29 Oct 2018 18:13:53 +0000
Message-ID: <4cf7076b81ea486c985921eb222c8eba@XCH-ALN-001.cisco.com>
References: <5BD74598029D054400390770_0_68291@msllnjpmsgsv06>
In-Reply-To: <5BD74598029D054400390770_0_68291@msllnjpmsgsv06>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-ms-exchange-transport-fromentityheader: Hosted
x-originating-ip: [10.24.37.28]
Content-Type: multipart/alternative; boundary="_000_4cf7076b81ea486c985921eb222c8ebaXCHALN001ciscocom_"
MIME-Version: 1.0
X-Outbound-SMTP-Client: 173.36.7.11, xch-aln-001.cisco.com
X-Outbound-Node: rcdn-core-5.cisco.com
Archived-At: <https://mailarchive.ietf.org/arch/msg/rtg-bfd/9Q1FpBI6njO_6iankw2aMpxnZ1I>
X-BeenThere: rtg-bfd@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "RTG Area: Bidirectional Forwarding Detection DT" <rtg-bfd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rtg-bfd/>
List-Post: <mailto:rtg-bfd@ietf.org>
List-Help: <mailto:rtg-bfd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 29 Oct 2018 18:13:58 -0000

Albert –

Do not confuse the current lack of detection with when the problem gets introduced.

The fact that the problem is not detected on protocol adjacency formation does not mean the problem gets introduced afterwards. Unless you are saying that folks change the link MTU AFTER the link comes up and has been used for a while the problem exists as soon as the link comes up. You could therefore detect this in a number of ways:

1)Use IS-IS ☺
2)Enhance other routing protocols to do what IS-IS does
3)Send BFD large packets (Echo or async)
4)Potentially some other OAM mechanism

In regards to #3, unless you believe link MTUs will change “on the fly”, sending the large packets during initial session bringup would be sufficient. If folks are concerned (as I am) that sending large BFD packets all the time introduces some risks for scalability/stability of BFD sessions, then one strategy would be to send the large packets only on session bringup.

FYI, some implementations have knobs on IS-IS to do exactly this i.e., send padded hellos until the adjacency is formed – then revert to small hellos. In the case of BFD I think there is a more compelling reason to be conservative in how often you send large packets given where it is implemented and how often the BFD packets are sent.

   Les



From: Albert Fu (BLOOMBERG/ 120 PARK) <afu14@bloomberg.net>
Sent: Monday, October 29, 2018 10:39 AM
To: Les Ginsberg (ginsberg) <ginsberg@cisco.com>om>; jhaas@juniper.net; rtg-bfd@ietf.org
Subject: Re: BFD WG adoption for draft-haas-bfd-large-packets

Hi Les,

> Jeff/Albert -
>
> Given the MTU issue is associated with a link coming up - and the use of Echo would allow the problem to be detected and prevent the BFD session from coming up -
> and you are acknowledging that the protocol allows padded Echo packets today ...
>
> is there really a need to do anything more?
>
> Les
>

Actually, all the issues we have observed were not associated
with link going up. The MTU issues occurred after OSPF/BGP had
established adjacency without any events on the routers.
ospf/BGP hellos/keepalives continued to be transmitted fine (small
packet size), but applications sending max size packets over the
link would time out and fail.

Hence, I mentioned several times that this issue is rather time
consuming to troubleshoot, as the cause is with the Telco network
and outside of our control and we do not see any alarms.

I did also look at BFD echo mode. As Jeff indicated, this is not
widely deployed (among the vendors we use, only one supports it).

Thanks
Albert