RE: Tuning BFD session times

Alexander Vainshtein <Alexander.Vainshtein@ecitele.com> Sun, 01 April 2018 16:28 UTC

Return-Path: <Alexander.Vainshtein@ecitele.com>
X-Original-To: rtg-bfd@ietfa.amsl.com
Delivered-To: rtg-bfd@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2212D126C22 for <rtg-bfd@ietfa.amsl.com>; Sun, 1 Apr 2018 09:28:54 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.791
X-Spam-Level:
X-Spam-Status: No, score=-1.791 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, T_DKIM_INVALID=0.01] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=fail (1024-bit key) reason="fail (body has been altered)" header.d=eci365.onmicrosoft.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Wp6rw7Y6-0jw for <rtg-bfd@ietfa.amsl.com>; Sun, 1 Apr 2018 09:28:52 -0700 (PDT)
Received: from mail1.bemta26.messagelabs.com (mail1.bemta26.messagelabs.com [85.158.142.6]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 5FA6C1200B9 for <rtg-bfd@ietf.org>; Sun, 1 Apr 2018 09:28:51 -0700 (PDT)
Received: from [85.158.142.101] (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256 bits)) by server-6.bemta.az-a.eu-central-1.aws.symcld.net id 08/E0-06180-1C801CA5; Sun, 01 Apr 2018 16:28:49 +0000
X-Brightmail-Tracker: H4sIAAAAAAAAA1WTWUwTURSGvTPTYVBGL0Xl2IgPFU1EplIlBkx cEnxAo1FeG7cBRtpYBtKWWCVGVETiFomIiESpqT6gVlOtW41SHhoF4lIUESWoCCrgEsAFUeN0 LrjMw813//+cc89JznC0tofVcZLTIdlk0apnxzLJs77FCwEuYEqqaIxOuR14r0mpKinXpAz8u IKW0Olu9xCVfqn3LZ3efMCnWU2bNBY5M8+5QWPu8wxT+b465Dzfcp0tQmW1aC8ayzG4hIar7T Vs+KLFhymo8D+nyaUDwcWBUsWJ5Fi8ELxn21WeiAW4c/SDyjROh4M+v8oxeAbseP44gsTMhIe v7tKEF8DHzx4qzAyOh8EvQTWex2vA13la1bV4O1S+3K3qkXgtlN3yaMKM8GT42nCOIm/FQtvr kyoDxuC+eZ8mPAnedf4aic+Eji4XIroeqn/u0hCOg9DJferIgC9T8GxfMUsMAT4dOaIU4hSeD pffriXySui9WcSQ+McInnobGGIkQMve7hHeBDfKL4zwNvBcC2lIwika/N6OkS6mwrDvBU2M6y x8aNnDkpGz4E71AHMIJVb9Mx1hGRq/d6nM42i4e+w1Q/REqPH3s4RnwxlXLz3KTXWd1L96DYq oRamZNkuO2ZErWqyCMSlJMBrnCUZhbvJcg7hVEA1SgZAlyQ6bqLgGcbPdYN+Sm2XNNsiSw4uU XRujfNfQu2BWPZrCUfpJvFxyy6Qdn5mXvcUs2s3rbQVWyV6PpnKcHvjhiIBJG22TciTnRotVW dhRG7go/UR+nLKyWt6eL+baLTnEakCpXP3H8v00d089g80V+2ktI+fJki6Wp8IJOJxgLpD/lB v9BUIoThfDI6VBbVS+ZMu1OP73e1Ash/QxPISrRFlkx59Xe5SGKKWhxXJduCGH+NfSFSFfZWh P9qXk+y/bUUZqzipH6Zny91sXdc9wt1TXTQh19uMTHp/rSk9hc1nf8ezAPdexoR+b+s7OCU4b Ex/j7qqy1V9cITSnDTZNFnuLC/3L+lfsdDa5uucPVi4vXlr4aiijLe6NrjWxjPsVLH2kS1v3z N/4oDU57Unqw9PeBNP2cW16xm4WjQm0zS7+BiIX7Vr9AwAA
X-Env-Sender: Alexander.Vainshtein@ecitele.com
X-Msg-Ref: server-40.tower-226.messagelabs.com!1522600125!57896!1
X-Originating-IP: [52.41.248.36]
X-SYMC-ESS-Client-Auth: mailfrom-relay-check=pass
X-StarScan-Received:
X-StarScan-Version: 9.9.15; banners=ecitele.com,-,-
X-VirusChecked: Checked
Received: (qmail 19917 invoked from network); 1 Apr 2018 16:28:48 -0000
Received: from us-west-2a.mta.dlp.protect.symantec.com (HELO EUR03-VE1-obe.outbound.protection.outlook.com) (52.41.248.36) by server-40.tower-226.messagelabs.com with AES256-SHA256 encrypted SMTP; 1 Apr 2018 16:28:48 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ECI365.onmicrosoft.com; s=selector1-ecitele-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=B3tpCPrkvHfpdQUoiMtG0qn/DefYnLq3JDb/P5il1lU=; b=TJht7+7jDyiNW+GVhvIRSvgNYroOKsaE6q38woNI7epLspS4AmlNQVy58HxCt1UZPU0uQ/3OfPMSFV59Fg1wJn390ywP8v1K9kr1cKfSeAvhXyAqIaWUE2w3HWjOwG1ZuYfva20PzTo+KXYkctyuPbqrNGJzYcGHLU9OajrL4CE=
Received: from DB3PR03MB0969.eurprd03.prod.outlook.com (10.161.58.145) by DB3PR03MB0826.eurprd03.prod.outlook.com (10.161.55.146) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.609.10; Sun, 1 Apr 2018 16:28:43 +0000
Received: from DB3PR03MB0969.eurprd03.prod.outlook.com ([fe80::147b:66b2:a2d4:56e1]) by DB3PR03MB0969.eurprd03.prod.outlook.com ([fe80::147b:66b2:a2d4:56e1%18]) with mapi id 15.20.0609.012; Sun, 1 Apr 2018 16:28:43 +0000
From: Alexander Vainshtein <Alexander.Vainshtein@ecitele.com>
To: Ashesh Mishra <mishra.ashesh@outlook.com>
CC: Jeffrey Haas <jhaas@pfrc.org>, "rtg-bfd@ietf.org" <rtg-bfd@ietf.org>
Subject: RE: Tuning BFD session times
Thread-Topic: Tuning BFD session times
Thread-Index: AQHTxsWMtes9PCMSZUCS5+3WL2YKnKPsBRmAgAAY0DA=
Date: Sun, 01 Apr 2018 16:28:43 +0000
Message-ID: <DB3PR03MB0969877EE63E4514F3975D6C9DA70@DB3PR03MB0969.eurprd03.prod.outlook.com>
References: <20180328184959.GB25442@pfrc.org> <BL0PR0102MB3345EC535EE558FC4CC692E6FAA70@BL0PR0102MB3345.prod.exchangelabs.com>
In-Reply-To: <BL0PR0102MB3345EC535EE558FC4CC692E6FAA70@BL0PR0102MB3345.prod.exchangelabs.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [79.177.112.202]
x-ms-publictraffictype: Email
x-microsoft-exchange-diagnostics: 1; DB3PR03MB0826; 7:NpYF4iojchPqs42eSEE5NL8DdTPuhknJtKkrAb+qzS5mAAsvakZ6cFYU1qC9uKqg+DXzydqEyAO4l3/3eySybN4jF1MDk8JMCRRB0oUuUfjNVPnZWhaAv0CGHGlhIwLvrkd9Keutewbc3ihscVlJdBV9pLYmcWNP/7a39yC8cwinDu195H+SBMXOKDoK00YTzdx78Damn7aZNfaJcP1lis7bnssVFw6U77skPkYue5/R3Y2BIPRUuiiJmhmAcDNX
x-ms-exchange-antispam-srfa-diagnostics: SOS;
x-ms-office365-filtering-ht: Tenant
x-ms-office365-filtering-correlation-id: e7ed8045-3506-45f7-012a-08d597eda276
x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(48565401081)(5600026)(4604075)(3008032)(4534165)(4627221)(201703031133081)(201702281549075)(2017052603328)(7153060)(7193020); SRVR:DB3PR03MB0826;
x-ms-traffictypediagnostic: DB3PR03MB0826:
x-microsoft-antispam-prvs: <DB3PR03MB08268C2338B892685FC464F59DA70@DB3PR03MB0826.eurprd03.prod.outlook.com>
x-exchange-antispam-report-test: UriScan:(28532068793085)(278428928389397)(21748063052155)(279101305709854)(211171220733660);
x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(6040522)(2401047)(5005006)(8121501046)(10201501046)(93006095)(93001095)(3002001)(3231221)(944501327)(52105095)(6055026)(6041310)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123560045)(20161123564045)(20161123558120)(20161123562045)(6072148)(201708071742011); SRVR:DB3PR03MB0826; BCL:0; PCL:0; RULEID:; SRVR:DB3PR03MB0826;
x-forefront-prvs: 06290ECA9D
x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(39850400004)(396003)(346002)(39380400002)(366004)(376002)(189003)(199004)(252514010)(7736002)(6246003)(55016002)(3280700002)(33656002)(106356001)(74316002)(6916009)(72206003)(97736004)(6116002)(476003)(478600001)(3846002)(5660300001)(105586002)(790700001)(9686003)(2906002)(99286004)(6436002)(236005)(54896002)(86362001)(14454004)(561944003)(6306002)(25786009)(66066001)(4326008)(229853002)(3660700001)(316002)(8936002)(39060400002)(2900100001)(54906003)(53936002)(7696005)(26005)(6506007)(81156014)(76176011)(11346002)(102836004)(53546011)(59450400001)(486005)(5250100002)(81166006)(8676002)(186003)(446003)(68736007)(486005)(3480700004); DIR:OUT; SFP:1102; SCL:1; SRVR:DB3PR03MB0826; H:DB3PR03MB0969.eurprd03.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en;
received-spf: None (protection.outlook.com: ecitele.com does not designate permitted sender hosts)
x-microsoft-antispam-message-info: PWO8ydBM++jFJdLyQWha0S5aZ4tQtaLTTwj9HmnNnnSwfkRaOpDkv1Nd/+pli6M5RT8Ue9QqlHP2+grmQS5inT50eMzNZMc87DnDs+7jQROBZ5e4bH+CKx6OcUETMrDJLjI6bfAiMg+QWVr3vvqFOp+y+nFah7Ce6wwAvXbvsnzJxSFhNSpLbugL8Jd8VjEgXjVE5HXy94ygM1EGgZhORpdlxdor1ii4mq38rhGWe4l4429fbnA7PGVdtccfdwwcpTk9xohp34rsjSHMpOYuVl6/xxASWm0ukI5cVoU8gtZi1zFrsKj51iudwpd1PQlNFTtqT71JMmgW45Mu2JNAdMwuLXf+PKidcHx2/p/4cbJbDXthBQEEc/zBrOZozEuT37JJn2TZt2VwGN2KMxA1lzan3d6zCfXaHH7Kbw5MWpw=
spamdiagnosticoutput: 1:99
spamdiagnosticmetadata: NSPM
Content-Type: multipart/alternative; boundary="_000_DB3PR03MB0969877EE63E4514F3975D6C9DA70DB3PR03MB0969eurp_"
MIME-Version: 1.0
X-OriginatorOrg: ecitele.com
X-MS-Exchange-CrossTenant-Network-Message-Id: e7ed8045-3506-45f7-012a-08d597eda276
X-MS-Exchange-CrossTenant-originalarrivaltime: 01 Apr 2018 16:28:43.0854 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 2c514a61-08de-4519-b4c0-921fef62c42a
X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB3PR03MB0826
X-CFilter-Loop: Reflected
Archived-At: <https://mailarchive.ietf.org/arch/msg/rtg-bfd/tD-axFotDAKqjKmwKm_N0oW7DMs>
X-BeenThere: rtg-bfd@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "RTG Area: Bidirectional Forwarding Detection DT" <rtg-bfd.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rtg-bfd/>
List-Post: <mailto:rtg-bfd@ietf.org>
List-Help: <mailto:rtg-bfd-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtg-bfd>, <mailto:rtg-bfd-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 01 Apr 2018 16:28:54 -0000

Ashesh,
I would like to understand better the use case with satellite links that you have described.
In particular, can you please explain why long RTT affects the BFD detection times?
As I see it, what could really affect these times is variable delay introduced in some cases by the satellite links since the distance between the satellite and the terrestrial antennae may change significantly with time.

What did I miss?

Regards,
Sasha

Office: +972-39266302
Cell:      +972-549266302
Email:   Alexander.Vainshtein@ecitele.com

From: Rtg-bfd [mailto:rtg-bfd-bounces@ietf.org] On Behalf Of Ashesh Mishra
Sent: Sunday, April 1, 2018 5:54 PM
To: Jeffrey Haas <jhaas@pfrc.org>; rtg-bfd@ietf.org
Subject: Re: Tuning BFD session times


Jeff, thanks for kicking-off this discussion on the list!



One additional comment that I wanted to make was around automation. There were questions during the meeting around the need for auto-tuning and that the process of determining the interval can/should be manual.



The automation of control in all aspects of dynamic behavior is a priority for network operators. When configuring manually, parameters such BFD intervals are typically set at very conservative values because human latency is very high when responding to changing network conditions. Manual configuration also takes a lot of time and is accounts for significant number of lost opportunities and value for operators.



[JH] "applications should generally choose a detection interval that is reasoanble for their application rather than try to dynamically discover the lowest possible stable detection interval. "

[AM] This depends on the use-case. From the point-of-view of a service provider that delivers long-haul connectivity (typical scenario in which the link characteristics have large variance) then the intent is to provide the best performance. As such providers deliver connectivity to critical applications, and are often the only way of delivering connectivity in such places, the ability to tune the system to deliver an up-time that is superior drives significant value. Consider a scenario where there is a 130ms RTT link (MEO satellite, LEO will be in the 20-60ms range) and its backup is a 600ms RTT link (GEO satellite), and are being used to deliver transit connectivity. The rate at which the end-to-end service can run BFD is significantly faster when MEO is active versus when GEO is active. The application, in this scenario, may survive the RTT, but the business continuity is critical in many cases. Since the provider of long-haul can not control the application, it must provide the best possible failover performance.



[JH] "1. BFD is asymmetric..  This means a receiving BFD implementation must provide feedback to a sending implementation in order for it to understand perceived reliability."

[AM] May not need to be the BFD implementation providing the feedback if there are other performance mechanisms running. The challenge is to standardize the mechanism that BFD can use (if the measurement is not self-contained in BFD). You're right in pointing out the challenge in accounting for the CPU delays and that was the reason for the original proposal for BFD performance measurement. If the measurement is within the BFD realm, it will account for the CPU delays. However, most good BFD engines have relatively deterministic performance and are quite optimized so the variance with scale and time is not significant (but I concede that not all BFD implementations are good).



[JH] "2. Measurement infrastructure may negatively impact session scale.  Greg, I believe, made this point when discussing host processing issues vs. BFD ingress/egress."

[AM] This is an issue if using a measurement mechanism within BFD (other performance measurement methods are always running in network for SLA reporting and/or network optimization). Within a metro-area with fiber or terrestrial wireless (microwave, LTE, etc.) connectivity, I would likely not need constant auto-tuning. The variance in the primary and backup links in such network will not be significant to affect the BFD parameters. In long-haul links, this may be a valuable feature in which case, the additional overhead may be justified. So it depends on the use-case whether continuous auto-tuning is required or if it is one-time.



[JH] "3. Detection interval calculations really need to take into account things that are greater than simple packet transmission times.  As an example, if your measurement is always taken during low system CPU or network activity, how high is your confidence about the interval?  What about scaling vs. number of total BFD sessions?"
[AM] Great questions. Typically when running BFD or CFM (or similar) high frequency OAM, CPU peaks should not affect the OAM performance (a variety of methods, based on the system on which OAM is running, can ensure that). CPU peaks become a bigger issue if BFD is used to detect continuity for a particular flow (or QoS).

--
Asheh
________________________________
From: Rtg-bfd <rtg-bfd-bounces@ietf.org<mailto:rtg-bfd-bounces@ietf.org>> on behalf of Jeffrey Haas <jhaas@pfrc.org<mailto:jhaas@pfrc.org>>
Sent: Wednesday, March 28, 2018 11:49 AM
To: rtg-bfd@ietf.org<mailto:rtg-bfd@ietf.org>
Subject: Tuning BFD session times

Working Group,

We had very active discussion (yay!) at the microphone as part of Mahesh's
presentation on BFD Performance Measurement.
(draft-am-bfd-performance)

I wanted to start this thread to discuss the greater underlying issues this
discussion raised.  In particular, active tuning of BFD session parameters.
Please note that opinions I state here are as an individual contributor.

BFD clients typically want the fastest, most stable detection interval that
is appropriate to their application.  That stability component is very
important since too aggressive of timers can result in unnecessary BFD
session instability which will impact the subscribing application.  Such
stability is a function of many things, scale of the system running BFD
being a major one.

In my opinion, applications should generally choose a detection interval
that is reasoanble for their application rather than try to dynamically
discover the lowest possible stable detection interval.  This is because a
number of unstable factors, such as CPU load, contention with other network
traffic and other things that are outside the general control of many
sytems may impact such scale.

That said, here's a few thoughts on active feedback mechanisms:
1. BFD is asymmetric.  This means a receiving BFD implementation must provide
   feedback to a sending implementation in order for it to understand
   perceived reliability.
2. Measurement infrastructure may negatively impact session scale.  Greg, I
   believe, made this point when discussing host processing issues vs. BFD
   ingress/egress.
3. Detection interval calculations really need to take into account things
   that are greater than simple packet transmission times.  As an example,
   if your measurement is always taken during low system CPU or network
   activity, how high is your confidence about the interval?  What about
   scaling vs. number of total BFD sessions?

I have no strong conclusions here, just some cautionary thoughts.

What are yours?

-- Jeff

___________________________________________________________________________

This e-mail message is intended for the recipient only and contains information which is 
CONFIDENTIAL and which may be proprietary to ECI Telecom. If you have received this 
transmission in error, please inform us by e-mail, phone or fax, and then delete the original 
and all copies thereof.
___________________________________________________________________________