Re: [GROW] Comment on draft-iops-grow-bgp-session-culling

Job Snijders <job@ntt.net> Tue, 14 March 2017 12:13 UTC

Return-Path: <job@instituut.net>
X-Original-To: grow@ietfa.amsl.com
Delivered-To: grow@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4BBEB129581 for <grow@ietfa.amsl.com>; Tue, 14 Mar 2017 05:13:50 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.919
X-Spam-Level:
X-Spam-Status: No, score=-1.919 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id bc244LNKqHDk for <grow@ietfa.amsl.com>; Tue, 14 Mar 2017 05:13:49 -0700 (PDT)
Received: from mail-wm0-f50.google.com (mail-wm0-f50.google.com [74.125.82.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A74DE12957F for <grow@ietf.org>; Tue, 14 Mar 2017 05:13:48 -0700 (PDT)
Received: by mail-wm0-f50.google.com with SMTP id v186so62356369wmd.0 for <grow@ietf.org>; Tue, 14 Mar 2017 05:13:48 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=+2FsNnoEwH8mLwW/k8FIuP55dCwR9weX+J6XGMRyOAY=; b=bVYjr+PdMD/j8G58+D+68MDzE6yxmNNkfG7fX92Z4aIYZ5XDlTKSQbbneWoY5TtC3f RxGFxZGbHAskvfUnzXM4TKwjdWqB0kSYlMH0SV2l25Cd6vgFGIwDbgwcUdk2KeUax5cl 3qYUwjcYEen3qLcF9wTX03WbMB3VY2naAymWhlXXc4Q/Zg/UAp4HCFTNPrmEtarEeiXB M5Q1bgABZtDQtr2KpqTKlV6RzpaZ3pqC8FL79lPtmVVeJlCx0ku3N0h/Aogl3HIxqWGx xaAOhF+Ep2Ox8+bPD9cD8b4dXXjF/q7zxxDSDQCzKzXm6LDXgza78aAo9cVMR/Xe1Bih MzCw==
X-Gm-Message-State: AFeK/H10f+RHBkihGN6+XSpZ0UtOZLZobsxJlCzShHnYQKKxHjtd4ocFpYpd3+ltp3IbUQ==
X-Received: by 10.28.147.147 with SMTP id v141mr16124745wmd.110.1489493626678; Tue, 14 Mar 2017 05:13:46 -0700 (PDT)
Received: from localhost ([188.206.71.239]) by smtp.gmail.com with ESMTPSA id z88sm29045836wrb.26.2017.03.14.05.13.45 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 14 Mar 2017 05:13:45 -0700 (PDT)
Date: Tue, 14 Mar 2017 13:13:44 +0100
From: Job Snijders <job@ntt.net>
To: Tore Anderson <tore@fud.no>
Message-ID: <20170314121344.hwfdjlirgskkg4ho@Vurt.local>
References: <20170313121134.6676bd02@echo.ms.redpill-linpro.com> <71D584DF-94F5-40B3-BCE0-4736354ECCCB@harg.net> <20170314072225.55fdd871@echo.ms.redpill-linpro.com> <58C7BD67.6080308@foobar.org> <20170314111326.3714e0ed@echo.ms.redpill-linpro.com> <58C7D033.8060203@foobar.org> <20170314123054.3b971d1d@echo.ms.redpill-linpro.com> <58C7D895.4070002@foobar.org> <20170314130508.63d4fcba@echo.ms.redpill-linpro.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <20170314130508.63d4fcba@echo.ms.redpill-linpro.com>
X-Clacks-Overhead: GNU Terry Pratchett
User-Agent: NeoMutt/20170306 (1.8.0)
Archived-At: <https://mailarchive.ietf.org/arch/msg/grow/L4Jw86JyyTlNg5ejartx5f1jb88>
Cc: grow@ietf.org
Subject: Re: [GROW] Comment on draft-iops-grow-bgp-session-culling
X-BeenThere: grow@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Grow Working Group Mailing List <grow.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/grow>, <mailto:grow-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/grow/>
List-Post: <mailto:grow@ietf.org>
List-Help: <mailto:grow-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/grow>, <mailto:grow-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 14 Mar 2017 12:13:50 -0000

On Tue, Mar 14, 2017 at 01:05:08PM +0100, Tore Anderson wrote:
> * Nick Hilliard <nick@foobar.org>
> > Tore Anderson wrote:
> > > In other words: in my opinion, BGP session culling should be
> > > considered a BCP even in situations where link state signaling
> > > and/or BFD is used. IP-transit providers should perform culling
> > > towards their customers ahead of maintenance works. Direct peers,
> > > likewise.  
> > 
> > probably not much need if bfd is used because that would operate
> > route-to-router.
> 
> Quite the contrary, there is very much a need in this case too. If there
> are many active routes that will become invalid, converging on
> alternate paths (reprogramming the FIB) can take significantly longer
> than actually detecting the outage (even if it's detected only using
> BGP timers).
> 
> > > IXPs aren't at all special regarding the fundamental need for session
> > > culling, only in the method by which it is accomplished (i.e., using
> > > layer-2 ACLs).  
> > 
> > Correct, but for direct peers over PNIs, etc, the operator will usually
> > have control over the bgp session.  What we're talking about here is a
> > situation where there is an intermediate operator which has no direct
> > admin control over bgp sessions.
> 
> The draft is most definitively also talking about the situations where
> the operator does have admin control over the BGP session (section 2.1).

TEXT:
    In network topologies where BGP speaking routers are directly
    attached to each other, or use fault detection mechanisms such as
    <xref target="RFC5880">BFD</xref>, detecting and acting upon a link
    down event (for example when someone yanks the physical connector)
    in a timely fashion is straightforward.

So we should add something that even though detection is
straightforward, and initiating action as a result of this event can be
done timely, we cannot be sure of timely termination of whatever actions
are taken because of the event, and therefor the recommendation is to
shutdown sessions before doing maintenance, even though networks are
directly connected to each other.

The above matches my operational experience and aligns with how we
perform router maintenance.

There are a number of considerations:

    - an operator may not know whether they are directly connected
    - even if directly connected, the remote side might not be able to
      convergence in a timely fashion

Perhaps the paragraph should just be removed?

Kind regards,

Job