Re: [Idr] Early allocation for draft-ietf-idr-bgp-gr-notification

Job Snijders <job@ntt.net> Mon, 20 March 2017 20:15 UTC

Return-Path: <job@instituut.net>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 938681276AF for <idr@ietfa.amsl.com>; Mon, 20 Mar 2017 13:15:08 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.695
X-Spam-Level:
X-Spam-Status: No, score=-4.695 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-2.796] autolearn=unavailable autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id IQ6zIFN8DeKp for <idr@ietfa.amsl.com>; Mon, 20 Mar 2017 13:15:06 -0700 (PDT)
Received: from mail-pf0-f177.google.com (mail-pf0-f177.google.com [209.85.192.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id E9760129506 for <idr@ietf.org>; Mon, 20 Mar 2017 13:15:03 -0700 (PDT)
Received: by mail-pf0-f177.google.com with SMTP id p189so47254997pfp.1 for <idr@ietf.org>; Mon, 20 Mar 2017 13:15:03 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=ybwuQ7JkUAQV6yttoB45QrHB5p+/gwL1BsJEyL9EISw=; b=TXmVJtli7PuHhCSUyFM2xqjLkV3rhkXysi0b0XvH6Az0Y6x/R6mw11qwMlr+rOITp/ JXfmYjfchWFvUKLStx0e470/xE6iepuypaNxtKF0XAhSig875upmqyXYHz34hEaIB0Qz R5SmkiO9FldgucVBTunXCy0DP3+JMEWERDRL4FUlsH9Q9ue5ESXC2sivw5fJ2t/Vgb8O GuOsBoH9oMntL7Qm6vJkep58TtSXGjqSozyMuerPA8mTDKFGjnGgXJgkHhozbU0KTU7K 9ziQ6k0fz5K8NEFOoTihCsiEHdmOMbtutK5pFmy7Y8prwUFBmybRuTiLB1pDY+yeqNVM tGyw==
X-Gm-Message-State: AFeK/H3QuARZ04iTVjpccfTZ0QN3fmjVSA0PYQYVo9/YAQ3QQOa7mh6ojx2m03hHnj8Wkg==
X-Received: by 10.98.160.193 with SMTP id p62mr35365962pfl.67.1490040903267; Mon, 20 Mar 2017 13:15:03 -0700 (PDT)
Received: from localhost ([192.147.168.22]) by smtp.gmail.com with ESMTPSA id l1sm15761027pfk.8.2017.03.20.13.15.01 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 20 Mar 2017 13:15:02 -0700 (PDT)
Date: Mon, 20 Mar 2017 21:14:55 +0100
From: Job Snijders <job@ntt.net>
To: Jeffrey Haas <jhaas@pfrc.org>
Cc: "idr-chairs@ietf.org" <idr-chairs@ietf.org>, "idr@ietf.org" <idr@ietf.org>
Message-ID: <20170320201455.micjs4yvzvyoycw6@Vurt.local>
References: <4eedda5c2db74539bd0f949e38cb8b26@XCH-ALN-014.cisco.com> <CACWOCC_JVt_=5mmD5c=D5MWRUsk8TdZOhJ6=F4DG-of-w36U6g@mail.gmail.com> <20170320194414.GD26130@pfrc.org>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <20170320194414.GD26130@pfrc.org>
X-Clacks-Overhead: GNU Terry Pratchett
User-Agent: NeoMutt/20170306 (1.8.0)
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/Ye3e4V-epVttjnKQu9kH6pIdWUc>
Subject: Re: [Idr] Early allocation for draft-ietf-idr-bgp-gr-notification
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 20 Mar 2017 20:15:09 -0000

Hi Jeff,

On Mon, Mar 20, 2017 at 03:44:14PM -0400, Jeffrey Haas wrote:
> > Furthermore, shouldn't the "Hard Reset" be called "Really Really
> > Really Reset"? I am skeptical whether the principles of graceful
> > restart should be applied to the notification mechanism. In
> > https://tools.ietf.org/html/draft-iops-grow-bgp-session-culling-00 we are
> > describing a procedure that strongly relies on the currently
> > understood semantics of expiration of BGP Hold Timers and the
> > Administrative Shutdown: "STOP sending traffic".
> 
> I haven't yet read this draft, so take my further comments with that under
> consideration.

ack. Please read the draft. We may have a fundamental problem on our
hands here.

> > I find it hard to reconcile how we can have both "My control-plane
> > temporarily going away, but keep sending traffic" and "data-plane is
> > broken, bgp subsequently dies, don't send traffic". The exact failure
> > scenario which triggers the expiration of the BGP Hold Timers cannot be
> > known at OPEN when the capabilities are exchanged. The GR NOTIFICATION
> > seems to distort the congruency between data-plane and control-plane.

1/ To emphasize the congruency issue here: BGP Hold Timer expiration
often (from the Operator's perspective) an unplanned event. BGP Hold
Timers usually expire because there is an issue with the lower layer
network. If there is an issue with the lower layer network, we should
not continue to forward traffic over that path. If I understand
draft-ietf-idr-bgp-gr-notification correctly, that is what would happen
if both sides through capabilities negotation understand they both
support draft-ietf-idr-bgp-gr-notification.

"make before break" or "ignore this break" are doable when events are
planned, but this draft touches upon scenarios which happen without
planning from the operator's side, where we should be careful with our
assumptions about why a Hold Timer expired.

> > Perhaps there should be an implementation guideline which encourages
> > vendors to by default, disable this mechanism on non-RFC3021/RFC6164
> > links?
> > 
> > Has draft-idr-bgp-gr-notification been vetted in the wild? What were
> > the results and under which circumstances is the mechanism useful?
> > Am I missing something?
> 
> The non-obvious connection is it's a requirement of
> draft-uttaro-idr-bgp-persistence (long-lived graceful restart).
> There's code shipping for this feature.  I *believe* that it's another
> vendor as well, but can't confirm from memory.  
> 
> There has been some discussion that LLGR needs to be resurrected from
> zombie draft state.

2/ So draft-uttaro-idr-bgp-persistence is zombie from IETF perspective,
but there is shipping code, so in order to resurrect
draft-uttaro-idr-bgp-persistence properly (and finish the project?),
draft-ietf-idr-bgp-gr-notification needs to move forward, is that the
chain of events?

3/ If I may ask, why isn't BGP Cease NOTIFICATION message subcode 4
"Administrative Reset" used to perform the 'hard reset' function, and
isn't subcode 9 (the suggested value) requested as a new feature called
'soft reset'? This way we don't break people's preconceptions about what
it means to type in "clear neighbor 1.2.3.4"..

By declaring "subcode 9" to be a hard clear, it appears to me the
meaning of 'Administrative Reset (subcode 4)' is redefined, and
redefining existing constructs might not be an easy task.

Kind regards,

Job