Re: [Idr] Fwd: New Version Notification for draft-spaghetti-idr-bgp-sendholdtimer-05.txt

Jeffrey Haas <jhaas@pfrc.org> Thu, 04 August 2022 12:09 UTC

Return-Path: <jhaas@slice.pfrc.org>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 78325C13C50E for <idr@ietfa.amsl.com>; Thu, 4 Aug 2022 05:09:44 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.909
X-Spam-Level:
X-Spam-Status: No, score=-1.909 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_BLOCKED=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7gBX72bbQ4cT for <idr@ietfa.amsl.com>; Thu, 4 Aug 2022 05:09:44 -0700 (PDT)
Received: from slice.pfrc.org (slice.pfrc.org [67.207.130.108]) by ietfa.amsl.com (Postfix) with ESMTP id E1F7FC13C50C for <idr@ietf.org>; Thu, 4 Aug 2022 05:09:43 -0700 (PDT)
Received: by slice.pfrc.org (Postfix, from userid 1001) id A52E21E358; Thu, 4 Aug 2022 08:09:42 -0400 (EDT)
Date: Thu, 04 Aug 2022 08:09:42 -0400
From: Jeffrey Haas <jhaas@pfrc.org>
To: Claudio Jeker <cjeker@diehard.n-r-g.com>
Cc: Job Snijders <job=40fastly.com@dmarc.ietf.org>, Robert Raszuk <robert@raszuk.net>, heasley <heas@shrubbery.net>, "idr@ietf. org" <idr@ietf.org>
Message-ID: <20220804120942.GE16746@pfrc.org>
References: <CAOj+MME7XnW7kDXL4muh4Qp1UvabQ9amUoU0Sn3h2axqKzswzA@mail.gmail.com> <77F3E1F0-486F-47DF-ABE4-EFDB9C2FB6D8@gmail.com> <CAOj+MMGR4f3eLEDZY++1m4Lpo9joG4L9OrWbeF6kREn-9a9onA@mail.gmail.com> <c6e44213-7667-0f67-71a4-634411cd102b@foobar.org> <CAOj+MMFajL6E42WCzC0ZqrfSBZjU-0B=ZzmtvCRPkuMzU8z5QA@mail.gmail.com> <Yun6e5jSb0OYZGAX@shrubbery.net> <CAOj+MMFRJr=cs+5DVOp72BVn_j3NgANwNftyj=jRbdsvPpg-wA@mail.gmail.com> <Yupp6uYxNVsBlL07@snel> <20220803170317.GB16746@pfrc.org> <Yut3nfDg403V52Kz@diehard.n-r-g.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <Yut3nfDg403V52Kz@diehard.n-r-g.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/NaAg0H0rT5X3kYfJLwozfkmh2HU>
Subject: Re: [Idr] Fwd: New Version Notification for draft-spaghetti-idr-bgp-sendholdtimer-05.txt
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 04 Aug 2022 12:09:44 -0000

Claudio,

On Thu, Aug 04, 2022 at 09:39:09AM +0200, Claudio Jeker wrote:
> On Wed, Aug 03, 2022 at 01:03:17PM -0400, Jeffrey Haas wrote:
> > The underlying conditions we want from my perspective are:
> > - We have detected we are unable to write further data to the TCP connection
> >   and have more data to send.  Certainly EWOULDBLOCK would be a common
> >   example.  How about EAGAIN?  Depends on the stack.  Other conditions such as
> >   ENOBUFS, EINTR, ENOMEM, ENOSPC lead to questions as to whether this is a
> >   useful POSIX indication that you can't write vs. you can't write because TCP
> >   is not progressing.
> 
> EWOULDBLOCK is normaly the same as EAGAIN (at least on anything modern).
> EINTR should be handled anyway because it is caused by signal deliver and
> has nothing todo with this issue. Generally after EINTR the same call is
> retried immediatly.
> 
> ENOBUFS, ENOMEM and ENOSPC are proper error conditions that need some
> action (normally a fatal write error that causes the connection to close).
> Also ENOMEM is not an error send(2) should return according to POSIX.

My faith that a particular implementation follows POSIX or not, is low at
the best of times.  I've been doing this stuff for Too Long.

I, and some of the other vendors on this list, also have the joy that our
TCP stacks have been doctored for hardware integration and non-stop routing
purposes.  This makes the situation even more murky.

Enke has done us a nice service in pointing out the Linux TCP_USER_TIMEOUT
feature which nicely encapsulates much of the behavior we want, but that
won't be universally supported either.  And even in that case, we likely
want to know at the application layer that was the reason the session closed
so we can locally take action on that.

My caveat here is that it might seem "obvious" what to write in terms of
observable socket behavior... but it's not, and trying to discuss too much
at that layer will likely not be as successful as we like.

> > - That said, if the TCP session starts draining and data starts moving, and
> >   we become able to write, the timer should be canceled.  
> >  
> >   + How does that manifest vs. kevent/select/poll/etc.?
>  
> In OpenBGPD a timeout is used that is rearmed whenever data is enqueued.
> If the timer fires, the socket did not become writeable for sendholdtimer
> interval and the session is reset. This works with poll, select, kevent, etc.

Offering observations from my own implementation, just because your timer
fired didn't mean the sockets didn't make progress in draining.  

To work around that condition, we peek at the TCP state when the holdtimer
fires. 

> > There are a number of other messier edge cases where you're not at a zero
> > window, but still can't write to the socket.  Basically any case where you
> > can't write pending data to the socket for the sendholdtimer interval is
> > probably a Good Enough condition.
> 
> IMO if a BGP speaker was not able to send out (or enqueue to send out) a
> message for sendholdtimer interval then it is clear that the BGP session
> is not functioning and the other side is lying about its holdtimer
> (because we know there was no keepalive sent for more than the holdtimer
> time).

We agree here.

The implication is to be careful in the document about using a zero window
as the gating condition for the feature.

-- Jeff