Re: [Idr] TCP & BGP: Some don't send terminate BGP when holdtimer expired, because TCP recv window is 0

Jeff Tantsura <jefftant.ietf@gmail.com> Fri, 11 December 2020 23:57 UTC

Return-Path: <jefftant.ietf@gmail.com>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EACCB3A1058 for <idr@ietfa.amsl.com>; Fri, 11 Dec 2020 15:57:24 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.098
X-Spam-Level:
X-Spam-Status: No, score=-2.098 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id KKYOkU_-0Nlo for <idr@ietfa.amsl.com>; Fri, 11 Dec 2020 15:57:23 -0800 (PST)
Received: from mail-pg1-x530.google.com (mail-pg1-x530.google.com [IPv6:2607:f8b0:4864:20::530]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id ED2693A1056 for <idr@ietf.org>; Fri, 11 Dec 2020 15:57:22 -0800 (PST)
Received: by mail-pg1-x530.google.com with SMTP id g18so8269853pgk.1 for <idr@ietf.org>; Fri, 11 Dec 2020 15:57:22 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=content-transfer-encoding:from:mime-version:subject:date:message-id :references:cc:in-reply-to:to; bh=C9+zFvPK7yn2qziEj5YgxXk51S+9Dy/K85868gbygZQ=; b=i59H7CyUWIVDJ3S0IznvlIU2OJnF83oq5Er3wuYXVLLrLyBhe4UmkuSe3qaoDuVkd5 hRIvNaNGea74ug7ovILIZeQw7PHidTZYRJ8TvvpfuZUtfCNZACfaLquEjhynu4Ycwj9J adTl3wQS460PoQhFk3F18LyKuPX0kZZF9hpfEccPO0WcFcNLGA89sbs8EJ7gGZJ1/Y6S AOAxHnIeycmlc/A/XgNZPh0O/cpmtNOTS4uiLOIHhEuqNlt9QikR9hMl0vbUrbwZkdc6 9G4TM/Zprm3dUirVP5s5url2DgjQSEDDtSKDzcMWKnEk3ZC/j3dSyIeDYzJ134a/qaAB WCsw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:content-transfer-encoding:from:mime-version :subject:date:message-id:references:cc:in-reply-to:to; bh=C9+zFvPK7yn2qziEj5YgxXk51S+9Dy/K85868gbygZQ=; b=ssgrGYF8FkSmX25Zicm8108EJgDDknifyPiQv7ueKA8Y9LL2d7x6/9n0EhHXWXVU+L nPn2cZMB87Mh72RpUcRcBXui56glJHoMm1urKX/Ua6FwAhv1KJurxfw1LGJ8EjTZJjjP /doFYhWiuR8Ci9LuyNK/09ea45zbVncrns0e7hE1RldWca5jF9zc7spnd+ZPE42xXgEg sUk/tuKj6Es5S+PXlWQ2va3is5maeuiYTEhMgXoP+a5QTevWdktHTsHal8xbrgzKibXP 4Cs17T7eWvDvcBFyOOH0OD7tkmwdGyAqhXDnr1CAB9liO/80lxVaj3IEzT9q43sQ+Hym ZJ/g==
X-Gm-Message-State: AOAM532IJGGIf2g5gOCtl+fwCKjYmIpnp5ftwx1V4swxzh3so8Siz12x w13/QuxVwbNx9OjN1h2a478=
X-Google-Smtp-Source: ABdhPJyPX0fMZkrLNKjMNZIPZwyD8FlZqJO5nUYdZRuO7QypK/6MAPinIVzHJ/l9PDt0psg+39yV+w==
X-Received: by 2002:a65:4887:: with SMTP id n7mr14420964pgs.85.1607731042354; Fri, 11 Dec 2020 15:57:22 -0800 (PST)
Received: from [192.168.1.12] (c-73-63-232-212.hsd1.ca.comcast.net. [73.63.232.212]) by smtp.gmail.com with ESMTPSA id o7sm12662905pfp.144.2020.12.11.15.57.21 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 11 Dec 2020 15:57:21 -0800 (PST)
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
From: Jeff Tantsura <jefftant.ietf@gmail.com>
Mime-Version: 1.0 (1.0)
Date: Fri, 11 Dec 2020 15:57:20 -0800
Message-Id: <57DF4DA1-256A-4FA9-8827-EFF6D9ED2A2E@gmail.com>
References: <2F238121-E468-4D0F-A0FF-9D82E44C3247@arrcus.com>
Cc: John Scudder <jgs=40juniper.net@dmarc.ietf.org>, Job Snijders <job@sobornost.net>, idr@ietf.org
In-Reply-To: <2F238121-E468-4D0F-A0FF-9D82E44C3247@arrcus.com>
To: Keyur Patel <keyur@arrcus.com>
X-Mailer: iPhone Mail (18B92)
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/HSyR0VOnCWWW6ZjFp9M_hAH4hh4>
Subject: Re: [Idr] TCP & BGP: Some don't send terminate BGP when holdtimer expired, because TCP recv window is 0
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 11 Dec 2020 23:57:25 -0000

The trade-off is (as often happens) between stability and convergence.
Given severity, I’d prefer formalized approach rather than implementation artifact ( at mercy of Product Manager in charge ;-))

Regards,
Jeff

> On Dec 11, 2020, at 15:30, Keyur Patel <keyur@arrcus.com> wrote:
> 
> One comment inlined #Keyur
> 
> On 12/11/20, 12:04 PM, "Idr on behalf of John Scudder" <idr-bounces@ietf.org on behalf of jgs=40juniper.net@dmarc.ietf.org> wrote:
> 
>    [all hats on]
> 
>    Hi Job,
> 
>    Thanks for bringing this up.
> 
>    To take the liberty of summarizing your wall of text :-) you’re saying that you believe BGP should tear down its session if it’s unable to send a message for the duration of the hold time. 
> 
>    Given that the conversation last time was inconclusive I think this is a good thing for the WG to discuss again. If you want to, you (or someone) could turn the idea into a short draft that updates RFC 4271, and we could have a WG adoption discussion about it. It might help focus the discussion but it’s not mandatory.
> 
>    I’ll point out a few things to start with —
> 
>    - Making it mandatory to apply hold time to the sending of messages would potentially make BGP peerings less stable. It clearly can’t make them *more* stable. Of course one can argue that if you haven’t been able to send a message for the hold time, the session has failed its metric of usefulness anyway, so any veneer of stability at this point is a harmful sham.
>    - If I recall correctly, RST doesn’t work (or may not work) if you’re using the MD5 TCP option. Nothing much to be done, but be aware.
>    - There is nothing stopping an implementation from doing what you describe now. The formalism that keeps you within the letter of 4271 would be that the implementation supplies a configuration option, that you set to enable the behavior. Once you’ve done that, when the implementation notices that the hold time has been exceeded in the outbound direction, it generates a ManualStop event for the session. 
> 
> #Keyur: +1 to what John said. This could very well be an implementation knob that generates ManualStop event.
> 
> Regards,
> Keyur
> 
>    Thanks,
> 
>    —John
> 
>> On Dec 11, 2020, at 2:23 PM, Job Snijders <job@sobornost.net> wrote:
>> 
>> 
>> Dear group,
>> 
>> Not too long ago an incident [1] in one Autonomous System resulted in
>> the global Internet being unusable in many parts of the world for
>> multiple hours. Some have reported the root cause was a 'configuration
>> error', however I believe much of the observed communication blackouts
>> in the global routing system stemmed from a pre-existing condition: a
>> specific implementation property present in multiple implementations
>> currently in use in the default-free zone.
>> 
>> Usually when an incident happens in one AS, affected parties can through
>> unilateral action 'route around the problem', but the ability to 'route
>> around problems' critically depends on the ability to distribute
>> WITHDRAW or UPDATE messages. When messages are not processed, what
>> generally was assumed to be a unilaterally solvable problem, now requires
>> coordination between *all* neighbors of the suffering AS.
>> 
>> The global routing system requires every participant to process BGP
>> messages, because the alternative is intervention on thousands of BGP
>> devices to manually shutdown thousands of BGP sessions disconnecting the
>> AS suffering from an incident, to help the rest of the default-free
>> zone. I speak from experience when saying that coordinating a disconnection
>> of an AS at global scale is incredibly hard and slow, any many approval
>> levels must be worked through. It takes *hours* of phone calls & email
>> chains, a time window during which internet traffic is routed towards
>> stale (now blackholing) locations.
>> 
>> In the average ISP's network design using IBGP Route Reflectors, these
>> blackout effects are aggravated when BGP sessions landing in such
>> devices are not terminated when TCP causes the BGP session to stall.
>> 
>> The problem of how TCP and BGP-4 can interact has been discussed before,
>> but I'm not sure the working group followed up with any publication
>> detailing the problem and the solution.
>> 
>>   https://urldefense.com/v3/__https://mailarchive.ietf.org/arch/msg/idr/q0Sx5d3zZjfOmOQ4lO2OZAHh9Lc/__;!!NEt6yMaO-gk!WnfNFxBMMXzuVhI23_QuKvcPfiG3Jwero3GwHhk0hhH6WNn1W0XWUkPhCc8cBA$
>> 
>> Does everyone agree BGP-4 sessions MUST be terminated using a TCP RST
>> (instead of a BGP-4 Cease NOTIFICATION) if the peer has indicated for
>> the duration of the Hold Timer that the TCP receive window is zero?
>> I'm fine with there being buttons to make this different, but the
>> default for routers in the global Internet routing system should be to
>> consider the remote peer to be 'a lost cause' when it won't accept new
>> BGP messages for the duration of the hold timer.
>> 
>> Perhaps RFC 4271 Section 6.5 should be amended as following:
>> 
>> OLD:
>>   If a system does not receive successive KEEPALIVE, UPDATE, and/or
>>   NOTIFICATION messages within the period specified in the Hold Time
>>   field of the OPEN message, then the NOTIFICATION message with the
>>   Hold Timer Expired Error Code is sent and the BGP connection is
>>   closed.
>> 
>> NEW:
>>   If a system does not receive (or is unable to send) successive
>>   KEEPALIVE, UPDATE, and/or NOTIFICATION messages within the period
>>   specified in the Hold Time field of the OPEN message, then the
>>   NOTIFICATION message with the Hold Timer Expired Error Code is sent
>>   and the BGP connection is closed. If the NOTIFICATION message cannot
>>   be send the BGP connection is closed.
>> 
>> This is an ongoing problem. I suspect the BGP Nyancat's discoloration at
>> the left most eye might have been caused by an active TCP session
>> keeping a stale BGP session alive. But also the observations from "BGP
>> Zombies: an Analysis of Beacons Stuck Routes" [3] could be explained by
>> the problematic interaction between TCP and BGP.
>> 
>> I appreciate the work the IDR working group has done to *SOFTEN* the
>> blow from implementation defects on global routing (RFC 7606 is a
>> brilliant example of this), but I fear in this case there is no subtle
>> way to say goodbye when the peer doesn't process messages in a timely
>> fashion. It might be good to document this.
>> 
>> Kind regards,
>> 
>> Job
>> 
>> [1]: https://urldefense.com/v3/__https://www.reuters.com/article/level-3-communi-outages-idUSL2N1CB00C__;!!NEt6yMaO-gk!WnfNFxBMMXzuVhI23_QuKvcPfiG3Jwero3GwHhk0hhH6WNn1W0XWUkMkF2w4cg$
>> [2]: https://urldefense.com/v3/__https://labs.ripe.net/Members/cteusche/bgp-meets-cat__;!!NEt6yMaO-gk!WnfNFxBMMXzuVhI23_QuKvcPfiG3Jwero3GwHhk0hhH6WNn1W0XWUkMry7Ktyw$
>> [3]: https://urldefense.com/v3/__https://www.iij-ii.co.jp/en/members/romain/pdf/romain_pam2019.pdf__;!!NEt6yMaO-gk!WnfNFxBMMXzuVhI23_QuKvcPfiG3Jwero3GwHhk0hhH6WNn1W0XWUkO8A78j8Q$
>> 
>> _______________________________________________
>> Idr mailing list
>> Idr@ietf.org
>> https://urldefense.com/v3/__https://www.ietf.org/mailman/listinfo/idr__;!!NEt6yMaO-gk!WnfNFxBMMXzuVhI23_QuKvcPfiG3Jwero3GwHhk0hhH6WNn1W0XWUkMMXdwc-g$
> 
>    _______________________________________________
>    Idr mailing list
>    Idr@ietf.org
>    https://www.ietf.org/mailman/listinfo/idr
> 
> _______________________________________________
> Idr mailing list
> Idr@ietf.org
> https://www.ietf.org/mailman/listinfo/idr