Re: [Idr] draft-spaghetti-idr-bgp-sendholdtimer - Feedback requested

Robert Raszuk <robert@raszuk.net> Wed, 28 April 2021 16:52 UTC

Return-Path: <robert@raszuk.net>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id AA0863A1657 for <idr@ietfa.amsl.com>; Wed, 28 Apr 2021 09:52:40 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.097
X-Spam-Level:
X-Spam-Status: No, score=-2.097 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=raszuk.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id T6vWGO58JsLa for <idr@ietfa.amsl.com>; Wed, 28 Apr 2021 09:52:34 -0700 (PDT)
Received: from mail-lf1-x12a.google.com (mail-lf1-x12a.google.com [IPv6:2a00:1450:4864:20::12a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id AFE6B3A1658 for <idr@ietf.org>; Wed, 28 Apr 2021 09:52:34 -0700 (PDT)
Received: by mail-lf1-x12a.google.com with SMTP id b23so17755927lfv.8 for <idr@ietf.org>; Wed, 28 Apr 2021 09:52:34 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=raszuk.net; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=qcyOZ6JxtTx31zzw+UBYaH+IPfa41Yq2HCJC6x5MsNI=; b=GimiUgxTI7MNVBfvvd28q0Hbjgb7/8U+ydGB6q8ptja1MEp0ICgtHW7NZk0a/z+hba /x4IRd3k1Vgr/dMGy2PIL5VilLbw1xAu6I3bI1ve+Yp+TPub+mw0s2CDK5BJYP/JVvoZ rXnfq7W7xBLaCZTwM9ToiLr/VZgTOMu0ZouelDlkUrER+WI4i483yxOJgiibSAS3QJeh UwyJTIeWHKNrqGibCZkoMz9cgU16nkbsmJcGmekUrtjDg2vHKNV4B83qzINipjNbDxgr nHMhEuGsyuhSreo4gchXNIB2EhyJ9DR5XPf0vA2Sdp4TJGTtV3jQ3dUsXvqLS6hJVBbP o9Fw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=qcyOZ6JxtTx31zzw+UBYaH+IPfa41Yq2HCJC6x5MsNI=; b=sWK2W9D/pDRjIsKytj7+z2VOFOCcAIDPVff4HgGsiTk5fRUdjzbPOS2YSkzjFxS43L N/poyD/bFgAO1KTyj4QjqH7I94r+yRI5QeEEa/RWMOtZkyH8aGLGXFWYpFIVSo7lvIU1 NWO9/Sb7yhZ4SPNIQtaLJ07k4DMIv66V65SkfC+S7jR9lFZHPfEVVKHjop1irIFKBO7u 2siqsNDp+pirMj43Wxwk481ro6pr78k5YWz5Gu3Tc2VLm/xL6mKDNjFQ6Ys5k3BsN9IK YGFgZgqEc84VgfiDabtPSvRUnq/fXqwAjpExH9b9WN9iCOeMeSNRJ8nGGlPSmeJF1SRL 6NjQ==
X-Gm-Message-State: AOAM5303AwEynPqY4NtQT/LzQfBGLFJn0Ve7FS2gDlE2HIUqlTEW23vy 0GsmGaM1XhY62HihHSUEPlM1iooAepWhOTUCAssMPw==
X-Google-Smtp-Source: ABdhPJxQDBvGjECPIUOJPnPefD7Bwdd4NDbxVoiZ3fi0bya0c8mQg1Tl/NJTwGUmilNGUq4ScEuAMvEg3LiDpG4vhlE=
X-Received: by 2002:a19:750a:: with SMTP id y10mr21466505lfe.517.1619628747542; Wed, 28 Apr 2021 09:52:27 -0700 (PDT)
MIME-Version: 1.0
References: <CAL=9YSVy+mvxvAv+maxkUSzPbe0bfnUy-XJJTtcVhi3S3bm=WQ@mail.gmail.com> <20210423212348.GB19004@pfrc.org> <CAOj+MMGH+y-gxSLaakknWSPFLEk9ikkUU1fa=3H0FjkokAbg3w@mail.gmail.com> <20210424004838.GC19004@pfrc.org> <CAOj+MMH5yzpPZjdUcfXV4cxCORqCsQY4X+niBjnwxjPfN-tsJA@mail.gmail.com> <BYAPR11MB3207E4A0BDC3367E21886C55C0439@BYAPR11MB3207.namprd11.prod.outlook.com> <20210427124724.GA21146@pfrc.org> <BYAPR11MB32077A59B783B81E5D4D2297C0409@BYAPR11MB3207.namprd11.prod.outlook.com> <CAOj+MMG9Nz=FEMr+HNm2A6T3bGRw4qnds_3FU9ZqV9Uisfozuw@mail.gmail.com> <AM7PR07MB62482A4E83C32BCAA1F95F94A0409@AM7PR07MB6248.eurprd07.prod.outlook.com>
In-Reply-To: <AM7PR07MB62482A4E83C32BCAA1F95F94A0409@AM7PR07MB6248.eurprd07.prod.outlook.com>
From: Robert Raszuk <robert@raszuk.net>
Date: Wed, 28 Apr 2021 18:52:17 +0200
Message-ID: <CAOj+MME3owkivap4UxEdC_6SnqC8tHKSZBNFoJVmVvQ6SnAJ3A@mail.gmail.com>
To: tom petch <ietfc@btconnect.com>
Cc: "Jakob Heitz (jheitz)" <jheitz@cisco.com>, "idr@ietf. org" <idr@ietf.org>, Ben Cox <ben=40benjojo.co.uk@dmarc.ietf.org>
Content-Type: multipart/alternative; boundary="0000000000002a625005c10b3429"
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/DuxsC4JZkNwgvbj6aCStxQUMQMM>
Subject: Re: [Idr] draft-spaghetti-idr-bgp-sendholdtimer - Feedback requested
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 28 Apr 2021 16:52:41 -0000

> Once we can not write to a peer over existing TCP session we could try to
> OPEN a new TCP session to the same address:179 with the same security
> strings. We could use same or different BGP_ID.
>
> <tp>
> As per RFC4271, that sets up a second FSM and leads to connection
> collision and the shutdown of one system, the one being dependent on who
> initiated the stuck session, so it could fall to the stuck system to act,
> except it is stuck so it cannot.
>

As I mentioned in comment brackets last time we investigated this topic the
conclusion was reached that most BGP implementations do not follow RFC in
that respect. We were even considering adding a knob "rfc-compliant" to bgp
:)

But I 100% agree that we should do a bit more "investigation" what's wrong
before resetting the session. Otherwise next session may experience the
same phenomenon.

Ideally it would be best to use bgp diagnostic message to exchange some
parameters before resetting some session (but we are not there yet with
this communication channel). Ideally it should run on different port.

Many thx,
R.



>
> I have not seen an analysis of exactly what has gone wrong but cannot help
> thinking that it could well be a bug, in TCP or BGP, in the peer in which
> case expecting rational behaviour of it might be expecting too much.
>
> A different approach, leading on from this idea, would be to reserve
> another TCP port number for BGP use which BGP could use to check out
> whether or not it can establish a TCP connection with the peer.  If not,
> then the peer is beyond help.  (TCP ports are a scarce resource but then
> functioning BGP is rather important to the health of the Internet).  If it
> can establish a TCP connection, then there are plenty of options available.
>
> Tom Petch
>


> /* When we discussed eBGP peering between VRFs of the same PE it was
> confirmed that most if not all implementations (at that time) do not check
> for different BGP_ID when accepting an OPEN). */
>
> If a new session succeeds and we sync routes we are good. Old session can
> be terminated. No data plane disruption at all.
>
> If not it would be indeed a sign that peer's BGP is not doing that well
> and perhaps confirmation that in some deployment situations it would be
> good to remove him from the picture after draining his routes from our
> peers.
>
> Best,
> R.
>