Re: [Idr] I-D Action: draft-wang-idr-bgp-error-enhance-00.txt

"Wanghaibo (Rainsword)" <rainsword.wang@huawei.com> Wed, 27 October 2021 10:06 UTC

Return-Path: <rainsword.wang@huawei.com>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9C5C43A0774 for <idr@ietfa.amsl.com>; Wed, 27 Oct 2021 03:06:05 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.918
X-Spam-Level:
X-Spam-Status: No, score=-1.918 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id fOPUQUbr0wPb for <idr@ietfa.amsl.com>; Wed, 27 Oct 2021 03:06:00 -0700 (PDT)
Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 7CAC43A0749 for <idr@ietf.org>; Wed, 27 Oct 2021 03:06:00 -0700 (PDT)
Received: from fraeml736-chm.china.huawei.com (unknown [172.18.147.226]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4HfPNp2CbWz67mXm for <idr@ietf.org>; Wed, 27 Oct 2021 18:01:46 +0800 (CST)
Received: from kwepeml100004.china.huawei.com (7.221.188.19) by fraeml736-chm.china.huawei.com (10.206.15.217) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.15; Wed, 27 Oct 2021 12:05:55 +0200
Received: from kwepeml500001.china.huawei.com (7.221.188.162) by kwepeml100004.china.huawei.com (7.221.188.19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.15; Wed, 27 Oct 2021 18:05:53 +0800
Received: from kwepeml500001.china.huawei.com ([7.221.188.162]) by kwepeml500001.china.huawei.com ([7.221.188.162]) with mapi id 15.01.2308.015; Wed, 27 Oct 2021 18:05:53 +0800
From: "Wanghaibo (Rainsword)" <rainsword.wang@huawei.com>
To: Robert Raszuk <robert@raszuk.net>, "shenming (A)" <shenming2@huawei.com>, "Dongjie (Jimmy)" <jie.dong@huawei.com>
CC: "idr@ietf. org" <idr@ietf.org>
Thread-Topic: I-D Action: draft-wang-idr-bgp-error-enhance-00.txt
Thread-Index: AQHXyM+kbMBN1T3uU0CQJSiZsu7WEavho3+AgASBtKA=
Date: Wed, 27 Oct 2021 10:05:53 +0000
Message-ID: <b028c749a89d4e82b7c4693b229e3b7a@huawei.com>
References: <163507718592.16183.4414540420076189232@ietfa.amsl.com> <CAOj+MMG=Ve+_BuOGY6tAU-CjY5GUg2uEHPtXO_HeSVFsBdDerQ@mail.gmail.com>
In-Reply-To: <CAOj+MMG=Ve+_BuOGY6tAU-CjY5GUg2uEHPtXO_HeSVFsBdDerQ@mail.gmail.com>
Accept-Language: zh-CN, en-US
Content-Language: zh-CN
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.108.153.118]
Content-Type: multipart/alternative; boundary="_000_b028c749a89d4e82b7c4693b229e3b7ahuaweicom_"
MIME-Version: 1.0
X-CFilter-Loop: Reflected
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/wtIPdVouNfX387SPQOsRr56HjBo>
Subject: Re: [Idr] I-D Action: draft-wang-idr-bgp-error-enhance-00.txt
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 27 Oct 2021 10:06:06 -0000

Hi Robert,

Thanks for your comments.
Our purpose is to minimize the impact of malformed packets on running BGP neighbors. Too strict processing rules cannot solve the problem.
RFC 7606, although some of its handling measures have been revised, still has some stricter policies, which we hope to revise.
At present, there are some exceptions that are not clearly defined. We hope to clarify the handling rules for these exceptions.
The proposed rule here may not be the most appropriate. We welcome discussion and suggestions.

There are some response inline with [Haibo].

Regards,
Haibo

From: Robert Raszuk [mailto:robert@raszuk.net]
Sent: Sunday, October 24, 2021 9:47 PM
To: Wanghaibo (Rainsword) <rainsword.wang@huawei.com>; shenming (A) <shenming2@huawei.com>; Dongjie (Jimmy) <jie.dong@huawei.com>
Cc: idr@ietf. org <idr@ietf.org>
Subject: Re: I-D Action: draft-wang-idr-bgp-error-enhance-00.txt

Dear Authors,

Below are my comments after review of your proposal.

1) There is no such thing as "MP_REACH_UNLRI" ... Please swap all occurances to "MP_UNREACH_NLRI". Likewise please adjust all "UNLRI" or define a priori what you mean by this abbreviation.

=>[Haibo] I’m sorry, this is my mistake, I’ll correct it at the next update. Thanks for your comments.

2)  Section 3.1 -

You say:

"Then we may also try to continue parse the next update packet if we can correctly find it."

I don't think this is a good suggestion for the Standards track document.

=>[Haibo] The description here may not be appropriate and we intend to revise it as follows:

[RFC7606] defines the condition whether NLRI field or Withdrawn Routes field SHALL be considered "syntactically incorrect". The error handling is not clearly described.
For exception handling of these "syntactically incorrect",  we recommend the following:
If the attribute and some of the prefixes in the NLRI or MP_REACH_NLRI can be correctly parsed. These prefixes can be used for BGP processing and drop the rest of the prefixes which cannot be parsed.

3)  Section 3.2 -

I disagree with the suggestion. If a speaker is sending more than one MP_REACH_NLRI or MP_UNREACH_NLRI it should be cut out as soon as possible from causing more damage.

=>[Haibo] These problems can be caused by software bugs. Interrupting neighbors does not help solve the problem.
Our consideration here is to take is suggest a best-effort approach to keep BGP running stably and reduce the impact on normal services.
In this case, it is recommended that make a firm rule to process it.
We may discuss using the rules here or other rules, such as “attribute discard” or “treat-as-withdraw”?

4) Section 4.1 -

I am not sure why there is any need to enumerate the same specific special IP addresses and not to list others. For example if I receive IPv4 next hop as 127.0.0.1 is this cool ?

I recommend editing this:  "f) The IP address is not a invalid ip address." I think you mean to say:  "f) The IP address is not a valid ip address."

=>[Haibo] Here we consider the ip addresses that should not be used as next hops in terms of protocol definition.
For other special cases, eg, the next-hop address is 127.0. 0.1, which is not a protocol error but a service error. This case is not described here.
The f) is my typo, thanks for your correction.

5) Section 4.2 -

I don't think this is a good suggestion.  Likewise in the case  of same prefix being present in both MP_REACH_NLRI and MP_UNREACH_NRLI suggestion to "firstly process the Prefixes in the MP_REACH_NLRI then process the Prefixes in the MP_REACH_UNLRI" is not good.

=>[Haibo] This case is not defined in RFC4271, RFC4760, RFC7606. In fact, all vendors can handle this case without interrupting it.
However, the processing results may be different. It is recommended that define a clear rule be provided to ensure route consistency.

6) Section 4.3 -

How can you treat it as withdraw something which is part of the "malformed prefix-SID attribute" ?

=>[Haibo] Although the SID is carried in the PREFIX_SID attribute, it is necessary for SR tunnel establishment.
After the information is incorrect, the SR tunnel is actually unavailable. Therefore, it is not recommended that this route be processed if this attribute is discarded.

7) Sections 4.4 & 4.5 & 4.6 -

Why are you suggesting again to check and detect special cases of *only* all zero addresses ? There are lot's of special IPv4 or even more IPv6 addresses to detect. I am not sure if we need to educate implementers about those in the BGP spec.

=>[Haibo] All of these attribute values here indicate BGP identifiers. RFC 6286 relaxes the definition of BGP identifiers to 4-octet, unsigned, non-zero integers.
Therefore, we only need to ensure that the BGP ID is a non-zero value. Then, the BGP ID is valid.

Conclusion:

If WG agrees that there are some valid suggestions in your proposal we should issue a RFC7606bis and not separate draft updating it like you are suggesting. So far to me like there is no substance to even go for -bis version of 7606.

=> [Haibo] We may first discuss which problem scenarios need to be modified and supplemented. Then we'll see whether to write a 7606bis or a separate draft.

Yes, the BGP-4 protocol running on TCP is not bullet proof when it comes to handling bad implementations or malicious protocol attacks. Your figure 1 illustrates this well. But even with all suggestions listed there still remains lot's of attack vectors if someone has access to inject bad BGP packets to your network.

I think we should consider using new transport which no longer runs on TCP and essentially not only treats all SAFIs as fully independent streams but also cut's interdependencies between all NLRI even with given SAFI.

Good news is that some proposals in this direction are starting to pop-up in this WG, IMO already opening new doors for the new (hopefully much more robust) BGP version.

=> [Haibo] We are also involved in the investigation of the update of the transport mechanism with BGP.
At the same time, it seems reasonable to polish the existing BGP error handling mechanism to help the operation of the current BGP protocol in live networks.


Many thx,
Robert



On Sun, Oct 24, 2021 at 2:06 PM <internet-drafts@ietf.org<mailto:internet-drafts@ietf.org>> wrote:

A New Internet-Draft is available from the on-line Internet-Drafts directories.


        Title           : Revised Error Handling for BGP Messages
        Authors         : Haibo Wang
                          Ming Shen
                          Jie Dong
        Filename        : draft-wang-idr-bgp-error-enhance-00.txt
        Pages           : 8
        Date            : 2021-10-24

Abstract:
   This document supplements and revises RFC7606.  According to RFC
   7606, when an UPDATE packet received from a neighbor contains an
   attribute of incorrect format, the BGP session cannot be reset
   directly.  Instead, the BGP session must be reset based on the
   specific problem.  Error packets must minimize the impact on routes
   and do not affect the correctness of the protocol.  Different error
   handling methods are used.  The error handling methods include
   discarding attributes, withdrawing routes, disabling the address
   family, and resetting sessions.

   RFC 7606 specifies the error handling methods of some existing
   attributes and provides guidance for error handling of new
   attributes.

   This document supplements the error handling methods for common
   attributes that are not specified in RFC7606, and provides
   suggestions for revising the error handling methods for some
   attributes.  The general principle remains unchanged: Maintain
   established BGP sessions and keep valid routes updated.  However,
   discard or delete incorrect attributes or packets to minimize the
   impact on the current session.



The IETF datatracker status page for this draft is:
https://datatracker.ietf.org/doc/draft-wang-idr-bgp-error-enhance/

There is also an htmlized version available at:
https://datatracker.ietf.org/doc/html/draft-wang-idr-bgp-error-enhance-00


Internet-Drafts are also available by anonymous FTP at:
ftp://ftp.ietf.org/internet-drafts/


_______________________________________________
I-D-Announce mailing list
I-D-Announce@ietf.org<mailto:I-D-Announce@ietf.org>
https://www.ietf.org/mailman/listinfo/i-d-announce
Internet-Draft directories: http://www.ietf.org/shadow.html
or ftp://ftp.ietf.org/ietf/1shadow-sites.txt