Re: [Idr] Please discuss the use cases for draft-xie-idr-mpbgp-extention-4map6

Jeffrey Haas <jhaas@pfrc.org> Mon, 22 January 2024 22:33 UTC

Return-Path: <jhaas@slice.pfrc.org>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C8BFFC15793B; Mon, 22 Jan 2024 14:33:09 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.908
X-Spam-Level:
X-Spam-Status: No, score=-6.908 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id TCG4b1tv0vYf; Mon, 22 Jan 2024 14:33:08 -0800 (PST)
Received: from slice.pfrc.org (slice.pfrc.org [67.207.130.108]) by ietfa.amsl.com (Postfix) with ESMTP id 7EC5DC151092; Mon, 22 Jan 2024 14:32:43 -0800 (PST)
Received: by slice.pfrc.org (Postfix, from userid 1001) id EEE351E28C; Mon, 22 Jan 2024 17:32:42 -0500 (EST)
Date: Mon, 22 Jan 2024 17:32:42 -0500
From: Jeffrey Haas <jhaas@pfrc.org>
To: Chongfeng Xie <chongfeng.xie@foxmail.com>
Cc: Susan Hares <shares@ndzh.com>, idr <idr@ietf.org>, draft-xie-idr-mpbgp-extension-4map6@ietf.org
Message-ID: <20240122223242.GA29681@pfrc.org>
References: <BYAPR08MB487294F5C1EE87A8184EDC8BB3AEA@BYAPR08MB4872.namprd08.prod.outlook.com> <tencent_CBB12F958C85FDF962D76180EB1C51662408@qq.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <tencent_CBB12F958C85FDF962D76180EB1C51662408@qq.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/sdfebr7zwMpwhszAxQpWM-yjHTI>
Subject: Re: [Idr] Please discuss the use cases for draft-xie-idr-mpbgp-extention-4map6
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 22 Jan 2024 22:33:09 -0000

Chongfeng and other authors,

Thank you for your patience in waiting for my reply.

I believe the use case for using BGP as a mechanism to distribute IPv4/IPv6
mapping advertisements is clear.  I will be restating some of the procedures
in your document to ensure that I understand them correctly.

I have several items of concern in your proposal from a BGP signaling
perspective.  The issues will be discussed as several items:

* Scaling considerations.
* FIB implications
* Route selection and route propagation considerations
* IPv4 BGP considerations at the ingress
* Routing security implications.
* Miscellaneous Comments

-----

Setup:

The mapping database is summarized as a table of:

IPv4 Prefix/length, IPv6 Prefix/length, Distance metric.

The key of this table is IPv4 Prefix/length.  That is, only a single entry
is permitted to be present for a given IPv4 Prefix/length.

Signaling the mapping database in BGP:

IPv6/unicast BGP routes are distributed such that for a given IPv6 mapping
prefix, the BGP NLRI is composed from:

M6 - IPv6 mapping prefix network
M6len - Prefix length of M6
N4 - IPv4 network that is to be mapped
N4len - Prefix length of N4

>From your draft's diagram:

    |--------L2--------|
    +------------------+------------------+-------------+
    |  IPv6 Mapping    |   IPv4           |  ...0000... |
    |  Prefix of PE2   |   address prefix |             |
    +------------------+------------------+-------------+
    |-----------------L1------------------|
    Figure 5:Structure of IPv6 prefix in NLRI

The total prefix length for the above is carried as first octet of the NLRI
as is the usual practice for BGP NLRI.

The M6len field is carried in the proposed Path Attribute's "Length of IPv6
Mapping Prefix" field.:

    +---------------------------------------------------+
    |     Length of IPv6 Mapping Prefix(1 octet)        |
    +---------------------------------------------------+
    |     Forwarding Type(1 octet)                      |
    +---------------------------------------------------+
    |     Address Origin Type(1 octet)                  |
    +---------------------------------------------------+
    |     IPv4 Original ASN (4 octets)                  |
    +---------------------------------------------------+

    Figure 4:Encoding of the 4map6 attribute

-----

Scaling Considerations:

draft-ietf-v6ops-framework-md-ipv6only-underlay provide significant
commentary on situations where this technology may be deployed.  For some of
those scenarios, it might be clear as to when a provider originates a
mapping prefix for a given IPv4 network.  As an example, that prefix is
serviced in a given data center..

In situations where a given set of IPv4 networks are advertised once as
mapping prefixes, the scaling is no worse than if the routes were advertised
in IPv4 unicast natively.

There may be situations where a given IPv4 prefix is advertised as a mapping
prefix by more than one provider using Network Specific Prefixes (NSPs).
For some portions of the BGP-speaking IPv6 networks that may or may not
understand this feature, the distinct prefix count will go up.  When this
feature does not result in the consumption of FIB resources, the impact is
solely on consumed RIB resources. 

This is the same situation a provider using Internet-in-a-VRF scenarios in a
Layer 3 VPN would see if the Internet view were advertised from more than
one VRF with distinct Route Distinguishers.  Thus, we understand the scale
impacts.

The scale impacts look somewhat more frightening than they may be in
actuality since the desired deployment model is likely a subset of the
Internet IPv4 address space being distributed rather than the entirety of
the Internet routing table.  As an example, only routes associated with
local data center routes may be so distributed.

The motivation to raise this point is in the presence of accidental full
redistribution of Internet routes that the consequence is clear.  

-----

FIB Implications:

A sentence in the draft seems to indicate that the intent is that mapping
prefixes do not need to consume FIB resources.  From Section 4.2, Receiving
Mapping Rule advertisement by P router:

"It should be noted that this process does not change or affect the IPv6 FIB
table of the P router."

One interpretation of this sentence is it is your intent that when a IPv6
BGP Speaker receives a mapping prefix, perhaps identified by having the
4map6 Path Attribute, that it should not install that IPv6/unicast NLRI in
the FIB.  Is that your intent?

If so, there are two issues with this desired procedure:
1. Only routers that understand this feature would be able to perform the
   optimization to avoid installing the mapping prefix into its IPv6 FIB.
2. As we had discussed in our previous emails, if the 4map6 Path Attribute
   were accidentally (or intentionally) removed, IPv6 FIB resources would be used.

It should be noted that many routers currently size the space of their IPv4
and IPv6 FIBs differently.  Significant redistribution of IPv4 Internet into
this mechanism can result in IPv6 FIB exhaustion.  If more than one service
provider distributes mapping prefixes for the same IPv4 networks, this issue
is compounded.

This is perhaps not the expected operational model or impact.  If that's the
case, some text emphasizing the scaling considerations may be helpful.
(If it hurts, don't do that!)

-----

Route Selection and Route Propagation Considerations:

I suspect Section 3.3 needs to be deleted, but that depends if my
understanding of your intent in the next sections is correct.

In Section 4.2, the behavior of a BGP Speaker that understands these
procedures and is a P route (not an egress), once a mapping prefix has been
received the mapping database entry is extracted and compared versus the
currently installed value.  As noted in my text above, that entry is done
versus the key of the IPv4 prefix.

The draft then says:

"Advertise the updated content of the entry found in the form of
MP_REACH_NLRI update information to IPv6 peer routers."

If I am understanding this correctly, your intention is that the receiving
BGP Speaker only will propagate BGP Routes that are the best entry in the
mapping database.  Is this correct?

If so, this is a violation of typical BGP protocol procedures.

The intention may be that the mapping database entry selection procedure is
considered as an extension to the BGP Decision Process (route selection).
If so, such changes are usually hazardous and require very careful
consideration for deployment.  Inconsistently doing route selection within
an AS can result in forwarding loops.

My belief is that the intent for this feature is for partial deployment in a
controlled environment.  Such route selection changes can only be done in a
fully upgraded deployment.

Additionally, there is the problem if the mapping prefixes escape to the
IPv6 Internet routing table, either accidentally or even using selective
redistribution.  At the moment, the design of this feature does not provide
enough safety for general purpose BGP routers to know if they are in a
deployment that should run these procedures.

The more general violation of BGP procedures would be the case where the
same IPv4 network was carried more than one way as a mapping prefix.  In
such cases, using my notation from above:

NLRI 1: M6-1/M6-1len N4/N4len
NLRI 2: M6-2/M6-2len N4/N4len

Normally these NLRI would be dealt with independently: for BGP routes, the
NLRI is the key.  However, if the intent is that route selection and
redistribution depends solely on N4/N4len, we have problems when portions of
the network understand the feature and selectively propagate BGP Routes
based on that criteria, and others do not.

The simpler and more correct BGP procedure is to treat the mapping prefixes
in the same fashion as Layer 3 VPN routes: Only the installing ingress
device will carry out these procedures.

-----

IPv4 BGP Considerations at the Ingress:

At an ingress, there's a need to install an IPv4 FIB entry corresponding
with the mapping prefix routes.  I.e. there will be some N4/N4len that has
been selected from the mapping database procedures to use as the ingress.

In the circumstances that the ingress PE has no IPv4 route for this
destination, the procedure is easy.  But what happens if there's something
else, say from BGP itself?

Again, using the analogy of Layer 3 VPNs, the usual desired procedure is
that the installed route carry its original BGP properties.  This was the
motivation for having the attr-set attached to the route carrying these
things.

The underlying motivation is that when a route is installed at the ingress,
it can be appropriately selected from among competing routes. Perhaps it
might be redistributed further into the network.

What we CANNOT have is a route that is considered a "BGP" route stripped of
all of its properties be further propagated into Internet BGP.

-----

Routing Security Implications:

draft-ietf-v6ops-framework-md-ipv6only-underlay has some excellent points
regarding the security impacts of this feature.  We'll want some portions of
that present also in this document, or at least a pointer to that as the
source of the security rules.

As noted above, if we don't adequately pass routing inforamtion from the
egress to the ingress route, perhaps via attr-set, we have significant
possibility of introducing an Internet route hijack.

-----

Miscellaneous Comments:

The Address Origin Type's field could use some clarification.  I believe the
intent is that if the field's value is Relay, this is a mapping prefix
generated for a IPv4 BGP route redistributed as a mapping prefix?

All field entries in the Path Attribute need appropriate IANA Considerations
registries.

In the draft, Distance to the Egress appears to correspond to the
"calculated BGP AS_PATH length" used in BGP route selection.  Correct?
If so, I would discourage you from trying to use only this metric in
selecting entries in the mapping database.  Long experience with BGP has
shown that operators may want to use any number of BGP tie-breaking
considerations to choose routes. For example, maybe BGP Communities!  Once
the procedures for route distribution are clearer, let's revisit this
distance metric.

The IPv4 Original ASN field isn't clear in its use.  I suspect the IPv4 BGP
considerations section above overlaps its desired use.  If the original
AS_PATH from the route redistributed from the egress is passed in the BGP
attr-set Path Attribute, the Original ASN field may become unnecessary.

-- Jeff








On Sun, Nov 12, 2023 at 03:46:18PM +0800, Chongfeng Xie wrote:
> 
> Hi Sue and all,
> 
> This use case is about IPv6-only deployment in multi-domain networks,it has been illustrated in section 3、4、5  of draft-ietf-v6ops-framework-md-ipv6only-underlay, I'd like to highlight the keypoints as below,
> 
> 1,In this case, IPv6-only is used for service data forwarding, including IPv6 and IPv4 service data. For IPv4 packet that arrives at the edge of the IPv6-only network, PE will convert it into IPv6 packet by encapsulation or translation,  so IPv4 related features are required to be maitained in the PE at the network edge. Currently, all the mobile operating systems,such as Android and IOS, support translation-based IPv6-only mode, so translaton needs to be considered as well.
> 
> 2, In order to support IPv4/IPv6 packet conversion in encapsulation or translation, stateless address mapping is adopted to transform the IPv4 source and destination addresses of IPv4 packets into IPv6 source and destination addresses, and vice versa. Stateless IPv4/IPv6 mapping means adding an IPv6 mapping prefix directly to the IPv4 address to generate its corresponding IPv6 address, so this is 1:1 mapping. This stateless address mapping works both for encapsulation or translation, it also has many advantages for network operation and management.
> 
> 3, To meet the requirements of 1 and 2 above, PEs are required to exchange information such as  IPv4 address blocks, IPv6 address mapping prefixes and IPv4/IPv6 packet conversion methods ( i.e. encapsulation or translation) . This is the what being done in 4map6 draft.
> 
> RFC5549 was mentioned in idr seesion on Friday, RFC5549 specifies the extensions to allow advertising IPv4 NLRI or VPN-IPv4 NLRI with a Next Hop address that belongs to IPv6. However, It does not provide IPv6 address mapping prefixes for the aforementioned stateless encapsulation mode, nor does it support translation, therefore, it does not meet the overall requirements of the above use case.
> 
> Best regards
> Chongfeng
> 
> 
> 
> chongfeng.xie@foxmail.com
>  
> From: Susan Hares
> Date: 2023-11-10 18:13
> To: idr@ietf.org
> Subject: [Idr] Please discuss the use cases for draft-xie-idr-mpbgp-extention-4map6
> Chong Feng and Ketan: 
>  
> Would you please explain the two use cases on the list?  
>  
> We need to discuss the questions regarding use cases regarding: 
>  
> Is the use case clearly described? 
> Do you think the 4 to 6 mapping is handled by another technology? 
>  
> Sue  

> _______________________________________________
> Idr mailing list
> Idr@ietf.org
> https://www.ietf.org/mailman/listinfo/idr