Re: [Tsv-art] Tsvart early review of draft-ietf-rtgwg-net2cloud-problem-statement-22

Łukasz Bromirski <lukasz.bromirski@gmail.com> Fri, 14 April 2023 22:38 UTC

Return-Path: <lukasz.bromirski@gmail.com>
X-Original-To: tsv-art@ietfa.amsl.com
Delivered-To: tsv-art@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7C7E4C15152C; Fri, 14 Apr 2023 15:38:16 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.094
X-Spam-Level:
X-Spam-Status: No, score=-2.094 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id TYwarSoFyYlO; Fri, 14 Apr 2023 15:38:12 -0700 (PDT)
Received: from mail-lf1-x129.google.com (mail-lf1-x129.google.com [IPv6:2a00:1450:4864:20::129]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 3704EC14CF1A; Fri, 14 Apr 2023 15:38:12 -0700 (PDT)
Received: by mail-lf1-x129.google.com with SMTP id 2adb3069b0e04-4ec8133c698so458668e87.0; Fri, 14 Apr 2023 15:38:12 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1681511890; x=1684103890; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:from:to:cc:subject:date:message-id:reply-to; bh=L38+d+hZpAuq6wne7lgVSNEIaDVE96G83MSHYFY+Ym0=; b=GaxjpTvrrHAmZafxLWmEIb5TyYL6rQeLIZiLDRPMvoyw8zH8K2UdFNbQM0zaAo/RQ3 Ot4cpuVsq6MW2TZYTFFa6Vd7eDBc05r6RN+8NPrcBsSCyEJ0WJ5Py7E9cfV88tFwKqfK LmhXERtn5oUgXDqBwWVYGrOfdGm+C2cnu2LEI22o/qH4BsaLzRS1MAZo/UbBbFNOMEN3 W9QL60WAMVjUIFYJfz3o2Tt4evEpddFoHJ3TCx2QVLcC5OPfQpsLQ4HPVGGTOtpt+t2m o8mKMWamxA8rilp0TS4bUqPkwZzw+db84bLiPt0EM75xWFjAHwf9p7ZfRMPGumMDqnA3 dGEg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681511890; x=1684103890; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=L38+d+hZpAuq6wne7lgVSNEIaDVE96G83MSHYFY+Ym0=; b=iWqwXTYZVUAmNLHpfCAIfMqUyB0255AxFtJL/dEvrAnPVtc0GiSxAgPKcOZJteUnjf 3YclM5UAMiMwqcq2BJ2Z5hqn4WssWqnnfIq/W5jScHl6rWOTMzVROBdNJ+AcgTQ/CC2S u0r69bsqfRwmvHDfEDnPXfkfh+uAgem6gKuEUB31ee3Ou8tvk5VxFTPfNiBzOoUNmJgW KRePsW/x+n7qFqG8uu6RTmqz54f/RF3No+OmMnlqabOliJYBqGdiEiBJiz5UB98BZp7l h0EgrMtvdufrR3vZuTuBWt81HaNX3QUulKOU8l+8qy83DLuki6PDmgVAhFzYpoDmKmPf jvkA==
X-Gm-Message-State: AAQBX9dS+6wAqY/p+8iY2lCDSrxGk96+HXszWCyPmjzFg63HLPlE7/Yc oIW7krQxQ2FC9PidtGgjgEg=
X-Google-Smtp-Source: AKy350bw0VL7UtZPYXEF35ynJ4y4yiQwUWY5z1CGGaknNOzDK7aZS7qBVrPxJB2B+eTziBnmWxwlLg==
X-Received: by 2002:ac2:5308:0:b0:4eb:2a26:babf with SMTP id c8-20020ac25308000000b004eb2a26babfmr115923lfh.0.1681511889303; Fri, 14 Apr 2023 15:38:09 -0700 (PDT)
Received: from smtpclient.apple ([2001:470:51e1:33:a8ea:ca1c:4496:c4c2]) by smtp.gmail.com with ESMTPSA id u9-20020a056512094900b004e887fd71acsm974074lft.236.2023.04.14.15.38.08 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 14 Apr 2023 15:38:08 -0700 (PDT)
From: Łukasz Bromirski <lukasz.bromirski@gmail.com>
Message-Id: <FDAE23AC-5834-4EC3-B368-249F94E9DE9F@gmail.com>
Content-Type: multipart/alternative; boundary="Apple-Mail=_77E48727-E22D-48C8-8658-9948EB931D4A"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.500.231\))
Date: Sat, 15 Apr 2023 00:37:57 +0200
In-Reply-To: <PH0PR13MB49229EDCFEC1D54173EA590585999@PH0PR13MB4922.namprd13.prod.outlook.com>
Cc: David Black <david.black@dell.com>, "tsv-art@ietf.org" <tsv-art@ietf.org>, "draft-ietf-rtgwg-net2cloud-problem-statement.all@ietf.org" <draft-ietf-rtgwg-net2cloud-problem-statement.all@ietf.org>, "rtgwg@ietf.org" <rtgwg@ietf.org>
To: Linda Dunbar <linda.dunbar@futurewei.com>
References: <168055635654.11507.17750417804419163710@ietfa.amsl.com> <PH0PR13MB49229EDCFEC1D54173EA590585999@PH0PR13MB4922.namprd13.prod.outlook.com>
X-Mailer: Apple Mail (2.3731.500.231)
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsv-art/M_VinmMtvtpAxsucxcKvP3QBbR4>
Subject: Re: [Tsv-art] Tsvart early review of draft-ietf-rtgwg-net2cloud-problem-statement-22
X-BeenThere: tsv-art@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Transport Area Review Team <tsv-art.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsv-art>, <mailto:tsv-art-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsv-art/>
List-Post: <mailto:tsv-art@ietf.org>
List-Help: <mailto:tsv-art-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsv-art>, <mailto:tsv-art-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 14 Apr 2023 22:38:16 -0000

Hi Linda, Group,

Let me offer some points related to the latest version of the draft:

1. "DSVPN" - this is Huawei specific term describing VPNs that allow for dynamic connections between spokes which itself is 1:1 copy of Cisco DMVPN down to use of NHRP and mGRE (https://support.huawei.com/enterprise/en/doc/EDOC1100112360/a485316c/overview-of-dsvpn). Shouldn't we avoid vendor-specific product/solution names in RFC documents?

It's actually called out again in point 4.2 later on along with Cisco's DMVPN callout at the same time (which itself is not defined anywhere).

2. 

"3.1: [...] Cloud GWs need to peer with a larger variety of parties, via private circuits or IPsec over public internet."

As far as I understood, the whole 3.1. section tries to underline need for flexible/resilient BGP implementation and I agree with that. However,  I'd argue that a lot of cloud-based connections happen via BGP over internet directly, not necessarily through private circuits or IPsec. The 4.2 section of that draft even mentions some examples of that use case.

There's so much focus in the document on only two types of connection - MPLS VPN or IPsec. The actual use case of connecting your workload to the cloud can be easily addressed by any type of overlay routing, like GRE or VXLAN/GENEVE terminated on the virtual cloud gateway.

"When inbound routes exceed the maximum routes threshold for a peer, the current common practice is generating out of band alerts (e.g., Syslog) via management system to the peer, or terminating the BGP session (with cease notification messages [RFC 4486] being sent)."

For completness sake, shouldn't we explicitly state what's the action in the first case? Typically, the additional routes above the threshold are ignored and this in turn may lead to other reachability problems.

"3.4.1: [...] Therefore, the edge Cloud that is the closest doesn't contribute much to the overall latency."

How that's a problem?

"4.3: [...] However, traditional MPLS-based VPN solutions are sub-optimized for dynamically connecting to workloads/applications in cloud DCs."

The whole section says existing MPLS VPNs and/or IPsec tunnels are being used to connect to Cloud DCs. So how exactly the "traditional MPLS-based VPNs" are "sub-optimized" if at the same time they're the exact means document mentions of solving the problem?

"4.3. [...] The existing MPLS VPN provider might not have PEs at the new location. Deploying PEs routers at new locations is not trivial, which defeats one of the benefits of Clouds' geographically diverse locations allowing workloads to be as close to their end-users as possible."

When reading this literally, I'd say that any SP offering MPLS VPNs will be anyway more flexible in terms of reach (if it covers given geo) than pretty much fixed and limited number of cloud DCs available. However, I sense the intent here was to underline role of "agile" DCs set up by for example "cloud" stacks of 5G services (and similar services), and if so - that likely would require some clarification to be well understood.

"4.3. [...] As MPLS VPNs provide more secure and higher quality services, choosing a PE closest to the Cloud GW for the IPsec tunnel is desirable to minimize the IPsec tunnel distance over the public Internet."

MPLS VPNs provide more secure and higher quality services.... than what?

"4.3. [...] As multiple Cloud DCs are interconnected by the Cloud provider's own internal network, the Cloud GW BGP session might advertise all of the prefixes of the enterprise's VPC, regardless of which Cloud DC a given prefix is actually in. This can result in inefficient routing for the end-to-end data path."

That's true, but either we praise use of anycast (in the doc above) or claim it's inferior to instead polluting routing table (announcing more prefixes), or limiting visibility (by announcing less prefixes). You can't really have it both ways.

"5. As described in [Int-tunnels], IPsec tunnels can introduce MTU problems. This document assumes that endpoints manage the appropriate MTU sizes, therefore, not requiring VPN PEs to perform the fragmentation when encapsulating user payloads in the IPsec packets."

Well, typically no, it's 2023 and while PMTUD is still broken in parts of the internet that's abusively controlled or censored, the real problem here is with networks that run above typical 1500 bytes which is common for virtual environments and likely was a reason that text was put in place. Maybe underlining this would make sense in this paragraph?

"5.2. IPSec" -> "IPsec"

"5.2. IPSec encap & decap are very processing intensive, which can degrade router performance. NAT also adds to the performance burden."

That's why nowadays IPsec is executed in hardware, or in "hardware-accelerated" software path (like QAT for x86-pure workloads), so is typically NAT on enterprise gear that does qualify as a "PE" so often mentioned in this document. 

"5.2. [...] When enterprise CPEs or gateways are far away from cloud DC gateways or across country/continent boundaries, performance of IPsec tunnels over the public Internet can be problematic and unpredictable."

...compared to? Pure IP routing between the same IPs? 

"7. [...] via Public IP ports which are exposed"

Wouldn't it make sense to use 'interfaces' here? "ports" has TCP/UDP layer 4 connotation.

"7. [...] Potential risk of augmenting the attack surface with inter-Cloud DC connection by means of identity spoofing, man-in-the-middle, eavesdropping or DDoS attacks. One example of mitigating such attacks is using DTLS to authenticate and encrypt MPLS-in-UDP encapsulation (RFC 7510)."

How it is different than protection offered by IPsec?

"7. [...] When IPsec tunnels established from enterprise on-premises CPEs are terminated at the Cloud DC gateway where the workloads or applications are hosted, traffic to/from an enterprise's workload can be exposed to others behind the data center gateway (e.g., exposed to other organizations that have workloads in the same data center).

To ensure that traffic to/from workloads is not exposed to unwanted entities, IPsec tunnels may go all the way to the workload (servers, or VMs) within the DC."

How that problem statement would be different than DTLS solution/protection from the beginning of the section? 

-- 
./

> On 14 Apr 2023, at 19:24, Linda Dunbar <linda.dunbar@futurewei.com> wrote:
> 
> David, 
> We really appreciate your review and comments. Please see below for the resolutions. 
> Sorry for the delayed response. I missed yours when I was going through the comments from other reviewers. 
>  
> The revision -23  https://datatracker.ietf.org/doc/draft-ietf-rtgwg-net2cloud-problem-statement/ has addressed the comments from OpsDIR, RTGDIR, DNSDIR and GENART. Changes to your comments will be reflected in the -24 revision.
>  
> Linda
> -----Original Message-----
> From: David Black via Datatracker <noreply@ietf.org <mailto:noreply@ietf.org>> 
> Sent: Monday, April 3, 2023 4:13 PM
> To: tsv-art@ietf.org <mailto:tsv-art@ietf.org>
> Cc: draft-ietf-rtgwg-net2cloud-problem-statement.all@ietf.org <mailto:draft-ietf-rtgwg-net2cloud-problem-statement.all@ietf.org>; rtgwg@ietf.org <mailto:rtgwg@ietf.org>
> Subject: Tsvart early review of draft-ietf-rtgwg-net2cloud-problem-statement-22
>  
> Reviewer: David Black
> Review result: Not Ready
>  
> Transport Area Review:
>  
>         Dynamic Networks to Hybrid Cloud DCs: Problem Statement and
>                            Mitigation Practices
>               draft-ietf-rtgwg-net2cloud-problem-statement-22
>  
> Reviewer: David L. Black (david.black@dell.com <mailto:david.black@dell.com>)
> Date: April 3, 2023
> Result: Not Ready
>  
> >From a Transport Area perspective, there's not a lot of relevant content in this draft.
> Section 5 mentions IPsec tunnels, which raise the usual transport-related concerns in dealing with tunnels.  Those concerns can be primarily addressed by citing appropriate references, e.g., MTU concerns are discussed in the tunnels draft in the intarea WG, and ECN propagation is covered by RFC 6040 plus the related update draft for shim headers in the TSVWG working group.  I don't see any serious problems here.
> [Linda] For the MTU introduced by IPsec tunnels, how about adding the following sentences? 
> As described in [Int-tunnels], IPsec tunnels can introduce MTU problems. This document assumes that endpoints manage the appropriate MTU sizes, therefore, not requiring VPN PEs to perform the fragmentation when encapsulating user payloads in the IPsec packets
>  
> IPsec tunnels are over public internet, which doesn’t support ECN. Why need to mention RFC6040?
>  
>  
> OTOH, from a broader perspective, the draft is not a coherent problem statement - it discusses a plethora of technologies ranging from MPLS to DNS, often without making any connections among them (e.g., section 6 identifies policy management as a requirement, but there's no discussion of policies that require management elsewhere in the draft).
> [Linda] This document describes the network-related problems enterprises face when interconnecting their branch offices with dynamic workloads in third-party data centers (a.k.a. Cloud DCs) and some mitigation practices. It is a list of technologies ranging from VPN to DNS. 
>  
>  
> I'm not even sure what the scope of the draft is, e.g.:
>  
> a) The abstract states that the draft is "mainly for enterprises that already have traditional MPLS services and are interested in leveraging those networks," but section
> 3.4 discusses 5G Edge Clouds, which are rather unlikely to use MPLS.
> [Linda] The document is mainly for enterprises that already have traditional VPN services and are interested in leveraging those networks (instead of altogether abandoning them). MPLS (which is now replaced by VPN) is just one example.
>  
>  
> b) There are at least three roles for BGP in this draft that are not disambiguated - IGP, EGP, and VPN routing protocol for MPLS-based VPNs, e.g., EVPN.  Section 4 would be a good place to clarify this by describing the Gateway interfaces in detail, including the role of BGP.
> [Linda] Connecting to Cloud needs BGP, but doesn’t run IGP, EVPN. 
> The intend of the draft is to identify future work in BGP. 
>  
> In its current form, I don't understand the target audience or purpose of this draft, especially the head-spinning mixture of topics in section 3, so I cannot recommend IETF publication of the draft in its current form.
> [Linda] The intent of the document is to lay out current mitigation methods and additional work on extension to BGPs, such as https://datatracker.ietf.org/doc/draft-ietf-idr-sdwan-edge-discovery/
>  
> Perhaps the draft ought to be focused and organized around extending and/or using MPLS and MPLS-based VPNs - much of the material in Sections 4 and 5 would be applicable, and some of the worst of section 3's distractions (e.g., 5G, DNS) could be avoided or at least scoped to the relevant VPN technologies.
> [Linda] DNS issues introduced by connecting to Cloud DCs were strongly requested by DNSOps and OpsDIRs. 
>  
> Thank you very much
> Linda
>  
>  
>  
> _______________________________________________
> rtgwg mailing list
> rtgwg@ietf.org <mailto:rtgwg@ietf.org>
> https://www.ietf.org/mailman/listinfo/rtgwg