RE: comments on draft-ietf-rtgwg-net2cloud-problem-statement-01

Linda Dunbar <ldunbar@futurewei.com> Mon, 17 June 2019 18:01 UTC

Return-Path: <ldunbar@futurewei.com>
X-Original-To: rtgwg@ietfa.amsl.com
Delivered-To: rtgwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 04ADF12019C for <rtgwg@ietfa.amsl.com>; Mon, 17 Jun 2019 11:01:42 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level:
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=futurewei.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qTWz4ktZVOJA for <rtgwg@ietfa.amsl.com>; Mon, 17 Jun 2019 11:01:37 -0700 (PDT)
Received: from NAM03-CO1-obe.outbound.protection.outlook.com (mail-eopbgr790139.outbound.protection.outlook.com [40.107.79.139]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 97C6E120121 for <rtgwg@ietf.org>; Mon, 17 Jun 2019 11:01:36 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Futurewei.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=DqKbrr5FDLFID3mhuRS2l0/PsDPeJ8SeuRNF9zn2nYg=; b=eJYhvO3aQ0o+5cgxxDNPKXczD8iIAuD6mx7evMpoqnRUODPwHV/DdaoqhseWTmqwAbr4KU16om39LIteElCiuUicTcI2a828h6kL2FHpBmimDMTkdZRb5xWkNf2ZZ0wl02CB/iC+c7Hez+65QNebhhavEsrkzQrOLIoHyIsiEHI=
Received: from MN2PR13MB3582.namprd13.prod.outlook.com (10.255.238.139) by MN2PR13MB2815.namprd13.prod.outlook.com (20.178.253.206) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2008.11; Mon, 17 Jun 2019 18:01:31 +0000
Received: from MN2PR13MB3582.namprd13.prod.outlook.com ([fe80::b5e8:95cb:5d8e:9397]) by MN2PR13MB3582.namprd13.prod.outlook.com ([fe80::b5e8:95cb:5d8e:9397%7]) with mapi id 15.20.2008.007; Mon, 17 Jun 2019 18:01:31 +0000
From: Linda Dunbar <ldunbar@futurewei.com>
To: Chris Bowers <chrisbowers.ietf@gmail.com>, RTGWG <rtgwg@ietf.org>
Subject: RE: comments on draft-ietf-rtgwg-net2cloud-problem-statement-01
Thread-Topic: comments on draft-ietf-rtgwg-net2cloud-problem-statement-01
Thread-Index: AQHVIvV/7ouGAUfRaUG82krdrVGpRqaf/Vqw
Date: Mon, 17 Jun 2019 18:01:31 +0000
Message-ID: <MN2PR13MB3582B545BE3B6C6AE9ADE3F0A9EB0@MN2PR13MB3582.namprd13.prod.outlook.com>
References: <CAHzoHbssvcmD+YiZZTtFHcXDSt72h-Pf8t_DZshREVz7uV-_TQ@mail.gmail.com>
In-Reply-To: <CAHzoHbssvcmD+YiZZTtFHcXDSt72h-Pf8t_DZshREVz7uV-_TQ@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: yes
X-MS-TNEF-Correlator:
authentication-results: spf=none (sender IP is ) smtp.mailfrom=ldunbar@futurewei.com;
x-originating-ip: [12.111.81.80]
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: eab57226-a840-4ca0-b495-08d6f34dd413
x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(5600148)(711020)(4605104)(1401327)(2017052603328)(49563074)(7193020); SRVR:MN2PR13MB2815;
x-ms-traffictypediagnostic: MN2PR13MB2815:
x-ms-exchange-purlcount: 6
x-microsoft-antispam-prvs: <MN2PR13MB281569A42A0E1926AC0EA92DA9EB0@MN2PR13MB2815.namprd13.prod.outlook.com>
x-ms-oob-tlc-oobclassifiers: OLM:8882;
x-forefront-prvs: 0071BFA85B
x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(979002)(346002)(136003)(39850400004)(376002)(396003)(366004)(15374003)(51914003)(189003)(20264003)(199004)(66556008)(66476007)(66616009)(73956011)(64756008)(66946007)(66446008)(8936002)(6246003)(966005)(14454004)(53946003)(110136005)(25786009)(55016002)(86362001)(76116006)(606006)(229853002)(66066001)(478600001)(256004)(14444005)(186003)(5024004)(68736007)(52536014)(102836004)(99936001)(476003)(81156014)(81166006)(5660300002)(71190400001)(53936002)(486006)(8676002)(7736002)(316002)(236005)(9686003)(7696005)(54896002)(6306002)(11346002)(446003)(33656002)(99286004)(2906002)(6116002)(3846002)(76176011)(790700001)(6436002)(71200400001)(30864003)(26005)(66574012)(53546011)(6506007)(74316002)(579004)(969003)(989001)(999001)(1009001)(1019001); DIR:OUT; SFP:1102; SCL:1; SRVR:MN2PR13MB2815; H:MN2PR13MB3582.namprd13.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1;
received-spf: None (protection.outlook.com: futurewei.com does not designate permitted sender hosts)
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam-message-info: /nFtyNu0Kga3Qez8UM/FqWnCfQ3bpPPDViGQfbBnCfzb7mXPPYjrAzoYQOjicTLGqsJfTr49/HJOJoYZjb9vw+H3nBoFYHZ0VfKtyRQI4oZYa3gASYu92mfHVm3w7XuZV8k3RMs4oNFlmdKAVKbG97DkrtlC/sHj+AG/8TVfX0GwMuH52FFvJg0xQZO2RN7YoiqoTaIlInBYKqd+iDGjoLqnDI1W/dcASIPfvHw/bovsX6UVnaYwljpOrV27IaxKFJVuxw4XKl3OeiZyz7m25Aa7ZWLE1Y3ZQM5w9fATapA+ZpxqXjuJsdW7498EK2KBonsht6/hdhpfWa+doNU3a3IVCIiY2+U/LprbqVKkYodcjFw0gWi06lw1NDLUsMZWI4SasaYU9Mz52JjulCLksQsueJ+ou7rbGqmRi6/TvCQ=
Content-Type: multipart/mixed; boundary="_004_MN2PR13MB3582B545BE3B6C6AE9ADE3F0A9EB0MN2PR13MB3582namp_"
MIME-Version: 1.0
X-OriginatorOrg: Futurewei.com
X-MS-Exchange-CrossTenant-Network-Message-Id: eab57226-a840-4ca0-b495-08d6f34dd413
X-MS-Exchange-CrossTenant-originalarrivaltime: 17 Jun 2019 18:01:31.3107 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 0fee8ff2-a3b2-4018-9c75-3a1d5591fedc
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: ldunbar@futurewei.com
X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN2PR13MB2815
Archived-At: <https://mailarchive.ietf.org/arch/msg/rtgwg/nEQl7erIwBSkzOIr9ZyNdn1ZMkg>
X-BeenThere: rtgwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Routing Area Working Group <rtgwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rtgwg/>
List-Post: <mailto:rtgwg@ietf.org>
List-Help: <mailto:rtgwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 17 Jun 2019 18:01:42 -0000

Chris,

Thank you very much for the detailed review and comments.
Changes to address your comments are inserted below. Attached is the changed document with the ChangeBar enabled.  Please let us know if you have more suggestions.


From: rtgwg <rtgwg-bounces@ietf.org> On Behalf Of Chris Bowers
Sent: Friday, June 14, 2019 4:06 PM
To: RTGWG <rtgwg@ietf.org>
Subject: comments on draft-ietf-rtgwg-net2cloud-problem-statement-01

RTGWG and authors,

I think draft-ietf-rtgwg-net2cloud-problem-statement-01 covers an important and interesting topic.  I think it needs some more work before we consider it for WG LC and publication.  In general I think current text is not clear enough about many of the details of the networking scenarios discussed.   These details are important for drawing conclusions about how to address the problems presented.  Below are some specific comments on the draft.

Thanks,
Chris

=============
The title  “Seamless Interconnect Underlay to Cloud Overlay Problem Statement” is not very clear.
The terms underlay and overlay need lots of context.  The abstract doesn’t mention overlay or underlay, but provides a pretty good description of the problems being discussed, ie. connecting enterprises to cloud DCs.

[Linda] This document discusses the issues associated with connecting enterprise to their workloads/applications instantiated in multiple third-party data centers (a.k.a. Cloud DCs). Very often, the actual Cloud DCs that host the workloads/applications can be transient.
Do you think a title along the line of “Dynamic Networks to Connect to Cloud DCs” is more appropriate?  Or simply “Dynamic Networks to Cloud DCs”?


=============
Abstract:

Existing:
This document also
   describes some of the (network) problems that many enterprises face

Proposed:
This document also
   describes some of the network problems that many enterprises face

[Linda] changed per your comment.

=============
Throughout:
“&” is used instead of “and” in many places.  I don’t think “&” should be used.

 [Linda] changed per your comment.

=============
Table of contents:

   10. Security Considerations.......................................17

   Solution drafts resulting from this work will address security

   concerns inherent to the solution(s), including both protocol

   aspects and the importance (for example) of securing workloads in

   cloud DCs and the use of secure interconnection mechanisms........17

Something is causing the text of the security considerations section to show up in the table of contents.
 [Linda] changed.

=============
Section 1.1:
Existing text:

   In addition, it is an uptrend with more enterprises instantiating

   their apps & workloads in different cloud DCs to maximize the

   benefits of geographical proximity, elasticity and special features

   offered by different cloud DCs.

The use of “uptrend” here is awkward.  It sounds like marketing copy.  Also, is the assertion that enterprises  will be using multiple, geographically diverse cloud DCs from the same provider or from different providers?
[Linda] How about changing to the following?

In addition, more enterprises are moving towards hybrid cloud DCs, i.e. owned or operated by different Cloud operators, to maximize the benefits of geographical proximity, elasticity and special features offered by different cloud DCs.

============
Section 2

Existing text:
   Hybrid Clouds: Hybrid Clouds (usually plural) refer to enterprises
               using their own premises DCs in addition to Cloud
               services provided by multiple cloud operators.  For
               example, an enterprise not only have applications
               running in their own DCs, but also have applications
               hosted in multiple third party cloud DCs ((AWS, Azure,
               Google, Salesforces, SAP, etc).  . ONUG also has a
               notion of heterogeneous cloud, refers to enterprises
               does not have its own DC, only uses services by 3rd
               party cloud operators.

This definition of hybrid cloud above implies that any hybrid cloud must also be a heterogenous cloud.  I would rewrite the first sentence as
“Hybrid Cloud refers to an enterprise using its own on-premises DCs in addition to Cloud
               services provided by one or more cloud operators.”

[Linda] thanks for the suggestion. Changed accordingly.

The last sentence about ONUG’s notion of heterogenous cloud is very confusing here.

[Linda] Removed

=============
Section 2

Existing text:
VPC:        Virtual Private Cloud. A service offered by Cloud DC
               operators to allocate logically-isolated cloud
               resources, including compute, networking and storage.

It seems to me that Virtual Private Cloud needs a much more detailed definition or description.   For example, does the VPC use public or private address space?  Later on in section 3.1 there is mention of “transit gateways”.  Perhaps a more complete description of  the VPC would describe transit gateways.

=============
Section 3.1

Existing text:

     - Internet gateway for any external entities to reach the

        workloads hosted in AWS Cloud DC via the Internet.

It is not clear what this option refers to.  Is the ability for the enterprise to SSH and SCP into their server instances in the AWS cloud at public IP addresses over the internet?  Or it the ability of, for example, a customer of the enterprise to access an application on a web server run by the enterprise?  Or is it access from the Internet to the VPC private address space mediated by NAT.  This should be clarified.

[Linda] this is refereeing to AWS Internet Gateway.
How about changing to “AWS Internet Gateway allows communication between instances in AWS VPC and the internet”?

Here is the direct quote from AWS documentation:

Internet Gateways
An internet gateway is a horizontally scaled, redundant, and highly available VPC component that allows communication between instances in your VPC and the internet. It therefore imposes no availability risks or bandwidth constraints on your network traffic.
An internet gateway serves two purposes: to provide a target in your VPC route tables for internet-routable traffic, and to perform network address translation (NAT) for instances that have been assigned public IPv4 addresses.
An internet gateway supports IPv4 and IPv6 traffic.

============
Section 3.1

Existing text:
Via Direct Connect, an AWS Transit
        Gateway can be used to interconnect multiple VPCs in different
        Availability Zones.

The “transit gateway” needs a clearer description.  It is also not clear what the transit gateway has to do with the Direct Connect option.

[Linda] it is referring to AWS Transit Gateway which is described in detail in AWS documentation: https://aws.amazon.com/transit-gateway/
 Transit Gateway acts as a hub that controls how traffic is routed among all the connected networks which act like spokes.

============
Section 3.1

Existing text:

  CPEs at one Enterprise branch office are connected to the Internet

   to reach AWS's vGW via IPsec tunnels. Other ports of such CPEs are

   connected to AWS DirectConnect via a private network (without any

   encryption).

Proposed text:
As an example, some branch offices of an enterprise can connect to over the Internet to reach AWS's vGW via IPsec tunnels.
Other branch offices of the same enterprise can connect to AWS DirectConnect via a private network (without any
   encryption).

[Linda] thank you very much for the suggestion. Changed accordingly.

=============
Figure 1.

Figure 1 needs more description and detail

What are TN-1 and TN-2?  Are they “Tenant Networks” or something else?  Are they all part of the same VPC or do they represent different VPCs?

If the point of figure 1 is to show that a single enterprise can connect to the same set of resources with some branches using IPSec Tunnels and others branches using Direct Connect (since TN-1 and TN-2 are repeated in each instance), then perhaps it would be better to just represent those resources as a single instance, instead of multiple instances with the same names.

Where is the “customer gateway” physically located in the Direct connect case?

[Linda] Modified the figure per your suggestion. And add the following explanation:

Figure below shows an example of some tenants’ workloads are accessible via a virtual router connected by AWS Internet Gateway; some are accessible via AWS vGW, and others are accessible via AWS Direct Connect. The vR1 can have its own IPsec capability for secure tunnel over the internet to bypass paying additional price for the IPsec features provided by AWS vGW. Some tenants can deploy separate virtual routers to connect to internet traffic and to traffic from the secure channels from vGW and DirectConnect, e.g. vR1 & vR2. Others may have one virtual router connecting to both types of traffic. Customer Gateway can be customer owned router or ports physically connected to AWS Direct Connect GW.
.

=============
Section 3.2

Existing Text:

   According to Gartner, by 2020 "hybrid will be the most common usage

   of the cloud" as more enterprises see the benefits of integrating

   public and private cloud infrastructures.

I personally don’t think that this reference to a Gartner report is very useful.  By the time this draft is published, it will likely already be 2020.  However, it you do want to use the reference, then it needs a citation in the References section so that someone can go look it up.

[Linda] removed the reference.


========
The division of the material in Sections 3.1  “Interconnect to Cloud DCs” and section 3.2  “Interconnect to Hybrid Cloud DCs” is confusing and seems somewhat arbitrary.  The content of section 3.1 seems like it mainly applies to Hybrid Cloud DCs.    At the same time, the observation in section 3.2  that “some enterprises prefer to instantiate their own virtual  CPEs/routers inside the Cloud DC to connect the workloads within the Cloud DC” doesn’t seem specific to Hybrid Clouds DCs.  I would suggest reorganizing the content of these two sections.

[Linda] The section 3.1 is mainly about same workloads being accessible by multiple connections (Internet, Direct Connect, etc.).  It is important for enterprises to be able to observe the specific behaviors when connected by different connections.
How about Changing the Section 3.1 title to “Multiple connection to workloads in a Cloud DC”.


========
Section 3.3 mentions three different approaches to interconnect workloads among different Cloud DCs.  However, most of the discussion is about the third option (establishing direct tunnels among different VPCs via client's own virtual routers instantiated within Cloud DCs.)  It would the good to provide more detail on the first two options.  Presumably the first option (utilizing Cloud DC provided transit gateways) is reasonable if the enterprise is using only one cloud provider. The current text is pretty dismissive of this option.  The second option (Hairpin all the traffic through the customer gateway) is not very clearly explained.  If these different approaches are going to be discussed, there needs to be more detail.

[Linda] Added the following text to describe the issues associated with each of the bullets listed:

Approach a) usually does not work if Cloud DCs are owned and managed by different Cloud providers.
Approach b) creates additional transmission delay plus incurring cost when exiting Cloud DCs.
For the Approach c), DMVPN or DSVPN use NHRP (Next Hop Resolution Protocol) [RFC2735] so that spoke nodes can register their IP addresses & WAN ports with the hub node. The IETF ION (Internetworking over NBMA (non-broadcast multiple access) WG standardized NHRP for connection-oriented NBMA network (such as ATM) network address resolution more than two decades ago.

========
Section 3.3
Existing text:
   There are many differences between virtual routers in Public Cloud
   DCs and the nodes in an NBMA network. NHRP & DSVPN are not cannot be
   used for registering virtual routers in Cloud DCs unless an
   extension of such protocols is developed for that purpose.

The current text simply asserts that NHRP and DSVPN cannot be used for this purpose.  It seems like more detail is needed in the text.  Does this conclusion also apply to DMVPN?
[Linda] Yes. Changed the text to the following:
 NHRP cannot be used for registering virtual routers in Cloud DCs unless an extension of such protocols is developed for that purpose. Therefore, DMVPN and/or DSVPN cannot be used directly for connecting workloads in hybrid Cloud DCs.

========
Section 4
Existing text:
     - High availability at any time, whatever the duration of the
        connection to the cloud DC.
        Many enterprises include cloud infrastructures in their
        disaster recovery strategy, e.g., by enforcing periodic backup
        policies within the cloud, or by running backup applications in
        the Cloud, etc. Therefore, the connection to the cloud DCs may
        not be permanent, but rather needs to be on-demand.

This requirement is confusing.  Is the requirement for the network connectivity to be highly available or is the requirement that it be on-demand to support high availability in a cost-effective manner?
[Linda] Both. Changed to the following:

  *   High availability to access all workloads in the desired cloud DCs.


=========
Section 4
     - Elasticity and mobility, to instantiate additional applications
        at Cloud DCs when end-users' usages increase and shut down
        applications at locations when there are fewer end-users.
        Some enterprises have front-end web portals running in cloud
        DCs and database servers in their on-premises DCs. Those Front-
        end web portals need to be reachable from the public Internet.
        The backend connection to the sensitive data in database
        servers hosted in the on-premises DCs might need secure
        connections.

This seems like two different requirements in the same bullet point.

[Linda] changed the text to the following:


  *   Elasticity: prompt connection to newly instantiated applications at Cloud DCs when end-users’ usages increase and prompt release of connection after applications at locations being removed when demands change.


=============
Section 5. Problems with MPLS-based VPNs extending to Hybrid Cloud DCs

Existing text:
     - Most of the cloud DCs do not expose their internal networks, so
        the MPLS-based VPNs can only reach Cloud DC's Gateways, not to
        the workloads hosted inside.

This assertion seems to contradict the description of the AWS Direct Connect option described in Section 3.1.
If this is true, please provide more detail about why it is true in the context of a more complete description of VPCs.

[Linda] added this paragraph to explain:
Even with AWS DirectConnect, the connection only reaches the AWS DirectConnect Gateway.

=============

Section 5
There is something wrong with the formatting of the last list item.

In addition to the formatting, this last list item beginning with “Many cloud DCs use an overlay to connect their gateways ..” is very confusing.  This should be expanded into a section with a full explanation and a figure to explain the problem,  as opposed to just a bullet item.

[Linda] changed the bullet to “-  Extensive usage of Overlay by Cloud DCs”, and added the explanation.

=============
Figure 2.

Where is the “customer gateway” physically located in the Direct connect case?

What do TN-1, TN-2, TN-3, … TN-6 represent exactly?  Are they all part of the same VPC or different VPCs?

[Linda] Added the explanation. TN= Tenant Applications/workloads.
========