Re: [Tsv-art] TSV-ART review of draft-ietf-nvo3-mcast-framework-09

Linda Dunbar <linda.dunbar@huawei.com> Thu, 05 October 2017 19:56 UTC

From: Linda Dunbar <linda.dunbar@huawei.com>
To: Colin Perkins <csp@csperkins.org>
CC: "draft-ietf-nvo3-mcast-framework@ietf.org" <draft-ietf-nvo3-mcast-framework@ietf.org>, IETF Discussion <ietf@ietf.org>, "tsv-art@ietf.org" <tsv-art@ietf.org>, "tsv-ads@tools.ietf.org" <tsv-ads@tools.ietf.org>
Thread-Topic: TSV-ART review of draft-ietf-nvo3-mcast-framework-09
Thread-Index: AQHTMyz2QyUp7jiWVEWK2U7HbvpcuqLBFsfQgATVlgCAD8mQ4A==
Date: Thu, 05 Oct 2017 19:55:51 +0000
Message-ID: <4A95BA014132FF49AE685FAB4B9F17F6594A6E3D@SJCEML702-CHM.china.huawei.com>
References: <21AF4F15-19E9-4E34-89C7-8E3E22017878@csperkins.org> <4A95BA014132FF49AE685FAB4B9F17F65949DEE7@SJCEML702-CHM.china.huawei.com> <8F309687-6224-4991-A346-F021ECF11B7A@csperkins.org>
In-Reply-To: <8F309687-6224-4991-A346-F021ECF11B7A@csperkins.org>
Accept-Language: en-US
Content-Language: en-US
Content-Type: multipart/alternative; boundary="_000_4A95BA014132FF49AE685FAB4B9F17F6594A6E3DSJCEML702CHMchi_"
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsv-art/HKRCkPEDgMkdBWewv9mojtdlSrQ>
Subject: Re: [Tsv-art] TSV-ART review of draft-ietf-nvo3-mcast-framework-09
Precedence: list

Replies to your comments are inserted below (in purple).

Linda

From: Colin Perkins [mailto:csp@csperkins.org]
Sent: Monday, September 25, 2017 6:09 AM
To: Linda Dunbar <linda.dunbar@huawei.com>
Cc: draft-ietf-nvo3-mcast-framework@ietf.org; IETF Discussion <ietf@ietf.org>; tsv-art@ietf.org; tsv-ads@tools.ietf.org
Subject: Re: TSV-ART review of draft-ietf-nvo3-mcast-framework-09

Linda,

Please see inline.
Colin

On 22 Sep 2017, at 17:49, Linda Dunbar <linda.dunbar@huawei.com<mailto:linda.dunbar@huawei.com>> wrote:

Colin,

Thank you very much for reviewing the document.
Reply to your comments are inserted below:

-----Original Message-----
From: Colin Perkins [mailto:csp@csperkins.org]
Sent: Thursday, September 21, 2017 5:57 PM
To: draft-ietf-nvo3-mcast-framework@ietf.org<mailto:draft-ietf-nvo3-mcast-framework@ietf.org>; IETF Discussion <ietf@ietf.org<mailto:ietf@ietf.org>>
Cc: tsv-art@ietf.org<mailto:tsv-art@ietf.org>; tsv-ads@tools.ietf.org<mailto:tsv-ads@tools.ietf.org>
Subject: TSV-ART review of draft-ietf-nvo3-mcast-framework-09

Hi,

I’ve reviewed this document as part of the transport area review team's ongoing effort to review key IETF documents. These comments were written primarily for the transport area directors, but are copied to the document's authors for their information and to allow them to address any issues raised. When done at the time of IETF Last Call, the authors should consider this review together with any other last-call comments they receive. Please always CC tsv-art@ietf.org<mailto:tsv-art@ietf.org> if you reply to or forward this review.

Summary:
This draft is on the right track but has open issues, described in the review.

Comments:
Overall this seems like a reasonably clearly written draft that describes the problem space well, and outlines reasonable possible solutions. It seems to be on the right track, but there are a couple of transport-related issues that ought to be highlighted.

The major transport-related issue would seem to be congestion. Section 3.2 discuss this in terms of the load on the network generated by having to send multiple copies of packets when emulating multicast. This is good, and well written. However, it may be appropriate to explicitly mention that generating multiple copies of the packets can cause congestion that would not be present if native multicast were used (to be clear, this is not suggesting a new problem or requiring new solutions, just asking for an explicit statement that the replication can cause network congestion).
[Linda] The whole purpose of the paragraph is to emphasize the issue of generating multiple copies of packets. The draft emphasizes that this replication in DC is causing more bandwidth waste than MPLS VPLS service (which also uses the replication) because the amount of replication is much higher:
This method requires multiple copies of the same packet to all NVEs that participate in the VN.  If, for example, a tenant subnet is spread across 50 NVEs, the packet would have to be replicated 50 times at the source NVE.  This also creates an issue with the forwarding performance of the NVE.
---
Therefore, the Multicast VPN solution may not scale in DC environment with dynamic attachment of Virtual Networks to NVEs and greater number of NVEs for each virtual network.

Yes, I understand. However, as I said, it would be appropriate to explicitly mention that generating multiple copies of the packets can cause congestion that would not be present if native multicast were used, rather than describe congestion while avoiding saying the word “congestion”.

[Linda] Actually sending more copies of the same packet, especially multicast packets, doesn’t necessarily (or rarely) cause congestion. Congestion only occurs when links’ utilization reach close to 100% as most switches/routers today can handle wire speed forwarding. The rule of thumb of deploying network is 50% link utilization. Application based multicast is less than 2% of total traffic. Anyway, to make you happy, I inserted one sentence in the following original paragraph (in purple)
This method requires multiple copies of the same packet to all NVEs that participate in the VN.  If, for example, a tenant subnet is spread across 50 NVEs, the packet would have to be replicated 50 times at the source NVE.  Obviously, this approach creates more traffic to the network that can cause congestion when the network load is high. This also creates an issue with the forwarding performance of the NVE.

On a related note, the penultimate paragraph of Section 3.3 could usefully mention that overload of the MSN could result in packet loss that will appear as congestion to the endpoints.

[Linda] at the end of the section, we have the following describing the overload (a.k.a. scaling issue) of this approach. Is it good enough?
However, there remain issues with multiple copies of the same packet on links that are common to the paths from the MSN to each of the egress NVEs.  Additional issues that are introduced with this method include the availability of the MSN, methods to scale the services offered by the MSN, and the sub-optimality of the delivery paths.

No, since it says nothing about the impact of this on the endpoints. I think you need an explicit statement that “Overload of the MSN could result in packet loss that will appear as congestion to the endpoints”. The point is to explain how these effects are visible to applications using multicast.

[Linda]  I think it is more appropriate to say that “Additional issues that are introduced with this method include the availability of the MSN, methods to scale the services offered by the MSN, and the sub-optimality of the delivery paths” than “… congestion to the endpoints”. If MSN is not functioning correct due to overload, there could be many issues. It is beyond the scope of this document to describe what would “appear to the endpoints”.

One congestion related issue that is not discussed, and potentially affects Sections 3.2 and 3.3, is that multicast congestion control algorithms based on asynchronous layered coding (ALC) [RFC5775] perform rate adaptation by being able to prune back the rate sent across certain branches of the multicast distribution tree. An example is the FLUTE protocol [RFC6726]. While I expect these congestion control protocols are safe to use in source-replicated or MSN-replicated scenarios, they’ll certainly have sub-optimal performance in such overlays, and will likely work better if native multicast is used in the overlay. It might be appropriate to highlight this, since the large-scale presence of applications using such congestion control schemes may drive the choice of multicast support mechanism in the overlay.

[Linda] “Congestion Control” for multicast traffic is out of the scope of this draft. Maybe there should be a separate ID to describe the “congestion control for multicast traffic”.

Defining multicast congestion control algorithms is, of course, out of scope.

Noting the potential impact of NVO3 on existing IETF standards for multicast congestion control cannot be. I suggest adding text to say something like: “The use of mechanisms that emulate multicast using multiple unicast flows, as described in Sections 3.2 and 3.3, has the potential to interact poorly with applications using multicast congestion control schemes, such as those built on Asynchronous Layered Coding [RFC5775], that rely on being able to prune back branches of the multicast distribution tree for rate adaptation. Care should be taken in the choice of multicast mechanisms for deployments that need to support such applications.”

The draft makes no mention of ECN. It could usefully cite draft-ietf-tsvwg-ecn-encap-guidelines-09 and highlight that the encapsulation mechanism chosen needs to support ECN if the multicast flows being encapsulated make use of ECN. Similarly, if the multicast traffic sets the DSCP (DiffServ) bits, will may need support from the overlay. Both points could potentially be noted after the first paragraph of Section 3, where encapsulation options are listed.

[Linda] Again, “Congestion Control” is out of the scope of this draft. To discuss ECN for multicast traffic, there should be a separate ID.

This is not about multicast congestion control, it’s about correctly supporting IP features in the multicast overlay, so others can define multicast congestion control mechanisms.

If multicast applications that use ECN are to be supported, then the NVO3 overlay needs to support ECN. To do that, the tunnelling encapsulation used to deliver multicast traffic needs to support [draft-ietf-tsvwg-ecn-encap-guidelines-09]. This draft would seem the logical place for a statement to that effect. Similarly, if multicast applications that set the DSCP bits in the IP header are to be used, this likely also needs support in the tunnelling encapsulation, and this draft should highlight that such support is needed.

Note: I’m not suggesting that the tunnelling encapsulation needs to support ECN. I’m suggesting the draft needs to say that *if* the tunnelling encapsulation wants to support applications using ECN, then it must do so according to the ECN encapsulation guidelines (similarly, for DSCP).

Editorial Nit:
Section 1.1: Please expand the acronym “TS” on first use (it’s not expanded until Section 1.3, but is used in Section 1.1).
[Linda] this draft assumes all the terminologies specified by NVO3 architecture (RFC8014) where TS (Tenant System) is specified

I agree. However, it’s still good practice to expand acronyms on first use.

[Linda] Actually, the first paragraph of the 1.3 Terminology Clarification has
In this document, the terms host, tenant system (TS) and virtual machine (VM) are used interchangeably to represent an end station that originates or consumes data packets.

And TS is definition is right there in the Section 1.3 Terminology Clarification
TS: Tenant system

Linda

--
Colin Perkins
https://csperkins.org/

[Tsv-art] TSV-ART review of draft-ietf-nvo3-mcast… Colin Perkins
Re: [Tsv-art] TSV-ART review of draft-ietf-nvo3-m… Linda Dunbar
Re: [Tsv-art] TSV-ART review of draft-ietf-nvo3-m… Colin Perkins
Re: [Tsv-art] TSV-ART review of draft-ietf-nvo3-m… Linda Dunbar