Re: [nvo3] Pick up the comment resolutions to Tsvart last call review of draft-ietf-nvo3-vmm-04
"B. Khasnabish" <vumip1@gmail.com> Fri, 16 August 2019 13:55 UTC
Return-Path: <vumip1@gmail.com>
X-Original-To: nvo3@ietfa.amsl.com
Delivered-To: nvo3@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1F9131201DE for <nvo3@ietfa.amsl.com>; Fri, 16 Aug 2019 06:55:32 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.747
X-Spam-Level:
X-Spam-Status: No, score=-1.747 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id A2bG9GDnfU61 for <nvo3@ietfa.amsl.com>; Fri, 16 Aug 2019 06:55:26 -0700 (PDT)
Received: from mail-io1-xd2c.google.com (mail-io1-xd2c.google.com [IPv6:2607:f8b0:4864:20::d2c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 7A802120046 for <nvo3@ietf.org>; Fri, 16 Aug 2019 06:55:26 -0700 (PDT)
Received: by mail-io1-xd2c.google.com with SMTP id i22so6300770ioh.2 for <nvo3@ietf.org>; Fri, 16 Aug 2019 06:55:26 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=9EadcDv3VlJQgpQ9uFF+U8ucZ9kcvY/+/mqiz1iOcJk=; b=m0hqAlav5o+Urevlu81UkguLJTyIRodAVdmqTMjZAF9VQltfEdHFeyPjPfTsuEDO2/ r9c/ss8y2rx1JUH79h2QgpQttF+BUVI9sW/nQLIrNVpsmYZV4vG8t4P9ear2I3pMk7H9 9noewtNrrWN0WPyRDLRCx3yDUXXGGPkiTo6NzeO4iiTNzjYSCqSb9DUQQewQB3VTVCn8 3r7vgXsYXj3VxfwxitHg8jV1HqtyI0LqE58Hd3PbA/6r+gCcMyJ0NSMJJWYn/KQeJ7kn sY1EpnOmyr+dPwB2k2jm7Vm8daUOH0KIpj0NumV/FnI+Vq+QnIPArPHxqv0ynLnUl2Wo vwbQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=9EadcDv3VlJQgpQ9uFF+U8ucZ9kcvY/+/mqiz1iOcJk=; b=J6dFCD0RlzWj97Z2GBOLKMM84TbZSwlG6sMJd3NqhxJxABX8kO3awOefuOISgIH4NC CNMsasKmml12IIfw1jSM78YGIXxXF9ITjOgyA8MHUREaYagGdZUN9rKHei+zSp178w1J hYXQQRZhUCyzpwAIdaJqhoSWYScn9SDFpLJjUSg8wLoHiUFCelRLEK+HyXMHPKm+lDXv w20SCYjzj4ECW68G4m6zwaJZJdDciWpUE5w0xyAStBq74QsacK0AWngpvuTvOHvU5v4b p+GDnKSNv5lgaN3JvkqdTv/afZd3fsKKasSnLz3QApbM5J3cUdNVD1QW3HJU3XkB/HBS LfZg==
X-Gm-Message-State: APjAAAWVdYr6znHd6H4RlqjYpP+yXZMxcLXBWHA8kvsMPwRrqWWTEqoX 1NFxnxt5suI3J6zMOGKwgWAI7fhU87WRPrZf81g=
X-Google-Smtp-Source: APXvYqxS952l6R9Z+qi3cEbZVtpJ9x5QmCXW1szIzSWOw3EO2VDZgzayo4eQ/nS5c2txddM+cTdDMRorIlOGR7657rY=
X-Received: by 2002:a5d:80d6:: with SMTP id h22mr8113366ior.231.1565963725317; Fri, 16 Aug 2019 06:55:25 -0700 (PDT)
MIME-Version: 1.0
References: <MN2PR13MB358289A29478BF08AC8FFF3B85D50@MN2PR13MB3582.namprd13.prod.outlook.com>
In-Reply-To: <MN2PR13MB358289A29478BF08AC8FFF3B85D50@MN2PR13MB3582.namprd13.prod.outlook.com>
From: "B. Khasnabish" <vumip1@gmail.com>
Date: Fri, 16 Aug 2019 09:54:47 -0400
Message-ID: <CANtnpwjXKdhr5fEM0+WCtjvoJHCmc4Fg0ABCsvN7yf_ndnohCg@mail.gmail.com>
To: Linda Dunbar <linda.dunbar@futurewei.com>
Cc: Bob Briscoe <ietf@bobbriscoe.net>, "nvo3@ietf.org" <nvo3@ietf.org>
Content-Type: multipart/alternative; boundary="000000000000942b8705903c576d"
Archived-At: <https://mailarchive.ietf.org/arch/msg/nvo3/zm4mgo1Kyc2_dPG-_TZ33FOdu9M>
Subject: Re: [nvo3] Pick up the comment resolutions to Tsvart last call review of draft-ietf-nvo3-vmm-04
X-BeenThere: nvo3@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Network Virtualization Overlays \(NVO3\) Working Group" <nvo3.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nvo3>, <mailto:nvo3-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nvo3/>
List-Post: <mailto:nvo3@ietf.org>
List-Help: <mailto:nvo3-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nvo3>, <mailto:nvo3-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 16 Aug 2019 13:55:32 -0000
Hi Linda, Thanks, and I think these adequately answer Bob's concerns. Can we set a timeline to resolve these in order to move this draft forward ?! Once again, many Thanks in advance. Best Bhumip +1-781-752-8003 (m) | vumip1@gmail.com https://about.me/bhumip __o _ `\ <, _ .......... ( • ) / ( • ) ...................... On Tue, Aug 6, 2019 at 4:40 PM Linda Dunbar <linda.dunbar@futurewei.com> wrote: > Bob, > > NVO3 Chairs have designated me as the editor for this draft which has been > stalled in the TSVARE LC review process for almost a year. > > The answers to your questions are marked by [Linda 2019] (hopefully you > still remember them). Please let me know if they are good enough to move > forward. > > ----------------------------------------------- > > Re: [nvo3] [Tsv-art] Tsvart last call review of draft-ietf-nvo3-vmm-04 > > Bob Briscoe <ietf@bobbriscoe.net> Tue, 18 September 2018 23:03 UTCShow > header > <https://mailarchive.ietf.org/arch/browse/nvo3/?q=draft-ietf-nvo3-vmm-04&f_from=Linda%20Dunbar%2CBob%20Briscoe> > > Linda, > > > > Until we can all understand the answers to the following two questions, > > I don't think we can discuss what track this draft ought to be on, let > > alone move on to your responses to all my other points. > > > > 1/ Applicability > > > > You say this draft solely applies to connections with both ends within > > the controlled DC environment. But the draft says it's about > > multi-tenant DCs. Are there any multi-tenant DCs that restrict all VMs > > to only communicate with other VMs within the same controlled DC > > environment? > > > > [Linda 2019] Yes. Many Cloud DC hosts applications for multiple tenants. > E.g. multiple depts of one organization instantiate their workloads in one > Cloud DC . There are communication within one tenant (workloads belonging > to one dept), and communications among hosts belonging to different tenants > (applications belong to different Depts). > > > > > > 2/ Purpose of publishing as an RFC > > > > When I said: > > > #. The introduction does not say what the purpose of publishing this > > > draft is. > > you responded: > > > [Linda] The first paragraph on Page 3 has the description why VM > > > Mobility is needed. > > > > Whether VM Mobility is needed was not my question. My question was what > > is the purpose of the IETF publishing an RFC about VM Mobility? And > > particularly, what is /this/ RFC intended to achieve? > > > > [Linda 2019] this RFC is intended to describe VMs (or applications) being > moved from one Rack to another and describe the impact to the Overlay > networks. > > > > Are the authors trying to argue for a particular approach vs. others? > > Are you trying to write a tutorial? Are you trying to give the pros and > > cons of different approaches? Are you trying to give advice on good > > practice (with the implication that alternative practices are less > > good)? Are you trying to clarify ideas by writing them down? Are you > > trying to outline the implications of VM Mobility for other protocols > > being developed within the NVO WG? > > > > [Linda 2019] NVo3 is developing protocols for IP overlay network within > Data Centers that hosts applications belonging to different tenants (or > organizations). Applications don’t stay in one place, as in traditional > Data Centers. That is why it is necessary for NVO3 to have a document > describing the behavior of VM mobility and its impact to the overlay > networks. > > > > > > > > > > Bob > > > > On 10/09/18 19:16, Linda Dunbar wrote: > > > Bob, > > > Thank you very much for reviewing the draft and provided in-depth > > > comments. I am very sorry for the delayed response due to traveling. > > > Replies to your comments are inserted below marked by [Linda]: > > > -----Original Message----- > > > From: Bob Briscoe [mailto:ietf@bobbriscoe.net] > > > Sent: Monday, September 03, 2018 9:45 PM > > > To: tsv-art@ietf.org > > > Cc: nvo3@ietf.org; ietf@ietf.org; draft-ietf-nvo3-vmm.all@ietf.org > > > Subject: Tsvart last call review of draft-ietf-nvo3-vmm-04 > > > Reviewer: Bob Briscoe > > > Review result: Not Ready > > > I have been selected as the Transport Directorate reviewer for this > > > draft. The Transport Directorate seeks to review all transport or > > > transport-related drafts as they pass through IETF last call and IESG > > > review, and sometimes on special request. The purpose of the review is > > > to provide assistance to the Transport ADs. For more information about > > > the Transport Directorate Reviews and the Transport Area Review Team, > > > please see https://trac.ietf.org/trac/tsv/wiki/TSV-Directorate-Reviews > > > In this case, very very few of the review comments relate to transport > > > issues, although the greatest issue concerns a desire that the network > > > could pause or stop connections during L3 VM Mobility, which is > > > certainly a transport issue. > > > [Linda] There is “Hot Migration” with transport service continuing, > > > and there is a “Cold Migration”, which is a common practice in many > > > data centers, which stop the task running on the old place and move to > > > the new place before restart as described in the Task Migration. > > > Is it helpful to add this description to the draft? > > > ==Summary== > > > The technical aspects of the draft concerning L2 VM mobility (within a > > > subnet) seem sound. However, this is only part of the draft, which has > > > the following > > > issues: > > > #. The introduction does not say what the purpose of publishing this > > > draft is. > > > It seems that, rather than describing a specific protocol or > > > protocols, it intends to describe the overall system procedure that > > > would typically be used in DCs for VM mobility. It is tagged as a BCP, > > > but it does not say who needs this BCP, why it is useful for the IETF > > > to publish this BCP, how wide the authors' knowledge is of current > > > practice (given DCs are private), or why this is a BCP rather than a > > > protocol spec. > > > [Linda] The first paragraph on Page 3 has the description why VM > > > Mobility is needed. Is it helpful to move this paragraph to the > > > beginning of the Introduction Section? > > > /“//Virtualization which is being used in almost all of today’s data/ > > > /centers enables many virtual machines to run on a single physical/ > > > /computer or compute server. Virtual machines (VM) need hypervisor/ > > > /running on the physical compute server to provide them shared/ > > > /processor/memory/storage. Network connectivity is provided by the/ > > > /network virtualization edge (NVE) [RFC8014]. Being able to move VMs/ > > > /dynamically, or live migration, from one server to another allows for/ > > > /dynamic load balancing or work distribution and thus it is a highly/ > > > /desirable feature [RFC7364].//”/ > > > The draft starts out (S.3) as if it intends to say what a good VM > > > Mobility protocol should or shouldn't do, but the rest of the document > > > doesn't give any reasoning for these recommendations, it just asserts > > > what appears to be one view of how a whole VM Mobility system works, > > > sometimes referring to one example protocol RFC for a component part, > > > but more often with no references or details. > > > [Linda] Is it helpful to move the paragraph above to the beginning of > > > the Introduction Section? So that audience is aware of why VM Mobility > > > is needed. And then follow up with what a good VM Mobility protocol > > > should or shouldn't do? > > > #. It does not seem as if the NVO WG has discussed the purpose of > > > using normative text in this draft. See detailed comments. > > > [Linda] The “Intended status” of the draft is “Best Current Practice”. > > > So all the text are not “normative”. Is it Okay? > > > #. The draft silently slips back and forth between VM mobility and VM > > > redundancy, without recognizing the differences. See detailed comments. > > > [Linda] There is only one usage of “redundancy” in the entire > > > document, used under the context of “Hot standby option”, indicating > > > the “redundancy” of “the VMs in both primary and secondary domains > > > have identical information and can provide services simultaneously as > > > in load-share mode of operation” being expensive. > > > #. Please adopt different terminology than "source NVE" and > > > "destination NVE", which are really poor choices of terms for an > > > intermediate node. See detailed comments. Why not use "old NVE" and > > > "new NVE", which is what you mean? > > > [Linda] Thanks for the suggestion. We will change to “Old NVE”, and > > > “new NVE”. > > > #. Applicability is fairly clearly outlined, but it is not clear > > > whether hosts corresponding with the mobile VMs are part of the same > > > controlled environment or on the uncontrolled public Internet. See > > > detailed comments. > > > [Linda] “Hosts” are the App running on the VM. It is the under the > > > same controlled environment. Not on uncontrolled public internet. > > > #. Section 4.2.1 on L3 VM mobility reads like some potential > > > half-thought-through ideas on how to solve L3 mobility, rather than > > > current practice, let alone best current practice. Either current > > > practice should be described instead, or the scope of the draft should > > > be narrowed solely to L2 VM mobility. See detailed comments. > > > [Linda] This is refereeing to “Cold Migration”, which is a common > > > practice in many data centers. > > > # The VM's file system is described as state that moves with the VM > > > (S.6), but VM mobility solutions often move the VM but stitch it back > > > to its (unmoved) storage. Conversely, the storage can also move > > > independent of the VM. > > > [Linda] It depends. When a VM move to a different zone, the > > > storage/file can becomes inaccessible. > > > #. The draft omits some of the security, transport and management > > > aspects of VM mobility. See detailed comments. > > > [Linda] Can you provide some text? > > > #. The draft reads as if different sections have been written by > > > different authors and no-one has edited the whole to give it a > > > coherent structure, or to ensure consistency (both technical and > > > editorial) between the parts. See detailed comments. > > > [Linda] we can improve. > > > #. The quality of the English grammar does not allow a reviewer to > > > concentrate on the technical aspects rather than the English. It would > > > have been useful if one of the English-speaking co-authors had > > > improved the English before submission for review. See detailed comments. > > > [Linda] can you help? Becoming a co-author to improve? > > > ==Detailed Comments== > > > ===#. Normative statements=== > > > In the body of the document, there is just one occurrence of normative > > > text (actually two "MUST"s, but both state a common requirement - just > > > written separately for IPv4 and IPv6). This merely serves to imply > > > that everything else the document says is less important or optional, > > > which was probably not the intention. > > > [Linda] The goal is to indicate any solution in moving the VM “MUST” > > > follow this rule. They make sense, aren’t they? > > > At the start there is a requirements section, which states what a VM > > > Mobility protocol "SHOULD" or "SHOULD NOT" do. I think this is > > > intended as a set of goals for the rest of the document. If so, these > > > "SHOULDs" are not intended to apply to implementations, so they ought > > > not to be capitalized. > > > [Linda] okay, will change. > > > The first requirement, "Data center network SHOULD support virtual > > > machine mobility in IPv6", is written as a requirement on all DC > > > networks, not on implementations. I assume this was intended to read > > > as "Data center network virtual machine mobility protocols SHOULD > > > support IPv6". Even then, it doesn't really add anything to say VM > > > mobility should support v6 and it should support v4. A L2 solution > > > won't. While undoubtedly, a L3 solution will at least support one of > them. > > > [Linda]Agree. Will change it to “Data center that support IPv6 address > > > should …” > > > I'm not sure that 'protocol' is the right word anyway; I think 'VM > > > Mobility procedure' would be a better phrase, because it includes > > > steps such as suspending the VM, which is more than a protocol. > > > [Linda] yes. Will change to “Procedure”. > > > The requirement "Virtual machine mobility protocol MAY support host > > > routes to accomplish virtualization", is not followed up at all in the > > > rest of the draft. > > > Even if this requirement stays, the last 3 words should be deleted. > > > [Linda] will change to “Host Route can be used to support the Virtual > > > Machine Mobility Procedure.” > > > By the end of the draft, the solution falls far short of the most > > > relevant "Requirements" anyway, so one assumes the title of the > > > section ought to have been "Goals". Specifically, even in the simpler > > > case of L2 VM mobility, S.4.1 says that triangular routing and > > > tunnelling persist "until a neighbour cache entry times out". A cache > > > timeout is about 10 orders of magnitude longer than the requirement to > > > only persist "while handling packets in flight", which would be a few > > > milliseconds at most (the time for packets to clear the network that > > > were already launched into flight when the old VM stopped). > > > Whatever, it would be preferable for the draft to give rationale for > > > these requirements, rather than just assert them. This would help to > > > shed light on the merits of the different trade offs that solutions > > > choose. > > > [Linda] Agree, will add. > > > ===#. Mobility vs. Redundancy=== > > > Redundancy and mobility have a lot of similarities, but they have > > > different goals. With mobility, it is necessary to know the exact > > > instant when one set of state is identical to the other so it can hand > > > over. With redundancy, the aim is to keep two (or more) sets of state > > > evolving through the same sequence of changes, but there is no need to > > > know the point at which one is the same as the other was at a certain > > > point. > > > [Linda] Agree with what you said. There is only one usage of > > > “redundancy” in the entire document, used under the context of “Hot > > > standby option”, indicating the “redundancy” of “the VMs in both > > > primary and secondary domains have identical information and can > > > provide services simultaneously as in load-share mode of operation” > > > being expensive. > > > The draft slips from mobility to resilience in the following places: > > > * S.2. Terminology: Warm VM Mobility is defined without any ending, as > > > if it is permanent replication. * S.7. "Handling of Hot, Warm and Cold > > > Virtual Machine Mobility" is actually all about redundancy, and > > > doesn't address mobility explicitly. > > > [Linda] Will add the definition “Hot Migration”, “cold migration”, and > > > “warm migration”. > > > ===#. Terminology=== > > > Packets run from the source at A to the destination at B via NVE1, > > > then via NVE2. Please don't call NVE1 and NVE2 the source NVE and the > > > destination NVE. > > > In future, no-one will thank you for the apparent contradictions when > > > they continually stumble over phrases like this one in S.4.1: "...send > > > their packets to the source NVE". > > > The term "packets in flight" is used incorrectly to refer to all the > > > packets sent to the old NVE after the VM has moved, even if they were > > > launched into flight long after the old VM stopped receiving packets. > > > [Linda] thank for the comments. Will change. > > > BTW, I think s/before/after/ in: "that have old ARP or neighbor cache > > > entry before VM or task migration". > > > I think: s/IP-based VM mobility/L3 VM mobility/ throughout, because > > > "based" > > > sounds (to me) like the mobility control protocol is over (i.e. based > > > on) IP. > > > ===#. Applicability=== > > > In section 4.2 it says that the protocol mostly used as the IP based > > > task migration protocol is ILA. This implies that all hosts > > > corresponding with the mobile VMs are either part of the same > > > controlled environment, or they are proxied via nodes that are part of > > > the same controlled environment (I only have passing knowledge of ILA, > > > but I understand that it depends on ILA routers on the path). If I am > > > correct, this aspect of scope needs to be made clear from the start. > > > Also under the heading of applicabiliy, the sentence "Since migrations > > > should be relatively rare events" appears very late in the document > > > (S.4.2.1). The assumed level of churn ought to be stated nearer the > start. > > > [Linda] yes, under the same controlled environment. > > > ===#. L3 Mobility=== > > > L2 VM mobility is independent of the application, because resolution > > > of L2 mappings is delegated to the stack. In contrast, L3 VM mobility > > > is only feasible under certain conditions, because an application > > > needs an IP address to open a socket (resolution of DNS names is not > > > delegated to the stack, and apps can use IP addresses directly anyway). > > > Examples of the 'certain conditions': > > > a) /All/ applications used in the whole DC load balancing scheme > > > contain IP address migration logic for /all/ their connections; b) VMs > > > running solely applications that support IP address migration register > > > this fact with the NVA, and it only select such VMs for mobility. c) > > > An abstraction is layered over /all/ the IP addresses exposed to > > > applications (at both ends) so that the IP addresses that applications > > > use are solely identifiers (e.g. ILA, LISP, HIP), not also locators. > > > The introduction says the draft is about VM mobility in a multi-tenant > > > DC, so the DC admin will not know the range of applications being > > > used. This excludes condition (a) above. When the draft says "...if > > > all applications running are known to handle this gracefully...", it > > > doesn't quantify just how restrictive this condition is, and it gives > > > no explanation of how this knowledge might be 'known' or which > > > function within the system 'knows' it. > > > S.4.2.1 contains what seems like plenty of arm-waving. > > > * "TCP connections could be automatically closed in the network stack > > > during a migration event." > > > o There is no TCP connection state in the network stack. > > > o Even if the network starts to drop every packet, the TCP > > > connection > > > state persists in the end-points for a duration of the order > > > of 30-90 > > > minutes (OS-dependent) before TCP deems the connection is > > > broken. o > > > Other transport protocols have similar designs (including the > > > app-layer > > > of protocols over UDP). > > > * "More involved approach to connection migration": > > > o pausing the connection [does this refer to an actual feature > > > of any > > > L4 protocol?] o packaging connection state and sending to > > > target [does > > > this assume logic written into the application, or is this > > > assuming the > > > stack handles this and the app is restricted to using some form > of > > > separate identifier/locator addresses?] o instantiating > connection > > > state in the peer stack [ditto?]. > > > There's some arm-waving in S.7 too: > > > "Cold Virtual Machine mobility is facilitated by the VM initially > > > sending an ARP or Neighbor Discovery message at the destination NVE > > > but the source NVE not receiving any packets inflight." > > > [How is it arranged for the source NVE not to receive any packets > > > in flight?] > > > And in S.7: > > > "In hot > > > standby option, regarding TCP connections, one option is to start > > > with and maintain TCP connections to two different VMs at the same > > > time." > > > [This sounds like resilience logic has been written into the > > > application, > > > which would be a special case but not something VM mobility > > > infrastructure > > > could depend on.] > > > [Linda] will add. > > > ===#. Gaps=== > > > #. Security Considerations: repeats issues in other drafts that are > > > not specific to mobility, but it does not mention any security issues > > > specifically due to VM mobility. It says that address spoofing may > > > arise in a DC (sort-of implying it is worse than in non-DC > > > environments, but not saying why). The handshake at the start of a > > > connection (e.g. TCP, SCTP, QUIC) checks for source address spoofing. > > > So L3 VM mobility would be more vulnerable to source address spoofing > > > in cases where the mobile VM was the connection initiator and there > > > was not a new handshake after the move. However, this draft does not > > > contain any detailed mobility protocols, so it is not possible to > > > identify any specific security flaws. > > > #. Transport Issues: Effect of delay on the transport: Cold mobility > > > introduces significant delay, and other forms less, but still some > > > delay. It should be pointed out that some applications (e.g. > > > real-time) will therefore not be useful if subjected to VM mobility. > > > Similarly, even a short period of delay will drive most congestion > > > controls to severely reduce throughput. These points might be > > > self-evident, but perhaps they should be stated explicitly. > > > BTW, in the L3 VM mobility case, the draft often refers to TCP > > > connections, but the address bindings of any transport protocols would > > > have to be migrated due to VM mobility (e.g. SCTP; sequences of > > > datagrams over UDP; streams over UDP such as with RTP, QUIC). > > > #. Management Issues: perhaps the draft ought to recommend statistics > > > gathering (e.g. time taken, amount of duplicate data) to aid a DC's > > > future decisions on the cost-benefit of moving a VM. The OPSDIR review > > > says a BCP does not /have/ to describe management issues, but this > > > document seems to describe a whole system procedure, not just a > > > protocol, which then surely includes the management plane. > > > [Linda] can you become a co-author and add those in? > > > ===#. Incoherent Structure=== > > > S.4.1. happens to talk about VMs moving, while S.4.2. happens to talk > > > about tasks moving, but this is not the distinguishing aspect of these > > > two sections (anyway, S.2. says "the draft uses task and VM > > > interchangeably"): * "4.1 VM Migration" is about "L2 VM Mobility" so > > > this ought to be the section heading, * > > > "4.2 Task Migration" is about "L3 VM Mobility" so this ought to be the > > > section heading. It would also help not to switch from VM to task > > > across these sections > > > - it's just a distraction. > > > S.4.1 needs better signposting of where each sub-case ends > > > (Subsections might be useful to solve this): * IPv4 * end-user client > > > * 2 paras starting "All NVEs communicating with this virtual > > > machine..." [Not clear that the end-user case has ended and we have > > > returned to the general IPv4 case?] * IPv6 [Strictly, it still hasn't > > > said whether the end-user client case has ended.] [Also, it doesn't > > > explain why there is no need for an end-user client case under IPv6?] > > > Sections 5 & 6 seem to be about either L2 or L3 mobility, whereas > > > Sections 7 & > > > 8 seem to be restricted to L2. > > > The draft vacillates over what to do with packets arriving at the old > > > NVE in the L3 case (see also L3 mobility above): * S4.2 first says > > > packets are dropped, possibly with an ICMP error message; > > > o then later it says they are silently dropped; > > > o then in the very next sentence it says either silently drop them > > > or forward > > > them to the new location > > > * S.5 says they should not be lost, but instead delivered to the > > > destination hypervisor > > > o then it describes how they are tunnelled (which is not the same as > > > "forwarding"). > > > The order in which all the stages of mobilty are given is jumbled up > > > across sections that also appear in arbitrary order: * S.5 prepares, > > > establishes uses then stops a tunnel, but it doesn't say where the > > > other stages fit between these steps > > > o When tunneling packets, it talks about the *migrating* VM > > > not the > > > *migrated* VM, which implies tunnelling has started before the > > > new VM > > > is running. Does this imply there is a huge buffer? o It says > > > "Stop > > > Tunneling Packets - When source NVE stops receiving packets > > > destined > > > to..." but it is never clear when a source has stopped sending > > > packets > > > to a destination, unless it explicitly closes the connection > > > (e.g. with > > > a FIN in the case of TCP). Often there are long gaps between > > > packets, > > > because many flows are 'thin' (meaning the application > > > frequently has > > > nothing to send). These gaps can last for milliseconds, hours > > > or even > > > days without any implication that the connection has ended. > > > * Then S.6. describes moving state, but doesn't say that this is not > > > after the previous tunnelling steps (or where it fits within those > > > steps). * Then S.7 describes hot, warm and cold mobility, but doesn't > > > lay out the tunnelling or steps to move state in each case. * Then S.8 > > > says it's about VM life-cycle, but just gives the very first 3 steps > > > for allocation of resources to a VM, then abruptly ends, without even > > > starting the VM, let alone getting to move it. > > > S.5 exhibits another inconsistency by talking about the hypervisor, > > > not the NVE. > > > ==#. Nits== > > > Nits with the English are too numerous to mention them all. Below are > > > pointers to general problems as well as some individual instances. > > > S.4 > > > "Layer 2 and Layer 3 protocols are described next. In the following > > > sections, we examine more advanced features." > > > s/following/subsequent/ > > > S.4.1 > > > Expand WSC, MSC and NVA on first use. > > > s/the VM moves in the same link/the VM moves in the same subnet/ > > > "i.e. end-user clients ask for the same MAC address upon migration. > > > [...] to ensure that the same IPv4 address is assigned to the VM." I > > > think s/IPv4/MAC/ was intended? > > > " All NVEs communicating with this virtual machine uses the old ARP > > > entry. If any VM in those NVEs need to talk to the new VM in the > > > destination NVE, it uses the old ARP entry." > > > Repetition: these 2 sentences say the same. (The mistake is also > > > repeated when these 2 sentences are repeated for IPv6). > > > S.4.2.1 > > > s/Push the new mapping to hosts./Push the new mapping to communicating > > > hosts./ > > > S.5. > > > The IPv4/IPv6 pairs of paras for "tunnel estabilshment" and "tunneling > > > packets" > > > only differ in the words "IPv4"/"IPv6". So in each case a single para > > > could be given for IP (irrespective of whether v4 or v6). > > > Thank you very much. > > > Linda Dunbar > > > > > > > > > _______________________________________________ > > > Tsv-art mailing list > > > Tsv-art@ietf.org > > > https://www.ietf.org/mailman/listinfo/tsv-art > > > > -- > > ________________________________________________________________ > > Bob Briscoe http://bobbriscoe.net/ > > > _______________________________________________ > nvo3 mailing list > nvo3@ietf.org > https://www.ietf.org/mailman/listinfo/nvo3 >
- [nvo3] Pick up the comment resolutions to Tsvart … Linda Dunbar
- Re: [nvo3] Pick up the comment resolutions to Tsv… B. Khasnabish