Re: [Tsv-art] Tsvart last call review of draft-ietf-nvo3-vmm-04

Behcet Sarikaya <sarikaya2012@gmail.com> Fri, 14 September 2018 14:25 UTC

Return-Path: <sarikaya2012@gmail.com>
X-Original-To: tsv-art@ietfa.amsl.com
Delivered-To: tsv-art@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0B605130E82; Fri, 14 Sep 2018 07:25:41 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.749
X-Spam-Level:
X-Spam-Status: No, score=-1.749 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id fFbgVWuPW_Xp; Fri, 14 Sep 2018 07:25:35 -0700 (PDT)
Received: from mail-wr1-x42f.google.com (mail-wr1-x42f.google.com [IPv6:2a00:1450:4864:20::42f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 081A2130F16; Fri, 14 Sep 2018 07:25:35 -0700 (PDT)
Received: by mail-wr1-x42f.google.com with SMTP id v90-v6so10844855wrc.0; Fri, 14 Sep 2018 07:25:34 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:reply-to:from:date:message-id :subject:to:cc; bh=/dZBP+R9PZ5UVh2JWYxbDVtkF7rm4O0o0i3nqvqvJU0=; b=gWrH4h3Odww3UZQPncSzjwzZAbWNdcvxCfAFWsV0QXWQJ6VK+W3VlGQnnKeBBIWZkQ l5h2LFE0nesU71Mx9+OzgqnbRvxW65phVfvyzIVdT5krms2caByJvydH8B+ESnVW/2+6 q4cCPJRgQUXgH8qGK/Xk21b7z/5jIC/uI90Z2FNqaehrR00OXxdPcs07CA7rS6H2wVkE 7y/ol9fc9bakxbGCZq108VQhAOEobxAYH6jRsYMVC6G/ONuxj8WmDv4Eoj0grM2BKUPR zr/9lQ5FDVJ/gQiSr4I92fGOcovbAjOzesDjNVpCDOBuFb8l9BSi4TQdFQbdvnFXARAG xYcw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:reply-to :from:date:message-id:subject:to:cc; bh=/dZBP+R9PZ5UVh2JWYxbDVtkF7rm4O0o0i3nqvqvJU0=; b=nuWu+TC9+IlO4lQWeFqHRENdpg+3w3qTu75y68gARMe026heyGYbK1rL2ZBU+DDNQc ydtRZDf47CYWkIw8VKQ8Tzx0tC26Sp9eCysHrjI3rdD7O8wr/GJDrBRPbtuzEFxRVwny dV3V7+dzTMUmFbhm6N+utkfxVQxiRsrNkxD4To71bSSI8SkZcuCq1a2eL1W0oj3dkarW k1uU518NmOtqnfDjaxSUUANv4QmKdqb8oBM+BVnGDn/fP7+hl8JlYYN3hj4A6m+dYn+v kDI5HChUaEegpDvc+/OOp1J+CMxOqsptz8pQ2i+bbY9ZZou9LllTQty9NWblOuZn0Vhc T7FA==
X-Gm-Message-State: APzg51CLU9h7a2qpVZfx7ZIKWPVqvr/ZJ+nWWJoiT5wLqsKPUZmMxDPC /zpshY3FOVYyxrkSyVBYjzwb8TyLSw933kiYwcc=
X-Google-Smtp-Source: ANB0VdZ3oQ7BbOZMZGNQmz/8NR03vZ/9iNaTCDW5XlweGOK6+9/zAPSUwvGeY1++y8gT3xsQiPcierWWm2XIpohGejw=
X-Received: by 2002:a5d:4512:: with SMTP id s18-v6mr10298909wrq.82.1536935133311; Fri, 14 Sep 2018 07:25:33 -0700 (PDT)
MIME-Version: 1.0
References: <153602909285.13281.13763046029400746910@ietfa.amsl.com> <4A95BA014132FF49AE685FAB4B9F17F66B139743@sjceml521-mbs.china.huawei.com> <CE03DB3D7B45C245BCA0D2432779493630287A13@MX307CL04.corp.emc.com>
In-Reply-To: <CE03DB3D7B45C245BCA0D2432779493630287A13@MX307CL04.corp.emc.com>
Reply-To: sarikaya@ieee.org
From: Behcet Sarikaya <sarikaya2012@gmail.com>
Date: Fri, 14 Sep 2018 09:25:21 -0500
Message-ID: <CAC8QAcc=BxW8Zd3u7Ko8rWXemJ-XiL7uU8Gt-ZVyEEMAqd3G9g@mail.gmail.com>
To: David.Black@dell.com
Cc: Linda Dunbar <linda.dunbar@huawei.com>, ietf@bobbriscoe.net, tsv-art@ietf.org, NVO3 <nvo3@ietf.org>, IETF <ietf@ietf.org>, draft-ietf-nvo3-vmm.all@ietf.org
Content-Type: multipart/alternative; boundary="000000000000a9f8e30575d5980f"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsv-art/NKCRz6yo1T8YJx89nnyZXMeg3bI>
Subject: Re: [Tsv-art] Tsvart last call review of draft-ietf-nvo3-vmm-04
X-BeenThere: tsv-art@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Review Team <tsv-art.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsv-art>, <mailto:tsv-art-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsv-art/>
List-Post: <mailto:tsv-art@ietf.org>
List-Help: <mailto:tsv-art-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsv-art>, <mailto:tsv-art-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 14 Sep 2018 14:25:49 -0000

On Thu, Sep 13, 2018 at 8:03 PM Black, David <David.Black@dell.com> wrote:

> > #. It does not seem as if the NVO WG has discussed the purpose of using
> normative text in this draft. See detailed comments.
>
>
>
> > [Linda] The “Intended status” of the draft is “Best Current Practice”.
> So all the text are not “normative”. Is it Okay?
>
>
>
> Not really – this draft might be better targeted as “Informational” as it
> is not a comprehensive review of current practice (best or otherwise) nor
> an overall set of recommendations, e.g., as Bob wrote “it just asserts what
> appears to be one view of how a whole VM Mobility system works.”
>
>
>


At present we have no intention of changing the intended status because
that decision should be deferred until IESG Review where we expect to
receive an authoritative view.

Behcet

> Thanks, --David
>
>
>
> *From:* Tsv-art [mailto:tsv-art-bounces@ietf.org] *On Behalf Of *Linda
> Dunbar
> *Sent:* Monday, September 10, 2018 2:17 PM
> *To:* Bob Briscoe; tsv-art@ietf.org
> *Cc:* nvo3@ietf.org; ietf@ietf.org; draft-ietf-nvo3-vmm.all@ietf.org
> *Subject:* Re: [Tsv-art] Tsvart last call review of draft-ietf-nvo3-vmm-04
>
>
>
> Bob,
>
>
>
> Thank you very much for reviewing the draft and provided in-depth
> comments. I am very sorry for the delayed response due to traveling.
>
>
>
> Replies to your comments are inserted below marked by [Linda]:
>
>
>
>
>
> -----Original Message-----
> From: Bob Briscoe [mailto:ietf@bobbriscoe.net <ietf@bobbriscoe.net>]
> Sent: Monday, September 03, 2018 9:45 PM
> To: tsv-art@ietf.org
> Cc: nvo3@ietf.org; ietf@ietf.org; draft-ietf-nvo3-vmm.all@ietf.org
> Subject: Tsvart last call review of draft-ietf-nvo3-vmm-04
>
>
>
> Reviewer: Bob Briscoe
>
> Review result: Not Ready
>
>
>
> I have been selected as the Transport Directorate reviewer for this draft.
> The Transport Directorate seeks to review all transport or
> transport-related drafts as they pass through IETF last call and IESG
> review, and sometimes on special request. The purpose of the review is to
> provide assistance to the Transport ADs. For more information about the
> Transport Directorate Reviews and the Transport Area Review Team, please
> see https://trac.ietf.org/trac/tsv/wiki/TSV-Directorate-Reviews
>
>
>
> In this case, very very few of the review comments relate to transport
> issues, although the greatest issue concerns a desire that the network
> could pause or stop connections during L3 VM Mobility, which is certainly a
> transport issue.
>
>
>
> [Linda] There is “Hot Migration” with transport service continuing, and
> there is a “Cold Migration”, which is a common practice in many data
> centers, which stop the task running on the old place and move to the new
> place before restart as described in the Task Migration.
>
> Is it helpful to add this description to the draft?
>
>
>
>
>
> ==Summary==
>
>
>
> The technical aspects of the draft concerning L2 VM mobility (within a
> subnet) seem sound. However, this is only part of the draft, which has the
> following
>
> issues:
>
>
>
> #. The introduction does not say what the purpose of publishing this draft
> is.
>
> It seems that, rather than describing a specific protocol or protocols, it
> intends to describe the overall system procedure that would typically be
> used in DCs for VM mobility. It is tagged as a BCP, but it does not say who
> needs this BCP, why it is useful for the IETF to publish this BCP, how wide
> the authors' knowledge is of current practice (given DCs are private), or
> why this is a BCP rather than a protocol spec.
>
>
>
> [Linda] The first paragraph on Page 3 has the description why VM Mobility
> is needed. Is it helpful to move this paragraph to the beginning of the
> Introduction Section?
>
> *“Virtualization which is being used in almost all of today’s data*
>
> *centers enables many virtual machines to run on a single physical*
>
> *computer or compute server. Virtual machines (VM) need hypervisor*
>
> *running on the physical compute server to provide them shared*
>
> *processor/memory/storage. Network connectivity is provided by the*
>
> *network virtualization edge (NVE) [RFC8014]. Being able to move VMs*
>
> *dynamically, or live migration, from one server to another allows for*
>
> *dynamic load balancing or work distribution and thus it is a highly*
>
> *desirable feature [RFC7364].”*
>
>
>
>
>
> The draft starts out (S.3) as if it intends to say what a good VM Mobility
> protocol should or shouldn't do, but the rest of the document doesn't give
> any reasoning for these recommendations, it just asserts what appears to be
> one view of how a whole VM Mobility system works, sometimes referring to
> one example protocol RFC for a component part, but more often with no
> references or details.
>
>
>
> [Linda] Is it helpful to move the paragraph above to the beginning of the
> Introduction Section? So that audience is aware of why VM Mobility is
> needed. And then follow up with what a good VM Mobility protocol should or
> shouldn't do?
>
>
>
> #. It does not seem as if the NVO WG has discussed the purpose of using
> normative text in this draft. See detailed comments.
>
>
>
> [Linda] The “Intended status” of the draft is “Best Current Practice”. So
> all the text are not “normative”. Is it Okay?
>
>
>
> #. The draft silently slips back and forth between VM mobility and VM
> redundancy, without recognizing the differences. See detailed comments.
>
>
>
> [Linda] There is only one usage of “redundancy” in the entire document,
> used under the context of “Hot standby option”, indicating  the
> “redundancy” of “the VMs in both primary and secondary domains have
> identical information and can provide services simultaneously as in
> load-share mode of operation” being expensive.
>
>
>
> #. Please adopt different terminology than "source NVE" and "destination
> NVE", which are really poor choices of terms for an intermediate node. See
> detailed comments. Why not use "old NVE" and "new NVE", which is what you
> mean?
>
> [Linda] Thanks for the suggestion. We will change to “Old NVE”, and “new
> NVE”.
>
>
>
> #. Applicability is fairly clearly outlined, but it is not clear whether
> hosts corresponding with the mobile VMs are part of the same controlled
> environment or on the uncontrolled public Internet. See detailed comments.
>
> [Linda] “Hosts” are the App running on the VM. It is the under the same
> controlled environment. Not on uncontrolled public internet.
>
>
>
>
>
> #. Section 4.2.1 on L3 VM mobility reads like some potential
> half-thought-through ideas on how to solve L3 mobility, rather than current
> practice, let alone best current practice. Either current practice should
> be described instead, or the scope of the draft should be narrowed solely
> to L2 VM mobility. See detailed comments.
>
> [Linda] This is refereeing to “Cold Migration”, which is a common practice
> in many data centers.
>
>
>
> # The VM's file system is described as state that moves with the VM (S.6),
> but VM mobility solutions often move the VM but stitch it back to its
> (unmoved) storage. Conversely, the storage can also move independent of the
> VM.
>
> [Linda] It depends. When a VM move to a different zone, the storage/file
> can becomes inaccessible.
>
>
>
> #. The draft omits some of the security, transport and management aspects
> of VM mobility. See detailed comments.
>
> [Linda] Can you provide some text?
>
>
>
> #. The draft reads as if different sections have been written by different
> authors and no-one has edited the whole to give it a coherent structure, or
> to ensure consistency (both technical and editorial) between the parts. See
> detailed comments.
>
>
>
> [Linda] we can improve.
>
>
>
>
>
> #. The quality of the English grammar does not allow a reviewer to
> concentrate on the technical aspects rather than the English. It would have
> been useful if one of the English-speaking co-authors had improved the
> English before submission for review. See detailed comments.
>
> [Linda] can you help?  Becoming a co-author to improve?
>
>
>
> ==Detailed Comments==
>
>
>
> ===#. Normative statements===
>
>
>
> In the body of the document, there is just one occurrence of normative
> text (actually two "MUST"s, but both state a common requirement - just
> written separately for IPv4 and IPv6). This merely serves to imply that
> everything else the document says is less important or optional, which was
> probably not the intention.
>
> [Linda] The goal is to indicate any solution in moving the VM “MUST”
> follow this rule. They make sense, aren’t they?
>
>
>
> At the start there is a requirements section, which states what a VM
> Mobility protocol "SHOULD" or "SHOULD NOT" do. I think this is intended as
> a set of goals for the rest of the document. If so, these "SHOULDs" are not
> intended to apply to implementations, so they ought not to be capitalized.
>
>
>
> [Linda] okay, will change.
>
>
>
>
>
> The first requirement, "Data center network SHOULD support virtual machine
> mobility in IPv6", is written as a requirement on all DC networks, not on
> implementations. I assume this was intended to read as "Data center network
> virtual machine mobility protocols SHOULD support IPv6". Even then, it
> doesn't really add anything to say VM mobility should support v6 and it
> should support v4. A L2 solution won't. While undoubtedly, a L3 solution
> will at least support one of them.
>
> [Linda]Agree. Will change it to “Data center that support IPv6 address
> should …”
>
>
>
> I'm not sure that 'protocol' is the right word anyway; I think 'VM
> Mobility procedure' would be a better phrase, because it includes steps
> such as suspending the VM, which is more than a protocol.
>
> [Linda] yes. Will change to “Procedure”.
>
>
>
> The requirement "Virtual machine mobility protocol MAY support host routes
> to accomplish virtualization", is not followed up at all in the rest of the
> draft.
>
> Even if this requirement stays, the last 3 words should be deleted.
>
>
>
> [Linda] will change to “Host Route can be used to support the Virtual
> Machine Mobility Procedure.”
>
>
>
> By the end of the draft, the solution falls far short of the most relevant
> "Requirements" anyway, so one assumes the title of the section ought to
> have been "Goals". Specifically, even in the simpler case of L2 VM
> mobility, S.4.1 says that triangular routing and tunnelling persist "until
> a neighbour cache entry times out". A cache timeout is about 10 orders of
> magnitude longer than the requirement to only persist "while handling
> packets in flight", which would be a few milliseconds at most (the time for
> packets to clear the network that were already launched into flight when
> the old VM stopped).
>
>
>
> Whatever, it would be preferable for the draft to give rationale for these
> requirements, rather than just assert them. This would help to shed light
> on the merits of the different trade offs that solutions choose.
>
>
>
> [Linda] Agree, will add.
>
>
>
> ===#. Mobility vs. Redundancy===
>
>
>
> Redundancy and mobility have a lot of similarities, but they have
> different goals. With mobility, it is necessary to know the exact instant
> when one set of state is identical to the other so it can hand over. With
> redundancy, the aim is to keep two (or more) sets of state evolving through
> the same sequence of changes, but there is no need to know the point at
> which one is the same as the other was at a certain point.
>
> [Linda] Agree with what you said. There is only one usage of “redundancy”
> in the entire document, used under the context of “Hot standby option”,
> indicating  the “redundancy” of  “the VMs in both primary and secondary
> domains have identical information and can provide services simultaneously
> as in load-share mode of operation” being expensive.
>
>
>
> The draft slips from mobility to resilience in the following places:
>
> * S.2. Terminology: Warm VM Mobility is defined without any ending, as if
> it is permanent replication. * S.7. "Handling of Hot, Warm and Cold Virtual
> Machine Mobility" is actually all about redundancy, and doesn't address
> mobility explicitly.
>
>
>
> [Linda] Will add the definition “Hot Migration”, “cold migration”, and
> “warm migration”.
>
>
>
> ===#. Terminology===
>
>
>
> Packets run from the source at A to the destination at B via NVE1, then
> via NVE2. Please don't call NVE1 and NVE2 the source NVE and the
> destination NVE.
>
> In future, no-one will thank you for the apparent contradictions when they
> continually stumble over phrases like this one in S.4.1: "...send their
> packets to the source NVE".
>
>
>
> The term "packets in flight" is used incorrectly to refer to all the
> packets sent to the old NVE after the VM has moved, even if they were
> launched into flight long after the old VM stopped receiving packets.
>
>
>
> [Linda] thank for the comments. Will change.
>
>
>
> BTW, I think s/before/after/ in: "that have old ARP or neighbor cache
> entry before VM or task migration".
>
>
>
> I think: s/IP-based VM mobility/L3 VM mobility/ throughout, because "based"
>
> sounds (to me) like the mobility control protocol is over (i.e. based on)
> IP.
>
>
>
> ===#. Applicability===
>
>
>
> In section 4.2 it says that the protocol mostly used as the IP based task
> migration protocol is ILA. This implies that all hosts corresponding with
> the mobile VMs are either part of the same controlled environment, or they
> are proxied via nodes that are part of the same controlled environment (I
> only have passing knowledge of ILA, but I understand that it depends on ILA
> routers on the path). If I am correct, this aspect of scope needs to be
> made clear from the start.
>
>
>
> Also under the heading of applicabiliy, the sentence "Since migrations
> should be relatively rare events" appears very late in the document
> (S.4.2.1). The assumed level of churn ought to be stated nearer the start.
>
>
>
> [Linda] yes, under the same controlled environment.
>
>
>
> ===#. L3 Mobility===
>
> L2 VM mobility is independent of the application, because resolution of L2
> mappings is delegated to the stack. In contrast, L3 VM mobility is only
> feasible under certain conditions, because an application needs an IP
> address to open a socket (resolution of DNS names is not delegated to the
> stack, and apps can use IP addresses directly anyway).
>
>
>
> Examples of the 'certain conditions':
>
> a) /All/ applications used in the whole DC load balancing scheme contain
> IP address migration logic for /all/ their connections; b) VMs running
> solely applications that support IP address migration register this fact
> with the NVA, and it only select such VMs for mobility. c) An abstraction
> is layered over /all/ the IP addresses exposed to applications (at both
> ends) so that the IP addresses that applications use are solely
> identifiers  (e.g. ILA, LISP, HIP), not also locators.
>
>
>
> The introduction says the draft is about VM mobility in a multi-tenant DC,
> so the DC admin will not know the range of applications being used. This
> excludes condition (a) above. When the draft says "...if all applications
> running are known to handle this gracefully...", it doesn't quantify just
> how restrictive this condition is, and it gives no explanation of how this
> knowledge might be 'known' or which function within the system 'knows' it.
>
>
>
> S.4.2.1 contains what seems like plenty of arm-waving.
>
> * "TCP connections could be automatically closed in the network stack
> during a migration event."
>
>         o There is no TCP connection state in the network stack.
>
>         o Even if the network starts to drop every packet, the TCP
> connection
>
>         state persists in the end-points for a duration of the order of
> 30-90
>
>         minutes (OS-dependent) before TCP deems the connection is broken. o
>
>         Other transport protocols have similar designs (including the
> app-layer
>
>         of protocols over UDP).
>
> * "More involved approach to connection migration":
>
>         o pausing the connection [does this refer to an actual feature of
> any
>
>         L4 protocol?] o packaging connection state and sending to target
> [does
>
>         this assume logic written into the application, or is this
> assuming the
>
>         stack handles this and the app is restricted to using some form of
>
>         separate identifier/locator addresses?] o instantiating connection
>
>         state in the peer stack [ditto?].
>
>
>
> There's some arm-waving in S.7 too:
>
>   "Cold Virtual Machine mobility is facilitated by the VM initially
>
>    sending an ARP or Neighbor Discovery message at the destination NVE
>
>    but the source NVE not receiving any packets inflight."
>
>    [How is it arranged for the source NVE not to receive any packets in
> flight?]
>
>
>
> And in S.7:
>
>   "In hot
>
>    standby option, regarding TCP connections, one option is to start
>
>    with and maintain TCP connections to two different VMs at the same
>
>    time."
>
>    [This sounds like resilience logic has been written into the
> application,
>
>    which would be a special case but not something VM mobility
> infrastructure
>
>    could depend on.]
>
>
>
> [Linda] will add.
>
>
>
> ===#. Gaps===
>
> #. Security Considerations: repeats issues in other drafts that are not
> specific to mobility, but it does not mention any security issues
> specifically due to VM mobility. It says that address spoofing may arise in
> a DC (sort-of implying it is worse than in non-DC environments, but not
> saying why). The handshake at the start of a connection (e.g. TCP, SCTP,
> QUIC) checks for source address spoofing. So L3 VM mobility would be more
> vulnerable to source address spoofing in cases where the mobile VM was the
> connection initiator and there was not a new handshake after the move.
> However, this draft does not contain any detailed mobility protocols, so it
> is not possible to identify any specific security flaws.
>
>
>
> #. Transport Issues: Effect of delay on the transport: Cold mobility
> introduces significant delay, and other forms less, but still some delay.
> It should be pointed out that some applications (e.g. real-time) will
> therefore not be useful if subjected to VM mobility. Similarly, even a
> short period of delay will drive most congestion controls to severely
> reduce throughput. These points might be self-evident, but perhaps they
> should be stated explicitly.
>
>
>
> BTW, in the L3 VM mobility case, the draft often refers to TCP
> connections, but the address bindings of any transport protocols would have
> to be migrated due to VM mobility (e.g. SCTP; sequences of datagrams over
> UDP; streams over UDP such as with RTP, QUIC).
>
>
>
> #. Management Issues: perhaps the draft ought to recommend statistics
> gathering (e.g. time taken, amount of duplicate data) to aid a DC's future
> decisions on the cost-benefit of moving a VM. The OPSDIR review says a BCP
> does not /have/ to describe management issues, but this document seems to
> describe a whole system procedure, not just a protocol, which then surely
> includes the management plane.
>
>
>
> [Linda] can you become a co-author and add those in?
>
>
>
> ===#. Incoherent Structure===
>
>
>
> S.4.1. happens to talk about VMs moving, while S.4.2. happens to talk
> about tasks moving, but this is not the distinguishing aspect of these two
> sections (anyway, S.2. says "the draft uses task and VM interchangeably"):
> * "4.1 VM Migration" is about "L2 VM Mobility" so this ought to be the
> section heading, *
>
> "4.2 Task Migration" is about "L3 VM Mobility" so this ought to be the
> section heading. It would also help not to switch from VM to task across
> these sections
>
> - it's just a distraction.
>
>
>
> S.4.1 needs better signposting of where each sub-case ends (Subsections
> might be useful to solve this): * IPv4 * end-user client * 2 paras starting
> "All NVEs communicating with this virtual machine..." [Not clear that the
> end-user case has ended and we have returned to the general IPv4 case?] *
> IPv6 [Strictly, it still hasn't said whether the end-user client case has
> ended.] [Also, it doesn't explain why there is no need for an end-user
> client case under IPv6?] Sections 5 & 6 seem to be about either L2 or L3
> mobility, whereas Sections 7 &
>
> 8 seem to be restricted to L2.
>
>
>
> The draft vacillates over what to do with packets arriving at the old NVE
> in the L3 case (see also L3 mobility above): * S4.2 first says packets are
> dropped, possibly with an ICMP error message;
>
>   o then later it says they are silently dropped;
>
>   o then in the very next sentence it says either silently drop them or
> forward
>
>   them to the new location
>
> * S.5 says they should not be lost, but instead delivered to the
> destination hypervisor
>
>   o then it describes how they are tunnelled (which is not the same as
>
>   "forwarding").
>
>
>
> The order in which all the stages of mobilty are given is jumbled up
> across sections that also appear in arbitrary order: * S.5 prepares,
> establishes uses then stops a tunnel, but it doesn't say where the other
> stages fit between these steps
>
>         o When tunneling packets, it talks about the *migrating* VM not the
>
>         *migrated* VM, which implies tunnelling has started before the new
> VM
>
>         is running. Does this imply there is a huge buffer? o It says "Stop
>
>         Tunneling Packets - When source NVE stops receiving packets
> destined
>
>         to..." but it is never clear when a source has stopped sending
> packets
>
>         to a destination, unless it explicitly closes the connection (e.g.
> with
>
>         a FIN in the case of TCP). Often there are long gaps between
> packets,
>
>         because many flows are 'thin' (meaning the application frequently
> has
>
>         nothing to send). These gaps can last for milliseconds, hours or
> even
>
>         days without any implication that the connection has ended.
>
> * Then S.6. describes moving state, but doesn't say that this is not after
> the previous tunnelling steps (or where it fits within those steps). * Then
> S.7 describes hot, warm and cold mobility, but doesn't lay out the
> tunnelling or steps to move state in each case. * Then S.8 says it's about
> VM life-cycle, but just gives the very first 3 steps for allocation of
> resources to a VM, then abruptly ends, without even starting the VM, let
> alone getting to move it.
>
>
>
> S.5 exhibits another inconsistency by talking about the hypervisor, not
> the NVE.
>
>
>
> ==#. Nits==
>
>
>
> Nits with the English are too numerous to mention them all. Below are
> pointers to general problems as well as some individual instances.
>
>
>
> S.4
>
>   "Layer 2 and Layer 3 protocols are described next.  In the following
>
>    sections, we examine more advanced features."
>
>         s/following/subsequent/
>
>
>
> S.4.1
>
> Expand WSC, MSC and NVA on first use.
>
>
>
> s/the VM moves in the same link/the VM moves in the same subnet/
>
>
>
> "i.e. end-user clients ask for the same MAC address upon migration. [...]
> to ensure that the same IPv4 address is assigned to the VM." I think
> s/IPv4/MAC/ was intended?
>
>
>
> "  All NVEs communicating with this virtual machine uses the old ARP
>
>    entry.  If any VM in those NVEs need to talk to the new VM in the
>
>    destination NVE, it uses the old ARP entry."
>
> Repetition: these 2 sentences say the same. (The mistake is also repeated
> when these 2 sentences are repeated for IPv6).
>
>
>
> S.4.2.1
>
> s/Push the new mapping to hosts./Push the new mapping to communicating
> hosts./
>
>
>
> S.5.
>
> The IPv4/IPv6 pairs of paras for "tunnel estabilshment" and "tunneling
> packets"
>
> only differ in the words "IPv4"/"IPv6". So in each case a single para
> could be given for IP (irrespective of whether v4 or v6).
>
>
>
> Thank you very much.
>
>
>
> Linda Dunbar
>
>
>
>
>
>
>