Re: [nvo3] Pick up the comment resolutions to Tsvart last call review of draft-ietf-nvo3-vmm-04

"B. Khasnabish" <vumip1@gmail.com> Fri, 16 August 2019 13:55 UTC

Return-Path: <vumip1@gmail.com>
X-Original-To: nvo3@ietfa.amsl.com
Delivered-To: nvo3@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1F9131201DE for <nvo3@ietfa.amsl.com>; Fri, 16 Aug 2019 06:55:32 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.747
X-Spam-Level:
X-Spam-Status: No, score=-1.747 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id A2bG9GDnfU61 for <nvo3@ietfa.amsl.com>; Fri, 16 Aug 2019 06:55:26 -0700 (PDT)
Received: from mail-io1-xd2c.google.com (mail-io1-xd2c.google.com [IPv6:2607:f8b0:4864:20::d2c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 7A802120046 for <nvo3@ietf.org>; Fri, 16 Aug 2019 06:55:26 -0700 (PDT)
Received: by mail-io1-xd2c.google.com with SMTP id i22so6300770ioh.2 for <nvo3@ietf.org>; Fri, 16 Aug 2019 06:55:26 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=9EadcDv3VlJQgpQ9uFF+U8ucZ9kcvY/+/mqiz1iOcJk=; b=m0hqAlav5o+Urevlu81UkguLJTyIRodAVdmqTMjZAF9VQltfEdHFeyPjPfTsuEDO2/ r9c/ss8y2rx1JUH79h2QgpQttF+BUVI9sW/nQLIrNVpsmYZV4vG8t4P9ear2I3pMk7H9 9noewtNrrWN0WPyRDLRCx3yDUXXGGPkiTo6NzeO4iiTNzjYSCqSb9DUQQewQB3VTVCn8 3r7vgXsYXj3VxfwxitHg8jV1HqtyI0LqE58Hd3PbA/6r+gCcMyJ0NSMJJWYn/KQeJ7kn sY1EpnOmyr+dPwB2k2jm7Vm8daUOH0KIpj0NumV/FnI+Vq+QnIPArPHxqv0ynLnUl2Wo vwbQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=9EadcDv3VlJQgpQ9uFF+U8ucZ9kcvY/+/mqiz1iOcJk=; b=J6dFCD0RlzWj97Z2GBOLKMM84TbZSwlG6sMJd3NqhxJxABX8kO3awOefuOISgIH4NC CNMsasKmml12IIfw1jSM78YGIXxXF9ITjOgyA8MHUREaYagGdZUN9rKHei+zSp178w1J hYXQQRZhUCyzpwAIdaJqhoSWYScn9SDFpLJjUSg8wLoHiUFCelRLEK+HyXMHPKm+lDXv w20SCYjzj4ECW68G4m6zwaJZJdDciWpUE5w0xyAStBq74QsacK0AWngpvuTvOHvU5v4b p+GDnKSNv5lgaN3JvkqdTv/afZd3fsKKasSnLz3QApbM5J3cUdNVD1QW3HJU3XkB/HBS LfZg==
X-Gm-Message-State: APjAAAWVdYr6znHd6H4RlqjYpP+yXZMxcLXBWHA8kvsMPwRrqWWTEqoX 1NFxnxt5suI3J6zMOGKwgWAI7fhU87WRPrZf81g=
X-Google-Smtp-Source: APXvYqxS952l6R9Z+qi3cEbZVtpJ9x5QmCXW1szIzSWOw3EO2VDZgzayo4eQ/nS5c2txddM+cTdDMRorIlOGR7657rY=
X-Received: by 2002:a5d:80d6:: with SMTP id h22mr8113366ior.231.1565963725317; Fri, 16 Aug 2019 06:55:25 -0700 (PDT)
MIME-Version: 1.0
References: <MN2PR13MB358289A29478BF08AC8FFF3B85D50@MN2PR13MB3582.namprd13.prod.outlook.com>
In-Reply-To: <MN2PR13MB358289A29478BF08AC8FFF3B85D50@MN2PR13MB3582.namprd13.prod.outlook.com>
From: "B. Khasnabish" <vumip1@gmail.com>
Date: Fri, 16 Aug 2019 09:54:47 -0400
Message-ID: <CANtnpwjXKdhr5fEM0+WCtjvoJHCmc4Fg0ABCsvN7yf_ndnohCg@mail.gmail.com>
To: Linda Dunbar <linda.dunbar@futurewei.com>
Cc: Bob Briscoe <ietf@bobbriscoe.net>, "nvo3@ietf.org" <nvo3@ietf.org>
Content-Type: multipart/alternative; boundary="000000000000942b8705903c576d"
Archived-At: <https://mailarchive.ietf.org/arch/msg/nvo3/zm4mgo1Kyc2_dPG-_TZ33FOdu9M>
Subject: Re: [nvo3] Pick up the comment resolutions to Tsvart last call review of draft-ietf-nvo3-vmm-04
X-BeenThere: nvo3@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Network Virtualization Overlays \(NVO3\) Working Group" <nvo3.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nvo3>, <mailto:nvo3-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nvo3/>
List-Post: <mailto:nvo3@ietf.org>
List-Help: <mailto:nvo3-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nvo3>, <mailto:nvo3-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 16 Aug 2019 13:55:32 -0000

Hi Linda,

Thanks, and I think these adequately answer Bob's concerns.

Can we set a timeline to resolve these in order to move this draft forward
?!

Once again, many Thanks in advance.

Best

Bhumip

 +1-781-752-8003 (m) | vumip1@gmail.com
https://about.me/bhumip

                   __o
             _ `\ <, _
.......... ( • ) / ( • ) ......................




On Tue, Aug 6, 2019 at 4:40 PM Linda Dunbar <linda.dunbar@futurewei.com>
wrote:

> Bob,
>
> NVO3 Chairs have designated me as the editor for this draft which has been
> stalled in the TSVARE LC review process for almost a year.
>
> The answers to your questions are marked by [Linda 2019] (hopefully you
> still remember them). Please let me know if they are good enough to move
> forward.
>
> -----------------------------------------------
>
> Re: [nvo3] [Tsv-art] Tsvart last call review of draft-ietf-nvo3-vmm-04
>
> Bob Briscoe <ietf@bobbriscoe.net> Tue, 18 September 2018 23:03 UTCShow
> header
> <https://mailarchive.ietf.org/arch/browse/nvo3/?q=draft-ietf-nvo3-vmm-04&f_from=Linda%20Dunbar%2CBob%20Briscoe>
>
> Linda,
>
>
>
> Until we can all understand the answers to the following two questions,
>
> I don't think we can discuss what track this draft ought to be on, let
>
> alone move on to your responses to all my other points.
>
>
>
> 1/ Applicability
>
>
>
> You say this draft solely applies to connections with both ends within
>
> the controlled DC environment. But the draft says it's about
>
> multi-tenant DCs. Are there any multi-tenant DCs that restrict all VMs
>
> to only communicate with other VMs within the same controlled DC
>
> environment?
>
>
>
> [Linda 2019] Yes. Many Cloud DC hosts applications for multiple tenants.
> E.g. multiple depts of one organization instantiate their workloads in one
> Cloud DC . There are communication within one tenant (workloads belonging
> to one dept), and communications among hosts belonging to different tenants
> (applications belong to different Depts).
>
>
>
>
>
> 2/ Purpose of publishing as an RFC
>
>
>
> When I said:
>
> > #. The introduction does not say what the purpose of publishing this
>
> > draft is.
>
> you responded:
>
> > [Linda] The first paragraph on Page 3 has the description why VM
>
> > Mobility is needed.
>
>
>
> Whether VM Mobility is needed was not my question. My question was what
>
> is the purpose of the IETF publishing an RFC about VM Mobility? And
>
> particularly, what is /this/ RFC intended to achieve?
>
>
>
> [Linda 2019] this RFC is intended to describe VMs (or applications) being
> moved from one Rack to another and describe the impact to the Overlay
> networks.
>
>
>
> Are the authors trying to argue for a particular approach vs. others?
>
> Are you trying to write a tutorial? Are you trying to give the pros and
>
> cons of different approaches? Are you trying to give advice on good
>
> practice (with the implication that alternative practices are less
>
> good)? Are you trying to clarify ideas by writing them down? Are you
>
> trying to outline the implications of VM Mobility for other protocols
>
> being developed within the NVO WG?
>
>
>
> [Linda 2019] NVo3 is developing protocols for IP overlay network within
> Data Centers that hosts applications belonging to different tenants (or
> organizations). Applications don’t stay in one place, as in traditional
> Data Centers. That is why it is necessary for NVO3 to have a document
> describing the behavior of VM mobility and its impact to the overlay
> networks.
>
>
>
>
>
>
>
>
>
> Bob
>
>
>
> On 10/09/18 19:16, Linda Dunbar wrote:
>
> > Bob,
>
> > Thank you very much for reviewing the draft and provided in-depth
>
> > comments. I am very sorry for the delayed response due to traveling.
>
> > Replies to your comments are inserted below marked by [Linda]:
>
> > -----Original Message-----
>
> > From: Bob Briscoe [mailto:ietf@bobbriscoe.net]
>
> > Sent: Monday, September 03, 2018 9:45 PM
>
> > To: tsv-art@ietf.org
>
> > Cc: nvo3@ietf.org; ietf@ietf.org; draft-ietf-nvo3-vmm.all@ietf.org
>
> > Subject: Tsvart last call review of draft-ietf-nvo3-vmm-04
>
> > Reviewer: Bob Briscoe
>
> > Review result: Not Ready
>
> > I have been selected as the Transport Directorate reviewer for this
>
> > draft. The Transport Directorate seeks to review all transport or
>
> > transport-related drafts as they pass through IETF last call and IESG
>
> > review, and sometimes on special request. The purpose of the review is
>
> > to provide assistance to the Transport ADs. For more information about
>
> > the Transport Directorate Reviews and the Transport Area Review Team,
>
> > please see ​https://trac.ietf.org/trac/tsv/wiki/TSV-Directorate-Reviews
>
> > In this case, very very few of the review comments relate to transport
>
> > issues, although the greatest issue concerns a desire that the network
>
> > could pause or stop connections during L3 VM Mobility, which is
>
> > certainly a transport issue.
>
> > [Linda] There is “Hot Migration” with transport service continuing,
>
> > and there is a “Cold Migration”, which is a common practice in many
>
> > data centers, which stop the task running on the old place and move to
>
> > the new place before restart as described in the Task Migration.
>
> > Is it helpful to add this description to the draft?
>
> > ==Summary==
>
> > The technical aspects of the draft concerning L2 VM mobility (within a
>
> > subnet) seem sound. However, this is only part of the draft, which has
>
> > the following
>
> > issues:
>
> > #. The introduction does not say what the purpose of publishing this
>
> > draft is.
>
> > It seems that, rather than describing a specific protocol or
>
> > protocols, it intends to describe the overall system procedure that
>
> > would typically be used in DCs for VM mobility. It is tagged as a BCP,
>
> > but it does not say who needs this BCP, why it is useful for the IETF
>
> > to publish this BCP, how wide the authors' knowledge is of current
>
> > practice (given DCs are private), or why this is a BCP rather than a
>
> > protocol spec.
>
> > [Linda] The first paragraph on Page 3 has the description why VM
>
> > Mobility is needed. Is it helpful to move this paragraph to the
>
> > beginning of the Introduction Section?
>
> > /“//Virtualization which is being used in almost all of today’s data/
>
> > /centers enables many virtual machines to run on a single physical/
>
> > /computer or compute server. Virtual machines (VM) need hypervisor/
>
> > /running on the physical compute server to provide them shared/
>
> > /processor/memory/storage. Network connectivity is provided by the/
>
> > /network virtualization edge (NVE) [RFC8014]. Being able to move VMs/
>
> > /dynamically, or live migration, from one server to another allows for/
>
> > /dynamic load balancing or work distribution and thus it is a highly/
>
> > /desirable feature [RFC7364].//”/
>
> > The draft starts out (S.3) as if it intends to say what a good VM
>
> > Mobility protocol should or shouldn't do, but the rest of the document
>
> > doesn't give any reasoning for these recommendations, it just asserts
>
> > what appears to be one view of how a whole VM Mobility system works,
>
> > sometimes referring to one example protocol RFC for a component part,
>
> > but more often with no references or details.
>
> > [Linda] Is it helpful to move the paragraph above to the beginning of
>
> > the Introduction Section? So that audience is aware of why VM Mobility
>
> > is needed. And then follow up with what a good VM Mobility protocol
>
> > should or shouldn't do?
>
> > #. It does not seem as if the NVO WG has discussed the purpose of
>
> > using normative text in this draft. See detailed comments.
>
> > [Linda] The “Intended status” of the draft is “Best Current Practice”.
>
> > So all the text are not “normative”. Is it Okay?
>
> > #. The draft silently slips back and forth between VM mobility and VM
>
> > redundancy, without recognizing the differences. See detailed comments.
>
> > [Linda] There is only one usage of “redundancy” in the entire
>
> > document, used under the context of “Hot standby option”, indicating
>
> > the “redundancy” of “the VMs in both primary and secondary domains
>
> > have identical information and can provide services simultaneously as
>
> > in load-share mode of operation” being expensive.
>
> > #. Please adopt different terminology than "source NVE" and
>
> > "destination NVE", which are really poor choices of terms for an
>
> > intermediate node. See detailed comments. Why not use "old NVE" and
>
> > "new NVE", which is what you mean?
>
> > [Linda] Thanks for the suggestion. We will change to “Old NVE”, and
>
> > “new NVE”.
>
> > #. Applicability is fairly clearly outlined, but it is not clear
>
> > whether hosts corresponding with the mobile VMs are part of the same
>
> > controlled environment or on the uncontrolled public Internet. See
>
> > detailed comments.
>
> > [Linda] “Hosts” are the App running on the VM. It is the under the
>
> > same controlled environment. Not on uncontrolled public internet.
>
> > #. Section 4.2.1 on L3 VM mobility reads like some potential
>
> > half-thought-through ideas on how to solve L3 mobility, rather than
>
> > current practice, let alone best current practice. Either current
>
> > practice should be described instead, or the scope of the draft should
>
> > be narrowed solely to L2 VM mobility. See detailed comments.
>
> > [Linda] This is refereeing to “Cold Migration”, which is a common
>
> > practice in many data centers.
>
> > # The VM's file system is described as state that moves with the VM
>
> > (S.6), but VM mobility solutions often move the VM but stitch it back
>
> > to its (unmoved) storage. Conversely, the storage can also move
>
> > independent of the VM.
>
> > [Linda] It depends. When a VM move to a different zone, the
>
> > storage/file can becomes inaccessible.
>
> > #. The draft omits some of the security, transport and management
>
> > aspects of VM mobility. See detailed comments.
>
> > [Linda] Can you provide some text?
>
> > #. The draft reads as if different sections have been written by
>
> > different authors and no-one has edited the whole to give it a
>
> > coherent structure, or to ensure consistency (both technical and
>
> > editorial) between the parts. See detailed comments.
>
> > [Linda] we can improve.
>
> > #. The quality of the English grammar does not allow a reviewer to
>
> > concentrate on the technical aspects rather than the English. It would
>
> > have been useful if one of the English-speaking co-authors had
>
> > improved the English before submission for review. See detailed comments.
>
> > [Linda] can you help?  Becoming a co-author to improve?
>
> > ==Detailed Comments==
>
> > ===#. Normative statements===
>
> > In the body of the document, there is just one occurrence of normative
>
> > text (actually two "MUST"s, but both state a common requirement - just
>
> > written separately for IPv4 and IPv6). This merely serves to imply
>
> > that everything else the document says is less important or optional,
>
> > which was probably not the intention.
>
> > [Linda] The goal is to indicate any solution in moving the VM “MUST”
>
> > follow this rule. They make sense, aren’t they?
>
> > At the start there is a requirements section, which states what a VM
>
> > Mobility protocol "SHOULD" or "SHOULD NOT" do. I think this is
>
> > intended as a set of goals for the rest of the document. If so, these
>
> > "SHOULDs" are not intended to apply to implementations, so they ought
>
> > not to be capitalized.
>
> > [Linda] okay, will change.
>
> > The first requirement, "Data center network SHOULD support virtual
>
> > machine mobility in IPv6", is written as a requirement on all DC
>
> > networks, not on implementations. I assume this was intended to read
>
> > as "Data center network virtual machine mobility protocols SHOULD
>
> > support IPv6". Even then, it doesn't really add anything to say VM
>
> > mobility should support v6 and it should support v4. A L2 solution
>
> > won't. While undoubtedly, a L3 solution will at least support one of
> them.
>
> > [Linda]Agree. Will change it to “Data center that support IPv6 address
>
> > should …”
>
> > I'm not sure that 'protocol' is the right word anyway; I think 'VM
>
> > Mobility procedure' would be a better phrase, because it includes
>
> > steps such as suspending the VM, which is more than a protocol.
>
> > [Linda] yes. Will change to “Procedure”.
>
> > The requirement "Virtual machine mobility protocol MAY support host
>
> > routes to accomplish virtualization", is not followed up at all in the
>
> > rest of the draft.
>
> > Even if this requirement stays, the last 3 words should be deleted.
>
> > [Linda] will change to “Host Route can be used to support the Virtual
>
> > Machine Mobility Procedure.”
>
> > By the end of the draft, the solution falls far short of the most
>
> > relevant "Requirements" anyway, so one assumes the title of the
>
> > section ought to have been "Goals". Specifically, even in the simpler
>
> > case of L2 VM mobility, S.4.1 says that triangular routing and
>
> > tunnelling persist "until a neighbour cache entry times out". A cache
>
> > timeout is about 10 orders of magnitude longer than the requirement to
>
> > only persist "while handling packets in flight", which would be a few
>
> > milliseconds at most (the time for packets to clear the network that
>
> > were already launched into flight when the old VM stopped).
>
> > Whatever, it would be preferable for the draft to give rationale for
>
> > these requirements, rather than just assert them. This would help to
>
> > shed light on the merits of the different trade offs that solutions
>
> > choose.
>
> > [Linda] Agree, will add.
>
> > ===#. Mobility vs. Redundancy===
>
> > Redundancy and mobility have a lot of similarities, but they have
>
> > different goals. With mobility, it is necessary to know the exact
>
> > instant when one set of state is identical to the other so it can hand
>
> > over. With redundancy, the aim is to keep two (or more) sets of state
>
> > evolving through the same sequence of changes, but there is no need to
>
> > know the point at which one is the same as the other was at a certain
>
> > point.
>
> > [Linda] Agree with what you said. There is only one usage of
>
> > “redundancy” in the entire document, used under the context of “Hot
>
> > standby option”, indicating  the “redundancy” of  “the VMs in both
>
> > primary and secondary domains have identical information and can
>
> > provide services simultaneously as in load-share mode of operation”
>
> > being expensive.
>
> > The draft slips from mobility to resilience in the following places:
>
> > * S.2. Terminology: Warm VM Mobility is defined without any ending, as
>
> > if it is permanent replication. * S.7. "Handling of Hot, Warm and Cold
>
> > Virtual Machine Mobility" is actually all about redundancy, and
>
> > doesn't address mobility explicitly.
>
> > [Linda] Will add the definition “Hot Migration”, “cold migration”, and
>
> > “warm migration”.
>
> > ===#. Terminology===
>
> > Packets run from the source at A to the destination at B via NVE1,
>
> > then via NVE2. Please don't call NVE1 and NVE2 the source NVE and the
>
> > destination NVE.
>
> > In future, no-one will thank you for the apparent contradictions when
>
> > they continually stumble over phrases like this one in S.4.1: "...send
>
> > their packets to the source NVE".
>
> > The term "packets in flight" is used incorrectly to refer to all the
>
> > packets sent to the old NVE after the VM has moved, even if they were
>
> > launched into flight long after the old VM stopped receiving packets.
>
> > [Linda] thank for the comments. Will change.
>
> > BTW, I think s/before/after/ in: "that have old ARP or neighbor cache
>
> > entry before VM or task migration".
>
> > I think: s/IP-based VM mobility/L3 VM mobility/ throughout, because
>
> > "based"
>
> > sounds (to me) like the mobility control protocol is over (i.e. based
>
> > on) IP.
>
> > ===#. Applicability===
>
> > In section 4.2 it says that the protocol mostly used as the IP based
>
> > task migration protocol is ILA. This implies that all hosts
>
> > corresponding with the mobile VMs are either part of the same
>
> > controlled environment, or they are proxied via nodes that are part of
>
> > the same controlled environment (I only have passing knowledge of ILA,
>
> > but I understand that it depends on ILA routers on the path). If I am
>
> > correct, this aspect of scope needs to be made clear from the start.
>
> > Also under the heading of applicabiliy, the sentence "Since migrations
>
> > should be relatively rare events" appears very late in the document
>
> > (S.4.2.1). The assumed level of churn ought to be stated nearer the
> start.
>
> > [Linda] yes, under the same controlled environment.
>
> > ===#. L3 Mobility===
>
> > L2 VM mobility is independent of the application, because resolution
>
> > of L2 mappings is delegated to the stack. In contrast, L3 VM mobility
>
> > is only feasible under certain conditions, because an application
>
> > needs an IP address to open a socket (resolution of DNS names is not
>
> > delegated to the stack, and apps can use IP addresses directly anyway).
>
> > Examples of the 'certain conditions':
>
> > a) /All/ applications used in the whole DC load balancing scheme
>
> > contain IP address migration logic for /all/ their connections; b) VMs
>
> > running solely applications that support IP address migration register
>
> > this fact with the NVA, and it only select such VMs for mobility. c)
>
> > An abstraction is layered over /all/ the IP addresses exposed to
>
> > applications (at both ends) so that the IP addresses that applications
>
> > use are solely identifiers  (e.g. ILA, LISP, HIP), not also locators.
>
> > The introduction says the draft is about VM mobility in a multi-tenant
>
> > DC, so the DC admin will not know the range of applications being
>
> > used. This excludes condition (a) above. When the draft says "...if
>
> > all applications running are known to handle this gracefully...", it
>
> > doesn't quantify just how restrictive this condition is, and it gives
>
> > no explanation of how this knowledge might be 'known' or which
>
> > function within the system 'knows' it.
>
> > S.4.2.1 contains what seems like plenty of arm-waving.
>
> > * "TCP connections could be automatically closed in the network stack
>
> > during a migration event."
>
> >         o There is no TCP connection state in the network stack.
>
> >         o Even if the network starts to drop every packet, the TCP
>
> > connection
>
> >         state persists in the end-points for a duration of the order
>
> > of 30-90
>
> >         minutes (OS-dependent) before TCP deems the connection is
>
> > broken. o
>
> >         Other transport protocols have similar designs (including the
>
> > app-layer
>
> >         of protocols over UDP).
>
> > * "More involved approach to connection migration":
>
> >         o pausing the connection [does this refer to an actual feature
>
> > of any
>
> >         L4 protocol?] o packaging connection state and sending to
>
> > target [does
>
> >         this assume logic written into the application, or is this
>
> > assuming the
>
> >         stack handles this and the app is restricted to using some form
> of
>
> >         separate identifier/locator addresses?] o instantiating
> connection
>
> >         state in the peer stack [ditto?].
>
> > There's some arm-waving in S.7 too:
>
> >   "Cold Virtual Machine mobility is facilitated by the VM initially
>
> >    sending an ARP or Neighbor Discovery message at the destination NVE
>
> >    but the source NVE not receiving any packets inflight."
>
> >    [How is it arranged for the source NVE not to receive any packets
>
> > in flight?]
>
> > And in S.7:
>
> >   "In hot
>
> >    standby option, regarding TCP connections, one option is to start
>
> >    with and maintain TCP connections to two different VMs at the same
>
> >    time."
>
> >    [This sounds like resilience logic has been written into the
>
> > application,
>
> >    which would be a special case but not something VM mobility
>
> > infrastructure
>
> >    could depend on.]
>
> > [Linda] will add.
>
> > ===#. Gaps===
>
> > #. Security Considerations: repeats issues in other drafts that are
>
> > not specific to mobility, but it does not mention any security issues
>
> > specifically due to VM mobility. It says that address spoofing may
>
> > arise in a DC (sort-of implying it is worse than in non-DC
>
> > environments, but not saying why). The handshake at the start of a
>
> > connection (e.g. TCP, SCTP, QUIC) checks for source address spoofing.
>
> > So L3 VM mobility would be more vulnerable to source address spoofing
>
> > in cases where the mobile VM was the connection initiator and there
>
> > was not a new handshake after the move. However, this draft does not
>
> > contain any detailed mobility protocols, so it is not possible to
>
> > identify any specific security flaws.
>
> > #. Transport Issues: Effect of delay on the transport: Cold mobility
>
> > introduces significant delay, and other forms less, but still some
>
> > delay. It should be pointed out that some applications (e.g.
>
> > real-time) will therefore not be useful if subjected to VM mobility.
>
> > Similarly, even a short period of delay will drive most congestion
>
> > controls to severely reduce throughput. These points might be
>
> > self-evident, but perhaps they should be stated explicitly.
>
> > BTW, in the L3 VM mobility case, the draft often refers to TCP
>
> > connections, but the address bindings of any transport protocols would
>
> > have to be migrated due to VM mobility (e.g. SCTP; sequences of
>
> > datagrams over UDP; streams over UDP such as with RTP, QUIC).
>
> > #. Management Issues: perhaps the draft ought to recommend statistics
>
> > gathering (e.g. time taken, amount of duplicate data) to aid a DC's
>
> > future decisions on the cost-benefit of moving a VM. The OPSDIR review
>
> > says a BCP does not /have/ to describe management issues, but this
>
> > document seems to describe a whole system procedure, not just a
>
> > protocol, which then surely includes the management plane.
>
> > [Linda] can you become a co-author and add those in?
>
> > ===#. Incoherent Structure===
>
> > S.4.1. happens to talk about VMs moving, while S.4.2. happens to talk
>
> > about tasks moving, but this is not the distinguishing aspect of these
>
> > two sections (anyway, S.2. says "the draft uses task and VM
>
> > interchangeably"): * "4.1 VM Migration" is about "L2 VM Mobility" so
>
> > this ought to be the section heading, *
>
> > "4.2 Task Migration" is about "L3 VM Mobility" so this ought to be the
>
> > section heading. It would also help not to switch from VM to task
>
> > across these sections
>
> > - it's just a distraction.
>
> > S.4.1 needs better signposting of where each sub-case ends
>
> > (Subsections might be useful to solve this): * IPv4 * end-user client
>
> > * 2 paras starting "All NVEs communicating with this virtual
>
> > machine..." [Not clear that the end-user case has ended and we have
>
> > returned to the general IPv4 case?] * IPv6 [Strictly, it still hasn't
>
> > said whether the end-user client case has ended.] [Also, it doesn't
>
> > explain why there is no need for an end-user client case under IPv6?]
>
> > Sections 5 & 6 seem to be about either L2 or L3 mobility, whereas
>
> > Sections 7 &
>
> > 8 seem to be restricted to L2.
>
> > The draft vacillates over what to do with packets arriving at the old
>
> > NVE in the L3 case (see also L3 mobility above): * S4.2 first says
>
> > packets are dropped, possibly with an ICMP error message;
>
> >   o then later it says they are silently dropped;
>
> >   o then in the very next sentence it says either silently drop them
>
> > or forward
>
> >   them to the new location
>
> > * S.5 says they should not be lost, but instead delivered to the
>
> > destination hypervisor
>
> >   o then it describes how they are tunnelled (which is not the same as
>
> >   "forwarding").
>
> > The order in which all the stages of mobilty are given is jumbled up
>
> > across sections that also appear in arbitrary order: * S.5 prepares,
>
> > establishes uses then stops a tunnel, but it doesn't say where the
>
> > other stages fit between these steps
>
> >         o When tunneling packets, it talks about the *migrating* VM
>
> > not the
>
> >         *migrated* VM, which implies tunnelling has started before the
>
> > new VM
>
> >         is running. Does this imply there is a huge buffer? o It says
>
> > "Stop
>
> >         Tunneling Packets - When source NVE stops receiving packets
>
> > destined
>
> >         to..." but it is never clear when a source has stopped sending
>
> > packets
>
> >         to a destination, unless it explicitly closes the connection
>
> > (e.g. with
>
> >         a FIN in the case of TCP). Often there are long gaps between
>
> > packets,
>
> >         because many flows are 'thin' (meaning the application
>
> > frequently has
>
> >         nothing to send). These gaps can last for milliseconds, hours
>
> > or even
>
> >         days without any implication that the connection has ended.
>
> > * Then S.6. describes moving state, but doesn't say that this is not
>
> > after the previous tunnelling steps (or where it fits within those
>
> > steps). * Then S.7 describes hot, warm and cold mobility, but doesn't
>
> > lay out the tunnelling or steps to move state in each case. * Then S.8
>
> > says it's about VM life-cycle, but just gives the very first 3 steps
>
> > for allocation of resources to a VM, then abruptly ends, without even
>
> > starting the VM, let alone getting to move it.
>
> > S.5 exhibits another inconsistency by talking about the hypervisor,
>
> > not the NVE.
>
> > ==#. Nits==
>
> > Nits with the English are too numerous to mention them all. Below are
>
> > pointers to general problems as well as some individual instances.
>
> > S.4
>
> >   "Layer 2 and Layer 3 protocols are described next.  In the following
>
> >    sections, we examine more advanced features."
>
> >         s/following/subsequent/
>
> > S.4.1
>
> > Expand WSC, MSC and NVA on first use.
>
> > s/the VM moves in the same link/the VM moves in the same subnet/
>
> > "i.e. end-user clients ask for the same MAC address upon migration.
>
> > [...] to ensure that the same IPv4 address is assigned to the VM." I
>
> > think s/IPv4/MAC/ was intended?
>
> > "  All NVEs communicating with this virtual machine uses the old ARP
>
> >    entry.  If any VM in those NVEs need to talk to the new VM in the
>
> >    destination NVE, it uses the old ARP entry."
>
> > Repetition: these 2 sentences say the same. (The mistake is also
>
> > repeated when these 2 sentences are repeated for IPv6).
>
> > S.4.2.1
>
> > s/Push the new mapping to hosts./Push the new mapping to communicating
>
> > hosts./
>
> > S.5.
>
> > The IPv4/IPv6 pairs of paras for "tunnel estabilshment" and "tunneling
>
> > packets"
>
> > only differ in the words "IPv4"/"IPv6". So in each case a single para
>
> > could be given for IP (irrespective of whether v4 or v6).
>
> > Thank you very much.
>
> > Linda Dunbar
>
> >
>
> >
>
> > _______________________________________________
>
> > Tsv-art mailing list
>
> > Tsv-art@ietf.org
>
> > https://www.ietf.org/mailman/listinfo/tsv-art
>
>
>
> --
>
> ________________________________________________________________
>
> Bob Briscoe                               http://bobbriscoe.net/
>
>
> _______________________________________________
> nvo3 mailing list
> nvo3@ietf.org
> https://www.ietf.org/mailman/listinfo/nvo3
>