Re: [Tsv-art] [nvo3] Tsvart last call review of draft-ietf-nvo3-vmm-04

Bob Briscoe <ietf@bobbriscoe.net> Fri, 21 September 2018 09:34 UTC

Return-Path: <ietf@bobbriscoe.net>
X-Original-To: tsv-art@ietfa.amsl.com
Delivered-To: tsv-art@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E3A14128D68; Fri, 21 Sep 2018 02:34:01 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.999
X-Spam-Level:
X-Spam-Status: No, score=-1.999 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=bobbriscoe.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vRJ4EUdEa9Za; Fri, 21 Sep 2018 02:33:56 -0700 (PDT)
Received: from server.dnsblock1.com (server.dnsblock1.com [85.13.236.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 636A0130E3F; Fri, 21 Sep 2018 02:33:55 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=bobbriscoe.net; s=default; h=Content-Type:In-Reply-To:MIME-Version:Date: Message-ID:From:References:Cc:To:Subject:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=B94SRlm23EoKz9TigP0P3ZctkQggQgLIBd5ngw7EN68=; b=cZ+brPc+Zgh9Tm4Z1XQP8p3qs Ym2WO+3MAVpMAK5MJcJ1VU0yVZGZNkgR4r/RgWmGvhByNnn/Bnq9k1CB5PGJRvkgKislyRGs5ygWd Af307vVwLDE4Uau0Y/lJgizLEwKnW9sElqq9O311RDHluW9EWZMGXNra16sToMGnE9kcbTlRzpvcK 6u4VprHKX8DPjjG30HHz0gKsIDQhVcGuQf7EXpq9aAizIkIwaENcbkqkGrpWahuvd/wkmvEfgzRze 69ifRxyD3DwWp7DIIM7Mvpig/P3L/iFKXU8B3NTVQpu6kWSvPSBDposh51iV3eNGbvnzpyGzyJIXb /nKIaTHXw==;
Received: from 188.74.9.51.dyn.plus.net ([51.9.74.188]:44520 helo=[192.168.0.2]) by server.dnsblock1.com with esmtpsa (TLSv1.2:ECDHE-RSA-AES128-GCM-SHA256:128) (Exim 4.91) (envelope-from <ietf@bobbriscoe.net>) id 1g3Hos-00080Y-KF; Fri, 21 Sep 2018 10:33:52 +0100
To: sarikaya@ieee.org
Cc: tsv-art@ietf.org, NVO3 <nvo3@ietf.org>, IETF <ietf@ietf.org>, draft-ietf-nvo3-vmm.all@ietf.org
References: <153602909285.13281.13763046029400746910@ietfa.amsl.com> <4A95BA014132FF49AE685FAB4B9F17F66B139743@sjceml521-mbs.china.huawei.com> <7f3ceaff-db16-8eb9-a72c-aca219c7d90c@bobbriscoe.net> <CAC8QAcfVywTMOs=+B5UH5JwpsPPkiYZnb4YQzcqKMzedQsiMdw@mail.gmail.com> <c513d041-0c65-111d-9fd4-4474c52fa491@bobbriscoe.net> <CAC8QAcc-fr_-g8bPe812=udZVQk3d2E00mkJBwDbX3ZkdDceUw@mail.gmail.com>
From: Bob Briscoe <ietf@bobbriscoe.net>
Message-ID: <9722da33-469b-38f6-b629-99a277fc864e@bobbriscoe.net>
Date: Fri, 21 Sep 2018 10:33:49 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1
MIME-Version: 1.0
In-Reply-To: <CAC8QAcc-fr_-g8bPe812=udZVQk3d2E00mkJBwDbX3ZkdDceUw@mail.gmail.com>
Content-Type: multipart/alternative; boundary="------------8FF66E7C011528F6C1FCAFF2"
Content-Language: en-GB
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - server.dnsblock1.com
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - bobbriscoe.net
X-Get-Message-Sender-Via: server.dnsblock1.com: authenticated_id: in@bobbriscoe.net
X-Authenticated-Sender: server.dnsblock1.com: in@bobbriscoe.net
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsv-art/aNONGk27tticUgPsk1onLCQT9XA>
Subject: Re: [Tsv-art] [nvo3] Tsvart last call review of draft-ietf-nvo3-vmm-04
X-BeenThere: tsv-art@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Review Team <tsv-art.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsv-art>, <mailto:tsv-art-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsv-art/>
List-Post: <mailto:tsv-art@ietf.org>
List-Help: <mailto:tsv-art-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsv-art>, <mailto:tsv-art-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 21 Sep 2018 09:34:02 -0000

Behcet,

Linda made load of responses to my review, some of which I disagree with 
so I would like to respond to them. I need responses to those two 
questions first though, 'cos everything else depends on those.


Bob

On 20/09/18 15:30, Behcet Sarikaya wrote:
> Dear Bob,
> On Wed, Sep 19, 2018 at 9:53 AM Bob Briscoe <ietf@bobbriscoe.net 
> <mailto:ietf@bobbriscoe.net>> wrote:
>
>     Behcet,
>
>     I would like to make significant responses to many of Linda's
>     responses, but until we get answers to the two pre-requisite
>     questions I've given, I can't be sure how to respond.
>
>     So rather than promising a new version with no prior discussion, I
>     believe it would be much more fruitful to engage in this
>     conversation. I'm trying to help.
>
>
> You already made a detailed review.
> Your two points are clarifications from your detailed review.
> When I said we will revise I meant we  will revise based on your 
> detailed review.
> After we post our revision you can do what ever you wish.
>
> Sincerely,
> Behcet
>
>     Cheers
>
>
>     Bob
>
>     On 19/09/18 15:46, Behcet Sarikaya wrote:
>>     Hi Bob,
>>
>>     Thank you for your comments.
>>     The authors are currently discussing your points and we will come
>>     up with a revision soon after the discussions are over.
>>
>>     Regards,
>>     Behcet
>>     On Tue, Sep 18, 2018 at 6:03 PM Bob Briscoe <ietf@bobbriscoe.net
>>     <mailto:ietf@bobbriscoe.net>> wrote:
>>
>>         Linda,
>>
>>         Until we can all understand the answers to the following two
>>         questions, I don't think we can discuss what track this draft
>>         ought to be on, let alone move on to your responses to all my
>>         other points.
>>
>>         1/ Applicability
>>
>>         You say this draft solely applies to connections with both
>>         ends within the controlled DC environment. But the draft says
>>         it's about multi-tenant DCs. Are there any multi-tenant DCs
>>         that restrict all VMs to only communicate with other VMs
>>         within the same controlled DC environment?
>>
>>         2/ Purpose of publishing as an RFC
>>
>>         When I said:
>>>         #. The introduction does not say what the purpose of
>>>         publishing this draft is.
>>         you responded:
>>>         [Linda] The first paragraph on Page 3 has the description
>>>         why VM Mobility is needed.
>>
>>         Whether VM Mobility is needed was not my question. My
>>         question was what is the purpose of the IETF publishing an
>>         RFC about VM Mobility? And particularly, what is /this/ RFC
>>         intended to achieve?
>>
>>         Are the authors trying to argue for a particular approach vs.
>>         others? Are you trying to write a tutorial? Are you trying to
>>         give the pros and cons of different approaches? Are you
>>         trying to give advice on good practice (with the implication
>>         that alternative practices are less good)? Are you trying to
>>         clarify ideas by writing them down? Are you trying to outline
>>         the implications of VM Mobility for other protocols being
>>         developed within the NVO WG?
>>
>>
>>
>>
>>         Bob
>>
>>         On 10/09/18 19:16, Linda Dunbar wrote:
>>>         Bob,
>>>         Thank you very much for reviewing the draft and provided
>>>         in-depth comments. I am very sorry for the delayed response
>>>         due to traveling.
>>>         Replies to your comments are inserted below marked by [Linda]:
>>>         -----Original Message-----
>>>         From: Bob Briscoe [mailto:ietf@bobbriscoe.net]
>>>         Sent: Monday, September 03, 2018 9:45 PM
>>>         To: tsv-art@ietf.org <mailto:tsv-art@ietf.org>
>>>         Cc: nvo3@ietf.org <mailto:nvo3@ietf.org>; ietf@ietf.org
>>>         <mailto:ietf@ietf..org>; draft-ietf-nvo3-vmm.all@ietf.org
>>>         <mailto:draft-ietf-nvo3-vmm.all@ietf.org>
>>>         Subject: Tsvart last call review of draft-ietf-nvo3-vmm-04
>>>         Reviewer: Bob Briscoe
>>>         Review result: Not Ready
>>>         I have been selected as the Transport Directorate reviewer
>>>         for this draft. The Transport Directorate seeks to review
>>>         all transport or transport-related drafts as they pass
>>>         through IETF last call and IESG review, and sometimes on
>>>         special request. The purpose of the review is to provide
>>>         assistance to the Transport ADs. For more information about
>>>         the Transport Directorate Reviews and the Transport Area
>>>         Review Team, please see
>>>         https://trac..ietf.org/trac/tsv/wiki/TSV-Directorate-Reviews
>>>         <https://trac.ietf.org/trac/tsv/wiki/TSV-Directorate-Reviews>
>>>         In this case, very very few of the review comments relate to
>>>         transport issues, although the greatest issue concerns a
>>>         desire that the network could pause or stop connections
>>>         during L3 VM Mobility, which is certainly a transport issue.
>>>         [Linda] There is “Hot Migration” with transport service
>>>         continuing, and there is a “Cold Migration”, which is a
>>>         common practice in many data centers, which stop the task
>>>         running on the old place and move to the new place before
>>>         restart as described in the Task Migration.
>>>         Is it helpful to add this description to the draft?
>>>         ==Summary==
>>>         The technical aspects of the draft concerning L2 VM mobility
>>>         (within a subnet) seem sound. However, this is only part of
>>>         the draft, which has the following
>>>         issues:
>>>         #. The introduction does not say what the purpose of
>>>         publishing this draft is.
>>>         It seems that, rather than describing a specific protocol or
>>>         protocols, it intends to describe the overall system
>>>         procedure that would typically be used in DCs for VM
>>>         mobility. It is tagged as a BCP, but it does not say who
>>>         needs this BCP, why it is useful for the IETF to publish
>>>         this BCP, how wide the authors' knowledge is of current
>>>         practice (given DCs are private), or why this is a BCP
>>>         rather than a protocol spec.
>>>         [Linda] The first paragraph on Page 3 has the description
>>>         why VM Mobility is needed. Is it helpful to move this
>>>         paragraph to the beginning of the Introduction Section?
>>>         /“//Virtualization which is being used in almost all of
>>>         today’s data/
>>>         /centers enables many virtual machines to run on a single
>>>         physical/
>>>         /computer or compute server. Virtual machines (VM) need
>>>         hypervisor/
>>>         /running on the physical compute server to provide them shared/
>>>         /processor/memory/storage. Network connectivity is provided
>>>         by the/
>>>         /network virtualization edge (NVE) [RFC8014]. Being able to
>>>         move VMs/
>>>         /dynamically, or live migration, from one server to another
>>>         allows for/
>>>         /dynamic load balancing or work distribution and thus it is
>>>         a highly/
>>>         /desirable feature [RFC7364].//”/
>>>         The draft starts out (S.3) as if it intends to say what a
>>>         good VM Mobility protocol should or shouldn't do, but the
>>>         rest of the document doesn't give any reasoning for these
>>>         recommendations, it just asserts what appears to be one view
>>>         of how a whole VM Mobility system works, sometimes referring
>>>         to one example protocol RFC for a component part, but more
>>>         often with no references or details.
>>>         [Linda] Is it helpful to move the paragraph above to the
>>>         beginning of the Introduction Section? So that audience is
>>>         aware of why VM Mobility is needed. And then follow up with
>>>         what a good VM Mobility protocol should or shouldn't do?
>>>         #. It does not seem as if the NVO WG has discussed the
>>>         purpose of using normative text in this draft. See detailed
>>>         comments.
>>>         [Linda] The “Intended status” of the draft is “Best Current
>>>         Practice”. So all the text are not “normative”. Is it Okay?
>>>         #. The draft silently slips back and forth between VM
>>>         mobility and VM redundancy, without recognizing the
>>>         differences. See detailed comments.
>>>         [Linda] There is only one usage of “redundancy” in the
>>>         entire document, used under the context of “Hot standby
>>>         option”, indicating  the “redundancy” of “the VMs in both
>>>         primary and secondary domains have identical information and
>>>         can provide services simultaneously as in load-share mode of
>>>         operation” being expensive.
>>>         #. Please adopt different terminology than "source NVE" and
>>>         "destination NVE", which are really poor choices of terms
>>>         for an intermediate node. See detailed comments. Why not use
>>>         "old NVE" and "new NVE", which is what you mean?
>>>         [Linda] Thanks for the suggestion. We will change to “Old
>>>         NVE”, and “new NVE”.
>>>         #. Applicability is fairly clearly outlined, but it is not
>>>         clear whether hosts corresponding with the mobile VMs are
>>>         part of the same controlled environment or on the
>>>         uncontrolled public Internet. See detailed comments.
>>>         [Linda] “Hosts” are the App running on the VM. It is the
>>>         under the same controlled environment. Not on uncontrolled
>>>         public internet.
>>>         #. Section 4.2.1 on L3 VM mobility reads like some potential
>>>         half-thought-through ideas on how to solve L3 mobility,
>>>         rather than current practice, let alone best current
>>>         practice. Either current practice should be described
>>>         instead, or the scope of the draft should be narrowed solely
>>>         to L2 VM mobility. See detailed comments.
>>>         [Linda] This is refereeing to “Cold Migration”, which is a
>>>         common practice in many data centers.
>>>         # The VM's file system is described as state that moves with
>>>         the VM (S.6), but VM mobility solutions often move the VM
>>>         but stitch it back to its (unmoved) storage. Conversely, the
>>>         storage can also move independent of the VM.
>>>         [Linda] It depends. When a VM move to a different zone, the
>>>         storage/file can becomes inaccessible.
>>>         #. The draft omits some of the security, transport and
>>>         management aspects of VM mobility. See detailed comments.
>>>         [Linda] Can you provide some text?
>>>         #. The draft reads as if different sections have been
>>>         written by different authors and no-one has edited the whole
>>>         to give it a coherent structure, or to ensure consistency
>>>         (both technical and editorial) between the parts. See
>>>         detailed comments.
>>>         [Linda] we can improve.
>>>         #. The quality of the English grammar does not allow a
>>>         reviewer to concentrate on the technical aspects rather than
>>>         the English. It would have been useful if one of the
>>>         English-speaking co-authors had improved the English before
>>>         submission for review. See detailed comments.
>>>         [Linda] can you help?  Becoming a co-author to improve?
>>>         ==Detailed Comments==
>>>         ===#. Normative statements===
>>>         In the body of the document, there is just one occurrence of
>>>         normative text (actually two "MUST"s, but both state a
>>>         common requirement - just written separately for IPv4 and
>>>         IPv6). This merely serves to imply that everything else the
>>>         document says is less important or optional, which was
>>>         probably not the intention.
>>>         [Linda] The goal is to indicate any solution in moving the
>>>         VM “MUST” follow this rule. They make sense, aren’t they?
>>>         At the start there is a requirements section, which states
>>>         what a VM Mobility protocol "SHOULD" or "SHOULD NOT" do. I
>>>         think this is intended as a set of goals for the rest of the
>>>         document. If so, these "SHOULDs" are not intended to apply
>>>         to implementations, so they ought not to be capitalized.
>>>         [Linda] okay, will change.
>>>         The first requirement, "Data center network SHOULD support
>>>         virtual machine mobility in IPv6", is written as a
>>>         requirement on all DC networks, not on implementations. I
>>>         assume this was intended to read as "Data center network
>>>         virtual machine mobility protocols SHOULD support IPv6".
>>>         Even then, it doesn't really add anything to say VM mobility
>>>         should support v6 and it should support v4. A L2 solution
>>>         won't. While undoubtedly, a L3 solution will at least
>>>         support one of them.
>>>         [Linda]Agree. Will change it to “Data center that support
>>>         IPv6 address should …”
>>>         I'm not sure that 'protocol' is the right word anyway; I
>>>         think 'VM Mobility procedure' would be a better phrase,
>>>         because it includes steps such as suspending the VM, which
>>>         is more than a protocol.
>>>         [Linda] yes. Will change to “Procedure”.
>>>         The requirement "Virtual machine mobility protocol MAY
>>>         support host routes to accomplish virtualization", is not
>>>         followed up at all in the rest of the draft.
>>>         Even if this requirement stays, the last 3 words should be
>>>         deleted.
>>>         [Linda] will change to “Host Route can be used to support
>>>         the Virtual Machine Mobility Procedure.”
>>>         By the end of the draft, the solution falls far short of the
>>>         most relevant "Requirements" anyway, so one assumes the
>>>         title of the section ought to have been "Goals".
>>>         Specifically, even in the simpler case of L2 VM mobility,
>>>         S.4.1 says that triangular routing and tunnelling persist
>>>         "until a neighbour cache entry times out". A cache timeout
>>>         is about 10 orders of magnitude longer than the requirement
>>>         to only persist "while handling packets in flight", which
>>>         would be a few milliseconds at most (the time for packets to
>>>         clear the network that were already launched into flight
>>>         when the old VM stopped).
>>>         Whatever, it would be preferable for the draft to give
>>>         rationale for these requirements, rather than just assert
>>>         them. This would help to shed light on the merits of the
>>>         different trade offs that solutions choose.
>>>         [Linda] Agree, will add.
>>>         ===#. Mobility vs. Redundancy===
>>>         Redundancy and mobility have a lot of similarities, but they
>>>         have different goals. With mobility, it is necessary to know
>>>         the exact instant when one set of state is identical to the
>>>         other so it can hand over. With redundancy, the aim is to
>>>         keep two (or more) sets of state evolving through the same
>>>         sequence of changes, but there is no need to know the point
>>>         at which one is the same as the other was at a certain point.
>>>         [Linda] Agree with what you said. There is only one usage of
>>>         “redundancy” in the entire document, used under the context
>>>         of “Hot standby option”, indicating  the “redundancy” of 
>>>         “the VMs in both primary and secondary domains have
>>>         identical information and can provide services
>>>         simultaneously as in load-share mode of operation” being
>>>         expensive.
>>>         The draft slips from mobility to resilience in the following
>>>         places:
>>>         * S.2. Terminology: Warm VM Mobility is defined without any
>>>         ending, as if it is permanent replication. * S.7. "Handling
>>>         of Hot, Warm and Cold Virtual Machine Mobility" is actually
>>>         all about redundancy, and doesn't address mobility explicitly.
>>>         [Linda] Will add the definition “Hot Migration”, “cold
>>>         migration”, and “warm migration”.
>>>         ===#. Terminology===
>>>         Packets run from the source at A to the destination at B via
>>>         NVE1, then via NVE2. Please don't call NVE1 and NVE2 the
>>>         source NVE and the destination NVE.
>>>         In future, no-one will thank you for the apparent
>>>         contradictions when they continually stumble over phrases
>>>         like this one in S.4.1: "...send their packets to the source
>>>         NVE"..
>>>         The term "packets in flight" is used incorrectly to refer to
>>>         all the packets sent to the old NVE after the VM has moved,
>>>         even if they were launched into flight long after the old VM
>>>         stopped receiving packets.
>>>         [Linda] thank for the comments. Will change.
>>>         BTW, I think s/before/after/ in: "that have old ARP or
>>>         neighbor cache entry before VM or task migration".
>>>         I think: s/IP-based VM mobility/L3 VM mobility/ throughout,
>>>         because "based"
>>>         sounds (to me) like the mobility control protocol is over
>>>         (i.e. based on) IP.
>>>         ===#. Applicability===
>>>         In section 4.2 it says that the protocol mostly used as the
>>>         IP based task migration protocol is ILA. This implies that
>>>         all hosts corresponding with the mobile VMs are either part
>>>         of the same controlled environment, or they are proxied via
>>>         nodes that are part of the same controlled environment (I
>>>         only have passing knowledge of ILA, but I understand that it
>>>         depends on ILA routers on the path). If I am correct, this
>>>         aspect of scope needs to be made clear from the start.
>>>         Also under the heading of applicabiliy, the sentence "Since
>>>         migrations should be relatively rare events" appears very
>>>         late in the document (S.4.2.1). The assumed level of churn
>>>         ought to be stated nearer the start.
>>>         [Linda] yes, under the same controlled environment.
>>>         ===#. L3 Mobility===
>>>         L2 VM mobility is independent of the application, because
>>>         resolution of L2 mappings is delegated to the stack. In
>>>         contrast, L3 VM mobility is only feasible under certain
>>>         conditions, because an application needs an IP address to
>>>         open a socket (resolution of DNS names is not delegated to
>>>         the stack, and apps can use IP addresses directly anyway).
>>>         Examples of the 'certain conditions':
>>>         a) /All/ applications used in the whole DC load balancing
>>>         scheme contain IP address migration logic for /all/ their
>>>         connections; b) VMs running solely applications that support
>>>         IP address migration register this fact with the NVA, and it
>>>         only select such VMs for mobility. c) An abstraction is
>>>         layered over /all/ the IP addresses exposed to applications
>>>         (at both ends) so that the IP addresses that applications
>>>         use are solely identifiers  (e.g. ILA, LISP, HIP), not also
>>>         locators.
>>>         The introduction says the draft is about VM mobility in a
>>>         multi-tenant DC, so the DC admin will not know the range of
>>>         applications being used. This excludes condition (a) above.
>>>         When the draft says "...if all applications running are
>>>         known to handle this gracefully...", it doesn't quantify
>>>         just how restrictive this condition is, and it gives no
>>>         explanation of how this knowledge might be 'known' or which
>>>         function within the system 'knows' it.
>>>         S.4.2.1 contains what seems like plenty of arm-waving.
>>>         * "TCP connections could be automatically closed in the
>>>         network stack during a migration event."
>>>                 o There is no TCP connection state in the network stack.
>>>                 o Even if the network starts to drop every packet,
>>>         the TCP connection
>>>                 state persists in the end-points for a duration of
>>>         the order of 30-90
>>>                 minutes (OS-dependent) before TCP deems the
>>>         connection is broken. o
>>>                 Other transport protocols have similar designs
>>>         (including the app-layer
>>>                 of protocols over UDP).
>>>         * "More involved approach to connection migration":
>>>                 o pausing the connection [does this refer to an
>>>         actual feature of any
>>>                 L4 protocol?] o packaging connection state and
>>>         sending to target [does
>>>                 this assume logic written into the application, or
>>>         is this assuming the
>>>                 stack handles this and the app is restricted to
>>>         using some form of
>>>                 separate identifier/locator addresses?] o
>>>         instantiating connection
>>>                 state in the peer stack [ditto?].
>>>         There's some arm-waving in S.7 too:
>>>           "Cold Virtual Machine mobility is facilitated by the VM
>>>         initially
>>>            sending an ARP or Neighbor Discovery message at the
>>>         destination NVE
>>>            but the source NVE not receiving any packets inflight."
>>>            [How is it arranged for the source NVE not to receive any
>>>         packets in flight?]
>>>         And in S.7:
>>>           "In hot
>>>            standby option, regarding TCP connections, one option is
>>>         to start
>>>            with and maintain TCP connections to two different VMs at
>>>         the same
>>>            time."
>>>            [This sounds like resilience logic has been written into
>>>         the application,
>>>            which would be a special case but not something VM
>>>         mobility infrastructure
>>>            could depend on.]
>>>         [Linda] will add.
>>>         ===#. Gaps===
>>>         #. Security Considerations: repeats issues in other drafts
>>>         that are not specific to mobility, but it does not mention
>>>         any security issues specifically due to VM mobility. It says
>>>         that address spoofing may arise in a DC (sort-of implying it
>>>         is worse than in non-DC environments, but not saying why).
>>>         The handshake at the start of a connection (e.g. TCP, SCTP,
>>>         QUIC) checks for source address spoofing. So L3 VM mobility
>>>         would be more vulnerable to source address spoofing in cases
>>>         where the mobile VM was the connection initiator and there
>>>         was not a new handshake after the move. However, this draft
>>>         does not contain any detailed mobility protocols, so it is
>>>         not possible to identify any specific security flaws.
>>>         #. Transport Issues: Effect of delay on the transport: Cold
>>>         mobility introduces significant delay, and other forms less,
>>>         but still some delay. It should be pointed out that some
>>>         applications (e.g. real-time) will therefore not be useful
>>>         if subjected to VM mobility. Similarly, even a short period
>>>         of delay will drive most congestion controls to severely
>>>         reduce throughput. These points might be self-evident, but
>>>         perhaps they should be stated explicitly.
>>>         BTW, in the L3 VM mobility case, the draft often refers to
>>>         TCP connections, but the address bindings of any transport
>>>         protocols would have to be migrated due to VM mobility (e.g.
>>>         SCTP; sequences of datagrams over UDP; streams over UDP such
>>>         as with RTP, QUIC).
>>>         #. Management Issues: perhaps the draft ought to recommend
>>>         statistics gathering (e.g. time taken, amount of duplicate
>>>         data) to aid a DC's future decisions on the cost-benefit of
>>>         moving a VM. The OPSDIR review says a BCP does not /have/ to
>>>         describe management issues, but this document seems to
>>>         describe a whole system procedure, not just a protocol,
>>>         which then surely includes the management plane.
>>>         [Linda] can you become a co-author and add those in?
>>>         ===#. Incoherent Structure===
>>>         S.4.1. happens to talk about VMs moving, while S.4.2.
>>>         happens to talk about tasks moving, but this is not the
>>>         distinguishing aspect of these two sections (anyway, S.2.
>>>         says "the draft uses task and VM interchangeably"): * "4.1
>>>         VM Migration" is about "L2 VM Mobility" so this ought to be
>>>         the section heading, *
>>>         "4.2 Task Migration" is about "L3 VM Mobility" so this ought
>>>         to be the section heading. It would also help not to switch
>>>         from VM to task across these sections
>>>         - it's just a distraction.
>>>         S.4.1 needs better signposting of where each sub-case ends
>>>         (Subsections might be useful to solve this): * IPv4 *
>>>         end-user client * 2 paras starting "All NVEs communicating
>>>         with this virtual machine..." [Not clear that the end-user
>>>         case has ended and we have returned to the general IPv4
>>>         case?] * IPv6 [Strictly, it still hasn't said whether the
>>>         end-user client case has ended.] [Also, it doesn't explain
>>>         why there is no need for an end-user client case under
>>>         IPv6?] Sections 5 & 6 seem to be about either L2 or L3
>>>         mobility, whereas Sections 7 &
>>>         8 seem to be restricted to L2.
>>>         The draft vacillates over what to do with packets arriving
>>>         at the old NVE in the L3 case (see also L3 mobility above):
>>>         * S4.2 first says packets are dropped, possibly with an ICMP
>>>         error message;
>>>           o then later it says they are silently dropped;
>>>           o then in the very next sentence it says either silently
>>>         drop them or forward
>>>           them to the new location
>>>         * S.5 says they should not be lost, but instead delivered to
>>>         the destination hypervisor
>>>           o then it describes how they are tunnelled (which is not
>>>         the same as
>>>           "forwarding").
>>>         The order in which all the stages of mobilty are given is
>>>         jumbled up across sections that also appear in arbitrary
>>>         order: * S.5 prepares, establishes uses then stops a tunnel,
>>>         but it doesn't say where the other stages fit between these
>>>         steps
>>>                 o When tunneling packets, it talks about the
>>>         *migrating* VM not the
>>>                 *migrated* VM, which implies tunnelling has started
>>>         before the new VM
>>>                 is running. Does this imply there is a huge buffer?
>>>         o It says "Stop
>>>                 Tunneling Packets - When source NVE stops receiving
>>>         packets destined
>>>                 to..." but it is never clear when a source has
>>>         stopped sending packets
>>>                 to a destination, unless it explicitly closes the
>>>         connection (e.g. with
>>>                 a FIN in the case of TCP). Often there are long gaps
>>>         between packets,
>>>                 because many flows are 'thin' (meaning the
>>>         application frequently has
>>>                 nothing to send). These gaps can last for
>>>         milliseconds, hours or even
>>>                 days without any implication that the connection has
>>>         ended.
>>>         * Then S.6. describes moving state, but doesn't say that
>>>         this is not after the previous tunnelling steps (or where it
>>>         fits within those steps). * Then S.7 describes hot, warm and
>>>         cold mobility, but doesn't lay out the tunnelling or steps
>>>         to move state in each case. * Then S.8 says it's about VM
>>>         life-cycle, but just gives the very first 3 steps for
>>>         allocation of resources to a VM, then abruptly ends, without
>>>         even starting the VM, let alone getting to move it.
>>>         S.5 exhibits another inconsistency by talking about the
>>>         hypervisor, not the NVE.
>>>         ==#. Nits==
>>>         Nits with the English are too numerous to mention them all.
>>>         Below are pointers to general problems as well as some
>>>         individual instances.
>>>         S.4
>>>           "Layer 2 and Layer 3 protocols are described next.  In the
>>>         following
>>>            sections, we examine more advanced features."
>>>                 s/following/subsequent/
>>>         S.4.1
>>>         Expand WSC, MSC and NVA on first use.
>>>         s/the VM moves in the same link/the VM moves in the same subnet/
>>>         "i.e. end-user clients ask for the same MAC address upon
>>>         migration. [...] to ensure that the same IPv4 address is
>>>         assigned to the VM." I think s/IPv4/MAC/ was intended?
>>>         "  All NVEs communicating with this virtual machine uses the
>>>         old ARP
>>>            entry.  If any VM in those NVEs need to talk to the new
>>>         VM in the
>>>            destination NVE, it uses the old ARP entry."
>>>         Repetition: these 2 sentences say the same. (The mistake is
>>>         also repeated when these 2 sentences are repeated for IPv6).
>>>         S.4.2.1
>>>         s/Push the new mapping to hosts./Push the new mapping to
>>>         communicating hosts./
>>>         S.5.
>>>         The IPv4/IPv6 pairs of paras for "tunnel estabilshment" and
>>>         "tunneling packets"
>>>         only differ in the words "IPv4"/"IPv6". So in each case a
>>>         single para could be given for IP (irrespective of whether
>>>         v4 or v6).
>>>         Thank you very much.
>>>         Linda Dunbar
>>>
>>>
>>>         _______________________________________________
>>>         Tsv-art mailing list
>>>         Tsv-art@ietf.org <mailto:Tsv-art@ietf.org>
>>>         https://www.ietf.org/mailman/listinfo/tsv-art
>>>         <https://www..ietf.org/mailman/listinfo/tsv-art>
>>
>>         -- 
>>         ________________________________________________________________
>>         Bob Briscoehttp://bobbriscoe.net/
>>
>>
>>
>>     _______________________________________________
>>     Tsv-art mailing list
>>     Tsv-art@ietf.org <mailto:Tsv-art@ietf.org>
>>     https://www.ietf.org/mailman/listinfo/tsv-art
>
>     -- 
>     ________________________________________________________________
>     Bob Briscoehttp://bobbriscoe.net/
>
>
>
> _______________________________________________
> nvo3 mailing list
> nvo3@ietf.org
> https://www.ietf.org/mailman/listinfo/nvo3

-- 
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/