Re: [Tsv-art] Tsvart last call review of draft-ietf-nvo3-vmm-04

Behcet,

I would like to make significant responses to many of Linda's responses, 
but until we get answers to the two pre-requisite questions I've given, 
I can't be sure how to respond.

So rather than promising a new version with no prior discussion, I 
believe it would be much more fruitful to engage in this conversation. 
I'm trying to help.

Cheers

Bob

On 19/09/18 15:46, Behcet Sarikaya wrote:
> Hi Bob,
>
> Thank you for your comments.
> The authors are currently discussing your points and we will come up 
> with a revision soon after the discussions are over.
>
> Regards,
> Behcet
> On Tue, Sep 18, 2018 at 6:03 PM Bob Briscoe <ietf@bobbriscoe.net 
> <mailto:ietf@bobbriscoe.net>> wrote:
>
>     Linda,
>
>     Until we can all understand the answers to the following two
>     questions, I don't think we can discuss what track this draft
>     ought to be on, let alone move on to your responses to all my
>     other points.
>
>     1/ Applicability
>
>     You say this draft solely applies to connections with both ends
>     within the controlled DC environment. But the draft says it's
>     about multi-tenant DCs. Are there any multi-tenant DCs that
>     restrict all VMs to only communicate with other VMs within the
>     same controlled DC environment?
>
>     2/ Purpose of publishing as an RFC
>
>     When I said:
>>     #. The introduction does not say what the purpose of publishing
>>     this draft is.
>     you responded:
>>     [Linda] The first paragraph on Page 3 has the description why VM
>>     Mobility is needed.
>
>     Whether VM Mobility is needed was not my question. My question was
>     what is the purpose of the IETF publishing an RFC about VM
>     Mobility? And particularly, what is /this/ RFC intended to achieve?
>
>     Are the authors trying to argue for a particular approach vs.
>     others? Are you trying to write a tutorial? Are you trying to give
>     the pros and cons of different approaches? Are you trying to give
>     advice on good practice (with the implication that alternative
>     practices are less good)? Are you trying to clarify ideas by
>     writing them down? Are you trying to outline the implications of
>     VM Mobility for other protocols being developed within the NVO WG?
>
>
>
>
>     Bob
>
>     On 10/09/18 19:16, Linda Dunbar wrote:
>>     Bob,
>>     Thank you very much for reviewing the draft and provided in-depth
>>     comments. I am very sorry for the delayed response due to traveling.
>>     Replies to your comments are inserted below marked by [Linda]:
>>     -----Original Message-----
>>     From: Bob Briscoe [mailto:ietf@bobbriscoe.net]
>>     Sent: Monday, September 03, 2018 9:45 PM
>>     To: tsv-art@ietf.org <mailto:tsv-art@ietf.org>
>>     Cc: nvo3@ietf.org <mailto:nvo3@ietf.org>; ietf@ietf.org
>>     <mailto:ietf@ietf..org>; draft-ietf-nvo3-vmm.all@ietf.org
>>     <mailto:draft-ietf-nvo3-vmm.all@ietf.org>
>>     Subject: Tsvart last call review of draft-ietf-nvo3-vmm-04
>>     Reviewer: Bob Briscoe
>>     Review result: Not Ready
>>     I have been selected as the Transport Directorate reviewer for
>>     this draft. The Transport Directorate seeks to review all
>>     transport or transport-related drafts as they pass through IETF
>>     last call and IESG review, and sometimes on special request. The
>>     purpose of the review is to provide assistance to the Transport
>>     ADs. For more information about the Transport Directorate Reviews
>>     and the Transport Area Review Team, please see
>>     https://trac.ietf.org/trac/tsv/wiki/TSV-Directorate-Reviews
>>     In this case, very very few of the review comments relate to
>>     transport issues, although the greatest issue concerns a desire
>>     that the network could pause or stop connections during L3 VM
>>     Mobility, which is certainly a transport issue.
>>     [Linda] There is “Hot Migration” with transport service
>>     continuing, and there is a “Cold Migration”, which is a common
>>     practice in many data centers, which stop the task running on the
>>     old place and move to the new place before restart as described
>>     in the Task Migration.
>>     Is it helpful to add this description to the draft?
>>     ==Summary==
>>     The technical aspects of the draft concerning L2 VM mobility
>>     (within a subnet) seem sound. However, this is only part of the
>>     draft, which has the following
>>     issues:
>>     #. The introduction does not say what the purpose of publishing
>>     this draft is.
>>     It seems that, rather than describing a specific protocol or
>>     protocols, it intends to describe the overall system procedure
>>     that would typically be used in DCs for VM mobility. It is tagged
>>     as a BCP, but it does not say who needs this BCP, why it is
>>     useful for the IETF to publish this BCP, how wide the authors'
>>     knowledge is of current practice (given DCs are private), or why
>>     this is a BCP rather than a protocol spec.
>>     [Linda] The first paragraph on Page 3 has the description why VM
>>     Mobility is needed. Is it helpful to move this paragraph to the
>>     beginning of the Introduction Section?
>>     /“//Virtualization which is being used in almost all of today’s data/
>>     /centers enables many virtual machines to run on a single physical/
>>     /computer or compute server. Virtual machines (VM) need hypervisor/
>>     /running on the physical compute server to provide them shared/
>>     /processor/memory/storage. Network connectivity is provided by the/
>>     /network virtualization edge (NVE) [RFC8014]. Being able to move VMs/
>>     /dynamically, or live migration, from one server to another
>>     allows for/
>>     /dynamic load balancing or work distribution and thus it is a highly/
>>     /desirable feature [RFC7364].//”/
>>     The draft starts out (S.3) as if it intends to say what a good VM
>>     Mobility protocol should or shouldn't do, but the rest of the
>>     document doesn't give any reasoning for these recommendations, it
>>     just asserts what appears to be one view of how a whole VM
>>     Mobility system works, sometimes referring to one example
>>     protocol RFC for a component part, but more often with no
>>     references or details.
>>     [Linda] Is it helpful to move the paragraph above to the
>>     beginning of the Introduction Section? So that audience is aware
>>     of why VM Mobility is needed. And then follow up with what a good
>>     VM Mobility protocol should or shouldn't do?
>>     #. It does not seem as if the NVO WG has discussed the purpose of
>>     using normative text in this draft. See detailed comments.
>>     [Linda] The “Intended status” of the draft is “Best Current
>>     Practice”. So all the text are not “normative”. Is it Okay?
>>     #. The draft silently slips back and forth between VM mobility
>>     and VM redundancy, without recognizing the differences. See
>>     detailed comments.
>>     [Linda] There is only one usage of “redundancy” in the entire
>>     document, used under the context of “Hot standby option”,
>>     indicating the “redundancy” of “the VMs in both primary and
>>     secondary domains have identical information and can provide
>>     services simultaneously as in load-share mode of operation” being
>>     expensive.
>>     #. Please adopt different terminology than "source NVE" and
>>     "destination NVE", which are really poor choices of terms for an
>>     intermediate node. See detailed comments. Why not use "old NVE"
>>     and "new NVE", which is what you mean?
>>     [Linda] Thanks for the suggestion. We will change to “Old NVE”,
>>     and “new NVE”.
>>     #. Applicability is fairly clearly outlined, but it is not clear
>>     whether hosts corresponding with the mobile VMs are part of the
>>     same controlled environment or on the uncontrolled public
>>     Internet. See detailed comments.
>>     [Linda] “Hosts” are the App running on the VM. It is the under
>>     the same controlled environment. Not on uncontrolled public
>>     internet.
>>     #. Section 4.2.1 on L3 VM mobility reads like some potential
>>     half-thought-through ideas on how to solve L3 mobility, rather
>>     than current practice, let alone best current practice. Either
>>     current practice should be described instead, or the scope of the
>>     draft should be narrowed solely to L2 VM mobility. See detailed
>>     comments.
>>     [Linda] This is refereeing to “Cold Migration”, which is a common
>>     practice in many data centers.
>>     # The VM's file system is described as state that moves with the
>>     VM (S.6), but VM mobility solutions often move the VM but stitch
>>     it back to its (unmoved) storage. Conversely, the storage can
>>     also move independent of the VM.
>>     [Linda] It depends. When a VM move to a different zone, the
>>     storage/file can becomes inaccessible.
>>     #. The draft omits some of the security, transport and management
>>     aspects of VM mobility. See detailed comments.
>>     [Linda] Can you provide some text?
>>     #. The draft reads as if different sections have been written by
>>     different authors and no-one has edited the whole to give it a
>>     coherent structure, or to ensure consistency (both technical and
>>     editorial) between the parts. See detailed comments.
>>     [Linda] we can improve.
>>     #. The quality of the English grammar does not allow a reviewer
>>     to concentrate on the technical aspects rather than the English.
>>     It would have been useful if one of the English-speaking
>>     co-authors had improved the English before submission for review.
>>     See detailed comments.
>>     [Linda] can you help?  Becoming a co-author to improve?
>>     ==Detailed Comments==
>>     ===#. Normative statements===
>>     In the body of the document, there is just one occurrence of
>>     normative text (actually two "MUST"s, but both state a common
>>     requirement - just written separately for IPv4 and IPv6). This
>>     merely serves to imply that everything else the document says is
>>     less important or optional, which was probably not the intention.
>>     [Linda] The goal is to indicate any solution in moving the VM
>>     “MUST” follow this rule. They make sense, aren’t they?
>>     At the start there is a requirements section, which states what a
>>     VM Mobility protocol "SHOULD" or "SHOULD NOT" do. I think this is
>>     intended as a set of goals for the rest of the document. If so,
>>     these "SHOULDs" are not intended to apply to implementations, so
>>     they ought not to be capitalized.
>>     [Linda] okay, will change.
>>     The first requirement, "Data center network SHOULD support
>>     virtual machine mobility in IPv6", is written as a requirement on
>>     all DC networks, not on implementations. I assume this was
>>     intended to read as "Data center network virtual machine mobility
>>     protocols SHOULD support IPv6". Even then, it doesn't really add
>>     anything to say VM mobility should support v6 and it should
>>     support v4. A L2 solution won't. While undoubtedly, a L3 solution
>>     will at least support one of them.
>>     [Linda]Agree. Will change it to “Data center that support IPv6
>>     address should …”
>>     I'm not sure that 'protocol' is the right word anyway; I think
>>     'VM Mobility procedure' would be a better phrase, because it
>>     includes steps such as suspending the VM, which is more than a
>>     protocol.
>>     [Linda] yes. Will change to “Procedure”.
>>     The requirement "Virtual machine mobility protocol MAY support
>>     host routes to accomplish virtualization", is not followed up at
>>     all in the rest of the draft.
>>     Even if this requirement stays, the last 3 words should be deleted.
>>     [Linda] will change to “Host Route can be used to support the
>>     Virtual Machine Mobility Procedure.”
>>     By the end of the draft, the solution falls far short of the most
>>     relevant "Requirements" anyway, so one assumes the title of the
>>     section ought to have been "Goals". Specifically, even in the
>>     simpler case of L2 VM mobility, S.4.1 says that triangular
>>     routing and tunnelling persist "until a neighbour cache entry
>>     times out". A cache timeout is about 10 orders of magnitude
>>     longer than the requirement to only persist "while handling
>>     packets in flight", which would be a few milliseconds at most
>>     (the time for packets to clear the network that were already
>>     launched into flight when the old VM stopped).
>>     Whatever, it would be preferable for the draft to give rationale
>>     for these requirements, rather than just assert them. This would
>>     help to shed light on the merits of the different trade offs that
>>     solutions choose.
>>     [Linda] Agree, will add.
>>     ===#. Mobility vs. Redundancy===
>>     Redundancy and mobility have a lot of similarities, but they have
>>     different goals. With mobility, it is necessary to know the exact
>>     instant when one set of state is identical to the other so it can
>>     hand over. With redundancy, the aim is to keep two (or more) sets
>>     of state evolving through the same sequence of changes, but there
>>     is no need to know the point at which one is the same as the
>>     other was at a certain point.
>>     [Linda] Agree with what you said. There is only one usage of
>>     “redundancy” in the entire document, used under the context of
>>     “Hot standby option”, indicating  the “redundancy” of  “the VMs
>>     in both primary and secondary domains have identical information
>>     and can provide services simultaneously as in load-share mode of
>>     operation” being expensive.
>>     The draft slips from mobility to resilience in the following places:
>>     * S.2. Terminology: Warm VM Mobility is defined without any
>>     ending, as if it is permanent replication. * S.7. "Handling of
>>     Hot, Warm and Cold Virtual Machine Mobility" is actually all
>>     about redundancy, and doesn't address mobility explicitly.
>>     [Linda] Will add the definition “Hot Migration”, “cold
>>     migration”, and “warm migration”.
>>     ===#. Terminology===
>>     Packets run from the source at A to the destination at B via
>>     NVE1, then via NVE2. Please don't call NVE1 and NVE2 the source
>>     NVE and the destination NVE.
>>     In future, no-one will thank you for the apparent contradictions
>>     when they continually stumble over phrases like this one in
>>     S.4.1: "...send their packets to the source NVE".
>>     The term "packets in flight" is used incorrectly to refer to all
>>     the packets sent to the old NVE after the VM has moved, even if
>>     they were launched into flight long after the old VM stopped
>>     receiving packets.
>>     [Linda] thank for the comments. Will change.
>>     BTW, I think s/before/after/ in: "that have old ARP or neighbor
>>     cache entry before VM or task migration".
>>     I think: s/IP-based VM mobility/L3 VM mobility/ throughout,
>>     because "based"
>>     sounds (to me) like the mobility control protocol is over (i.e.
>>     based on) IP.
>>     ===#. Applicability===
>>     In section 4.2 it says that the protocol mostly used as the IP
>>     based task migration protocol is ILA. This implies that all hosts
>>     corresponding with the mobile VMs are either part of the same
>>     controlled environment, or they are proxied via nodes that are
>>     part of the same controlled environment (I only have passing
>>     knowledge of ILA, but I understand that it depends on ILA routers
>>     on the path). If I am correct, this aspect of scope needs to be
>>     made clear from the start.
>>     Also under the heading of applicabiliy, the sentence "Since
>>     migrations should be relatively rare events" appears very late in
>>     the document (S.4.2.1). The assumed level of churn ought to be
>>     stated nearer the start.
>>     [Linda] yes, under the same controlled environment.
>>     ===#. L3 Mobility===
>>     L2 VM mobility is independent of the application, because
>>     resolution of L2 mappings is delegated to the stack. In contrast,
>>     L3 VM mobility is only feasible under certain conditions, because
>>     an application needs an IP address to open a socket (resolution
>>     of DNS names is not delegated to the stack, and apps can use IP
>>     addresses directly anyway).
>>     Examples of the 'certain conditions':
>>     a) /All/ applications used in the whole DC load balancing scheme
>>     contain IP address migration logic for /all/ their connections;
>>     b) VMs running solely applications that support IP address
>>     migration register this fact with the NVA, and it only select
>>     such VMs for mobility. c) An abstraction is layered over /all/
>>     the IP addresses exposed to applications (at both ends) so that
>>     the IP addresses that applications use are solely identifiers 
>>     (e.g. ILA, LISP, HIP), not also locators.
>>     The introduction says the draft is about VM mobility in a
>>     multi-tenant DC, so the DC admin will not know the range of
>>     applications being used. This excludes condition (a) above. When
>>     the draft says "...if all applications running are known to
>>     handle this gracefully...", it doesn't quantify just how
>>     restrictive this condition is, and it gives no explanation of how
>>     this knowledge might be 'known' or which function within the
>>     system 'knows' it.
>>     S.4.2.1 contains what seems like plenty of arm-waving.
>>     * "TCP connections could be automatically closed in the network
>>     stack during a migration event."
>>             o There is no TCP connection state in the network stack.
>>             o Even if the network starts to drop every packet, the
>>     TCP connection
>>             state persists in the end-points for a duration of the
>>     order of 30-90
>>             minutes (OS-dependent) before TCP deems the connection is
>>     broken. o
>>             Other transport protocols have similar designs (including
>>     the app-layer
>>             of protocols over UDP).
>>     * "More involved approach to connection migration":
>>             o pausing the connection [does this refer to an actual
>>     feature of any
>>             L4 protocol?] o packaging connection state and sending to
>>     target [does
>>             this assume logic written into the application, or is
>>     this assuming the
>>             stack handles this and the app is restricted to using
>>     some form of
>>             separate identifier/locator addresses?] o instantiating
>>     connection
>>             state in the peer stack [ditto?].
>>     There's some arm-waving in S.7 too:
>>       "Cold Virtual Machine mobility is facilitated by the VM initially
>>        sending an ARP or Neighbor Discovery message at the
>>     destination NVE
>>        but the source NVE not receiving any packets inflight."
>>        [How is it arranged for the source NVE not to receive any
>>     packets in flight?]
>>     And in S.7:
>>       "In hot
>>        standby option, regarding TCP connections, one option is to start
>>        with and maintain TCP connections to two different VMs at the same
>>        time."
>>        [This sounds like resilience logic has been written into the
>>     application,
>>        which would be a special case but not something VM mobility
>>     infrastructure
>>        could depend on.]
>>     [Linda] will add.
>>     ===#. Gaps===
>>     #. Security Considerations: repeats issues in other drafts that
>>     are not specific to mobility, but it does not mention any
>>     security issues specifically due to VM mobility. It says that
>>     address spoofing may arise in a DC (sort-of implying it is worse
>>     than in non-DC environments, but not saying why). The handshake
>>     at the start of a connection (e.g. TCP, SCTP, QUIC) checks for
>>     source address spoofing. So L3 VM mobility would be more
>>     vulnerable to source address spoofing in cases where the mobile
>>     VM was the connection initiator and there was not a new handshake
>>     after the move. However, this draft does not contain any detailed
>>     mobility protocols, so it is not possible to identify any
>>     specific security flaws.
>>     #. Transport Issues: Effect of delay on the transport: Cold
>>     mobility introduces significant delay, and other forms less, but
>>     still some delay. It should be pointed out that some applications
>>     (e.g. real-time) will therefore not be useful if subjected to VM
>>     mobility. Similarly, even a short period of delay will drive most
>>     congestion controls to severely reduce throughput. These points
>>     might be self-evident, but perhaps they should be stated explicitly.
>>     BTW, in the L3 VM mobility case, the draft often refers to TCP
>>     connections, but the address bindings of any transport protocols
>>     would have to be migrated due to VM mobility (e.g. SCTP;
>>     sequences of datagrams over UDP; streams over UDP such as with
>>     RTP, QUIC).
>>     #. Management Issues: perhaps the draft ought to recommend
>>     statistics gathering (e.g. time taken, amount of duplicate data)
>>     to aid a DC's future decisions on the cost-benefit of moving a
>>     VM. The OPSDIR review says a BCP does not /have/ to describe
>>     management issues, but this document seems to describe a whole
>>     system procedure, not just a protocol, which then surely includes
>>     the management plane.
>>     [Linda] can you become a co-author and add those in?
>>     ===#. Incoherent Structure===
>>     S.4.1. happens to talk about VMs moving, while S.4.2. happens to
>>     talk about tasks moving, but this is not the distinguishing
>>     aspect of these two sections (anyway, S.2. says "the draft uses
>>     task and VM interchangeably"): * "4.1 VM Migration" is about "L2
>>     VM Mobility" so this ought to be the section heading, *
>>     "4.2 Task Migration" is about "L3 VM Mobility" so this ought to
>>     be the section heading. It would also help not to switch from VM
>>     to task across these sections
>>     - it's just a distraction.
>>     S.4.1 needs better signposting of where each sub-case ends
>>     (Subsections might be useful to solve this): * IPv4 * end-user
>>     client * 2 paras starting "All NVEs communicating with this
>>     virtual machine..." [Not clear that the end-user case has ended
>>     and we have returned to the general IPv4 case?] * IPv6 [Strictly,
>>     it still hasn't said whether the end-user client case has ended.]
>>     [Also, it doesn't explain why there is no need for an end-user
>>     client case under IPv6?] Sections 5 & 6 seem to be about either
>>     L2 or L3 mobility, whereas Sections 7 &
>>     8 seem to be restricted to L2.
>>     The draft vacillates over what to do with packets arriving at the
>>     old NVE in the L3 case (see also L3 mobility above): * S4.2 first
>>     says packets are dropped, possibly with an ICMP error message;
>>       o then later it says they are silently dropped;
>>       o then in the very next sentence it says either silently drop
>>     them or forward
>>       them to the new location
>>     * S.5 says they should not be lost, but instead delivered to the
>>     destination hypervisor
>>       o then it describes how they are tunnelled (which is not the
>>     same as
>>       "forwarding").
>>     The order in which all the stages of mobilty are given is jumbled
>>     up across sections that also appear in arbitrary order: * S.5
>>     prepares, establishes uses then stops a tunnel, but it doesn't
>>     say where the other stages fit between these steps
>>             o When tunneling packets, it talks about the *migrating*
>>     VM not the
>>             *migrated* VM, which implies tunnelling has started
>>     before the new VM
>>             is running. Does this imply there is a huge buffer? o It
>>     says "Stop
>>             Tunneling Packets - When source NVE stops receiving
>>     packets destined
>>             to..." but it is never clear when a source has stopped
>>     sending packets
>>             to a destination, unless it explicitly closes the
>>     connection (e.g. with
>>             a FIN in the case of TCP). Often there are long gaps
>>     between packets,
>>             because many flows are 'thin' (meaning the application
>>     frequently has
>>             nothing to send). These gaps can last for milliseconds,
>>     hours or even
>>             days without any implication that the connection has ended.
>>     * Then S.6. describes moving state, but doesn't say that this is
>>     not after the previous tunnelling steps (or where it fits within
>>     those steps). * Then S.7 describes hot, warm and cold mobility,
>>     but doesn't lay out the tunnelling or steps to move state in each
>>     case. * Then S.8 says it's about VM life-cycle, but just gives
>>     the very first 3 steps for allocation of resources to a VM, then
>>     abruptly ends, without even starting the VM, let alone getting to
>>     move it.
>>     S.5 exhibits another inconsistency by talking about the
>>     hypervisor, not the NVE.
>>     ==#. Nits==
>>     Nits with the English are too numerous to mention them all. Below
>>     are pointers to general problems as well as some individual
>>     instances.
>>     S.4
>>       "Layer 2 and Layer 3 protocols are described next.  In the
>>     following
>>        sections, we examine more advanced features."
>>             s/following/subsequent/
>>     S.4.1
>>     Expand WSC, MSC and NVA on first use.
>>     s/the VM moves in the same link/the VM moves in the same subnet/
>>     "i.e. end-user clients ask for the same MAC address upon
>>     migration. [...] to ensure that the same IPv4 address is assigned
>>     to the VM." I think s/IPv4/MAC/ was intended?
>>     "  All NVEs communicating with this virtual machine uses the old ARP
>>        entry.  If any VM in those NVEs need to talk to the new VM in the
>>        destination NVE, it uses the old ARP entry."
>>     Repetition: these 2 sentences say the same. (The mistake is also
>>     repeated when these 2 sentences are repeated for IPv6).
>>     S.4.2.1
>>     s/Push the new mapping to hosts./Push the new mapping to
>>     communicating hosts./
>>     S.5.
>>     The IPv4/IPv6 pairs of paras for "tunnel estabilshment" and
>>     "tunneling packets"
>>     only differ in the words "IPv4"/"IPv6". So in each case a single
>>     para could be given for IP (irrespective of whether v4 or v6).
>>     Thank you very much.
>>     Linda Dunbar
>>
>>
>>     _______________________________________________
>>     Tsv-art mailing list
>>     Tsv-art@ietf.org <mailto:Tsv-art@ietf.org>
>>     https://www.ietf.org/mailman/listinfo/tsv-art
>>     <https://www..ietf.org/mailman/listinfo/tsv-art>
>
>     -- 
>     ________________________________________________________________
>     Bob Briscoehttp://bobbriscoe.net/
>
>
>
> _______________________________________________
> Tsv-art mailing list
> Tsv-art@ietf.org
> https://www.ietf.org/mailman/listinfo/tsv-art

-- 
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/