Re: [nvo3] [Tsv-art] Tsvart last call review of draft-ietf-nvo3-vmm-04
Bob Briscoe <ietf@bobbriscoe.net> Fri, 21 September 2018 09:34 UTC
Return-Path: <ietf@bobbriscoe.net>
X-Original-To: nvo3@ietfa.amsl.com
Delivered-To: nvo3@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E3A14128D68; Fri, 21 Sep 2018 02:34:01 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.999
X-Spam-Level:
X-Spam-Status: No, score=-1.999 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=bobbriscoe.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vRJ4EUdEa9Za; Fri, 21 Sep 2018 02:33:56 -0700 (PDT)
Received: from server.dnsblock1.com (server.dnsblock1.com [85.13.236.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 636A0130E3F; Fri, 21 Sep 2018 02:33:55 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=bobbriscoe.net; s=default; h=Content-Type:In-Reply-To:MIME-Version:Date: Message-ID:From:References:Cc:To:Subject:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=B94SRlm23EoKz9TigP0P3ZctkQggQgLIBd5ngw7EN68=; b=cZ+brPc+Zgh9Tm4Z1XQP8p3qs Ym2WO+3MAVpMAK5MJcJ1VU0yVZGZNkgR4r/RgWmGvhByNnn/Bnq9k1CB5PGJRvkgKislyRGs5ygWd Af307vVwLDE4Uau0Y/lJgizLEwKnW9sElqq9O311RDHluW9EWZMGXNra16sToMGnE9kcbTlRzpvcK 6u4VprHKX8DPjjG30HHz0gKsIDQhVcGuQf7EXpq9aAizIkIwaENcbkqkGrpWahuvd/wkmvEfgzRze 69ifRxyD3DwWp7DIIM7Mvpig/P3L/iFKXU8B3NTVQpu6kWSvPSBDposh51iV3eNGbvnzpyGzyJIXb /nKIaTHXw==;
Received: from 188.74.9.51.dyn.plus.net ([51.9.74.188]:44520 helo=[192.168.0.2]) by server.dnsblock1.com with esmtpsa (TLSv1.2:ECDHE-RSA-AES128-GCM-SHA256:128) (Exim 4.91) (envelope-from <ietf@bobbriscoe.net>) id 1g3Hos-00080Y-KF; Fri, 21 Sep 2018 10:33:52 +0100
To: sarikaya@ieee.org
Cc: tsv-art@ietf.org, NVO3 <nvo3@ietf.org>, IETF <ietf@ietf.org>, draft-ietf-nvo3-vmm.all@ietf.org
References: <153602909285.13281.13763046029400746910@ietfa.amsl.com> <4A95BA014132FF49AE685FAB4B9F17F66B139743@sjceml521-mbs.china.huawei.com> <7f3ceaff-db16-8eb9-a72c-aca219c7d90c@bobbriscoe.net> <CAC8QAcfVywTMOs=+B5UH5JwpsPPkiYZnb4YQzcqKMzedQsiMdw@mail.gmail.com> <c513d041-0c65-111d-9fd4-4474c52fa491@bobbriscoe.net> <CAC8QAcc-fr_-g8bPe812=udZVQk3d2E00mkJBwDbX3ZkdDceUw@mail.gmail.com>
From: Bob Briscoe <ietf@bobbriscoe.net>
Message-ID: <9722da33-469b-38f6-b629-99a277fc864e@bobbriscoe.net>
Date: Fri, 21 Sep 2018 10:33:49 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1
MIME-Version: 1.0
In-Reply-To: <CAC8QAcc-fr_-g8bPe812=udZVQk3d2E00mkJBwDbX3ZkdDceUw@mail.gmail.com>
Content-Type: multipart/alternative; boundary="------------8FF66E7C011528F6C1FCAFF2"
Content-Language: en-GB
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - server.dnsblock1.com
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - bobbriscoe.net
X-Get-Message-Sender-Via: server.dnsblock1.com: authenticated_id: in@bobbriscoe.net
X-Authenticated-Sender: server.dnsblock1.com: in@bobbriscoe.net
Archived-At: <https://mailarchive.ietf.org/arch/msg/nvo3/6Fh3ssXpVksaVrsSZXA4az4Y7tE>
Subject: Re: [nvo3] [Tsv-art] Tsvart last call review of draft-ietf-nvo3-vmm-04
X-BeenThere: nvo3@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Network Virtualization Overlays \(NVO3\) Working Group" <nvo3.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nvo3>, <mailto:nvo3-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nvo3/>
List-Post: <mailto:nvo3@ietf.org>
List-Help: <mailto:nvo3-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nvo3>, <mailto:nvo3-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 21 Sep 2018 09:34:02 -0000
Behcet, Linda made load of responses to my review, some of which I disagree with so I would like to respond to them. I need responses to those two questions first though, 'cos everything else depends on those. Bob On 20/09/18 15:30, Behcet Sarikaya wrote: > Dear Bob, > On Wed, Sep 19, 2018 at 9:53 AM Bob Briscoe <ietf@bobbriscoe.net > <mailto:ietf@bobbriscoe.net>> wrote: > > Behcet, > > I would like to make significant responses to many of Linda's > responses, but until we get answers to the two pre-requisite > questions I've given, I can't be sure how to respond. > > So rather than promising a new version with no prior discussion, I > believe it would be much more fruitful to engage in this > conversation. I'm trying to help. > > > You already made a detailed review. > Your two points are clarifications from your detailed review. > When I said we will revise I meant we will revise based on your > detailed review. > After we post our revision you can do what ever you wish. > > Sincerely, > Behcet > > Cheers > > > Bob > > On 19/09/18 15:46, Behcet Sarikaya wrote: >> Hi Bob, >> >> Thank you for your comments. >> The authors are currently discussing your points and we will come >> up with a revision soon after the discussions are over. >> >> Regards, >> Behcet >> On Tue, Sep 18, 2018 at 6:03 PM Bob Briscoe <ietf@bobbriscoe.net >> <mailto:ietf@bobbriscoe.net>> wrote: >> >> Linda, >> >> Until we can all understand the answers to the following two >> questions, I don't think we can discuss what track this draft >> ought to be on, let alone move on to your responses to all my >> other points. >> >> 1/ Applicability >> >> You say this draft solely applies to connections with both >> ends within the controlled DC environment. But the draft says >> it's about multi-tenant DCs. Are there any multi-tenant DCs >> that restrict all VMs to only communicate with other VMs >> within the same controlled DC environment? >> >> 2/ Purpose of publishing as an RFC >> >> When I said: >>> #. The introduction does not say what the purpose of >>> publishing this draft is. >> you responded: >>> [Linda] The first paragraph on Page 3 has the description >>> why VM Mobility is needed. >> >> Whether VM Mobility is needed was not my question. My >> question was what is the purpose of the IETF publishing an >> RFC about VM Mobility? And particularly, what is /this/ RFC >> intended to achieve? >> >> Are the authors trying to argue for a particular approach vs. >> others? Are you trying to write a tutorial? Are you trying to >> give the pros and cons of different approaches? Are you >> trying to give advice on good practice (with the implication >> that alternative practices are less good)? Are you trying to >> clarify ideas by writing them down? Are you trying to outline >> the implications of VM Mobility for other protocols being >> developed within the NVO WG? >> >> >> >> >> Bob >> >> On 10/09/18 19:16, Linda Dunbar wrote: >>> Bob, >>> Thank you very much for reviewing the draft and provided >>> in-depth comments. I am very sorry for the delayed response >>> due to traveling. >>> Replies to your comments are inserted below marked by [Linda]: >>> -----Original Message----- >>> From: Bob Briscoe [mailto:ietf@bobbriscoe.net] >>> Sent: Monday, September 03, 2018 9:45 PM >>> To: tsv-art@ietf.org <mailto:tsv-art@ietf.org> >>> Cc: nvo3@ietf.org <mailto:nvo3@ietf.org>; ietf@ietf.org >>> <mailto:ietf@ietf..org>; draft-ietf-nvo3-vmm.all@ietf.org >>> <mailto:draft-ietf-nvo3-vmm.all@ietf.org> >>> Subject: Tsvart last call review of draft-ietf-nvo3-vmm-04 >>> Reviewer: Bob Briscoe >>> Review result: Not Ready >>> I have been selected as the Transport Directorate reviewer >>> for this draft. The Transport Directorate seeks to review >>> all transport or transport-related drafts as they pass >>> through IETF last call and IESG review, and sometimes on >>> special request. The purpose of the review is to provide >>> assistance to the Transport ADs. For more information about >>> the Transport Directorate Reviews and the Transport Area >>> Review Team, please see >>> https://trac..ietf.org/trac/tsv/wiki/TSV-Directorate-Reviews >>> <https://trac.ietf.org/trac/tsv/wiki/TSV-Directorate-Reviews> >>> In this case, very very few of the review comments relate to >>> transport issues, although the greatest issue concerns a >>> desire that the network could pause or stop connections >>> during L3 VM Mobility, which is certainly a transport issue. >>> [Linda] There is “Hot Migration” with transport service >>> continuing, and there is a “Cold Migration”, which is a >>> common practice in many data centers, which stop the task >>> running on the old place and move to the new place before >>> restart as described in the Task Migration. >>> Is it helpful to add this description to the draft? >>> ==Summary== >>> The technical aspects of the draft concerning L2 VM mobility >>> (within a subnet) seem sound. However, this is only part of >>> the draft, which has the following >>> issues: >>> #. The introduction does not say what the purpose of >>> publishing this draft is. >>> It seems that, rather than describing a specific protocol or >>> protocols, it intends to describe the overall system >>> procedure that would typically be used in DCs for VM >>> mobility. It is tagged as a BCP, but it does not say who >>> needs this BCP, why it is useful for the IETF to publish >>> this BCP, how wide the authors' knowledge is of current >>> practice (given DCs are private), or why this is a BCP >>> rather than a protocol spec. >>> [Linda] The first paragraph on Page 3 has the description >>> why VM Mobility is needed. Is it helpful to move this >>> paragraph to the beginning of the Introduction Section? >>> /“//Virtualization which is being used in almost all of >>> today’s data/ >>> /centers enables many virtual machines to run on a single >>> physical/ >>> /computer or compute server. Virtual machines (VM) need >>> hypervisor/ >>> /running on the physical compute server to provide them shared/ >>> /processor/memory/storage. Network connectivity is provided >>> by the/ >>> /network virtualization edge (NVE) [RFC8014]. Being able to >>> move VMs/ >>> /dynamically, or live migration, from one server to another >>> allows for/ >>> /dynamic load balancing or work distribution and thus it is >>> a highly/ >>> /desirable feature [RFC7364].//”/ >>> The draft starts out (S.3) as if it intends to say what a >>> good VM Mobility protocol should or shouldn't do, but the >>> rest of the document doesn't give any reasoning for these >>> recommendations, it just asserts what appears to be one view >>> of how a whole VM Mobility system works, sometimes referring >>> to one example protocol RFC for a component part, but more >>> often with no references or details. >>> [Linda] Is it helpful to move the paragraph above to the >>> beginning of the Introduction Section? So that audience is >>> aware of why VM Mobility is needed. And then follow up with >>> what a good VM Mobility protocol should or shouldn't do? >>> #. It does not seem as if the NVO WG has discussed the >>> purpose of using normative text in this draft. See detailed >>> comments. >>> [Linda] The “Intended status” of the draft is “Best Current >>> Practice”. So all the text are not “normative”. Is it Okay? >>> #. The draft silently slips back and forth between VM >>> mobility and VM redundancy, without recognizing the >>> differences. See detailed comments. >>> [Linda] There is only one usage of “redundancy” in the >>> entire document, used under the context of “Hot standby >>> option”, indicating the “redundancy” of “the VMs in both >>> primary and secondary domains have identical information and >>> can provide services simultaneously as in load-share mode of >>> operation” being expensive. >>> #. Please adopt different terminology than "source NVE" and >>> "destination NVE", which are really poor choices of terms >>> for an intermediate node. See detailed comments. Why not use >>> "old NVE" and "new NVE", which is what you mean? >>> [Linda] Thanks for the suggestion. We will change to “Old >>> NVE”, and “new NVE”. >>> #. Applicability is fairly clearly outlined, but it is not >>> clear whether hosts corresponding with the mobile VMs are >>> part of the same controlled environment or on the >>> uncontrolled public Internet. See detailed comments. >>> [Linda] “Hosts” are the App running on the VM. It is the >>> under the same controlled environment. Not on uncontrolled >>> public internet. >>> #. Section 4.2.1 on L3 VM mobility reads like some potential >>> half-thought-through ideas on how to solve L3 mobility, >>> rather than current practice, let alone best current >>> practice. Either current practice should be described >>> instead, or the scope of the draft should be narrowed solely >>> to L2 VM mobility. See detailed comments. >>> [Linda] This is refereeing to “Cold Migration”, which is a >>> common practice in many data centers. >>> # The VM's file system is described as state that moves with >>> the VM (S.6), but VM mobility solutions often move the VM >>> but stitch it back to its (unmoved) storage. Conversely, the >>> storage can also move independent of the VM. >>> [Linda] It depends. When a VM move to a different zone, the >>> storage/file can becomes inaccessible. >>> #. The draft omits some of the security, transport and >>> management aspects of VM mobility. See detailed comments. >>> [Linda] Can you provide some text? >>> #. The draft reads as if different sections have been >>> written by different authors and no-one has edited the whole >>> to give it a coherent structure, or to ensure consistency >>> (both technical and editorial) between the parts. See >>> detailed comments. >>> [Linda] we can improve. >>> #. The quality of the English grammar does not allow a >>> reviewer to concentrate on the technical aspects rather than >>> the English. It would have been useful if one of the >>> English-speaking co-authors had improved the English before >>> submission for review. See detailed comments. >>> [Linda] can you help? Becoming a co-author to improve? >>> ==Detailed Comments== >>> ===#. Normative statements=== >>> In the body of the document, there is just one occurrence of >>> normative text (actually two "MUST"s, but both state a >>> common requirement - just written separately for IPv4 and >>> IPv6). This merely serves to imply that everything else the >>> document says is less important or optional, which was >>> probably not the intention. >>> [Linda] The goal is to indicate any solution in moving the >>> VM “MUST” follow this rule. They make sense, aren’t they? >>> At the start there is a requirements section, which states >>> what a VM Mobility protocol "SHOULD" or "SHOULD NOT" do. I >>> think this is intended as a set of goals for the rest of the >>> document. If so, these "SHOULDs" are not intended to apply >>> to implementations, so they ought not to be capitalized. >>> [Linda] okay, will change. >>> The first requirement, "Data center network SHOULD support >>> virtual machine mobility in IPv6", is written as a >>> requirement on all DC networks, not on implementations. I >>> assume this was intended to read as "Data center network >>> virtual machine mobility protocols SHOULD support IPv6". >>> Even then, it doesn't really add anything to say VM mobility >>> should support v6 and it should support v4. A L2 solution >>> won't. While undoubtedly, a L3 solution will at least >>> support one of them. >>> [Linda]Agree. Will change it to “Data center that support >>> IPv6 address should …” >>> I'm not sure that 'protocol' is the right word anyway; I >>> think 'VM Mobility procedure' would be a better phrase, >>> because it includes steps such as suspending the VM, which >>> is more than a protocol. >>> [Linda] yes. Will change to “Procedure”. >>> The requirement "Virtual machine mobility protocol MAY >>> support host routes to accomplish virtualization", is not >>> followed up at all in the rest of the draft. >>> Even if this requirement stays, the last 3 words should be >>> deleted. >>> [Linda] will change to “Host Route can be used to support >>> the Virtual Machine Mobility Procedure.” >>> By the end of the draft, the solution falls far short of the >>> most relevant "Requirements" anyway, so one assumes the >>> title of the section ought to have been "Goals". >>> Specifically, even in the simpler case of L2 VM mobility, >>> S.4.1 says that triangular routing and tunnelling persist >>> "until a neighbour cache entry times out". A cache timeout >>> is about 10 orders of magnitude longer than the requirement >>> to only persist "while handling packets in flight", which >>> would be a few milliseconds at most (the time for packets to >>> clear the network that were already launched into flight >>> when the old VM stopped). >>> Whatever, it would be preferable for the draft to give >>> rationale for these requirements, rather than just assert >>> them. This would help to shed light on the merits of the >>> different trade offs that solutions choose. >>> [Linda] Agree, will add. >>> ===#. Mobility vs. Redundancy=== >>> Redundancy and mobility have a lot of similarities, but they >>> have different goals. With mobility, it is necessary to know >>> the exact instant when one set of state is identical to the >>> other so it can hand over. With redundancy, the aim is to >>> keep two (or more) sets of state evolving through the same >>> sequence of changes, but there is no need to know the point >>> at which one is the same as the other was at a certain point. >>> [Linda] Agree with what you said. There is only one usage of >>> “redundancy” in the entire document, used under the context >>> of “Hot standby option”, indicating the “redundancy” of >>> “the VMs in both primary and secondary domains have >>> identical information and can provide services >>> simultaneously as in load-share mode of operation” being >>> expensive. >>> The draft slips from mobility to resilience in the following >>> places: >>> * S.2. Terminology: Warm VM Mobility is defined without any >>> ending, as if it is permanent replication. * S.7. "Handling >>> of Hot, Warm and Cold Virtual Machine Mobility" is actually >>> all about redundancy, and doesn't address mobility explicitly. >>> [Linda] Will add the definition “Hot Migration”, “cold >>> migration”, and “warm migration”. >>> ===#. Terminology=== >>> Packets run from the source at A to the destination at B via >>> NVE1, then via NVE2. Please don't call NVE1 and NVE2 the >>> source NVE and the destination NVE. >>> In future, no-one will thank you for the apparent >>> contradictions when they continually stumble over phrases >>> like this one in S.4.1: "...send their packets to the source >>> NVE".. >>> The term "packets in flight" is used incorrectly to refer to >>> all the packets sent to the old NVE after the VM has moved, >>> even if they were launched into flight long after the old VM >>> stopped receiving packets. >>> [Linda] thank for the comments. Will change. >>> BTW, I think s/before/after/ in: "that have old ARP or >>> neighbor cache entry before VM or task migration". >>> I think: s/IP-based VM mobility/L3 VM mobility/ throughout, >>> because "based" >>> sounds (to me) like the mobility control protocol is over >>> (i.e. based on) IP. >>> ===#. Applicability=== >>> In section 4.2 it says that the protocol mostly used as the >>> IP based task migration protocol is ILA. This implies that >>> all hosts corresponding with the mobile VMs are either part >>> of the same controlled environment, or they are proxied via >>> nodes that are part of the same controlled environment (I >>> only have passing knowledge of ILA, but I understand that it >>> depends on ILA routers on the path). If I am correct, this >>> aspect of scope needs to be made clear from the start. >>> Also under the heading of applicabiliy, the sentence "Since >>> migrations should be relatively rare events" appears very >>> late in the document (S.4.2.1). The assumed level of churn >>> ought to be stated nearer the start. >>> [Linda] yes, under the same controlled environment. >>> ===#. L3 Mobility=== >>> L2 VM mobility is independent of the application, because >>> resolution of L2 mappings is delegated to the stack. In >>> contrast, L3 VM mobility is only feasible under certain >>> conditions, because an application needs an IP address to >>> open a socket (resolution of DNS names is not delegated to >>> the stack, and apps can use IP addresses directly anyway). >>> Examples of the 'certain conditions': >>> a) /All/ applications used in the whole DC load balancing >>> scheme contain IP address migration logic for /all/ their >>> connections; b) VMs running solely applications that support >>> IP address migration register this fact with the NVA, and it >>> only select such VMs for mobility. c) An abstraction is >>> layered over /all/ the IP addresses exposed to applications >>> (at both ends) so that the IP addresses that applications >>> use are solely identifiers (e.g. ILA, LISP, HIP), not also >>> locators. >>> The introduction says the draft is about VM mobility in a >>> multi-tenant DC, so the DC admin will not know the range of >>> applications being used. This excludes condition (a) above. >>> When the draft says "...if all applications running are >>> known to handle this gracefully...", it doesn't quantify >>> just how restrictive this condition is, and it gives no >>> explanation of how this knowledge might be 'known' or which >>> function within the system 'knows' it. >>> S.4.2.1 contains what seems like plenty of arm-waving. >>> * "TCP connections could be automatically closed in the >>> network stack during a migration event." >>> o There is no TCP connection state in the network stack. >>> o Even if the network starts to drop every packet, >>> the TCP connection >>> state persists in the end-points for a duration of >>> the order of 30-90 >>> minutes (OS-dependent) before TCP deems the >>> connection is broken. o >>> Other transport protocols have similar designs >>> (including the app-layer >>> of protocols over UDP). >>> * "More involved approach to connection migration": >>> o pausing the connection [does this refer to an >>> actual feature of any >>> L4 protocol?] o packaging connection state and >>> sending to target [does >>> this assume logic written into the application, or >>> is this assuming the >>> stack handles this and the app is restricted to >>> using some form of >>> separate identifier/locator addresses?] o >>> instantiating connection >>> state in the peer stack [ditto?]. >>> There's some arm-waving in S.7 too: >>> "Cold Virtual Machine mobility is facilitated by the VM >>> initially >>> sending an ARP or Neighbor Discovery message at the >>> destination NVE >>> but the source NVE not receiving any packets inflight." >>> [How is it arranged for the source NVE not to receive any >>> packets in flight?] >>> And in S.7: >>> "In hot >>> standby option, regarding TCP connections, one option is >>> to start >>> with and maintain TCP connections to two different VMs at >>> the same >>> time." >>> [This sounds like resilience logic has been written into >>> the application, >>> which would be a special case but not something VM >>> mobility infrastructure >>> could depend on.] >>> [Linda] will add. >>> ===#. Gaps=== >>> #. Security Considerations: repeats issues in other drafts >>> that are not specific to mobility, but it does not mention >>> any security issues specifically due to VM mobility. It says >>> that address spoofing may arise in a DC (sort-of implying it >>> is worse than in non-DC environments, but not saying why). >>> The handshake at the start of a connection (e.g. TCP, SCTP, >>> QUIC) checks for source address spoofing. So L3 VM mobility >>> would be more vulnerable to source address spoofing in cases >>> where the mobile VM was the connection initiator and there >>> was not a new handshake after the move. However, this draft >>> does not contain any detailed mobility protocols, so it is >>> not possible to identify any specific security flaws. >>> #. Transport Issues: Effect of delay on the transport: Cold >>> mobility introduces significant delay, and other forms less, >>> but still some delay. It should be pointed out that some >>> applications (e.g. real-time) will therefore not be useful >>> if subjected to VM mobility. Similarly, even a short period >>> of delay will drive most congestion controls to severely >>> reduce throughput. These points might be self-evident, but >>> perhaps they should be stated explicitly. >>> BTW, in the L3 VM mobility case, the draft often refers to >>> TCP connections, but the address bindings of any transport >>> protocols would have to be migrated due to VM mobility (e.g. >>> SCTP; sequences of datagrams over UDP; streams over UDP such >>> as with RTP, QUIC). >>> #. Management Issues: perhaps the draft ought to recommend >>> statistics gathering (e.g. time taken, amount of duplicate >>> data) to aid a DC's future decisions on the cost-benefit of >>> moving a VM. The OPSDIR review says a BCP does not /have/ to >>> describe management issues, but this document seems to >>> describe a whole system procedure, not just a protocol, >>> which then surely includes the management plane. >>> [Linda] can you become a co-author and add those in? >>> ===#. Incoherent Structure=== >>> S.4.1. happens to talk about VMs moving, while S.4.2. >>> happens to talk about tasks moving, but this is not the >>> distinguishing aspect of these two sections (anyway, S.2. >>> says "the draft uses task and VM interchangeably"): * "4.1 >>> VM Migration" is about "L2 VM Mobility" so this ought to be >>> the section heading, * >>> "4.2 Task Migration" is about "L3 VM Mobility" so this ought >>> to be the section heading. It would also help not to switch >>> from VM to task across these sections >>> - it's just a distraction. >>> S.4.1 needs better signposting of where each sub-case ends >>> (Subsections might be useful to solve this): * IPv4 * >>> end-user client * 2 paras starting "All NVEs communicating >>> with this virtual machine..." [Not clear that the end-user >>> case has ended and we have returned to the general IPv4 >>> case?] * IPv6 [Strictly, it still hasn't said whether the >>> end-user client case has ended.] [Also, it doesn't explain >>> why there is no need for an end-user client case under >>> IPv6?] Sections 5 & 6 seem to be about either L2 or L3 >>> mobility, whereas Sections 7 & >>> 8 seem to be restricted to L2. >>> The draft vacillates over what to do with packets arriving >>> at the old NVE in the L3 case (see also L3 mobility above): >>> * S4.2 first says packets are dropped, possibly with an ICMP >>> error message; >>> o then later it says they are silently dropped; >>> o then in the very next sentence it says either silently >>> drop them or forward >>> them to the new location >>> * S.5 says they should not be lost, but instead delivered to >>> the destination hypervisor >>> o then it describes how they are tunnelled (which is not >>> the same as >>> "forwarding"). >>> The order in which all the stages of mobilty are given is >>> jumbled up across sections that also appear in arbitrary >>> order: * S.5 prepares, establishes uses then stops a tunnel, >>> but it doesn't say where the other stages fit between these >>> steps >>> o When tunneling packets, it talks about the >>> *migrating* VM not the >>> *migrated* VM, which implies tunnelling has started >>> before the new VM >>> is running. Does this imply there is a huge buffer? >>> o It says "Stop >>> Tunneling Packets - When source NVE stops receiving >>> packets destined >>> to..." but it is never clear when a source has >>> stopped sending packets >>> to a destination, unless it explicitly closes the >>> connection (e.g. with >>> a FIN in the case of TCP). Often there are long gaps >>> between packets, >>> because many flows are 'thin' (meaning the >>> application frequently has >>> nothing to send). These gaps can last for >>> milliseconds, hours or even >>> days without any implication that the connection has >>> ended. >>> * Then S.6. describes moving state, but doesn't say that >>> this is not after the previous tunnelling steps (or where it >>> fits within those steps). * Then S.7 describes hot, warm and >>> cold mobility, but doesn't lay out the tunnelling or steps >>> to move state in each case. * Then S.8 says it's about VM >>> life-cycle, but just gives the very first 3 steps for >>> allocation of resources to a VM, then abruptly ends, without >>> even starting the VM, let alone getting to move it. >>> S.5 exhibits another inconsistency by talking about the >>> hypervisor, not the NVE. >>> ==#. Nits== >>> Nits with the English are too numerous to mention them all. >>> Below are pointers to general problems as well as some >>> individual instances. >>> S.4 >>> "Layer 2 and Layer 3 protocols are described next. In the >>> following >>> sections, we examine more advanced features." >>> s/following/subsequent/ >>> S.4.1 >>> Expand WSC, MSC and NVA on first use. >>> s/the VM moves in the same link/the VM moves in the same subnet/ >>> "i.e. end-user clients ask for the same MAC address upon >>> migration. [...] to ensure that the same IPv4 address is >>> assigned to the VM." I think s/IPv4/MAC/ was intended? >>> " All NVEs communicating with this virtual machine uses the >>> old ARP >>> entry. If any VM in those NVEs need to talk to the new >>> VM in the >>> destination NVE, it uses the old ARP entry." >>> Repetition: these 2 sentences say the same. (The mistake is >>> also repeated when these 2 sentences are repeated for IPv6). >>> S.4.2.1 >>> s/Push the new mapping to hosts./Push the new mapping to >>> communicating hosts./ >>> S.5. >>> The IPv4/IPv6 pairs of paras for "tunnel estabilshment" and >>> "tunneling packets" >>> only differ in the words "IPv4"/"IPv6". So in each case a >>> single para could be given for IP (irrespective of whether >>> v4 or v6). >>> Thank you very much. >>> Linda Dunbar >>> >>> >>> _______________________________________________ >>> Tsv-art mailing list >>> Tsv-art@ietf.org <mailto:Tsv-art@ietf.org> >>> https://www.ietf.org/mailman/listinfo/tsv-art >>> <https://www..ietf.org/mailman/listinfo/tsv-art> >> >> -- >> ________________________________________________________________ >> Bob Briscoehttp://bobbriscoe.net/ >> >> >> >> _______________________________________________ >> Tsv-art mailing list >> Tsv-art@ietf.org <mailto:Tsv-art@ietf.org> >> https://www.ietf.org/mailman/listinfo/tsv-art > > -- > ________________________________________________________________ > Bob Briscoehttp://bobbriscoe.net/ > > > > _______________________________________________ > nvo3 mailing list > nvo3@ietf.org > https://www.ietf.org/mailman/listinfo/nvo3 -- ________________________________________________________________ Bob Briscoe http://bobbriscoe.net/
- [nvo3] Tsvart last call review of draft-ietf-nvo3… Bob Briscoe
- Re: [nvo3] Tsvart last call review of draft-ietf-… Linda Dunbar
- Re: [nvo3] Tsvart last call review of draft-ietf-… Black, David
- Re: [nvo3] Tsvart last call review of draft-ietf-… Linda Dunbar
- Re: [nvo3] Tsvart last call review of draft-ietf-… Behcet Sarikaya
- Re: [nvo3] [Tsv-art] Tsvart last call review of d… Spencer Dawkins at IETF
- Re: [nvo3] [Tsv-art] Tsvart last call review of d… Black, David
- Re: [nvo3] [Tsv-art] Tsvart last call review of d… Bob Briscoe
- Re: [nvo3] [Tsv-art] Tsvart last call review of d… Bob Briscoe
- Re: [nvo3] [Tsv-art] Tsvart last call review of d… Behcet Sarikaya
- Re: [nvo3] [Tsv-art] Tsvart last call review of d… Bob Briscoe
- Re: [nvo3] [Tsv-art] Tsvart last call review of d… Behcet Sarikaya
- Re: [nvo3] [Tsv-art] Tsvart last call review of d… Bob Briscoe
- Re: [nvo3] [Tsv-art] Tsvart last call review of d… Bocci, Matthew (Nokia - GB)
- Re: [nvo3] [Tsv-art] Tsvart last call review of d… Linda Dunbar
- Re: [nvo3] [Tsv-art] Tsvart last call review of d… Bob Briscoe
- Re: [nvo3] [Tsv-art] Tsvart last call review of d… Linda Dunbar
- Re: [nvo3] [Tsv-art] Tsvart last call review of d… Bob Briscoe
- Re: [nvo3] [Tsv-art] Tsvart last call review of d… Black, David
- Re: [nvo3] [Tsv-art] Tsvart last call review of d… Linda Dunbar
- Re: [nvo3] [Tsv-art] Tsvart last call review of d… Black, David
- Re: [nvo3] [Tsv-art] Tsvart last call review of d… Linda Dunbar
- Re: [nvo3] [Tsv-art] Tsvart last call review of d… Bob Briscoe
- Re: [nvo3] [Tsv-art] Tsvart last call review of d… Linda Dunbar
- Re: [nvo3] [Tsv-art] Tsvart last call review of d… Black, David
- Re: [nvo3] [Tsv-art] Tsvart last call review of d… Linda Dunbar
- Re: [nvo3] [Tsv-art] Tsvart last call review of d… Black, David
- Re: [nvo3] [Tsv-art] Tsvart last call review of d… Linda Dunbar
- Re: [nvo3] [Tsv-art] Tsvart last call review of d… Black, David
- Re: [nvo3] [Tsv-art] Tsvart last call review of d… Linda Dunbar
- Re: [nvo3] [Tsv-art] Tsvart last call review of d… Black, David
- Re: [nvo3] [Tsv-art] Tsvart last call review of d… Linda Dunbar