Re: [Tsv-art] Tsvart last call review of draft-ietf-nvo3-vmm-04

Behcet Sarikaya <sarikaya2012@gmail.com> Thu, 20 September 2018 14:30 UTC

Return-Path: <sarikaya2012@gmail.com>
X-Original-To: tsv-art@ietfa.amsl.com
Delivered-To: tsv-art@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BD80E130EAE; Thu, 20 Sep 2018 07:30:53 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.749
X-Spam-Level:
X-Spam-Status: No, score=-1.749 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id S1ptHv-hidBA; Thu, 20 Sep 2018 07:30:48 -0700 (PDT)
Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 42242130E64; Thu, 20 Sep 2018 07:30:48 -0700 (PDT)
Received: by mail-wr1-x42d.google.com with SMTP id u12-v6so9627316wrr.4; Thu, 20 Sep 2018 07:30:48 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:reply-to:from:date:message-id :subject:to:cc; bh=8gSJ/EY2LXWJ5bJDR+XDPWOSvOEXngdX2Dhbdvz+7k4=; b=pHqeET6MV4fYNdE9dlJ+Gk8/oLPrthObobLEPwdrGtpsGMT7T0hVYNqXLXfkuPyKGA kZj8AiF7W/fnDyY8/g6Vh8g630ipg6JDQJ31XaVnIGCVJlfTeN6jhFdPG7zr5DlCr1Cv 3/tUk2T5/nH6i9jjZa6nx18bIsE/EIoMTnTX21z8OalDpeaXO94Q0HaVPUq8vNSbVXjC OSrm/oayJOTACFDijioOrj5PNamoXUfnqbyszH1fBiVx68ceWhmR90XhCHHaVXW6PlOe vRMRCOQohXu17QCz3Y+htYjmYUTRDmuCpPnK5QLOMXRkNtkj9SGPN3GHDhV/6RT0pnN6 8pNA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:reply-to :from:date:message-id:subject:to:cc; bh=8gSJ/EY2LXWJ5bJDR+XDPWOSvOEXngdX2Dhbdvz+7k4=; b=Z6c42rfJKIaWsPqHMoWr06JLR0e2uiZCci/MMHW5voUOC7j2mVeLbNQwwGVb+XfojU 3l94jwhDwSf9QcqaQuCW907pLT7J9BV9/qitJg0sakjNoTz3zrQwO5OdPg5vFdE1wr8x 7m1fPRrEdIx8YD2614+qrsnW5mHyQdmt1TrSY6RfVvXDz8X5Jwy66cUCuhv2Ti+Ym2d6 4YsZNxCYQHoT70/G0Op2cQIyk0AWkZc0R+nF9zVgkj92FRdn+pT4Zgje4xbMKlTSyOgq tjA9AtL3n2mp7s9gxXM42RBHeJRBlQww4SC7Hzeeea7ojdXaBMZtCOQHQHXelCr6CBf5 98Pg==
X-Gm-Message-State: APzg51AHu+GQCVLpGYA8N+WiJS3PF9VhPhTc/cEhQxlDWdXtDa5nfS8M +rG1tg/NQ/7MGU43nQ3RJbxKuCxD3LlQIlqmpjUQsw==
X-Google-Smtp-Source: ANB0VdYHx71VRRZgd7pWb9dryOBKwty4fBlUlCBDXifvZLa+Y20dM8oZyZGvcu785ke/Qt4yf7xFP5TsMqv4y2PBHoA=
X-Received: by 2002:a5d:4512:: with SMTP id s18-v6mr34456729wrq.82.1537453846409; Thu, 20 Sep 2018 07:30:46 -0700 (PDT)
MIME-Version: 1.0
References: <153602909285.13281.13763046029400746910@ietfa.amsl.com> <4A95BA014132FF49AE685FAB4B9F17F66B139743@sjceml521-mbs.china.huawei.com> <7f3ceaff-db16-8eb9-a72c-aca219c7d90c@bobbriscoe.net> <CAC8QAcfVywTMOs=+B5UH5JwpsPPkiYZnb4YQzcqKMzedQsiMdw@mail.gmail.com> <c513d041-0c65-111d-9fd4-4474c52fa491@bobbriscoe.net>
In-Reply-To: <c513d041-0c65-111d-9fd4-4474c52fa491@bobbriscoe.net>
Reply-To: sarikaya@ieee.org
From: Behcet Sarikaya <sarikaya2012@gmail.com>
Date: Thu, 20 Sep 2018 09:30:35 -0500
Message-ID: <CAC8QAcc-fr_-g8bPe812=udZVQk3d2E00mkJBwDbX3ZkdDceUw@mail.gmail.com>
To: ietf@bobbriscoe.net
Cc: sarikaya@ieee.org, tsv-art@ietf.org, NVO3 <nvo3@ietf.org>, IETF <ietf@ietf.org>, draft-ietf-nvo3-vmm.all@ietf.org
Content-Type: multipart/alternative; boundary="0000000000005fb84205764e5ead"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsv-art/hhVSQ69hsR441QvRQmKBsF79uCs>
Subject: Re: [Tsv-art] Tsvart last call review of draft-ietf-nvo3-vmm-04
X-BeenThere: tsv-art@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Review Team <tsv-art.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsv-art>, <mailto:tsv-art-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsv-art/>
List-Post: <mailto:tsv-art@ietf.org>
List-Help: <mailto:tsv-art-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsv-art>, <mailto:tsv-art-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 20 Sep 2018 14:30:54 -0000

Dear Bob,
On Wed, Sep 19, 2018 at 9:53 AM Bob Briscoe <ietf@bobbriscoe.net> wrote:

> Behcet,
>
> I would like to make significant responses to many of Linda's responses,
> but until we get answers to the two pre-requisite questions I've given, I
> can't be sure how to respond.
>
> So rather than promising a new version with no prior discussion, I believe
> it would be much more fruitful to engage in this conversation. I'm trying
> to help.
>
>
You already made a detailed review.
Your two points are clarifications from your detailed review.
When I said we will revise I meant we  will revise based on your detailed
review.
After we post our revision you can do what ever you wish.

Sincerely,
Behcet

> Cheers
>
>
> Bob
>
> On 19/09/18 15:46, Behcet Sarikaya wrote:
>
> Hi Bob,
>
> Thank you for your comments.
> The authors are currently discussing your points and we will come up with
> a revision soon after the discussions are over.
>
> Regards,
> Behcet
> On Tue, Sep 18, 2018 at 6:03 PM Bob Briscoe <ietf@bobbriscoe.net> wrote:
>
>> Linda,
>>
>> Until we can all understand the answers to the following two questions, I
>> don't think we can discuss what track this draft ought to be on, let alone
>> move on to your responses to all my other points.
>>
>> 1/ Applicability
>>
>> You say this draft solely applies to connections with both ends within
>> the controlled DC environment. But the draft says it's about multi-tenant
>> DCs. Are there any multi-tenant DCs that restrict all VMs to only
>> communicate with other VMs within the same controlled DC environment?
>>
>> 2/ Purpose of publishing as an RFC
>>
>> When I said:
>>
>> #. The introduction does not say what the purpose of publishing this
>> draft is.
>>
>> you responded:
>>
>> [Linda] The first paragraph on Page 3 has the description why VM Mobility
>> is needed.
>>
>>
>> Whether VM Mobility is needed was not my question. My question was what
>> is the purpose of the IETF publishing an RFC about VM Mobility? And
>> particularly, what is /this/ RFC intended to achieve?
>>
>> Are the authors trying to argue for a particular approach vs. others? Are
>> you trying to write a tutorial? Are you trying to give the pros and cons of
>> different approaches? Are you trying to give advice on good practice (with
>> the implication that alternative practices are less good)? Are you trying
>> to clarify ideas by writing them down? Are you trying to outline the
>> implications of VM Mobility for other protocols being developed within the
>> NVO WG?
>>
>>
>>
>>
>> Bob
>>
>> On 10/09/18 19:16, Linda Dunbar wrote:
>>
>> Bob,
>>
>> Thank you very much for reviewing the draft and provided in-depth
>> comments. I am very sorry for the delayed response due to traveling.
>>
>> Replies to your comments are inserted below marked by [Linda]:
>>
>>
>> -----Original Message-----
>> From: Bob Briscoe [mailto:ietf@bobbriscoe.net <ietf@bobbriscoe.net>]
>> Sent: Monday, September 03, 2018 9:45 PM
>> To: tsv-art@ietf.org
>> Cc: nvo3@ietf.org; ietf@ietf.org <ietf@ietf..org>;
>> draft-ietf-nvo3-vmm.all@ietf.org
>> Subject: Tsvart last call review of draft-ietf-nvo3-vmm-04
>>
>> Reviewer: Bob Briscoe
>> Review result: Not Ready
>>
>> I have been selected as the Transport Directorate reviewer for this
>> draft. The Transport Directorate seeks to review all transport or
>> transport-related drafts as they pass through IETF last call and IESG
>> review, and sometimes on special request. The purpose of the review is to
>> provide assistance to the Transport ADs. For more information about the
>> Transport Directorate Reviews and the Transport Area Review Team, please
>> see https://trac.ietf.org/trac/tsv/wiki/TSV-Directorate-Reviews
>>
>> In this case, very very few of the review comments relate to transport
>> issues, although the greatest issue concerns a desire that the network
>> could pause or stop connections during L3 VM Mobility, which is certainly a
>> transport issue.
>>
>> [Linda] There is “Hot Migration” with transport service continuing, and
>> there is a “Cold Migration”, which is a common practice in many data
>> centers, which stop the task running on the old place and move to the new
>> place before restart as described in the Task Migration.
>> Is it helpful to add this description to the draft?
>>
>>
>> ==Summary==
>>
>> The technical aspects of the draft concerning L2 VM mobility (within a
>> subnet) seem sound. However, this is only part of the draft, which has the
>> following
>> issues:
>>
>> #. The introduction does not say what the purpose of publishing this
>> draft is.
>> It seems that, rather than describing a specific protocol or protocols,
>> it intends to describe the overall system procedure that would typically be
>> used in DCs for VM mobility. It is tagged as a BCP, but it does not say who
>> needs this BCP, why it is useful for the IETF to publish this BCP, how wide
>> the authors' knowledge is of current practice (given DCs are private), or
>> why this is a BCP rather than a protocol spec.
>>
>> [Linda] The first paragraph on Page 3 has the description why VM Mobility
>> is needed. Is it helpful to move this paragraph to the beginning of the
>> Introduction Section?
>> *“**Virtualization which is being used in almost all of today’s data*
>> *centers enables many virtual machines to run on a single physical*
>> *computer or compute server. Virtual machines (VM) need hypervisor*
>> *running on the physical compute server to provide them shared*
>> *processor/memory/storage. Network connectivity is provided by the*
>> *network virtualization edge (NVE) [RFC8014]. Being able to move VMs*
>> *dynamically, or live migration, from one server to another allows for*
>> *dynamic load balancing or work distribution and thus it is a highly*
>> *desirable feature [RFC7364].**”*
>>
>>
>> The draft starts out (S.3) as if it intends to say what a good VM
>> Mobility protocol should or shouldn't do, but the rest of the document
>> doesn't give any reasoning for these recommendations, it just asserts what
>> appears to be one view of how a whole VM Mobility system works, sometimes
>> referring to one example protocol RFC for a component part, but more often
>> with no references or details.
>>
>> [Linda] Is it helpful to move the paragraph above to the beginning of the
>> Introduction Section? So that audience is aware of why VM Mobility is
>> needed. And then follow up with what a good VM Mobility protocol should or
>> shouldn't do?
>>
>> #. It does not seem as if the NVO WG has discussed the purpose of using
>> normative text in this draft. See detailed comments.
>>
>> [Linda] The “Intended status” of the draft is “Best Current Practice”. So
>> all the text are not “normative”. Is it Okay?
>>
>> #. The draft silently slips back and forth between VM mobility and VM
>> redundancy, without recognizing the differences. See detailed comments.
>>
>> [Linda] There is only one usage of “redundancy” in the entire document,
>> used under the context of “Hot standby option”, indicating  the
>> “redundancy” of “the VMs in both primary and secondary domains have
>> identical information and can provide services simultaneously as in
>> load-share mode of operation” being expensive.
>>
>> #. Please adopt different terminology than "source NVE" and "destination
>> NVE", which are really poor choices of terms for an intermediate node. See
>> detailed comments. Why not use "old NVE" and "new NVE", which is what you
>> mean?
>> [Linda] Thanks for the suggestion. We will change to “Old NVE”, and “new
>> NVE”.
>>
>> #. Applicability is fairly clearly outlined, but it is not clear whether
>> hosts corresponding with the mobile VMs are part of the same controlled
>> environment or on the uncontrolled public Internet. See detailed comments.
>> [Linda] “Hosts” are the App running on the VM. It is the under the same
>> controlled environment. Not on uncontrolled public internet.
>>
>>
>> #. Section 4.2.1 on L3 VM mobility reads like some potential
>> half-thought-through ideas on how to solve L3 mobility, rather than current
>> practice, let alone best current practice. Either current practice should
>> be described instead, or the scope of the draft should be narrowed solely
>> to L2 VM mobility. See detailed comments.
>> [Linda] This is refereeing to “Cold Migration”, which is a common
>> practice in many data centers.
>>
>> # The VM's file system is described as state that moves with the VM
>> (S.6), but VM mobility solutions often move the VM but stitch it back to
>> its (unmoved) storage. Conversely, the storage can also move independent of
>> the VM.
>> [Linda] It depends. When a VM move to a different zone, the storage/file
>> can becomes inaccessible.
>>
>> #. The draft omits some of the security, transport and management aspects
>> of VM mobility. See detailed comments.
>> [Linda] Can you provide some text?
>>
>> #. The draft reads as if different sections have been written by
>> different authors and no-one has edited the whole to give it a coherent
>> structure, or to ensure consistency (both technical and editorial) between
>> the parts. See detailed comments.
>>
>> [Linda] we can improve.
>>
>>
>> #. The quality of the English grammar does not allow a reviewer to
>> concentrate on the technical aspects rather than the English. It would have
>> been useful if one of the English-speaking co-authors had improved the
>> English before submission for review. See detailed comments.
>> [Linda] can you help?  Becoming a co-author to improve?
>>
>> ==Detailed Comments==
>>
>> ===#. Normative statements===
>>
>> In the body of the document, there is just one occurrence of normative
>> text (actually two "MUST"s, but both state a common requirement - just
>> written separately for IPv4 and IPv6). This merely serves to imply that
>> everything else the document says is less important or optional, which was
>> probably not the intention.
>> [Linda] The goal is to indicate any solution in moving the VM “MUST”
>> follow this rule. They make sense, aren’t they?
>>
>> At the start there is a requirements section, which states what a VM
>> Mobility protocol "SHOULD" or "SHOULD NOT" do. I think this is intended as
>> a set of goals for the rest of the document. If so, these "SHOULDs" are not
>> intended to apply to implementations, so they ought not to be capitalized.
>>
>> [Linda] okay, will change.
>>
>>
>> The first requirement, "Data center network SHOULD support virtual
>> machine mobility in IPv6", is written as a requirement on all DC networks,
>> not on implementations. I assume this was intended to read as "Data center
>> network virtual machine mobility protocols SHOULD support IPv6". Even then,
>> it doesn't really add anything to say VM mobility should support v6 and it
>> should support v4. A L2 solution won't. While undoubtedly, a L3 solution
>> will at least support one of them.
>> [Linda]Agree. Will change it to “Data center that support IPv6 address
>> should …”
>>
>> I'm not sure that 'protocol' is the right word anyway; I think 'VM
>> Mobility procedure' would be a better phrase, because it includes steps
>> such as suspending the VM, which is more than a protocol.
>> [Linda] yes. Will change to “Procedure”.
>>
>> The requirement "Virtual machine mobility protocol MAY support host
>> routes to accomplish virtualization", is not followed up at all in the rest
>> of the draft.
>> Even if this requirement stays, the last 3 words should be deleted.
>>
>> [Linda] will change to “Host Route can be used to support the Virtual
>> Machine Mobility Procedure.”
>>
>> By the end of the draft, the solution falls far short of the most
>> relevant "Requirements" anyway, so one assumes the title of the section
>> ought to have been "Goals". Specifically, even in the simpler case of L2 VM
>> mobility, S.4.1 says that triangular routing and tunnelling persist "until
>> a neighbour cache entry times out". A cache timeout is about 10 orders of
>> magnitude longer than the requirement to only persist "while handling
>> packets in flight", which would be a few milliseconds at most (the time for
>> packets to clear the network that were already launched into flight when
>> the old VM stopped).
>>
>> Whatever, it would be preferable for the draft to give rationale for
>> these requirements, rather than just assert them. This would help to shed
>> light on the merits of the different trade offs that solutions choose.
>>
>> [Linda] Agree, will add.
>>
>> ===#. Mobility vs. Redundancy===
>>
>> Redundancy and mobility have a lot of similarities, but they have
>> different goals. With mobility, it is necessary to know the exact instant
>> when one set of state is identical to the other so it can hand over. With
>> redundancy, the aim is to keep two (or more) sets of state evolving through
>> the same sequence of changes, but there is no need to know the point at
>> which one is the same as the other was at a certain point.
>> [Linda] Agree with what you said. There is only one usage of “redundancy”
>> in the entire document, used under the context of “Hot standby option”,
>> indicating  the “redundancy” of  “the VMs in both primary and secondary
>> domains have identical information and can provide services simultaneously
>> as in load-share mode of operation” being expensive.
>>
>> The draft slips from mobility to resilience in the following places:
>> * S.2. Terminology: Warm VM Mobility is defined without any ending, as if
>> it is permanent replication. * S.7. "Handling of Hot, Warm and Cold Virtual
>> Machine Mobility" is actually all about redundancy, and doesn't address
>> mobility explicitly.
>>
>> [Linda] Will add the definition “Hot Migration”, “cold migration”, and
>> “warm migration”.
>>
>> ===#. Terminology===
>>
>> Packets run from the source at A to the destination at B via NVE1, then
>> via NVE2. Please don't call NVE1 and NVE2 the source NVE and the
>> destination NVE.
>> In future, no-one will thank you for the apparent contradictions when
>> they continually stumble over phrases like this one in S.4.1: "...send
>> their packets to the source NVE".
>>
>> The term "packets in flight" is used incorrectly to refer to all the
>> packets sent to the old NVE after the VM has moved, even if they were
>> launched into flight long after the old VM stopped receiving packets.
>>
>> [Linda] thank for the comments. Will change.
>>
>> BTW, I think s/before/after/ in: "that have old ARP or neighbor cache
>> entry before VM or task migration".
>>
>> I think: s/IP-based VM mobility/L3 VM mobility/ throughout, because
>> "based"
>> sounds (to me) like the mobility control protocol is over (i.e. based on)
>> IP.
>>
>> ===#. Applicability===
>>
>> In section 4.2 it says that the protocol mostly used as the IP based task
>> migration protocol is ILA. This implies that all hosts corresponding with
>> the mobile VMs are either part of the same controlled environment, or they
>> are proxied via nodes that are part of the same controlled environment (I
>> only have passing knowledge of ILA, but I understand that it depends on ILA
>> routers on the path). If I am correct, this aspect of scope needs to be
>> made clear from the start.
>>
>> Also under the heading of applicabiliy, the sentence "Since migrations
>> should be relatively rare events" appears very late in the document
>> (S.4.2.1). The assumed level of churn ought to be stated nearer the start.
>>
>> [Linda] yes, under the same controlled environment.
>>
>> ===#. L3 Mobility===
>> L2 VM mobility is independent of the application, because resolution of
>> L2 mappings is delegated to the stack. In contrast, L3 VM mobility is only
>> feasible under certain conditions, because an application needs an IP
>> address to open a socket (resolution of DNS names is not delegated to the
>> stack, and apps can use IP addresses directly anyway).
>>
>> Examples of the 'certain conditions':
>> a) /All/ applications used in the whole DC load balancing scheme contain
>> IP address migration logic for /all/ their connections; b) VMs running
>> solely applications that support IP address migration register this fact
>> with the NVA, and it only select such VMs for mobility. c) An abstraction
>> is layered over /all/ the IP addresses exposed to applications (at both
>> ends) so that the IP addresses that applications use are solely
>> identifiers  (e.g. ILA, LISP, HIP), not also locators.
>>
>> The introduction says the draft is about VM mobility in a multi-tenant
>> DC, so the DC admin will not know the range of applications being used.
>> This excludes condition (a) above. When the draft says "...if all
>> applications running are known to handle this gracefully...", it doesn't
>> quantify just how restrictive this condition is, and it gives no
>> explanation of how this knowledge might be 'known' or which function within
>> the system 'knows' it.
>>
>> S.4.2.1 contains what seems like plenty of arm-waving.
>> * "TCP connections could be automatically closed in the network stack
>> during a migration event."
>>         o There is no TCP connection state in the network stack.
>>         o Even if the network starts to drop every packet, the TCP
>> connection
>>         state persists in the end-points for a duration of the order of
>> 30-90
>>         minutes (OS-dependent) before TCP deems the connection is broken.
>> o
>>         Other transport protocols have similar designs (including the
>> app-layer
>>         of protocols over UDP).
>> * "More involved approach to connection migration":
>>         o pausing the connection [does this refer to an actual feature of
>> any
>>         L4 protocol?] o packaging connection state and sending to target
>> [does
>>         this assume logic written into the application, or is this
>> assuming the
>>         stack handles this and the app is restricted to using some form of
>>         separate identifier/locator addresses?] o instantiating connection
>>         state in the peer stack [ditto?].
>>
>> There's some arm-waving in S.7 too:
>>   "Cold Virtual Machine mobility is facilitated by the VM initially
>>    sending an ARP or Neighbor Discovery message at the destination NVE
>>    but the source NVE not receiving any packets inflight."
>>    [How is it arranged for the source NVE not to receive any packets in
>> flight?]
>>
>> And in S.7:
>>   "In hot
>>    standby option, regarding TCP connections, one option is to start
>>    with and maintain TCP connections to two different VMs at the same
>>    time."
>>    [This sounds like resilience logic has been written into the
>> application,
>>    which would be a special case but not something VM mobility
>> infrastructure
>>    could depend on.]
>>
>> [Linda] will add.
>>
>> ===#. Gaps===
>> #. Security Considerations: repeats issues in other drafts that are not
>> specific to mobility, but it does not mention any security issues
>> specifically due to VM mobility. It says that address spoofing may arise in
>> a DC (sort-of implying it is worse than in non-DC environments, but not
>> saying why). The handshake at the start of a connection (e.g. TCP, SCTP,
>> QUIC) checks for source address spoofing. So L3 VM mobility would be more
>> vulnerable to source address spoofing in cases where the mobile VM was the
>> connection initiator and there was not a new handshake after the move.
>> However, this draft does not contain any detailed mobility protocols, so it
>> is not possible to identify any specific security flaws.
>>
>> #. Transport Issues: Effect of delay on the transport: Cold mobility
>> introduces significant delay, and other forms less, but still some delay.
>> It should be pointed out that some applications (e.g. real-time) will
>> therefore not be useful if subjected to VM mobility. Similarly, even a
>> short period of delay will drive most congestion controls to severely
>> reduce throughput. These points might be self-evident, but perhaps they
>> should be stated explicitly.
>>
>> BTW, in the L3 VM mobility case, the draft often refers to TCP
>> connections, but the address bindings of any transport protocols would have
>> to be migrated due to VM mobility (e.g. SCTP; sequences of datagrams over
>> UDP; streams over UDP such as with RTP, QUIC).
>>
>> #. Management Issues: perhaps the draft ought to recommend statistics
>> gathering (e.g. time taken, amount of duplicate data) to aid a DC's future
>> decisions on the cost-benefit of moving a VM. The OPSDIR review says a BCP
>> does not /have/ to describe management issues, but this document seems to
>> describe a whole system procedure, not just a protocol, which then surely
>> includes the management plane.
>>
>> [Linda] can you become a co-author and add those in?
>>
>> ===#. Incoherent Structure===
>>
>> S.4.1. happens to talk about VMs moving, while S.4.2. happens to talk
>> about tasks moving, but this is not the distinguishing aspect of these two
>> sections (anyway, S.2. says "the draft uses task and VM interchangeably"):
>> * "4.1 VM Migration" is about "L2 VM Mobility" so this ought to be the
>> section heading, *
>> "4.2 Task Migration" is about "L3 VM Mobility" so this ought to be the
>> section heading. It would also help not to switch from VM to task across
>> these sections
>> - it's just a distraction.
>>
>> S.4.1 needs better signposting of where each sub-case ends (Subsections
>> might be useful to solve this): * IPv4 * end-user client * 2 paras starting
>> "All NVEs communicating with this virtual machine..." [Not clear that the
>> end-user case has ended and we have returned to the general IPv4 case?] *
>> IPv6 [Strictly, it still hasn't said whether the end-user client case has
>> ended.] [Also, it doesn't explain why there is no need for an end-user
>> client case under IPv6?] Sections 5 & 6 seem to be about either L2 or L3
>> mobility, whereas Sections 7 &
>> 8 seem to be restricted to L2.
>>
>> The draft vacillates over what to do with packets arriving at the old NVE
>> in the L3 case (see also L3 mobility above): * S4.2 first says packets are
>> dropped, possibly with an ICMP error message;
>>   o then later it says they are silently dropped;
>>   o then in the very next sentence it says either silently drop them or
>> forward
>>   them to the new location
>> * S.5 says they should not be lost, but instead delivered to the
>> destination hypervisor
>>   o then it describes how they are tunnelled (which is not the same as
>>   "forwarding").
>>
>> The order in which all the stages of mobilty are given is jumbled up
>> across sections that also appear in arbitrary order: * S.5 prepares,
>> establishes uses then stops a tunnel, but it doesn't say where the other
>> stages fit between these steps
>>         o When tunneling packets, it talks about the *migrating* VM not
>> the
>>         *migrated* VM, which implies tunnelling has started before the
>> new VM
>>         is running. Does this imply there is a huge buffer? o It says
>> "Stop
>>         Tunneling Packets - When source NVE stops receiving packets
>> destined
>>         to..." but it is never clear when a source has stopped sending
>> packets
>>         to a destination, unless it explicitly closes the connection
>> (e.g. with
>>         a FIN in the case of TCP). Often there are long gaps between
>> packets,
>>         because many flows are 'thin' (meaning the application frequently
>> has
>>         nothing to send). These gaps can last for milliseconds, hours or
>> even
>>         days without any implication that the connection has ended.
>> * Then S.6. describes moving state, but doesn't say that this is not
>> after the previous tunnelling steps (or where it fits within those steps).
>> * Then S.7 describes hot, warm and cold mobility, but doesn't lay out the
>> tunnelling or steps to move state in each case. * Then S.8 says it's about
>> VM life-cycle, but just gives the very first 3 steps for allocation of
>> resources to a VM, then abruptly ends, without even starting the VM, let
>> alone getting to move it.
>>
>> S.5 exhibits another inconsistency by talking about the hypervisor, not
>> the NVE.
>>
>> ==#. Nits==
>>
>> Nits with the English are too numerous to mention them all. Below are
>> pointers to general problems as well as some individual instances.
>>
>> S.4
>>   "Layer 2 and Layer 3 protocols are described next.  In the following
>>    sections, we examine more advanced features."
>>         s/following/subsequent/
>>
>> S.4.1
>> Expand WSC, MSC and NVA on first use.
>>
>> s/the VM moves in the same link/the VM moves in the same subnet/
>>
>> "i.e. end-user clients ask for the same MAC address upon migration. [...]
>> to ensure that the same IPv4 address is assigned to the VM." I think
>> s/IPv4/MAC/ was intended?
>>
>> "  All NVEs communicating with this virtual machine uses the old ARP
>>    entry.  If any VM in those NVEs need to talk to the new VM in the
>>    destination NVE, it uses the old ARP entry."
>> Repetition: these 2 sentences say the same. (The mistake is also repeated
>> when these 2 sentences are repeated for IPv6).
>>
>> S.4.2.1
>> s/Push the new mapping to hosts./Push the new mapping to communicating
>> hosts./
>>
>> S.5.
>> The IPv4/IPv6 pairs of paras for "tunnel estabilshment" and "tunneling
>> packets"
>> only differ in the words "IPv4"/"IPv6". So in each case a single para
>> could be given for IP (irrespective of whether v4 or v6).
>>
>> Thank you very much.
>>
>> Linda Dunbar
>>
>>
>>
>>
>>
>> _______________________________________________
>> Tsv-art mailing listTsv-art@ietf.orghttps://www.ietf.org/mailman/listinfo/tsv-art <https://www..ietf.org/mailman/listinfo/tsv-art>
>>
>>
>> --
>> ________________________________________________________________
>> Bob Briscoe                               http://bobbriscoe.net/
>>
>>
>
> _______________________________________________
> Tsv-art mailing listTsv-art@ietf.orghttps://www.ietf.org/mailman/listinfo/tsv-art
>
>
> --
> ________________________________________________________________
> Bob Briscoe                               http://bobbriscoe.net/
>
>