[nvo3] Boston interim meeting minutes

"Bocci, Matthew (Matthew)" <matthew.bocci@alcatel-lucent.com> Tue, 09 October 2012 15:44 UTC

Return-Path: <matthew.bocci@alcatel-lucent.com>
X-Original-To: nvo3@ietfa.amsl.com
Delivered-To: nvo3@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0A2FD11E80D1 for <nvo3@ietfa.amsl.com>; Tue, 9 Oct 2012 08:44:41 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -108.318
X-Spam-Level:
X-Spam-Status: No, score=-108.318 tagged_above=-999 required=5 tests=[AWL=-0.670, BAYES_50=0.001, HELO_EQ_FR=0.35, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-8, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 12CvniN-rX62 for <nvo3@ietfa.amsl.com>; Tue, 9 Oct 2012 08:44:28 -0700 (PDT)
Received: from smail6.alcatel.fr (smail6.alcatel.fr [64.208.49.42]) by ietfa.amsl.com (Postfix) with ESMTP id 9AEDC11E8114 for <nvo3@ietf.org>; Tue, 9 Oct 2012 08:44:27 -0700 (PDT)
Received: from FRMRSSXCHHUB03.dc-m.alcatel-lucent.com (FRMRSSXCHHUB03.dc-m.alcatel-lucent.com [135.120.45.63]) by smail6.alcatel.fr (8.14.3/8.14.3/ICT) with ESMTP id q99FiCKN020328 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=NOT) for <nvo3@ietf.org>; Tue, 9 Oct 2012 17:44:26 +0200
Received: from FRMRSSXCHMBSA3.dc-m.alcatel-lucent.com ([135.120.45.35]) by FRMRSSXCHHUB03.dc-m.alcatel-lucent.com ([135.120.45.63]) with mapi; Tue, 9 Oct 2012 17:44:11 +0200
From: "Bocci, Matthew (Matthew)" <matthew.bocci@alcatel-lucent.com>
To: "nvo3@ietf.org" <nvo3@ietf.org>
Date: Tue, 09 Oct 2012 17:44:03 +0200
Thread-Topic: Boston interim meeting minutes
Thread-Index: Ac2mNO0QlQyZO/SJTI+AKiILH+uAVw==
Message-ID: <CC9A04D3.3662A%matthew.bocci@alcatel-lucent.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
user-agent: Microsoft-MacOutlook/14.2.4.120824
acceptlanguage: en-US
Content-Type: multipart/alternative; boundary="_000_CC9A04D33662Amatthewboccialcatellucentcom_"
MIME-Version: 1.0
X-Scanned-By: MIMEDefang 2.69 on 155.132.188.84
Subject: [nvo3] Boston interim meeting minutes
X-BeenThere: nvo3@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Network Virtualization Overlays \(NVO3\) Working Group" <nvo3.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nvo3>, <mailto:nvo3-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/nvo3>
List-Post: <mailto:nvo3@ietf.org>
List-Help: <mailto:nvo3-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nvo3>, <mailto:nvo3-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 09 Oct 2012 15:44:41 -0000

Please find the draft minutes from the Boston interim, below. Many Thanks to Siamack for these.

Please send any comments to the list.


Best regards

Matthew


NVO3 Working Group - 20-Sep-2012 Interim Meeting - Minutes
Network Virtualization Overlays (NVO3)
Location:
Juniper Networks, Westford, MA, US

Time:
20-Sep-2012, 1000-1700 EDT

Chairs:
Benson Schliesser (bensons@queuefull.net) & Matthew Bocci
(matthew.bocci@alcatel-lucent.com)

URL:
http://tools.ietf.org/wg/nvo3/

Agenda & Slides:
http://www.ietf.org/proceedings/interim/2012/09/20/nvo3/agenda/agenda-interim-2012-nvo3-1

Version 1,  20-Sep-2012 by Siamack Ayandeh note taker comments in italic
________________________________________
Meeting Objectives: (in order of priority)
1. Prepare Problem Statement and Framework for WGLC.
2. Prepare Requirements for WG Adoption.
3. Discuss Use-cases and their impact on Requirements.
4. Discuss solutions' Applicability and Gap Analysis.
5. Extra Credit: Help the chairs determine disposition of various drafts
(relative to milestones).
________________________________________
A. [10:00] Welcome and Meeting Administrative Information
- Meeting arrangements for the remote site
- Reviewed the “Note Well” IETF rules and procedures
- Meeting priorities and objectives noted above where reviewed
- Will try to fit the extra drafts to stimulate discussion & seek feedback re where to
    fit the material
- Meeting broken up to sections: problem statement, requirements, etc. agenda posted on
    line along with link to slides ( )
- Remote audio/video issues resolved meeting technical track started by 10:20 a.m.

B. [10:20] Problem Statement
o draft-ietf-nvo3-overlay-problem-statement (Thomas Narten - 15 min)
- Summary of where we are after IETF84
- Tom and Eric Gray are official doc editors plus a larger group of contributors
- 3 or 4 topical areas need further feedback:
- 1st is the data plane; text is lacking as some of it pulled out and put in the
    framework document; since no new encacp is introduced, it is not discussed (there is
    also a data plane req draft)
- 2nd Trombone Routing needs to be defined in the context of NVO3; may exist for inter
    VN routing; this may be a policy discussion as gateways are involved; to what extent
    should inter-VN communications be considered and optimized; does NVO3 offer any new
    opportunity for a solution in this space; (discussion pursued: D. Black two use cases
    are floating re this issue, does not seem need for optimization)
- 3rd issue is ingress/egress path optimization for gateways; is there a need to
    optimize in the solution? Middleboxes may pin the route; would the default route move
    with the VM move? Does this problem exist in today’s IP networks?
- 4th L2 “Problems” review by Janos Farkas pointed out inaccuracies in L2 limitations;
    one approach is to down play the L2 problems; there may be L2 mechanisms which have
    not yet be widely deployed; L2 limitations are felt today hence the L3 overlay
    approach
- NVO3 control plane work areas may be expanded or better focused; e.g. re “Oracle”
    definition; was further elaborated to clarify the perspective of the authors re use
    of a generic term/concept. Question (Stewart AD; this is a trade name may cause
    issues;
- Three work areas broadly are: server side to/from NVE; Oracle itself; and NVE-Oracle
    interface
- Oracle may be implemented as centralized or distributed; use routing or a directory
    approach
- NVE-Oracle interaction items: Pull/Push information re tunnel end points and VM name/
    addressing; NVE may be part of the Oracle; or defined a generic interface based on a
    standard language
- Server-NVE interaction: NVE if moved in front of the vSwitch, information exchange is
    required to tag frames between the NVE and the vSwitch e.g. (floor: other
    possibilities are to involve the Oracle? Different architecture approach; Sumesh: Mat
    be better to draw slide “3 potential NVO3 work areas” without the hypervisor; Pat:
    Oracle may be the orchestration system talking to the Hypervisor which then talks to
    the NVE so this is not the only way to represent this; other comments were made re
    many options can be instantiated from these components; need to discuss as a group
    if solution space is limited by the problem definition, as we may have to standardize
    different protocols depending on the model; Thomas Morin: Other architecture where NVE
    is on the server interfacing the Cloud OS is possible; Floor: there may be multiple
    Oracles, server, network Oracles which interface and communicate; Floor: perhaps the
    problem statement should be more generic in identifying work areas! Rather than
    involve architecture details.
- Further discussion: Eric: IP transport should be made explicit as a distinguisher of
    this working group; seconded by Pat: however future deployments should not be left
    out in the specifications; push back is if some existing solution has not been popular
    to what extend should we spend time on them. Anoop: Supported the IP transport; we
    need the models prior to any protocol discussion; Tom agreed that protocols should
    probably go to the framework draft. Tom: Work areas need to be here to give the group
    its problem areas for further work. Floor: Discussion of L2 cannot do overlays should
     NOT be introduced. Operator view: many versions of L2 while there is one flavor of
     IP; Floor: How about having an architecture document? Sumesh: Is the Oracle the Cloud
     orchestration system? Is this the right model? If not why not etc. let’s have a
     discussion. Tom asked for any other issues that need to be addressed; we seem to be a
     couple of iterations from a final call for this draft. Larry: Should Trombone routing
     be in the problem statement; let’s discuss. Opinions on both sides were expressed.

o draft-khasnabish-vmmi-problems (Bhumip Khasnabish - 15 min)
Skipped as speaker was late arrival.

o Also of Interest: draft-rekhter-nvo3-vm-mobility-issues

C. [11:15] Framework
o draft-ietf-nvo3-framework (Florin Balus - 15 min)
o Also of Interest: draft-wei-nvo3-security-framework
- Order of the slides is update, terminology refresh, model and technology, finally
    input needed from the WG, and next steps.
- So far minor text cleanup; seems ready for WG adaption
- Terminology pictorially described; TES is a VM or a server attaching via a VN
    attachment Point (VAP), NVE contains the tenant instances (identified by a VNID) and
    mapping of tenant addresses to tunnel end points (overlay module); DC underlay is a
    L3 core network; Anoop commented how the document is confusing re these terms; Floor:
    questioned the NVE= VNIs + “overlay module” model; there is no connection to the
    Oracle hence the control is missing; need for better alignment with the problem
    statement, at least add a picture; David: not comfortable with the word tenant,
    change to IEEE terminology; Nabil: Need to include the control plane view in the frame
    work; Lucy: How do we account for a local LAN between the TES and NVE? Model lacking.
    Response was that the text calls it out. It may also be an IP network! Perhaps a
    diagram is required. Tom: Please let’s get clarity on terminology as it will help us
    move forward.
    David: this technology will be deployed where they are out of VLANs and there may be
    no tenants in the picture! Floor: NVE gateway is ambigious; a GW should be clarified
    as demark between two systems, also gateway may be part of an NVE for inter-NV
    communications; response: need to improve on this and add more details on gateways.
    Benson: is the term host confusing the issue given virtualization has changed the
    definition;
    Tom IETF lacks terminology for this space e.g. a host may be bare metal; or a VM; or
    other nesting/iterations. Good discussion followed on what is the end system; Pat: end
    system depends on which layer you are at (it is an end of what).  Capture the role and
    what of the entity before settling on terminology. What view do we take the
    network/layered view or the server/hypervisor view? Let’s discuss on mailing list.
    Would the server-network Oracle interface be covered? Remote site comment: let’s
    capture the differences between a gateway and a typical NVE (pls send text
    suggestions).
    David: let’s have a design team terminology discussion as this is in critical path.
    Nabil: TES may have to be changed to a Tenant System.
    Erik: Careful re terminology.
    Remote: also better map to VPN terminology. Dino: Use original Internet terms host,
    router, etc. Himanshu defined an end system as a source/sink of traffic.
    Benson: Perhaps we can name the interfaces rather than the attachments.
    Mathew: Suggested sharing an industry wide terminology; floor: identify the customer
    or user as a tenant i.e. a user of this system vs. a provider in this system.
    Floor: suggested “tenant host” for TES.
- NVE GW Function was elaborated as interface to VPNS, etc.
- Open issues: multi-homing discussion is either not a standard issue or can be
    captured as part of the tunnel discussions; request to add text; Request to add an OAM
     section; and beef up the security section.
- Next steps: 4 items will be expanded on: Terminology, control plane, mappings, and
    gateways. Anoop: How about models? Is VNI a table or a whole network?
     Floor: are we discussing terminology now or later? Himanshu: Have you covered query
     /response or inter Oracle communications? Some discussions exist in the control plane
      section. Need both push model and query/response and auto discovery.
      Lou: NVE to NVE data plane un-reachability needs to be covered.


D. [12:00] Adjourned for Lunch
Reconvened at 13:00 hr.:
Chair: Show of hands re who is comfortable with term TES? Tom: please set some context
because other terms may need tweaking; floor: let’s get more time. In the context of
discussion today who believes TES is adequate? Approx 10[Y]/20[N] vote.
Florin: Please take the terminology into some other document.
Tom volunteered to help with a small team and come back in a week or two.

E. [13:05] Use Cases and Use-case (UC) Requirements
o draft-mity-nvo3-use-case (Aldrin Isaac - 15 min)
- Taking an operator perspective to make it more useful and focus on general use cases.
- 4 focus areas: basic NVO, Interworking, …
- Assumptions: no gateway in an NVO; end systems do not directly with the transport
    underlay; NVO may be L2 or L3; L2 NVO is used for non IP protocols e.g. VRRP, FW HA,
    etc. L3 may be used also between gateways. Gateway is defined as interconnection
    between NVO instances; logical GWs may coexist;
- Basic NVO: any NVO can be on any NVE in the NVO3 autonomous system (AS); AS seems to
    be limited to the scope of a VM movement in this use case. UC optimizes physical
    resource utilization. VM move may also be due to: maintenance window requirements;
    datacenter migration/consolidation; load balancing between Data Centers; suggesting
    direct underlay tunnel between datacenters with no gateways in between. So a DC
    boundary does not map to an Oracle Domain boundary. David/Tom: pls be careful re the
    use of AS (watch for baggage) perhaps use admin domain. Aldrin: means one Oracle (per
    AS).  So it is an Oracle Domain or NVO3 domain. Sumesh: what is the use of an OD
    spanning multiple datacenters? There may be negatives to doing this.
    Response: it depends on the Oracle and how distributed it is (i.e. control or
    directory based). David: One can argue for both scenarios of one OD per datacenter
    and otherwise. Aldrin: Based on past experience extending one OD makes my life easier.
    I want a direct tunnel between the datacenters between NVEs; rather than some other
    entity. Floor: asked for proof point! Operator experience. Please no gateways in
    bridging datacenters should be required. Latter is a use case. Lucy: This is doable
    as tunnels can be built on top of the inter data center WAN connectivity if the
    underlay is multi-datacenter.
    Sumesh: Now I understand better why you are asking for
    this. This may be doable in other ways without a OD crossing datacenters.
    Response: OK, show me what you can do and I will consider it.
    Floor: what does the Oracle control in your mind? Response: It depends on how you
    federate and resolve the name space issue.
- Interworking of NVEs: one form of NVE communicating with another form of NVE based on
    placement.  NVE from multiple vendors may be Hypervisor based on ToR based should
    interwork. Please no hidden vendor locks nor customers need to sort it out. Fit with
    existing TOR functions; no specialized new device.
- Interworking NVO instances: within an NVO3 domain using gateways; within an OD.
    Support B2B inter tenant inter NVO instance communications; access via Internet etc.
     chair cut off due to lack of time …
- Next steps: we need real use cases from the real world. Soliciting input.
   Chair: no milestone for use cases currently; thoughts on mailing list how to absorb
   the use cases.


F. [13:50] Data Plane Requirements
o draft-bl-nvo3-dataplane-requirements (Nabil Bitar - 15 min)
- Purpose define the data plane for NV over layer-3. VAP definition is followed by the
    L2/3 VN Instance services and TTL handling (inetgarted routing/bridging is optional).
    IPv4/6 outer encapsulation is a MUST and MPLS is a MAY. Entropy is requirement for
    load balancing; Diffserv and ECN marking is covered and BUM handling; Somesh: The U
    may not be an issue if routing is used for the control plane. Point of disagreement.
    Floor: Please track the action items resulting from the discussion.
    Floor: is entropy a MUST? No it is not as long as there is enough information for that
    purpose. ECN is not an RFC yet and work in progress; disagreement on references.
    Floor: should this be specified as part of the standard or just references as what
    might be required. Floor: Framework needs to be settled before this document can
    complete.
- VNI Types: L2 and L3 are both supported. As well as integrated routing and bridging
    on NVE.
- Next Steps: Make entropy language more general. Expand on L3 multicast groups; and
    solicit comments. David: suggestion be careful about venturing much beyond the
     requirements as there are at least five data planes out there. Avoid data plane wars.
     Somesh: How do existing deployments fit the L2 model? Not clear. Back to BUM, does it
      need to be a MUST? Tom: Extent of L2 emulation needs to be captured concisely at
      some point; agreed.
- Further discussion on the mailing list.

G. [14:15] Control Plane Requirements
o draft-kreeger-nvo3-overlay-cp (Larry Kreeger - 15 min)
- Want to focus more on taxonomy
- A couple of diagrams to set the stage were presented to level set the terminology
    based on the framework draft.
- Dynamic state required by an NVE to do its function was listed: forwarding using
    inner/outer header mappings; multicast per VN context or a baseline broadcast;
- Three control plane interfaces to standardize: North bound Oracle to orchestration
    system; Oracle to NVE and Server to NVE. Florin: How about the inter Oracle interface?
- X4 Categories of control protocols was covered.
- NVE—Oracle Interface: Oracle may be monolithic or distributed and push or pull data
    from NVEs.
- Characteristics of the control plane were covered; are listed in the draft: light
    weight, extensible e.g. IPv4/v6 support, be reactive to change with quick convergence.
- Summary: Listed the required interfaces. Floor: How do we pick a control protocol
    from available options? Response: let’s get the framework and identify the
    requirements of each interface and then focus on solution space.
    David: Intra Oracle may be better left out to cluster experts and then perhaps find
    a way to federate based on Aldrin’s requirements. Florin strongly disagreed. Supported
    by other commenter’s. Tom; North bound may not be required as Oracle may already be
    integrated in the orchestration system; agree to leave inter Oracle alone; and likes
    the federation idea. Maria: why leave inter Oracle If out if scaling it is an issue?
    Somesh leave inter Oracle out as it has been solved before, in favor of inter domain
    work. David re iterated Somesh’s point.
    Dino: customer requirement is that the Oracle needs to be mobile between datacenters.
    Florin: Need inter operable Oracles even if we leave out how to scale the Oracle (re
    inter Oracle interface). Back and forth opinions were expressed on which interface is
    more important.
    Nabil: How would things changed if we called the Oracle, a controller?
    Despite all the confusion it will cause? Ben: We can decouple the protocol from
    distributed/centralized question. E.g. even a directory can be centralized or
    distributed. Tom: A two tier model seems to be more useful i.e. NVE—Oracle which
    interact. Discussion shifted to centralize vs. distributed and all flavors in between.
    Does a route reflector make BGP a centralized approach? And what will reside in an
    Oracle database? Are ACL’s in the Oracle database? Ben: how is the change in data and
    caches on NVE communicated? Aldrin: explained the procedure for the VM movement and
    how the cache is updated today. Somesh: It all starts from a centralized orchestrator
    anyway; so the role of Oracle may be more limited than we realize. There will be a
    single automation point in the large datacenters, so question is how to bridge the
    virtual and physical world and their management and configuration.
    Dino: explained LISP re VM movement. Pull model of DNS scales to billions of end
    points vs. the routing million end points. Flushing cache entries for every
    conversation is not scalable. Himansho: why not solve the problem using the
    virtualization Oracle which has knowledge of all the movements vs. the network
    trying to keep up with it. Nabil: Why not create a VM to NVE interface? Not back
    ward compatible.
- Proposals: Separate the two interface definitions i.e. NVE—Oracle and Server – NVE.
    Have separate requirement drafts; straw poll: 12[yes]/1 [no].
    Can we call the Oracle a “mapping system”? Change “Server” to “End Device”

o Also of Interest: draft-kompella-nvo3-server2nve
o Also of Interest: draft-gu-nvo3-tes-nve-mechanism
o Also of Interest: draft-gu-nvo3-overlay-cp-arch

H. [15:35] Operational Requirements
o Also of Interest: draft-ashwood-nvo3-operational-requirement
- No presenter present.
- Chair asked for feedback on the mailing list.
- Chair: what does OAM requirement mean to you?
    Himanshu; pls no OAM, there is enough OAM in the network already in the underlay.
    Aldrin: per domain do better hop by hop quality check; network can check its own
    links as opposed to thousands of end points checking out the network. Provide the
    info to the overlay. Often packets drop through the cracks between service providers
    and applications are the first to discover this (this is not good …).
    David: Yes we need it upfront.
    Chris: Yes we need some, e.g. black holed traffic etc. Nabil: Yes we need it upfront.
    John: OAM is important to the user; not sexy but needed. Tom: want to hear from
    operators to make sure we have enough OAM.

I. [15:50] Solution Applicability and Gap Analysis
o draft-drake-nvo3-evpn-control-plane (Aldrin Isaac - 15 min)
- Use E-VPN as generic/common control plane for any encapsulation
- Was designed for bridging different edge technologies amongst other things
- Learning using BGP; most customers already have BGP in their data center edge
- VPN and virtual LAN auto-discovery
- ARP flood optimization
- Scale using route reflectors and block MAC addr withdrawals
- BC and MC traffic over multicast trees or ingress replication
- Multi-homing
- Supports local sig context ID
- 4 Billion tenant VPNs, 4 Billion virtual LAN per tenant VPN
- Operator defined networks
- Distributes MAC and IP address and LAN segment to PE & MPLS label binding
- MPLS labels have local significance (unless desired otherwise)
- See RFC5512
- Can support multiple encap types
- Ingress PE uses the encap advertised by the egress PE
- So multiple encap can coexist when supported by a tunnel end point
- Separate multicast trees per encap type

o draft-kj-nvo3-pion-architecture (Bhumip Khasnabish - 15 min)
-  Protocol layers and their requirements
- One data plane for both L2 and L3
- Encapsulation layer requirements were covered
- Floor: No need for fragmentation and sequencing in NVO3
- TNI layer requirements were covered
- PSN layer requirements were covered
- Next steps comments questions?
o draft-bitar-nvo3-vpn-applicability (Nabil Bitar - 15 min)
o –  Discussion started in Tippah as to how VPN’s can be used in NVO3
o – Identify the gaps
o – Requirements addressed are large multi tenant intra and inter datacenters;
      cloud networks etc.
o – Cloud networking architecture was reviewed
o – VPN using MPLS/BGP IP VPN and PBB and TRILL are tools to be used
o – Reviewed the applicability of BGP/MPLS approach
o – Applicability of PBB and use of ISID
o – Applicability of VPLS and mapping to NVO3 components
o – E-VPN mapping covered already; was reviewed briefly
o – Other work in progress on VM mobility, ARP suppression, ARMD drafts
o – Gaps: may need auto discovery; NVE location i.e. server based vs appliance based;
      NVI size; scope of identifiers; traffic path optimization; interworking of VxLAN
      with WAN technologies.
o – Next steps: merge various material; address comments; new co-authors e.g. John Drake
o
o draft-hy-nvo3-vpn-protocol-gap-analysis (Lucy Yong - 15 min)
o – Back ground originated from IP/MPLS gap analysis
o – Try to have a neutral approach to re use vs. new work items
o – Document organization was described
o – Described the IP/MPLS VPN model
o – What NVO3 is (asking) or trying to do high level
o – Quick comparison vs. VPN approach was listed identifying any gaps
o – Most significant gaps were identified as: VM mobility, VM placement, multi data
      plane interworking are missing; also operational models of the service provider and
      datacenter are different hence this is a gap
o – Gateway between NVO3 and VPN PE may be required to bring in enterprise traffic to
     the datacenter
o – send questions to mailing list
o Also of Interest: draft-maino-nvo3-lisp-cp
o – Matched the terminology of the NVO3 and LISP
o – LISP control plane was described as supporting multiple data planes
o The Oracle or so called mapping system is scalable e.g. using DDT. ALT is another
    mapping system, …
o – Benefits were covered
o – Flood and learn is replaced by a pull from the database; however the database
    needs to be populated and maintained …there is no free lunch
o – What’s next; integrate with the WG progress
o Covered remote questions and comments.
o Floor open
o
J. [17:00] Adjourn
K. – Floor remained open for further discussion and questions semi-official
 
________________________________________