Re: [nvo3] Comments on draft-ietf-nvo3-dataplane-requirements-01

"LASSERRE, MARC (MARC)" <marc.lasserre@alcatel-lucent.com> Tue, 15 October 2013 07:42 UTC
From: "LASSERRE, MARC (MARC)" <marc.lasserre@alcatel-lucent.com>
To: Eric Gray <eric.gray@ericsson.com>
Thread-Topic: Comments on draft-ietf-nvo3-dataplane-requirements-01
Thread-Index: Ac6+7Y5s3A39EJDURduAI+HZdKFZiQAWVB1QAaRtbMAA6FznEA==
Date: Tue, 15 Oct 2013 07:41:30 +0000
Message-ID: <B30152B129674240ADF67727A967301402A756@FR711WXCHMBA03.zeu.alcatel-lucent.com>
References: <48E1A67CB9CA044EADFEAB87D814BFF643FBE9@eusaamb107.ericsson.se> <B30152B129674240ADF67727A96730140291A8@FR711WXCHMBA03.zeu.alcatel-lucent.com> <48E1A67CB9CA044EADFEAB87D814BFF6442D8A@eusaamb107.ericsson.se>
In-Reply-To: <48E1A67CB9CA044EADFEAB87D814BFF6442D8A@eusaamb107.ericsson.se>
Accept-Language: fr-FR, en-US
Content-Language: en-US
Content-Type: multipart/alternative; boundary="_000_B30152B129674240ADF67727A967301402A756FR711WXCHMBA03zeu_"
MIME-Version: 1.0
Cc: "draft-ietf-nvo3-dataplane-requirements@tools.ietf.org" <draft-ietf-nvo3-dataplane-requirements@tools.ietf.org>, "nvo3@ietf.org" <nvo3@ietf.org>
Subject: Re: [nvo3] Comments on draft-ietf-nvo3-dataplane-requirements-01
Precedence: list
Eric,

I suggest addressing your concerns privately to avoid long email threads on the list.

Marc

________________________________
From: Eric Gray [mailto:eric.gray@ericsson.com]
Sent: Friday, October 11, 2013 12:47 AM
To: LASSERRE, MARC (MARC)
Cc: draft-ietf-nvo3-dataplane-requirements@tools.ietf.org; nvo3@ietf.org
Subject: RE: Comments on draft-ietf-nvo3-dataplane-requirements-01

Marc,

                See responses below (I have omitted text where we seem to have
agreement - at least in some cases; I am not certain that the remaining stuff
will make much sense in the absence of some of the included discussion)...

--
Eric

From: LASSERRE, MARC (MARC) [mailto:marc.lasserre@alcatel-lucent.com]
Sent: Friday, October 04, 2013 3:50 AM
To: Eric Gray
Cc: draft-ietf-nvo3-dataplane-requirements@tools.ietf.org; nvo3@ietf.org
Subject: RE: Comments on draft-ietf-nvo3-dataplane-requirements-01
Importance: High

HI Eric,

Thanks for your thorough review.

Please see my comments below.

Marc


________________________________
From: Eric Gray [mailto:eric.gray@ericsson.com]
Sent: Tuesday, October 01, 2013 11:31 PM
To: LASSERRE, MARC (MARC)
Cc: draft-ietf-nvo3-dataplane-requirements@tools.ietf.org<mailto:draft-ietf-nvo3-dataplane-requirements@tools.ietf.org>; nvo3@ietf.org<mailto:nvo3@ietf.org>
Subject: Comments on draft-ietf-nvo3-dataplane-requirements-01

Marc, et al,



I have some comments on this draft that are specific to problems we have with

determining what the actual requirements are in this draft, as we try to include

those requirements in the Gap Analysis draft.



These comments are pretty critical; remember that I am only asking for the sort

of clarifications that are needed to do the next step.



The general issues are with "hard requirements" that need to be better stated,

and with "soft requirements" that are ambiguously stated.

I'm a bit puzzled by this statement. The hard reqs are indicated by a MUST and the soft reqs by a MAY.

Is there anything specific you'd like to see?



I gave numerous examples.



--- [SNIP] ---

A tertiary general issue is with requirements that seem to be aimed more directly

at implementations, rather than solutions to the NVO3 problem space.



The intent is to include nvo3 dataplane requirements.

As such, when designing dataplane encapsulation solutions, it seems appropriate to consider encapsulation header properties (such as field alignment on specific word boundaries).



Again, this is intended as a general comment.  One or more specific examples were

included.



A final issue is that some of these requirements seem to be oriented around well

known and frequently desirable optimizations.  Are we certain that we've picked

the right requirement level for these?

Isn't this the WG's role to say so? I would hope that after over 18 months, we have...



You may be right (I too hope so), but it is not clear to me that every reader will

have been able to pick out the exact requirements.  However, if the draft is

made even slightly more readable, and nobody objects to including specific

optimizations as a requirement of every solution, I am happy to go with the

flow.



Remember that the 1st formal testing of these requirements is when we try to

apply them to screen/analyze potential solutions.  I would not be inclined to

just believe that these requirements have been reviewed with exactly the

same view point as we require in writing the Gap Analysis draft.



Specific examples

---------------------



Section 3.2 - text before the heading for section 3.2.1: does this text contain any

requirements?  It seems to describe what a VAP and VNI do, or can do.

Yes, it is introductory text before getting into corresponding reqs.



Thanks.





Section 3.2.1 - the hard requirements are apparently as follows:



1) L2 VNI MUST provide a virtual bridge emulation using NVO3 Tunnels.

2) Loop avoidance capability MUST be provided.

3) In the absence of a management or control plane, data plane learning MUST be

     used to populate forwarding tables.

4) When flooding is required (for BUM traffic), the NVE MUST support ingress

     replication or multicast.

5) If multicast support is provided, the NVE MUST have one or more multicast

     tree that can be used by the local VNI for flooding to NVEs belonging to the

     same VN.



For the same section - the soft requirements appear to be as follows:



A) The emulated bridge (in requirement 1 above) MAY be 802.1Q enabled

      (allowing use of VLAN tags as a VAP).

B) Forwarding tables MAY be populated via a control, management or data

      plane.

C) "For each VNI, there is one flooding tree, and a multicast tree may be dedicated

      per VNI or shared across VNIs. In such cases, multiple VNIs MAY share the same

      default flooding tree."

D) Multicast trees MAY be established automatically via routing and signaling or

      pre-provisioned.

E) When tenant multicast is supported, it SHOULD also be possible to select whether

      the NVE provides optimized multicast trees inside the VNI for individual tenant

      multicast groups or a default VNI flooding tree is used.



Issues with the requirements in section 3.2.1:



-              For 1, in order to avoid ambiguity, the term "bridge" should be replaced by

                "L2 forwarder" (or an equivalent term or phrase such as "L2 switch").  Also,

                while it is pretty clear what we mean by NVO3 Tunnels, this is not a defined

                term/phrase; we might want to use L3 overlay encapsulation instead.



It is also pretty clear what a L2 bridge is...

nvo3 tunnels are described in the framework draft.



The requirement describes a need to provide a limited virtual bridge emulation.

A virtual bridge emulation would need to support a minimal set of capabilities

in order to be compatible with real bridges.  I am reasonably certain that the

V-Bridge emulation we're talking about is not actually intended to be a fully

capable (and compatible) "Bridge" in the sense that IEEE 802.1Q defines.



I also believe we need to be very careful in making any statement about the

degree to which "it is clear" (in this context) "what a L2 bridge is..."



One thing I am pretty sure about is that many of the readers for this draft

will have a number of holes in their understanding of what an L2 bridge is.

Since this is the IETF, and not IEEE 802, that is only to be expected.  A lot of

the discussions I've seen on the list do not give me the same confidence you

express about bridging knowledge shared by all WG participants.



A better understanding of what a bridge is, is likely with implementers, but

not necessarily a safe assumption.



About the safest assumption we can make is that there is good knowledge

of what an L2 bridge is from the perspective of routing implementers and

users.



-              For A, we need to be more specific about what we mean by 802.1Q enabled

                or the solutions claiming to meet this soft requirement may not correctly

                interoperate.

Ok.

-              For 2, MUST be provided by whom?  Is this a requirement of solutions, or

                implementations?  I suggest rewording as "solutions MUST provide ..."

Ok.

-              For B and 3, there is considerable redundancy between the soft requirement

                and hard requirement; the real requirement is that solutions MUST provide

                at least one common or default mechanism for populating forwarding tables.

                We can clarify this with descriptive text that says that methods for doing

                this (i.e. - populating forwarding tables) include via a control or management

                plane or via learning from the data plane.

Ok, will streamline the text.

-              For 3, from the above, this hard requirement becomes a tautology, as it is

                a reasonable assumption that every solution will include at least a data plane.

                Hence, 3 should be replaced with text along the lines of the underlined text

                immediately above and B should be restated as descriptive narrative.  Note

                that the most important part of the requirement is that there needs to be

                a common or default mechanism specified by a solution or the solutions does

                not ensure interoperability.

Obviously, this a dataplane requirements draft and hence the assumption is that there is a dataplane...

This requirement describes the need to support dataplane learning  in the absence of control plane.



More fundamental than the context of this draft as a dataplane requirements draft,

any solution that does not provide a dataplane is clearly not a useable solution.



So, in general, it is perfectly okay to assume that there will be a dataplane to learn

from, provided that a solution supports this capability.



My point is that - strictly speaking - a solution might explicitly mandate either

management or control plane mechanism(s) and be incapable of acquiring the

information needed for learning the information from the dataplane.  As I

understand it, such a solution should satisfy the requirement as I phrased it

(and possibly as it was phrased in the draft).



However, the wording of the draft seemsto assert that every solution MUST include

dataplane learning (implicitly as a default mechanism) for populating the FDB.



I believe the only real requirement is that solutions MUST define a common

default mechanism for populating the FDB.



I am not certain that anything more than this would be a correct assertion, and

I recall no discussion along these lines on the mailing list that would explicitly

support - in particular - that every solution MUST include the capability to

learn from the dataplane.

.



-              For 4, this requirement needs to be further explained.  What is meant by

                ingress replication?  How can this approach be used exclusively to ensure

                delivery to an unknown unicast address?  In addition, the requirement is

                nearly a tautology, as phrased (if X is required, X MUST be supported).  I

                strongly suspect that ingress replication means separately encapsulating a

                distinct copy of a frame to be forwarded to each egress in a VN, but I cannot

                find this explicitly stated either in the draft or in the framework draft (the

                draft shouldn't rely on implicit understandings based on - for instance - a

                discussion on the mailing list or in meetings).  I believe the requirement is

                that solutions MUST provide mechanisms for ensuring that a copy of any

                broadcast, unknown unicast or multicast (BUM) frame is delivered to at least

                all egress NVE in the same VN as the ingress NVE. In descriptive text, we can

                add that ingress replication or multicast trees are examples of this kind of

                mechanism.

Ingress replication is a term that has been used for over a decade and is pretty well known. It is described in VPLS RFCs  for instance.

I'm fine changing the text as you suggested in bold if it is clearer,



If the term is defined explicitly somewhere that is already referenced, I am fine

with leaving it as is.  I know what the phrase means, but I am asking for it to

be defined, of we're going to use it.  We are writing requirements that may

need to make sense in another decade, when folks may or may not already

know what the phrase means.



My replacement text shows that we don't need to explicitly limit the choices

as we have done in the current draft, and that we can state the requirement

without either mentioning ingress replication, or restricting the choices more

than necessary.



-              For C, this seems to be a two-part soft requirement (because of the lower

                case use of "may") and the exact requirement is ambiguous. Which cases

                does "in such cases" refer to?  The use of phrasing "for each VNI, there is

                one flooding tree" seems to imply that there is only one such tree per VNI;

                this conflicts with the use of "default flooding tree" later in the same text

                (see quoted text above) that implies that there may be other such trees

                (otherwise, why specify a "default flooding tree"). Also, the text is not

                consistent in naming the tree under discussion - referring to flooding tree

                in one part and multicast tree in the next.

                The requirement would be clearer if stated along the lines of -

                " For each VNI, there is at least one flooding tree used for Broadcast, Unknown

                Unicast and Multicast forwarding.  This tree MAY be shared across VNIs."

ok



                Note that it is not necessary to state the alternative as it is both implicit and

                obvious (i.e. - in saying that they MAY be shared, it follows that they might

                not be shared).

-              For D, as near as I can tell, this is not so much a soft requirement as a scope

                statement; as far as this draft is concerned, what we are saying is that we

                don't care how the multicast trees are established, as long as any solutions

                provide a means to establish them.  This implies a missing hard requirement

                that solutions MUST provide mechanisms for establishing multicast trees

                if multicast trees are used in the solution for delivery of BUM traffic.

 No, the intent was to say that a dynamic way of establishing mcast trees MAY be supported.

Otherwise, they need to be manually configured.



Even in this phrasing, it seems that we're eliminating a number of potential choices.

For example, this wording eliminates ingress replication (since M-Cast trees are not

used in this case).



Also, we should explicitly indicate that "via routing and signaling" is an example.

A number of proposals have been made where a great deal of automatic actions

are not driven by routing and/or signaling in the conventional sense.



In addition, if you look at what D (above) says, this use of MAY means that all of

these methods for establishing multicast trees effectively MUST be supported by

all solutions (this is counter-intuitive, but is often why we use MAY in requirements

language).



This is the case because - by saying that any of these methods MAY be used - we

open the door to solutions/implementations/deployments that might choose to

use any of them.  Hence, to ensure compatibility, solutions/implementations will

need to support all choices.



This may be in fact what we want to say.  My reading was that what we wanted to

say is that a solution could choose to use either routing/signaling (control-plane)

or configuration (management-plane).  Could use, not MUST provide.



If my reading is not correct, then I am happy to withdraw that aspect of my comment.



-              For 5, the text I provide is an extraction that I am reasonably sure is correct,

                but the requirement is not explicitly stated this way.  I suggest making the

                requirement more explicit.

Not really. The penultimate paragraph of 3.2.1 is more explicit.



The penultimate paragraph is not nearly as clear as you think.  :)



The 1st sentence talks about 2 potential approaches, while the entire remainder of

the paragraph (and possibly the section) is written assuming the 2nd approach has

already been chosen.



I would break the paragraph up into parts.  The current 1st sentence - if it is kept at

all - should be in its own paragraph.



The remainder of the paragraph should be broken up along the following lines:



    "If multicast is used:



    "-  the NVE MUST have one or more multicast trees that can be used by local

          VNIs for flooding to NVEs belonging to the same VN.



          "Note - for each VNI, there is (at least) one flooding tree, and a multicast tree

            can be either dedicated to a single VNI, or shared across multiple VNI.



    "-  if a multicast tree is shared across multiple VNI, the multiple VNI MAY

         share the same default flooding tree.



        "Note - a flooding tree is equivalent to a multicast (*, G) construct where each

          NVE at which a corresponding VNI is instantiated is a member of G.



     "- multicast trees MAY be established automatically via routing and signaling,

         or pre-provisioned."



Some further notes on this text:

1) I've put what I believe to be descriptive text  that was embedded within the

     requirements text in "Notes" - if you would prefer to separate it, and put it

     before or after the requirements text, that is fine with me.

2) Alternatively, we could replace

                "If multicast is used:"

                                with

                "Throughout the remainder of this section, we assume multicast is used."

3) in one of the notes, I replaced "may" (lower case) with "can" because I think

     we are not stating this as a requirement, but simply making an observation

     (there are a significant number of IETF folks who do not believe that it is

     necessary to capitalize "requirements language" to give it the RFC 2119 spin,

      and a number of RFC are indeed written without using capitalization).

4) the requirement about sharing the same flooding tree needs to include a

     bit more about the conditions under which this is permissible (otherwise,

     doing so may introduce a number of issues).

5) in suggesting this text, I am not explicitly withdrawing my objection to the

    last (soft) requirement.  I am merely showing how this text would be made

    clearer assuming it is kept otherwise as is.



-              For E, neither the phrasing nor the current narrative structure makes

                it clear this requirement applies if flooding is required.  I suggest using the

                phrase "... when tenant multicast is used to support flooding" instead of

                "when tenant multicast is supported"

No, the last req of 3.2.1 is not about flooding but about the use of specific multicast trees instead of the default flooding tree.



I am guilty of not making my point clear.  I understand that the point we are

trying to make is that we would like it to be possible to choose whether or

not a solution might use "optimized" (read "pruned") multicast trees for

multicast instead of the default flooding tree.



There are a couple of problems with how this is worded, however.



1) as with the above paragraph, this paragraph assumes multicast is used

     (the fact that tenant multicast is supported does not mean that ingress

     replication is not used instead of multicast trees); this assumption needs

     to be made explicit in some way possibly similar to the above paragraph.

2) it is not clear who we want the choice to be made by (are we saying that

     any specific solution may choose, implementations of any solution may

     choose (implying solutions MUST support), or users may choose (thus,

     implying both solutions and implementations MUST support).



I can see that my suggested replacement text does not address the issues

(I probably was confused about the meaning of "tenant multicast", as I

personally get confused about the distinction between "tenant", "resident",

and "currently present").



I believe it would be nearly sufficient if it is clear that - with this text, as with

a lot of the rest of the text in this section - we are assuming multicast trees

will be used.



You may want to note that this is an example of what I think of as an

optimization, and the question I have about whose choice it is intended

to be is a reflection of why I think we might not be getting the requirement

level right.





Section 3.2.2:

I believe that both the hard and soft requirements in this section are fairly easy to

determine, given a thorough read.  However the secondary general issue described

at the beginning of this mail message applies.



One thing that is not clear, in either section 3.2.1 or 3.2.2, is - where we talk about

using a "default" VNI multicast tree across multiple VNI - are we assuming that a

solution MAY allow for sending multicast traffic to a superset of the egress NVEs

that are expected to deliver it (requiring the egress NVEs that are not part of the

same VN as the ingress NVE to filter/discard this traffic), or are the shared default

trees required to be completely congruent (i.e. - VNIs may share the same tree if

and only if the set of NVEs for each VN is the same)?

You seem to be mixing 3.2.1 and 3.2.2.

There is no default flooding tree in 3.2.2.



The lines of separation are not as clear as you think here, either.  :)



First thing, I did not mention flooding in this comment at all.  My comment is

with respect to using multicast trees.  I do realize (having re-read the text and

looked again at my comment) that neither section 3.2.1 or 3.2.2 talks about

sharing multicast trees across multiple VNI.  Hence I can understand how you

read "flooding" into this comment.



That said, some types of L3 traffic are one-hop flooded.  Technically they use

a multicast address, but scoping them to one hop and delivering them to all

local L3 entities (to be ignored by those that don't care) is common.



Really, the point I was trying to make is more about sharing of trees, which I

thought might be occurring in both sections because I thought that the L3

multicast might be flooded (sub-optimally) and end up being distributed

over a shared default tree.



For example of some blurring between the sections. 3.2.2 talks about mixing

L2 and L3 - potentially using the same multicast trees (IP multicast is usually

distributed between L3 entities using multicast MAC addresses - which might

be flooded over L2).



But the "blending" I was most concerned about is in the last paragraph of

section 3.2.2.  Again we have very similar wording to the corresponding

paragraph in section 3.2.1 - specifically:



   "When multicast is supported, it SHOULD be possible to select whether

     the NVE provides optimized multicast trees inside the VNI for individual

     tenant multicast groups or whether a default VNI multicasting tree,

     where all the NVEs of the corresponding VNI are members, is used."


If you note that we allowed (in section 3.2.1) for using a shared "flooding"

default tree to be used for all BUM traffic (which includes multicast), how

can we assume that this same tree might not be used (incorrectly) for the

default VNI multicasting tree?



I assume that is the intent - i.e. - we don't want the L3 VNI multicast tree

to be shared across multiple VNI.  This is implied by the use of "inside the

VNI" (though "inside" is not exactly the most precise term to use).



This should be relatively easy to fix.  I would suggest rewording the quoted

sentence above along the lines of:



   "When multicast is supported, it SHOULD be possible to select whether

     the NVE provides optimized multicast trees scoped by specific VNI for

     individual tenant multicast groups or whether a default VNI multicast

     tree,  where all NVE of the corresponding VNI (and only those NVE) are

     members, is used."



Note that the same observation about requirements levels applies, not

only because this is an optimization, but also because it depends (a lot)

on whose choice it is to "select whether ..."





I suspect the latter, but don't know if this is explicitly stated anywhere.  This draft

would be a good place to do so.



Section 3.3 - text before the heading for section 3.3.1: does not contain requirements,

as far as I can tell.  It seems entirely descriptive.

Yes. Isn't it useful to have some introductory text before going through specific reqs?



Of course it is.



It just makes the requirements easier to pick out, review, comment on

and implement if they are not buried among descriptive text.



It would help a lot if the requirements were bulleted in each section,

possibly with a leader that says something like "the requirements in

this section are: ..."



It would be even more helpful if the requirements were numbered,

as they have been in other (possibly contentious) IETF RFCs, but I

am not going to try to insist on that.



The reason why I brought this up in this case is that - since you define

(in an abstract sense) the overlay header in this text - some part of this

text should be identified with a requirement.  At some point, (either

here, or in section 3.3.1) the text should say something to the effect that

an overlay header containing "..." MUST be included.





Section 3.3.1 -



Again, the secondary general issue applies.

What do you mean?

It is a hard req to have an overlay header.



Yeah.  But the text staring with "Note that ..." is not a part of the

requirement and the requirement would be easier to pick out if this

text were in a separate paragraph.



It is probably sufficient to just break this into 2 paragraphs.





Section 3.3.1.1 -



In the descriptive text around the first requirement in this section (context field), the

example of a "local context identifier" needs to be made clearer.  I doubt that it is a

strictly local identifier as it is presumably received on the wire.  As this is an example,

it doesn't hurt to be specific.

It is defined in the framework draft.

On one hand, you are questioning introductory/descriptive text in some section and asking for more in others. This is not consistent.



I believe it is consistent.  I have no objection to descriptive text, only an

objection to burying requirements in description, or embedding description

in requirements to the extent that it may not be clear where one starts and

the other ends.



I believe the usual term for my concern is what folks call "crispness."





Not sure where the 32-bit boundary alignment soft requirement comes from.  It is a

SHOULD, which means that some justification is required if it is not done.  However,

it has been observed many time before that boundary alignment is a somewhat

artificial requirement for information received serially over a link - and this is even

more true when the number of bits in a preceding header may or may not be aligned

similarly.

Justification is given. What else would you like to see?



It is not a big deal, but it is strange to assume that bits, that are essentially

not received in 32-bit chunks over a wire or piece of glass, are somehow

assumed to be naturally aligned on a 32-bit boundary.  If you're talking

about hardware in your reference to the data-path, assuming a 32-bit

alignment is necessarily useful depends rather heavily on the hardware.



Also, 32-bit alignment in the data-path makes not much of a difference, if

the headers that come with the data are not also 32-bit aligned.



Even in software, I would point out that very little of the work I've seen in

protocols has shown much concern for 32-bit alignment IPv4 addresses are

known in a large number of cases to have been defined in message and

other formats to wrap 32-bit boundaries (even though they are 32 bits).



See - for example - the IPv4 subobject format defined on page 26 of

RFC 3209 (RSVP-TE: Extensions to RSVP for LSP Tunnels).



I think this is just a bit too much of a detailed requirement at this point,

and is not particularly useful in trying to analyze potential solutions.



The note stating that there is "no such constraint when using a local identifier seems

to be strange for a requirements draft.

No, it clarifies the point that scaling the ID space does not apply when using local IDs (e.g. MPLS labels)



See below...





For one thing, it seems likely that the scale requirements for local identifiers can be

derived from the complexity of the network and are therefore not non-existent.

With a sufficiently simple topology, the scaling issue is of the same order - especially

considering that a "global identifier" is context limited to a data center deployment.

You missed the general idea.



Not really.  I just did the analysis.



It is relatively straight-forward to show that the scale required for local identifiers

can be derived from network and local information such as the number of interfaces

a device has, the topological radius of a network, the number of entities that are

required to be globally recognizable as distinct, etc.



As a simple proof, consider the extreme case of a single large box connected to all

of the entities in the network via a single physical interface.  The scaling requirement

for local identifiers in this case is the same as for a corresponding "global identifier."



Increase the number of interfaces (including through the use of virtual interfaces)

and increase the radius of the network, and you reduce the scaling requirement for

local identifiers relative to global identifiers, but you don't actually eliminate the

relationship between local and global identifier scale.



Note that the use of network virtualization can be demonstrated to be related to

(as in part of) the scaling concern associated with local identifiers (i.e. - using a

virtual context to scope a local identifier decreases the scaling requirements for

that local identifier but this is because - arguably - it has now become part of

that local identifier).



The relationship is non-linear and - as a result - the effect of scaling requirements

on local identifiers is quite flat, given any network of sufficient complexity.



I just question the value of including a statement comparing local to global IDs

in the requirements draft.



Maybe it is useful.



Keep it if you want.



But separate it from the requirement it obscures.  :)







For another thing, this looks as if it was added to justify a particular solution choice.

No, an MPLS label is a valid option.



Yeah, that was obvious.  Too obvious, in fact.  :)





Section 3.3.1.2 -



In this section, normative terms appear to be used in descriptive text, making it -

for instance - unclear if a specific phrase means to indicate a soft requirement, or

describe observations about the operating environment.



For example, much of the text appears to be making the observation that it is often

possible to define a mapping from QoS, PCP, or whatever bits used to indicate how a

frame or packet will be handled at one layer in an hierarchy of network overlays.



This is an observation of reality, not a requirement (soft or hard) of a solution.  Maybe

the intention is that - where it is possible to define such a mapping - a solution MAY

(or probably SHOULD) describe mechanisms that may be used to cause the mapping

to occur?



This would be a more useful requirement, as either a MAY or a SHOULD.

Not sure I get your point.

The existing text explains that this is a MAY.



The section has a few examples.  I had hoped not to have to spell them each

out.  Some of them have to do with using "may" in one sentence and "MAY"

in the next (e.g. - "... applications may span ..."; "support for multiple ...

MAY be required."



If the "may" in the first sentence were capitalized, the "MAY be required"

in the next sentence would need to be "MUST be supported."



I assume the non-capitalization in the 1st case was intentional (so "might"

would be a better word).  So, I question the meaning of "MAY be required."



What does that mean?  Does it mean that a solution MAY require it?  Does

it mean a user MAY require it?



In the next paragraph, the draft talks about a capability that an NVE MAY

have (i.e. - it MAY be able to map CoS markings between network layers).



What kind of requirement is this?  It might also have an anodized blue

aluminum front panel.  So what?  :)



The next one is especially problematic.  Tenant CoS policies MAY be defined

by Tenant administrators?  Does this mean that solutions have to support

the ability for Tenant administrators to do this?  In that case, this becomes

a hard (MUST) requirement for both solutions and implementations.



What I believe you are saying is that solutions - or implementations - might

include the ability for Tenant administrators to define CoS policies.  In fact,

this not only seems likely, but is a fact for many implementations in general.

But is it a requirement?  If so, then there are a bunch of hard requirements

that implicitly follow from it.



Under "NVE CoS", the first bullet seems to be stating a soft requirement, but is

not (IMO) doing so; it is instead making observations about capabilities any

specific implementation might have.



The 2nd bullet under "NVE CoS" looks as if it is intended to define required

behavior but does not use any key requirements level terminology.  As a

result, I cannot determine what the text is supposed to do, or how I am

supposed to read it.



Under "Underlay CoS", there are good soft requirements, properly and

clearly identified - but it would be good if the draft said something about

the implications of these requirements.  The Underlay/Core network MAY

very well do these things (use a different CoS set, or cause the CoS value

to change from one domain to another).



The implication is that implementations SHOULD support mechanisms

that allow them to adapt to this possibility.  Maybe this doesn't need to

be said.



But what are the implications for solutions, if any?







Section 3.3.2 -



The only requirement in the text before 3.3.2.1 is that IPv4 or IPv6 MUST be supported.

Is this meant to be an "or", or is the real requirement that solutions MUST support both?

It says either one MUST be supported but both SHOULD be.



So, what happens if a solution allows implementations to support either

without specifying a common minimum support requirement (either, not

really important)?



I suspect that we mean that a solution MUST (consistently) support one

or the other, and SHOULD support both.  I suspect implementations will

support what they have to support in order to sell.





Section 3.3.2.1 -



There are a number of issues with the requirements in this section.



Support for LAG/ECMP is a soft (SHOULD) requirement, yet support for header info is

stated as a hard requirement.  From an implementation perspective, this makes some

sense.  When looking to evaluate potential solutions, it does not.



Either a solution supports ECMP/LAG, or it does not.  If it does not, it hardly matters if

the solution provides header information that implementations of the solution cannot

use.

Regardless of LAG/ECMP support, an overlay is still required.



Not my point.  The relevant hard requirement is that "encapsulation MUST

result in sufficient entropy to exercise all paths through several LAG/ECMP

hops."



This comes after the requirement that "multipath over LAG and ECMP SHOULD

be supported."



Based on context, "encapsulation" in the hard requirement refers to that part

of the total encapsulation that LAG/ECMP participating nodes have access to.



The "MUST" requirement in this case seems be a requirement that only would

apply if the SHOULD requirement is applicable.  If that is the case, they would

both need to be at the same requirement level, would they not?



Also, the requirements in the 2nd paragraph are quite dense and appear to be

fairly entangled.  I am not sure anyone reading them (other than yourself) can

be entirely certain what they are.  The last sentence in this paragraph would

be much easier to figure out if it were broken out and broken down.



Reading the last sentence in the 2nd paragraph, leaves me with the confused

feeling that I am not sure what the relationship is between "overlay header"

and "overlay network", and "underlay header" and "underlay network."







In the descriptive text, the focus appears to be on providing sufficient entropy to allow

for fine-grained load balancing to exercise all paths.  I read this as the entropy must be

sufficient to allow reasonable hashing algorithms (or other load-spreading mechanisms)

to ensure that no path remains unused.



For most deployment scenarios, this is not an awful large amount of entropy.  In fact, I

doubt there are many implementations that cannot be at least tuned to meet this very

low bar.



But this is not sufficient by itself to ensure that any particular load-spreading approach

will be efficient.



No amount of entropy in header information can guarantee reasonably efficient load

spreading across arbitrarily many load-spreading implementations - at least not without

fairly intimate knowledge of how each implementation works.



Since the algorithm used to achieve load spreading is implementation specific far more

than it is solution specific, it is hard to determine how to apply such an un-quantified

soft requirement to evaluation or selection of any solution.



Hence it is completely not clear what this  section is attempting to define as requirements

of solutions.  The requirements are reasonable for implementations, but not so much for

solutions.

What difference do you make between implementation and solution?



It is somewhat difficult to determine what we can expect a solution to specify about

unspecified and proprietary LAG and ECMP implementations other than to say what

the implementations MUST be prepared to deal with in supporting the solution.





Section 3.3.2.2 -



I have some difficulty in figuring out why this section is not either after section 3.3.1.2, or

a subsection of section 3.3.1.2.



With the exception of the text relating to ECN marking, the text says much the same as

section 3.3.1.2.



I can see moving the DiffServ marking part to section 3.3.1.2, and leaving ECN marking

stuff here.

Fine



Section 3.3.2.3 -



The first requirement in this section starts with:



"User-definable knobs MUST be provided ..."



I strongly suspect that this is not what we want to say here.  Rather than "definable",

I believe we want to say either "adjustable" or "configurable."  These knobs would

need to be "solution-defined" or they would be un-implementable.

Fine



Section 3.3.2.3 -



For the requirement:



"When ingress replication is used, NVEs MUST track for each VNI the related tunnel

  endpoints to which it needs to replicate the frame."



I am very much uncertain as to how this is a data plane requirement.  The "tracking"

of related endpoints is not actually likely to be a data plane function.

"Track" is the wrong word. How about "maintain" instead.

Knowing which endpoints to replicate the franme to is a dataplane function indeed.



Fine, I believe.  I think what we're hunting about is that the dataplane needs

this information to be maintained by something.  That it needs this info is - I

completely agree a dataplane requirement.





Perhaps it could be re-phrased as follows:



"When ingress replication is to be used, solutions must define mechanisms that allow

  an NVE to know over which tunnels it needs to replicate the frame."



Even with this formulation, this becomes largely a push for management or control

plane requirements to support the required mechanism.

This is not what is meant.



It is (of course) possible (possibly even likely) that I am not getting what we

mean with this text.



I am looking at this as analogous to FDB information that relates instead to

the use of ingress-replication and tunneling.  If that is the case, then I would

expect the requirements should be similar (though learning from dataplane

forwarding could be different).





What is needed by the data plane is a reasonably accurate and up-to-date mapping

of VNI to tunnels.  How this mapping is maintained is pretty much out of scope for

data plane requirements.



I am pretty sure that my suggestion is far from perfect and perhaps we can work out

better words for what the real data plane requirements are.



I'm open to other suggestions.



Section 3.4.2 -



--- [SNIP] ---

For the following requirement:



"The granularity of such mappings, in both active-backup and active-active, MUST be

   unique to each tenant."



I think the word we want to use here is "distinct" (or "distinguishable") rather than

unique.

How about "specific"?



Fine with me.





Section 3.4.2.1 -



For the following requirement:



"Procedures for gateway selection to avoid triangular routing issues SHOULD be

  provided. The details of such procedures are, most likely, part of the NVO3

  Management and/or Control Plane requirements and, thus, out of scope of this

  document. However, a key requirement on the data plane of any NVO3 solution

  to avoid triangular routing is stated above, in Section 3.4.2, with respect to active-

  active load-balancing."



Initially, I read this as starting to define a data-plane requirement, realizing it was

really a control or management-plane requirement, and then finally pointing to

requirements already defined (Section 3.4.2).



After reading this again, it seemed to me that this requirement might actually be

only (or mostly?) aimed at setting up the following requirement.  Or it could have

been intended as an oblique reference to the text in 3.4.2 that deals with failure

of the active NVE GW.



In short, the actual requirement is completely unclear.  I am not even certain that

this is a data-plane requirement distinct from the following data-plane requirement.



For the soft requirement:



"an NVO3 solution SHOULD support forwarding tables that can simultaneously map a

  single egress NVE to more than one NVO3 tunnels."



This seems to be a repeat of requirements in the ECMP/LAG section where the main

difference is baed on an expectation about control or management plane behavior.

Is this then a distinct data-plane requirement?

I agree that this is somewhat confusing since it does not specify a new req besides what has been already defined in 3.4.2.

We will look at how to streamline this text.



Fine with me.





Section 3.6 -



Is there a real requirement (soft or hard) in this text?  The one case where "MAY" is

used, it could be replaced with "might" or "could" with no apparent effect on the

meaning of the text.

Fine

Section 3.8 -



Assuming (as I believe we must) that OAM traffic is indistinguishable from data traffic,

this is exclusively an operational requirement.



The ability to originate and terminate OAM messaging - which typically imposes a set

of requirements on data-plane implementations does not impose any requirements

on data-plane standards (at least not any that are distinct from being a requirement

to be able to originate and terminate messaging for any other device-level function).



Same question as above: What difference do you make between DP standards and implementation?



Different answer.  :)



In this case, making sure we can demux OAM traffic in general seems to be much

more about building a robust implementation, than about specifying a standard

solution.



However, in this case, each solution probably needs to be looked at to make sure it

supports the full gamut of OAM messages and interactions.  Some OAM messages

are not end-to-end, and even those that are may not be actually addressed to the

MEP that needs to deal with them.



It depends.



So, I withdraw this comment.



Section 3.9 -



Are any actual requirements defined in this section and its subsections that apply to

the process of evaluating solutions?

No, these are recommendations to consider when designing DP solutions.



Ah.  Next step after the next step.  Fine, and thanks!





How does a requirement for "encapsulation choices" to "consider ... limitations of ...

implementations" translate to a quantifiable requirement of a solution?



Isn't a requirement that NVO3 encap/decap processing in software-based NVEs

make use of hardware assist provided by NICs to speed up packet processing an

implementation requirement?



Aren't the criteria listed in section 3.9.2 as much implementation issues as solution

issues?  Also, again, how do we translate "SHOULD be considered" into a quantifiable

requirement that we can use to evaluate solutions?



Thanks!

--

Eric Gray
[nvo3] Comments on draft-ietf-nvo3-dataplane-requ… Eric Gray
Re: [nvo3] Comments on draft-ietf-nvo3-dataplane-… LASSERRE, MARC (MARC)
Re: [nvo3] Comments on draft-ietf-nvo3-dataplane-… Eric Gray
Re: [nvo3] Comments on draft-ietf-nvo3-dataplane-… LASSERRE, MARC (MARC)
Re: [nvo3] Comments on draft-ietf-nvo3-dataplane-… Eric Gray