Re: [trill] AD review of draft-ietf-trill-cmt-06

Alia Atlas <akatlas@gmail.com> Tue, 18 August 2015 17:51 UTC

Return-Path: <akatlas@gmail.com>
X-Original-To: trill@ietfa.amsl.com
Delivered-To: trill@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2FD4A1A8F38; Tue, 18 Aug 2015 10:51:25 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -100.399
X-Spam-Level:
X-Spam-Status: No, score=-100.399 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, FREEMAIL_REPLY=1, HTML_MESSAGE=0.001, J_CHICKENPOX_31=0.6, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id kjlC7e9QBGCX; Tue, 18 Aug 2015 10:51:20 -0700 (PDT)
Received: from mail-oi0-x231.google.com (mail-oi0-x231.google.com [IPv6:2607:f8b0:4003:c06::231]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 3A1491A8F35; Tue, 18 Aug 2015 10:51:20 -0700 (PDT)
Received: by oiew67 with SMTP id w67so86325677oie.2; Tue, 18 Aug 2015 10:51:19 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=lgMGQbvuwwQnoa5z6u2XTxNCWeYyYnBdzvgtkgvn1cw=; b=NOZavxruMCSTjdPqca1C7JAShTOjWdz4AI5aa9dwnpgDngC6bDB0aVi50cvEb9HNsm uGfTyvJm4Jh8/5cmB3P5Cag9Zu9fGUEdSNG9dfRL2ku6IkI8Rt+Z6HHLUO94z00cYV1x 7hW94Y36mRz+g1F3/XPfATG1wvmCaYbb7/8yD9CCKk3dHHe4K0JCGsnfkRWoiNL5m9S7 oGgTdpFrizgjmgRgnleweeUYI+wR4Fsq4M8mKqGw/8Tc2VgOXAzvKzT/OVkQ784UiWsP A4Zx3BCcqUa0gRgunnZsmFK5zbFmzpignUaDjQ+0qQbEcMD2Vj7SkPMEV8ZdnnzrB0vO ZdxQ==
MIME-Version: 1.0
X-Received: by 10.202.175.143 with SMTP id y137mr6686988oie.22.1439920279606; Tue, 18 Aug 2015 10:51:19 -0700 (PDT)
Received: by 10.60.41.99 with HTTP; Tue, 18 Aug 2015 10:51:19 -0700 (PDT)
In-Reply-To: <CAF4+nEEa1B30QRo08om+RkTFiuLmaEkzuh+gRdbmQ-zCCHOp+g@mail.gmail.com>
References: <CAG4d1rce6spmBWq3ONVRStQnsJptwCABePjzyrLi3g5siFgKWA@mail.gmail.com> <CAF4+nEEaC-Ws5ps_RZZFe_GR6pQa9VP70s1tFsDBeEERMcKdZQ@mail.gmail.com> <CAG4d1rewzPvLwPJcsWyKe_MnzhnnSX9MZTT5ZnJmVF_YQeV7pA@mail.gmail.com> <CAF4+nEEa1B30QRo08om+RkTFiuLmaEkzuh+gRdbmQ-zCCHOp+g@mail.gmail.com>
Date: Tue, 18 Aug 2015 13:51:19 -0400
Message-ID: <CAG4d1rcDWC_C2n0DmUEzTt8-WMsdtabVneNhkQmYQA-CceS51A@mail.gmail.com>
From: Alia Atlas <akatlas@gmail.com>
To: Donald Eastlake <d3e3e3@gmail.com>
Content-Type: multipart/alternative; boundary="001a113ce89ec56087051d9990c3"
Archived-At: <http://mailarchive.ietf.org/arch/msg/trill/rXFpZNhdnug_64pxxsN3bfZNydo>
Cc: Tissa Senevirathne <tsenevir@gmail.com>, draft-ietf-trill-cmt@ietf.org, "trill@ietf.org" <trill@ietf.org>
Subject: Re: [trill] AD review of draft-ietf-trill-cmt-06
X-BeenThere: trill@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "Developing a hybrid router/bridge." <trill.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/trill>, <mailto:trill-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/trill/>
List-Post: <mailto:trill@ietf.org>
List-Help: <mailto:trill-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/trill>, <mailto:trill-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 18 Aug 2015 17:51:25 -0000

Donald,

Those changes look good.

Are there any implications for implementations?  Does this need another
WGLC due to
the changes around the AFFINITY sub-TLV?

Alia

On Tue, Aug 18, 2015 at 1:43 PM, Donald Eastlake <d3e3e3@gmail.com> wrote:

> Hi Alia,
>
> My apologies for the delay. I believe the authors and I have good
> resolutions for those of your comments that were not fully resolved
> prviously. Please see below (deleting some older text in this thread).
>
> Thanks,
> Donald
>
> PS: Although I will be somewhat in contact I'm on vacation at the
> WorldCon (www.sasquan.org) from tomorrow through next Monday.
> =============================
>  Donald E. Eastlake 3rd   +1-508-333-2270 (cell)
>  155 Beaver Street, Milford, MA 01757 USA
>  d3e3e3@gmail.com
>
>
> On Fri, Jul 31, 2015 at 1:24 PM, Alia Atlas <akatlas@gmail.com> wrote:
> > Hi Donald,
> >...
> > On Fri, Jul 31, 2015 at 1:01 PM, Donald Eastlake <d3e3e3@gmail.com>
> wrote:
> >> Hi Alia,
> >> On Fri, Jul 31, 2015 at 11:22 AM, Alia Atlas <akatlas@gmail.com> wrote:
> >> > ...
> >>...
> >>
> >> > Major Issues:
> >> >
> >> > 1) In Sec 5.3, it says "If an RBridge RB1 advertises an Affinity
> sub-TLV
> >> > with an AFFINITY RECORD that's ask for nickname RBn to be its child in
> >> > any tree and RB1 is not adjacent to a real or virtual RBridge RBn,
> >> > that AFFINITY RECORD is in conflict with the campus topology and MUST
> >> > be ignored."  How does an RBridge determine the connectivity of a
> >> > virtual RBridge RBn?  I can see that a Designated RBridge announces
> >> > the pseudo-nickname for the vDRB to the other RBridges in the LAALPs
> >> > (as described in draft-ietf-trill-pseudonode-nickname) but I don't see
> >> > any specific way that the connectivity of a virtual RBridge is
> >> > described and known by the other RBridges.  What am I missing?
> >>
> >> Well, a virtual RBridge represents an edge group of RBridges that all
> >> have ports to links that are part of an active-active group of links.
> >> The RBridges in the edge group advertise in IS-IS an ID for that edge
> >> group which is unique in that TRILL network (campus) -- usually pretty
> >> easy as they just use the MC-LAG (or DRNI) ID. So all the members of
> >> the edge group can see each other in the link state and the highest
> >> priority member obtains and advertises in IS-IS the nickname for the
> >> virtual RBridge representing that group. All RBRidge in the edge group
> >> are assumed to be "connected" to that virtual RBridge.
> >
> > Right - so I agree that an RBridge can announce the tuple of
> > (pseudo-rbridge id, LAALP ID) to indicate that the RBridge is
> > attached to that virtual RBridge for the particular LAALP ID.
> > Probably, it isn't even necessary to include the LAALP ID.
> >
> > However, I don't see any text in
> > draft-ietf-trill-pseudonode-nickname-04 that causes this
> > announcement to happen.  For instance, in Sec 9.2 where the handling
> > of a PN-RBv APPsub-TLV is discussed, at the end all that is done by
> > the receiver is:
> >
> > "On receipt of such a sub-TLV, if RBn is not an LAALP related edge
> > RBridge, it ignores the sub-TLV. Otherwise, if RBn is also a member
> > RBridge of the RBv identified by the list of LAALPs, it associates
> > the pseudo-nickname with the ports of these LAALPs and downloads the
> > association to data plane fast path logic."
> >
> > So it sounds like what you are saying is that
> >
> > a) RBridges announce the LAALP IDs that they are part of
> > (PN-LAALP-Membership APPsub-TLV) and the Designated RBridge
> > announces the nickname for the associated virtual RBridge.
> >
> > and then something unspecified is done that is described in Sec 9.2
> > and contradicts the claim that the sub-TLV is ignored.
> >
> > I'm not saying the technology as implemented doesn't work - just that
> there
> > are clearly some missing details here that need to be written down.
>
> So, the problem here is that the text is draft-ietf-trill-cmt is based
> on the older mind set that the "virtual RBridge" representing the
> active-active edge group would be a pseudonode is the IS-IS sense;
> that is, it would have a seven byte IS-IS System ID, have link state,
> appear in the topology as a separate node adjacent to all the real
> edge RBridge in the group, etc. (This original view is still reflected
> in the file name of draft-ietf-trill-pseudonode-nickname.)
>
> However, for a variety of reasons, this is no longer true. In the
> scheme in draft-ietf-trill-pseudonode-nickname, the "virtual RBridge"
> representing the edge group has a nickname but each of the real
> RBridges in the edge group advertises that nickname as one of their
> own nickanmes, well as advertising their "real" nickname.  (The base
> TRILL protocol standard RFC 6325 provides for RBridges to hold
> multiple nicknames. The original motivation for this was that, if an
> RBridge was the source and/or sink for a lot of multi-destination
> traffic, you might want multiple different least cost trees to be
> rooted at that RBridge to spread the load and, since trees are
> represented by the nickname of their root, that RBridge would need to
> be identified by multiple nicknames.)
>
> There is no need for any RBridge that is not part of the edge group to
> know what nickname is the semi-permanent "real" nickname for an
> RBridge and what nickname identifies the virtual RBridge. These
> nicknames are all advertised in the same way. Adjacencies only exist
> between real RBridges that are advertising link state notwithstanding
> that such real RBridges may sometimes advertise that one of their
> nicknames is a nickname that happens to identify a virtual RBridge.
>
> When AFFINITY is used to associate a virtual RBridge RBv with tree t
> at edge group RBridge RB1, RB1 will be advertising RBv as one of its
> nicknames so the AFFINITY advertised by RB1 will be referring to
> itself.
>
> Two changes appear to be required to bring draft-ietf-trill-cmt up to
> date with draft-ietf-trill-psuedonode-nickname:
>
> OLD
>    Each RBridge that desires to be the parent RBridge for child Rbridge
>    RBy in a multi-destination distribution tree x announces the desired
>    association using an Affinity sub-TLV. The child RBridge RBy is
>    specified by its nickname (or one of its nicknames if it holds more
>    than one).
>
> NEW
>    Each RBridge that desires to be the parent RBridge for child
>    RBridge RBy in a multi-destination distribution tree x announces
>    the desired association using an Affinity sub-TLV. The child is
>    specified by its nickname. If an RBridge RB1 advertises an AFFINITY
>    sub-TLV designating one its own nicknames N1 as its “child” in some
>    distribution tree, the effect is that that nickname N1 is ignored
>    when constructing other distribution trees. Thus the RPF check will
>    enforce that only RB1 can use nickname N1 to do ingress/egress on
>    tree x. (This has no effect on least cost path calculations for
>    unicast traffic.)
>
> OLD
>    If an RBridge RB1 advertises an Affinity sub-TLV with an AFFINITY
>    RECORD that's ask for nickname RBn to be its child in any tree and
>    RB1 is not adjacent to a real or virtual RBridge RBn, that AFFINITY
>    RECORD is in conflict with the campus topology and MUST be ignored.
>
> NEW
>    If an RBridge RB1 advertises an Affinity sub-TLV with an AFFINITY
>    RECORD that's ask for nickname RBn to be its child in any tree and
>    RB1 is neither adjacent to RBn nor does nickname RBn identify RB1
>    itself, that AFFINITY RECORD is in conflict with the campus
>    topology and MUST be ignored.
>
> >> > Minor Issues:
> >> > ...
> >
> >> > 3) In Sec 5.1, could you clarify which RBridges are doing the
> >> > Distribution Tree provisioning?  I'm sure it's my lack of deep
> >> > familiarity, but until I got to Sec 5.2, it wasn't at all clear to me.
> >>
> >> I'm not sure that "Distribution Tree Provisioning" is the best name
> >> for that section. In TRILL, every RBridge in the campus independently
> >> computes the same set of distribution trees for the campus and each
> >> tree reaches every RBridge (right now, this will change with
> >> multi-topology, etc.). This section is about the assignment of edge
> >> group RBridges to trees. They all know about all the trees due to how
> >> tree calculation works in TRILL and they all know about all the
> >> members of the edge group. So each member of the edge group does the
> >> calculations described in 5.1 and they will all come up with the same
> >> assignment of edge group RBridges to trees.
> >
> > Right - but this section doesn't specify that the intended behavior
> > is just for the edge group RBridges.  IMHO, that would be useful for
> > clarity.
>
> See change below that consists of adding a reference on how tree
> numbers are determined and adding "edge group" as a qualifier before
> "RBridge" in two places.
>
> OLD
>    If n >= k
>
>      Let's assume edge RBridges are sorted in numerically ascending
>      order by IS-IS SystemID such that RB1 < RB2 < RBk. Each Rbridge in
>      the numerically sorted list is assigned a monotonically increasing
>      number j such that; RB1=0, RB2=1, RBi=j and RBi+1=j+1.
>
>      Assign each tree to RBi such that tree number { (tree_number) %
>      k}+1 } is assigned to RBridge i for tree_number from 1 to n. where
>      n is the number of trees, k is the number of RBridges considered
>      for tree allocation, and ''%'' is the integer division remainder
>      operation.
>
>    If n < k
>
>      Distribution trees are assigned to RBridges RB1 to RBn, using the
>      same algorithm as n >= k case. RBridges RBn+1 to RBk do not
>      participate in active-active forwarding process on behalf of RBv.
>
> NEW
>    If n >= k
>
>      Let's assume edge RBridges are sorted in numerically ascending
>      order by IS-IS SystemID such that RB1 < RB2 < RBk. Each RBridge
>      in the numerically sorted list is assigned a monotonically
>      increasing number j such that; RB1=0, RB2=1, RBi=j and
>      RBi+1=j+1. (See Section 4.5 of [RFC6325] as modified by Section
>      3.4 of [RFC7180bis] for how tree numbers are determined.)
>
>      Assign each tree to RBi such that tree number { (tree_number) %
>      k}+1 } is assigned to edge group RBridge i for tree_number from 1
>                            ^^^^^^^^^^
>      to n. where n is the number of trees, k is the number of edge
>                                                               ^^^^
>      group RBridges considered for tree allocation, and ''%'' is the
>      ^^^^^
>      integer division remainder operation.
>
>    If n < k
>
>      Distribution trees are assigned to edge group RBridges RB1 to
>                                         ^^^^^^^^^^
>      RBn, using the same algorithm as n >= k case. RBridges RBn+1 to
>      RBk do not participate in active-active forwarding process on
>      behalf of RBv.
>
> >> > 4) In Sec 5.6, it says "Timer T_j SHOULD be at least < T_i/2" Do
> >> > you mean that timer T_j should be no more than T_i/2 or that
> >> > timer T_j should be no less than T_i/2.  The "<" makes this
> >> > unclear to me because the "at least" contradicts it; is it T_j <
> >> > T_i/2 or T_i/2 < T_j.
> >>
> >> I am less familiar with that provision so I'm not sure what the
> >> correct interpretation is. It should probably be clarified.
> >
> > Oh dear - if neither you nor I are certain, it definitely needs to
> > be clarified.
>
> The purpose of these timers is to minimize multi-destination packet
> loss and/or duplication. Proposed OLD and NEW text is as follows:
>
> OLD
>    RBi upon start-up, starts advertising its presence through IS-IS
>    LSPs and starts a timer T_i. Member RBridges detecting the presence
>    of RBi start a timer T_j. Timer T_j SHOULD be at least < T_i/2.
>    (Please see note below)
>
>    Upon expiry of timer T_j, member RBridges recalculate the multi-
>    destination tree assignment and advertised the related trees using
>    Affinity sub-TLV.
>
>    Upon expiry of timer T_i, RBi recalculate the multi-destination tree
>    assignment and advertises the related trees using Affinity TLV.
>
>    Note: Timers T_i and T_j are designed so as to minimize traffic down
>    time and avoid multi-destination packet duplication.
>
> NEW
>    RBi, upon start-up, advertises its presence through IS-IS LSPs and
>    starts a timer T_i. Other member RBridges of the edge group,
>    detecting the presence of RBi, start a timer T_j.
>
>    Upon expiry of timer T_j, other member RBridges recalculate the
>    multi-destination tree assignment and advertised the related trees
>    using Affinity sub-TLV. Upon expiry of timer T_i, RBi recalculate
>    the multi-destination tree assignment and advertises the related
>    trees using Affinity TLV.
>
>    If the new RBridge in the edge group calculates trees and starts to
>    use one or more before the existing RBridges in the edge group
>    recalculate, there could be duplication of packets (for example
>    more than one edge group RBridge could decapsulate and forward a
>    multi-destination frame on links into the active active group) or
>    loss of packets (for example due to the Reverse Path Forwarding
>    Check in the rest of the campus if two edge group RBridges are
>    trying to forward on the same tree those from one will be
>    iscarded).  Alternatively, if the new RBridge in the edge group
>    calculates trees and starts to use one or more after the existing
>    RBridges recalculate, there could be loss of data due to frames
>    arriving at the new RBridge being black holed. Timers T_i and T_j
>    should be initialized to values designed to minimize these problems
>    keeping in mind that, in general, duplicating is a more serious
>    problem than dropping. It is RECOMMENDED that T_j be less than T_i
>    and a reasonable default is 1/2 of T_i.
>
>
> I hope the above resolutions of your comments are satisfactory.
>
> > Thanks,
> > Alia
> >
> >> Thanks,
> >> Donald
> >> =============================
> >>  Donald E. Eastlake 3rd   +1-508-333-2270 (cell)
> >>  155 Beaver Street, Milford, MA 01757 USA
> >>  d3e3e3@gmail.com
> >>
> >> > Thanks again,
> >> > Alia
>