Re: [trill] AD review of draft-ietf-trill-cmt-06

Donald Eastlake <d3e3e3@gmail.com> Tue, 18 August 2015 18:30 UTC

Return-Path: <d3e3e3@gmail.com>
X-Original-To: trill@ietfa.amsl.com
Delivered-To: trill@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 850AB1A90AC; Tue, 18 Aug 2015 11:30:28 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.15
X-Spam-Level:
X-Spam-Status: No, score=-1.15 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, J_CHICKENPOX_31=0.6, SPF_PASS=-0.001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 1BoYEqyGBD6e; Tue, 18 Aug 2015 11:30:25 -0700 (PDT)
Received: from mail-ob0-x234.google.com (mail-ob0-x234.google.com [IPv6:2607:f8b0:4003:c01::234]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A0D021A90AB; Tue, 18 Aug 2015 11:30:25 -0700 (PDT)
Received: by obkg7 with SMTP id g7so17230975obk.3; Tue, 18 Aug 2015 11:30:25 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; bh=0USkidX1mGBFaldezs7U0xnNG7gX127VvRWHBPeWC4o=; b=MB2JF2AQwjVwChUnxP7xeWoHzpcv+9UWqx+DUubiyDo60gnoH3bIsKRqxozC8WxvSE EDreLtNRtptH3pHhhD5ZAeG/7Da3QuZQ6V0rZSjTPqQAY8nL+QwGe0oXMIyA4Mmw0DPx ggI8+ESLm44INFvfQFG6GpOMNppbTBWoz09B1AaCY2hlmUJLr1JS2/QijelX2Ai2crC+ rIez/mTkqaG36xRWf3BzgqRs5O+yPlbiWnAazRix0NPfbtCN1LfXKzWGHbqPTJeTz6VI DUlBijcilfU8RI42kHKPgwjnYQGBynluiQc31Zf7lb9mynFvpBgMg44Q8ISI4oRgkTsF rRjA==
X-Received: by 10.182.39.194 with SMTP id r2mr7392235obk.20.1439922625007; Tue, 18 Aug 2015 11:30:25 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.76.173.3 with HTTP; Tue, 18 Aug 2015 11:30:10 -0700 (PDT)
In-Reply-To: <CAG4d1rcDWC_C2n0DmUEzTt8-WMsdtabVneNhkQmYQA-CceS51A@mail.gmail.com>
References: <CAG4d1rce6spmBWq3ONVRStQnsJptwCABePjzyrLi3g5siFgKWA@mail.gmail.com> <CAF4+nEEaC-Ws5ps_RZZFe_GR6pQa9VP70s1tFsDBeEERMcKdZQ@mail.gmail.com> <CAG4d1rewzPvLwPJcsWyKe_MnzhnnSX9MZTT5ZnJmVF_YQeV7pA@mail.gmail.com> <CAF4+nEEa1B30QRo08om+RkTFiuLmaEkzuh+gRdbmQ-zCCHOp+g@mail.gmail.com> <CAG4d1rcDWC_C2n0DmUEzTt8-WMsdtabVneNhkQmYQA-CceS51A@mail.gmail.com>
From: Donald Eastlake <d3e3e3@gmail.com>
Date: Tue, 18 Aug 2015 14:30:10 -0400
Message-ID: <CAF4+nEGnHMO3jKRU95FuFJZ=8JpJG+pJQWPf00+dpH=xS5hu0g@mail.gmail.com>
To: Alia Atlas <akatlas@gmail.com>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <http://mailarchive.ietf.org/arch/msg/trill/IqST6xflBLflRoB9GP5v9QsTLYw>
Cc: Tissa Senevirathne <tsenevir@gmail.com>, draft-ietf-trill-cmt@ietf.org, "trill@ietf.org" <trill@ietf.org>
Subject: Re: [trill] AD review of draft-ietf-trill-cmt-06
X-BeenThere: trill@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "Developing a hybrid router/bridge." <trill.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/trill>, <mailto:trill-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/trill/>
List-Post: <mailto:trill@ietf.org>
List-Help: <mailto:trill-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/trill>, <mailto:trill-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 18 Aug 2015 18:30:28 -0000

Hi Alia,

On Tue, Aug 18, 2015 at 1:51 PM, Alia Atlas <akatlas@gmail.com> wrote:
> Donald,
>
> Those changes look good.

Thanks.

> Are there any implications for implementations?  Does this need another WGLC
> due to the changes around the AFFINITY sub-TLV?

This specifies what the AFFINITY sub-TLV means when it is
self-referential, that is, it points at a nickname of the advertising
RBridge. This was not previously specified so I think of it more as
filling in a blank area than a change. It is up to the chairs if they
feel a brief WGLC is needed. Our earlier discussions of this have been
on the WG mailing list.

It is unlikely that an implementation based just on the current
trill-cmt text would handle this correctly.

Thanks,
Donald
=============================
 Donald E. Eastlake 3rd   +1-508-333-2270 (cell)
 155 Beaver Street, Milford, MA 01757 USA
 d3e3e3@gmail.com

> Alia
>
> On Tue, Aug 18, 2015 at 1:43 PM, Donald Eastlake <d3e3e3@gmail.com> wrote:
>>
>> Hi Alia,
>>
>> My apologies for the delay. I believe the authors and I have good
>> resolutions for those of your comments that were not fully resolved
>> prviously. Please see below (deleting some older text in this thread).
>>
>> Thanks,
>> Donald
>>
>> PS: Although I will be somewhat in contact I'm on vacation at the
>> WorldCon (www.sasquan.org) from tomorrow through next Monday.
>> =============================
>>  Donald E. Eastlake 3rd   +1-508-333-2270 (cell)
>>  155 Beaver Street, Milford, MA 01757 USA
>>  d3e3e3@gmail.com
>>
>>
>> On Fri, Jul 31, 2015 at 1:24 PM, Alia Atlas <akatlas@gmail.com> wrote:
>> > Hi Donald,
>> >...
>> > On Fri, Jul 31, 2015 at 1:01 PM, Donald Eastlake <d3e3e3@gmail.com>
>> > wrote:
>> >> Hi Alia,
>> >> On Fri, Jul 31, 2015 at 11:22 AM, Alia Atlas <akatlas@gmail.com> wrote:
>> >> > ...
>> >>...
>> >>
>> >> > Major Issues:
>> >> >
>> >> > 1) In Sec 5.3, it says "If an RBridge RB1 advertises an Affinity
>> >> > sub-TLV
>> >> > with an AFFINITY RECORD that's ask for nickname RBn to be its child
>> >> > in
>> >> > any tree and RB1 is not adjacent to a real or virtual RBridge RBn,
>> >> > that AFFINITY RECORD is in conflict with the campus topology and MUST
>> >> > be ignored."  How does an RBridge determine the connectivity of a
>> >> > virtual RBridge RBn?  I can see that a Designated RBridge announces
>> >> > the pseudo-nickname for the vDRB to the other RBridges in the LAALPs
>> >> > (as described in draft-ietf-trill-pseudonode-nickname) but I don't
>> >> > see
>> >> > any specific way that the connectivity of a virtual RBridge is
>> >> > described and known by the other RBridges.  What am I missing?
>> >>
>> >> Well, a virtual RBridge represents an edge group of RBridges that all
>> >> have ports to links that are part of an active-active group of links.
>> >> The RBridges in the edge group advertise in IS-IS an ID for that edge
>> >> group which is unique in that TRILL network (campus) -- usually pretty
>> >> easy as they just use the MC-LAG (or DRNI) ID. So all the members of
>> >> the edge group can see each other in the link state and the highest
>> >> priority member obtains and advertises in IS-IS the nickname for the
>> >> virtual RBridge representing that group. All RBRidge in the edge group
>> >> are assumed to be "connected" to that virtual RBridge.
>> >
>> > Right - so I agree that an RBridge can announce the tuple of
>> > (pseudo-rbridge id, LAALP ID) to indicate that the RBridge is
>> > attached to that virtual RBridge for the particular LAALP ID.
>> > Probably, it isn't even necessary to include the LAALP ID.
>> >
>> > However, I don't see any text in
>> > draft-ietf-trill-pseudonode-nickname-04 that causes this
>> > announcement to happen.  For instance, in Sec 9.2 where the handling
>> > of a PN-RBv APPsub-TLV is discussed, at the end all that is done by
>> > the receiver is:
>> >
>> > "On receipt of such a sub-TLV, if RBn is not an LAALP related edge
>> > RBridge, it ignores the sub-TLV. Otherwise, if RBn is also a member
>> > RBridge of the RBv identified by the list of LAALPs, it associates
>> > the pseudo-nickname with the ports of these LAALPs and downloads the
>> > association to data plane fast path logic."
>> >
>> > So it sounds like what you are saying is that
>> >
>> > a) RBridges announce the LAALP IDs that they are part of
>> > (PN-LAALP-Membership APPsub-TLV) and the Designated RBridge
>> > announces the nickname for the associated virtual RBridge.
>> >
>> > and then something unspecified is done that is described in Sec 9.2
>> > and contradicts the claim that the sub-TLV is ignored.
>> >
>> > I'm not saying the technology as implemented doesn't work - just that
>> > there
>> > are clearly some missing details here that need to be written down.
>>
>> So, the problem here is that the text is draft-ietf-trill-cmt is based
>> on the older mind set that the "virtual RBridge" representing the
>> active-active edge group would be a pseudonode is the IS-IS sense;
>> that is, it would have a seven byte IS-IS System ID, have link state,
>> appear in the topology as a separate node adjacent to all the real
>> edge RBridge in the group, etc. (This original view is still reflected
>> in the file name of draft-ietf-trill-pseudonode-nickname.)
>>
>> However, for a variety of reasons, this is no longer true. In the
>> scheme in draft-ietf-trill-pseudonode-nickname, the "virtual RBridge"
>> representing the edge group has a nickname but each of the real
>> RBridges in the edge group advertises that nickname as one of their
>> own nickanmes, well as advertising their "real" nickname.  (The base
>> TRILL protocol standard RFC 6325 provides for RBridges to hold
>> multiple nicknames. The original motivation for this was that, if an
>> RBridge was the source and/or sink for a lot of multi-destination
>> traffic, you might want multiple different least cost trees to be
>> rooted at that RBridge to spread the load and, since trees are
>> represented by the nickname of their root, that RBridge would need to
>> be identified by multiple nicknames.)
>>
>> There is no need for any RBridge that is not part of the edge group to
>> know what nickname is the semi-permanent "real" nickname for an
>> RBridge and what nickname identifies the virtual RBridge. These
>> nicknames are all advertised in the same way. Adjacencies only exist
>> between real RBridges that are advertising link state notwithstanding
>> that such real RBridges may sometimes advertise that one of their
>> nicknames is a nickname that happens to identify a virtual RBridge.
>>
>> When AFFINITY is used to associate a virtual RBridge RBv with tree t
>> at edge group RBridge RB1, RB1 will be advertising RBv as one of its
>> nicknames so the AFFINITY advertised by RB1 will be referring to
>> itself.
>>
>> Two changes appear to be required to bring draft-ietf-trill-cmt up to
>> date with draft-ietf-trill-psuedonode-nickname:
>>
>> OLD
>>    Each RBridge that desires to be the parent RBridge for child Rbridge
>>    RBy in a multi-destination distribution tree x announces the desired
>>    association using an Affinity sub-TLV. The child RBridge RBy is
>>    specified by its nickname (or one of its nicknames if it holds more
>>    than one).
>>
>> NEW
>>    Each RBridge that desires to be the parent RBridge for child
>>    RBridge RBy in a multi-destination distribution tree x announces
>>    the desired association using an Affinity sub-TLV. The child is
>>    specified by its nickname. If an RBridge RB1 advertises an AFFINITY
>>    sub-TLV designating one its own nicknames N1 as its “child” in some
>>    distribution tree, the effect is that that nickname N1 is ignored
>>    when constructing other distribution trees. Thus the RPF check will
>>    enforce that only RB1 can use nickname N1 to do ingress/egress on
>>    tree x. (This has no effect on least cost path calculations for
>>    unicast traffic.)
>>
>> OLD
>>    If an RBridge RB1 advertises an Affinity sub-TLV with an AFFINITY
>>    RECORD that's ask for nickname RBn to be its child in any tree and
>>    RB1 is not adjacent to a real or virtual RBridge RBn, that AFFINITY
>>    RECORD is in conflict with the campus topology and MUST be ignored.
>>
>> NEW
>>    If an RBridge RB1 advertises an Affinity sub-TLV with an AFFINITY
>>    RECORD that's ask for nickname RBn to be its child in any tree and
>>    RB1 is neither adjacent to RBn nor does nickname RBn identify RB1
>>    itself, that AFFINITY RECORD is in conflict with the campus
>>    topology and MUST be ignored.
>>
>> >> > Minor Issues:
>> >> > ...
>> >
>> >> > 3) In Sec 5.1, could you clarify which RBridges are doing the
>> >> > Distribution Tree provisioning?  I'm sure it's my lack of deep
>> >> > familiarity, but until I got to Sec 5.2, it wasn't at all clear to
>> >> > me.
>> >>
>> >> I'm not sure that "Distribution Tree Provisioning" is the best name
>> >> for that section. In TRILL, every RBridge in the campus independently
>> >> computes the same set of distribution trees for the campus and each
>> >> tree reaches every RBridge (right now, this will change with
>> >> multi-topology, etc.). This section is about the assignment of edge
>> >> group RBridges to trees. They all know about all the trees due to how
>> >> tree calculation works in TRILL and they all know about all the
>> >> members of the edge group. So each member of the edge group does the
>> >> calculations described in 5.1 and they will all come up with the same
>> >> assignment of edge group RBridges to trees.
>> >
>> > Right - but this section doesn't specify that the intended behavior
>> > is just for the edge group RBridges.  IMHO, that would be useful for
>> > clarity.
>>
>> See change below that consists of adding a reference on how tree
>> numbers are determined and adding "edge group" as a qualifier before
>> "RBridge" in two places.
>>
>> OLD
>>    If n >= k
>>
>>      Let's assume edge RBridges are sorted in numerically ascending
>>      order by IS-IS SystemID such that RB1 < RB2 < RBk. Each Rbridge in
>>      the numerically sorted list is assigned a monotonically increasing
>>      number j such that; RB1=0, RB2=1, RBi=j and RBi+1=j+1.
>>
>>      Assign each tree to RBi such that tree number { (tree_number) %
>>      k}+1 } is assigned to RBridge i for tree_number from 1 to n. where
>>      n is the number of trees, k is the number of RBridges considered
>>      for tree allocation, and ''%'' is the integer division remainder
>>      operation.
>>
>>    If n < k
>>
>>      Distribution trees are assigned to RBridges RB1 to RBn, using the
>>      same algorithm as n >= k case. RBridges RBn+1 to RBk do not
>>      participate in active-active forwarding process on behalf of RBv.
>>
>> NEW
>>    If n >= k
>>
>>      Let's assume edge RBridges are sorted in numerically ascending
>>      order by IS-IS SystemID such that RB1 < RB2 < RBk. Each RBridge
>>      in the numerically sorted list is assigned a monotonically
>>      increasing number j such that; RB1=0, RB2=1, RBi=j and
>>      RBi+1=j+1. (See Section 4.5 of [RFC6325] as modified by Section
>>      3.4 of [RFC7180bis] for how tree numbers are determined.)
>>
>>      Assign each tree to RBi such that tree number { (tree_number) %
>>      k}+1 } is assigned to edge group RBridge i for tree_number from 1
>>                            ^^^^^^^^^^
>>      to n. where n is the number of trees, k is the number of edge
>>                                                               ^^^^
>>      group RBridges considered for tree allocation, and ''%'' is the
>>      ^^^^^
>>      integer division remainder operation.
>>
>>    If n < k
>>
>>      Distribution trees are assigned to edge group RBridges RB1 to
>>                                         ^^^^^^^^^^
>>      RBn, using the same algorithm as n >= k case. RBridges RBn+1 to
>>      RBk do not participate in active-active forwarding process on
>>      behalf of RBv.
>>
>> >> > 4) In Sec 5.6, it says "Timer T_j SHOULD be at least < T_i/2" Do
>> >> > you mean that timer T_j should be no more than T_i/2 or that
>> >> > timer T_j should be no less than T_i/2.  The "<" makes this
>> >> > unclear to me because the "at least" contradicts it; is it T_j <
>> >> > T_i/2 or T_i/2 < T_j.
>> >>
>> >> I am less familiar with that provision so I'm not sure what the
>> >> correct interpretation is. It should probably be clarified.
>> >
>> > Oh dear - if neither you nor I are certain, it definitely needs to
>> > be clarified.
>>
>> The purpose of these timers is to minimize multi-destination packet
>> loss and/or duplication. Proposed OLD and NEW text is as follows:
>>
>> OLD
>>    RBi upon start-up, starts advertising its presence through IS-IS
>>    LSPs and starts a timer T_i. Member RBridges detecting the presence
>>    of RBi start a timer T_j. Timer T_j SHOULD be at least < T_i/2.
>>    (Please see note below)
>>
>>    Upon expiry of timer T_j, member RBridges recalculate the multi-
>>    destination tree assignment and advertised the related trees using
>>    Affinity sub-TLV.
>>
>>    Upon expiry of timer T_i, RBi recalculate the multi-destination tree
>>    assignment and advertises the related trees using Affinity TLV.
>>
>>    Note: Timers T_i and T_j are designed so as to minimize traffic down
>>    time and avoid multi-destination packet duplication.
>>
>> NEW
>>    RBi, upon start-up, advertises its presence through IS-IS LSPs and
>>    starts a timer T_i. Other member RBridges of the edge group,
>>    detecting the presence of RBi, start a timer T_j.
>>
>>    Upon expiry of timer T_j, other member RBridges recalculate the
>>    multi-destination tree assignment and advertised the related trees
>>    using Affinity sub-TLV. Upon expiry of timer T_i, RBi recalculate
>>    the multi-destination tree assignment and advertises the related
>>    trees using Affinity TLV.
>>
>>    If the new RBridge in the edge group calculates trees and starts to
>>    use one or more before the existing RBridges in the edge group
>>    recalculate, there could be duplication of packets (for example
>>    more than one edge group RBridge could decapsulate and forward a
>>    multi-destination frame on links into the active active group) or
>>    loss of packets (for example due to the Reverse Path Forwarding
>>    Check in the rest of the campus if two edge group RBridges are
>>    trying to forward on the same tree those from one will be
>>    iscarded).  Alternatively, if the new RBridge in the edge group
>>    calculates trees and starts to use one or more after the existing
>>    RBridges recalculate, there could be loss of data due to frames
>>    arriving at the new RBridge being black holed. Timers T_i and T_j
>>    should be initialized to values designed to minimize these problems
>>    keeping in mind that, in general, duplicating is a more serious
>>    problem than dropping. It is RECOMMENDED that T_j be less than T_i
>>    and a reasonable default is 1/2 of T_i.
>>
>>
>> I hope the above resolutions of your comments are satisfactory.
>>
>> > Thanks,
>> > Alia
>> >
>> >> Thanks,
>> >> Donald
>> >> =============================
>> >>  Donald E. Eastlake 3rd   +1-508-333-2270 (cell)
>> >>  155 Beaver Street, Milford, MA 01757 USA
>> >>  d3e3e3@gmail.com
>> >>
>> >> > Thanks again,
>> >> > Alia
>
>