[trill] Shepherd review of draft-ietf-trill-revilient-trees-05

Donald Eastlake <d3e3e3@gmail.com> Thu, 14 July 2016 02:26 UTC

Return-Path: <d3e3e3@gmail.com>
X-Original-To: trill@ietfa.amsl.com
Delivered-To: trill@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 40A9712DA36 for <trill@ietfa.amsl.com>; Wed, 13 Jul 2016 19:26:41 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.45
X-Spam-Level:
X-Spam-Status: No, score=-2.45 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vmdcqULZCQIV for <trill@ietfa.amsl.com>; Wed, 13 Jul 2016 19:26:39 -0700 (PDT)
Received: from mail-oi0-x22b.google.com (mail-oi0-x22b.google.com [IPv6:2607:f8b0:4003:c06::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 8C64212D186 for <trill@ietf.org>; Wed, 13 Jul 2016 19:26:38 -0700 (PDT)
Received: by mail-oi0-x22b.google.com with SMTP id l65so12148152oib.1 for <trill@ietf.org>; Wed, 13 Jul 2016 19:26:38 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to; bh=QvQlb0ti4P5cSKJ3PBy5r02lxXk+buOI0/+YVkCjypY=; b=ilSmfiqOebK8nmGZqlArA12ypXvtezDon7ib9OwSVaDimfgvp9qqVxV6Q1Btk3ADXI Mo8cYqtKl6roG3N2wS80/mOHquz3m5SVTRsXd4y0hYkgD5d7EhWPSdFDGAUv2Jv8odXr s4UHUsXIsHBbc+zmDQ1B37FWoZEFUybKRO6XG45R92V0yZhLi2QE1R2NF6frF/RrhRm5 C9qo8KV7zT2Y6B1VCizAn7FfCiYyNZ5XVw0lOHWDXj9iIjRXWaJiTmBSSXjfsp7axhJH 3bCq2X3I3EVmJmDb229MDa0IABiCNi2lWZfJmYBGjYvehvwws5WQv7OvXIJ/4vklFcLc JBRg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=QvQlb0ti4P5cSKJ3PBy5r02lxXk+buOI0/+YVkCjypY=; b=YSsYD9Ux/L88QBbO59XLpQrRXfzSIiXY4WmeUMgvnqfvOJ2qAGQTLPbRf0k9TASSww 4TSVTs/HzwLd+zKOZvNwYVDqALIP3aZ0swwv/0nMAeA7oMIBjasOC1Va7yglQ98ekVSO vcDh0Fj4cNual3izFJuvUphkxx77GISjPDAP7pwqo2ASbjyz1mUr5wwabntVIrwWG1qH sCKw7orX+WoVrT7OGLau4PpqNWsPC9BsCEjdmbmvs/oKTv8XV4LzyZdkDQS3rPHqXKTV ljqWEwadJn1mu3Edvq5PiC1wlHKRStFhj37cc+XidQhwtLSNrbHW8jkCC5f8NOCEJPCe idcw==
X-Gm-Message-State: ALyK8tIkTOotR7StZKd5R2+xVpk9dicrvBm2qk0N10dxlxE5PpdOAtVHYNwD2bj7BrpteSo8z9uMfHCOb/bCDg==
X-Received: by 10.157.51.112 with SMTP id u45mr7442714otd.124.1468463197763; Wed, 13 Jul 2016 19:26:37 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.157.40.40 with HTTP; Wed, 13 Jul 2016 19:26:23 -0700 (PDT)
From: Donald Eastlake <d3e3e3@gmail.com>
Date: Wed, 13 Jul 2016 22:26:23 -0400
Message-ID: <CAF4+nEGdWRinBbbwJOXRz_Eh_gyk-SnG=gcyhxQqiQpSVeojMg@mail.gmail.com>
To: "trill@ietf.org" <trill@ietf.org>
Content-Type: text/plain; charset=UTF-8
Archived-At: <https://mailarchive.ietf.org/arch/msg/trill/xJeD3LZFwR1aCUxRy7lJAyH7mTk>
Subject: [trill] Shepherd review of draft-ietf-trill-revilient-trees-05
X-BeenThere: trill@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: "Developing a hybrid router/bridge." <trill.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/trill>, <mailto:trill-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/trill/>
List-Post: <mailto:trill@ietf.org>
List-Help: <mailto:trill-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/trill>, <mailto:trill-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 14 Jul 2016 02:26:41 -0000

I've review this draft and I have one big comment and number of
smaller comments:

Major Comment:
--------------

Backwards compatibility: The biggest problem that I see has to do with
backward compatibility. What happens if one or more RBridges in the
campus do not implement resilient trees? Or even do not implement the
Affinity sub-TLV? It seems like you need to be able to detect these
conditions and either adapt to them or not try to do the resilient
tree alternative distribution trees.  This is related to the
incremental deployment question. There is an Affinity sub-TLV support
bit in the TRILL-VER sub-TLV but this does not seem to be mentioned in
the document.  Note that at the beginning of 3.2.1.1 it says "it is
desirable that every RBridge independently computes Affinity Links for
a backup DT across the whole campus" which would require every RBridge
in the campus to implement resilient tress. So, while maybe some sort
of local repair can be done when the local RBridge support, I think
the global repair requires support throughout the campus.

It seems to me that either the document needs to be changed to either
(1) make support mandatory or (2) state that you don't try any of the
resiliency features (or at least the global ones) unless the
resilience is supported by all the RBridges (or at least all the
RBridges other than singly connected leaves). In either case, there
needs to be a capability bit (probably an Extended Capability bit) to
indicate resilient trees support. And you really have to provide for
"(2)" above because it will always be possible that you have a legacy
RBridge in a campus. (Of course, as a practical matter, it will
commonly be the case that all the RBridges in a campus support exactly
the same capabilities.)

I recommend that we NOT make resilient tree support mandatory. I think
that would be a substantial change and require a new WG Last Call.

Medium Comments:
---------------

Section 3.2 and subsections:

If you are going to provide for manually configured alternative tree
and automatically computer alternative trees and allow the root for an
alternative tree to be different from the root for the primary tree,
it seem to me that some RBridge (presumably the highest priority tree
root) needs to announce something like {primary tree nickname,
alternative tree nickname, flags} where flags could indicate manually
configured or automatically configured and have some spare bits for
future uses.

I don't think it makes sense to have two alternative algorithms, as
specified in 3.2.1 and 3.2.2, for automatically computing a maximally
disjoint alternative tree when the two algorithms are so similar.
Maybe it would make sense to have two radically different algorithms
if you could point to some circumstances under which each was superior to the
other. But if you are going to have multiple algorithms, it seems like
you need a way for some designated RBridge (the highest priority tree
root?) to announce what algorithm is in use and you would need
algorithm identifiers and a registry and ...  I do not think it would
be worth it to have all that mechanism. Of the algorithms in 3.2.1 and
3.2.2, I think I favor the one in 3.2.1 a little. (The one in 3.2.2 at
least would need a maximum cost to avoid link cost overflowing and
wrapping around.) I would suggest that the increment to link cost in
3.2.1 of the sum of the cost of all links in the campus is too
conservative. I think it would be adequate for the increment to be 1
plus the total cost of the highest leaf-to-leaf cost in the primary
tree.

Section 4.1 on pruning: Maybe it is just me, but I don't quite
understand why pruning for an alternative tree should be based on the
pruned primary tree. What exactly happens when a TRILL Data packet is
sent on an alternative tree? Presumably, the "egress nickname" field
is changed from the nickname designating the primary tree to the
nickname designating the alternative tree. So, generally, the packet
will be forwarded down the alternative tree and it seems to me that it
should be sent down any branch where there are RBridges interested in
the packet VLAN/FGL further down that branch. What does the primary
tree have to do with it except in the local repair case?

Minor Comments:
---------------

Title page: "Updates: RFC 6325" You are just supposed to list the RFC
number without "RFC" before it in Updates and Obsoletes items on the
front page. I guess the change required to RFC 6325 is in Section
5.3.1 about changing the egress nickname of a TRILL Data packet in
flight. This should be mentioned in the Introduction (probably at the
end of the Introduction).

Section 1, page 5, at the top: It sounds a bit like the repair of the
failed link itself. It's more like routing is being repaired by
bypassing the failed link.

Section 2.1, top of page 6: Should clarify that the Affinity Sub-TLV
is distributed in an LSP.

Section 2.1, at the end: I'm not really comfortable with saying you
"MUST NOT" declare an Affinity Link between non-adjacent RBridges.
Since such a declaration would just be ignored, it usually just means
some wasted LSP space, so I think it should be "SHOULD NOT".

Section 2.2: I think it might avoid confusion when you talk about
"removing incoming links" to say that outgoing links/adjacencies are
not removed. (The figure is clear.)

Section 5.2.1: Although it is fairly obvious, I think you should point
out that item 1 requires a very reliable and steady data packet stream.

Section 5.3: The last two sentences should be deleted. It says pretty
much the same thing in the first paragraph of Section 5.

Section 5.3.3: This walkthrough starts talking about RBridges but does
not indicate what Figure the reader should look at to follow what is
being said.

Section 5.4: The precision and the limits for the configuration of Ts
should be given.

Minor updates:
    The CMT draft has been published as RFC 7783.
    RFC 7180 has been obsoleted by RFC 7780.

Thanks,
Donald
===============================
 Donald E. Eastlake 3rd   +1-508-333-2270 (cell)
 155 Beaver Street, Milford, MA 01757 USA
 d3e3e3@gmail.com