Re: [Rift] Routing directorate early review of draft-ietf-rift-rift

Jonathan Hardwick <Jonathan.Hardwick@metaswitch.com> Mon, 04 November 2019 12:40 UTC

Return-Path: <Jonathan.Hardwick@metaswitch.com>
X-Original-To: rift@ietfa.amsl.com
Delivered-To: rift@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id F1AD0120019; Mon, 4 Nov 2019 04:40:20 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, HTTPS_HTTP_MISMATCH=0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=metaswitch.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CYJ2SDopRrFX; Mon, 4 Nov 2019 04:40:15 -0800 (PST)
Received: from NAM01-SN1-obe.outbound.protection.outlook.com (mail-eopbgr820129.outbound.protection.outlook.com [40.107.82.129]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 8E06112004C; Mon, 4 Nov 2019 04:40:15 -0800 (PST)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Y7btMDbR03mL/qRMwMP2MN+eIkf/nqiSvoe5itRE/USfX6kFMbuv8sGCSOsBQWIQFH2didu6EECKqIkZVvKuicHAs+eZAHIM5ubeuVf4v1y/ssi0F+azAU5bHcYdMDKLL+SAkieMqs/5tH9YQgBdL34ZFkx5WkOu8B++Roa+W+tWBOhcmknW6D7VwrW5fpxM6qZ2L4KS3562GoKTyiUyo9f3pT+ajNMm+Sp/2RiU5+URHCMbZKx9yUNwdHysW5XXB65xOSk7UWqzCcSC/OUGL8QzdxHewgMJW9ceP0QrHfqIV/ftdnpIfxm3xm1xqfpavf7ocYuQ6UENrX6kt+fgNg==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=WDFIREhbanSE+e9IVye/W4+CL7/E3pQ45U+bNQdzTXA=; b=QzE5ibdMSBHj+P311nJnWwmc6Uj5NJ3lmAGmAvZeodg+Q3WFR6jGIxXAftt+G4fWQdz7Y/qSiYf0eIh5Ll8fZOcagYJ2cIpxJ8zoBOQx8mDF4pLk/HgHmaAT47VSVucGO+SVAYY3Zh7CsYmnwE6HdVLyy6X49stGNlV5CvUFWQmQxZtOV1aFeQWhIBb3HMgnmtm3W98ePbwP7Rn/xnsR/3ZvIeCrzvTAH6nGLCm/RHbT9zv56wI1t7HngmfEg/0HNNIk8RVRPovV3oug2eDhjbP61pC7yFp8MufYk8MiLGnFnH1L2WvsGz/gHNCouKVsIyqM7/G7BLO5ZthOHUrHFQ==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=metaswitch.com; dmarc=pass action=none header.from=metaswitch.com; dkim=pass header.d=metaswitch.com; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=metaswitch.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=WDFIREhbanSE+e9IVye/W4+CL7/E3pQ45U+bNQdzTXA=; b=Cxgji3LsbAYzMZqLaleWToVnRJbUfDbTUWKV9bL5azGNgzoU/5nGbzV3zs5iFifPApoyunu30BhxCOGd+z8rQCxrLmGqcdA13zt64KsbvEt+t/3KmUbb/NA9UyIAE0bztGFg4X9BbzU2Q+UKClsyUtVNnxCBeUmfds0sZzWukxA=
Received: from BL0PR02MB4868.namprd02.prod.outlook.com (20.177.144.87) by BL0PR02MB4354.namprd02.prod.outlook.com (10.167.172.219) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2408.24; Mon, 4 Nov 2019 12:40:13 +0000
Received: from BL0PR02MB4868.namprd02.prod.outlook.com ([fe80::d967:8fc7:e08c:410c]) by BL0PR02MB4868.namprd02.prod.outlook.com ([fe80::d967:8fc7:e08c:410c%5]) with mapi id 15.20.2408.024; Mon, 4 Nov 2019 12:40:13 +0000
From: Jonathan Hardwick <Jonathan.Hardwick@metaswitch.com>
To: Tony Przygienda <tonysietf@gmail.com>
CC: "rift-chairs@ietf.org" <rift-chairs@ietf.org>, "draft-ietf-rift-rift.all@ietf.org" <draft-ietf-rift-rift.all@ietf.org>, "rtg-dir@ietf.org" <rtg-dir@ietf.org>, =?utf-8?B?THVjIEFuZHLDqSBCdXJkZXQ=?= <laburdet.ietf@gmail.com>, Min Ye <amy.yemin@huawei.com>, "rift@ietf.org" <rift@ietf.org>
Thread-Topic: [Rift] Routing directorate early review of draft-ietf-rift-rift
Thread-Index: AdWQFFZ+nLkvsrpgQ+mvM9YJ2lkc7wClvH2AABTQPpA=
Date: Mon, 4 Nov 2019 12:40:12 +0000
Message-ID: <BL0PR02MB48684435784A92180AEE2F87847F0@BL0PR02MB4868.namprd02.prod.outlook.com>
References: <BL0PR02MB48689FA2D6B7C255DF11045D84630@BL0PR02MB4868.namprd02.prod.outlook.com> <CA+wi2hO=rZ2mbX3ZJVgn9cSvfbot29W+MNnunysPhPv+3Mxykw@mail.gmail.com>
In-Reply-To: <CA+wi2hO=rZ2mbX3ZJVgn9cSvfbot29W+MNnunysPhPv+3Mxykw@mail.gmail.com>
Accept-Language: en-GB, en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-ts-tracking-id: 4c09a58c-52b0-4530-a236-9190537f30a4.0
authentication-results: spf=none (sender IP is ) smtp.mailfrom=Jonathan.Hardwick@metaswitch.com;
x-originating-ip: [192.91.191.162]
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: 5c2a3a24-f984-4c92-60d6-08d76124231c
x-ms-traffictypediagnostic: BL0PR02MB4354:
x-ms-exchange-purlcount: 2
x-microsoft-antispam-prvs: <BL0PR02MB43540609AD9C043368E58D85847F0@BL0PR02MB4354.namprd02.prod.outlook.com>
x-ms-oob-tlc-oobclassifiers: OLM:10000;
x-forefront-prvs: 0211965D06
x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(4636009)(136003)(366004)(376002)(346002)(396003)(39850400004)(51444003)(54094003)(189003)(199004)(76176011)(71200400001)(71190400001)(6246003)(86362001)(55016002)(6436002)(102836004)(6506007)(53546011)(7696005)(229853002)(5660300002)(790700001)(52536014)(6116002)(3846002)(4326008)(30864003)(6306002)(446003)(11346002)(9686003)(186003)(14444005)(256004)(54896002)(486006)(236005)(476003)(66066001)(2906002)(8936002)(6916009)(74316002)(66946007)(478600001)(8676002)(81156014)(14454004)(81166006)(33656002)(54906003)(1411001)(316002)(26005)(7736002)(76116006)(966005)(64756008)(66446008)(66556008)(66476007)(99286004)(25786009)(606006)(60764002)(559001)(569006); DIR:OUT; SFP:1102; SCL:1; SRVR:BL0PR02MB4354; H:BL0PR02MB4868.namprd02.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1;
received-spf: None (protection.outlook.com: metaswitch.com does not designate permitted sender hosts)
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: 9r4TdrT8sA0k18zT2XMLlAFsHmV35TgDb6MXiqI77gM2WPj0DnRYQFZHDsigAP5cTmo9Rp4+QpT/eDI/7znnjYt9fDf+J45MVXiMMzFkLUk7lyFSdKhBpe0KCzk2cwxeLW54f+4N6OJKMug5o57K8EUfyj/a1QIqJ0ylHWdsh1Ya+eDU3/xPEPrBeaw7zwY0Nomm2ijEeKqlcJBlJEyRRPxv4uow7m1Eu5lWqURpapozdRwQlgNJtfD1E3VvgLf/SctW3JSJuteKEtcgp7L0a5RSz9tUJsw72420juDCXdmrXM8/p03QckG6nbbYPVS6id1mMr6Ip97AjDXMkZYhJSp38z8BWj+q9oOU33CK8R94Y3sw8NfWtDvJ9Cu8e7V4cLrSn7tm2PEJvGPMmIun2uP5BqrJnf3PDweWeSwMF70/wzh0J4sax9+tZhRwI0ZUF452iu6CyjJVrcnG7RDthSmmvRTe/V05UC3dEIjas2w=
x-ms-exchange-transport-forked: True
Content-Type: multipart/alternative; boundary="_000_BL0PR02MB48684435784A92180AEE2F87847F0BL0PR02MB4868namp_"
MIME-Version: 1.0
X-OriginatorOrg: metaswitch.com
X-MS-Exchange-CrossTenant-Network-Message-Id: 5c2a3a24-f984-4c92-60d6-08d76124231c
X-MS-Exchange-CrossTenant-originalarrivaltime: 04 Nov 2019 12:40:13.0594 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 9d9e56eb-f613-4ddb-b27b-bfcdf14b2cdb
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: c9eI2GNU8k+JMULv+iYVU714ijxr+K8wSFebiqKnEASDWuivVXRXW/tsC8blsoSTxk4aP07QoxiJConImjczhQ==
X-MS-Exchange-Transport-CrossTenantHeadersStamped: BL0PR02MB4354
Archived-At: <https://mailarchive.ietf.org/arch/msg/rift/WVGd3tzSSugfw_8mcXZlT-8eB8k>
Subject: Re: [Rift] Routing directorate early review of draft-ietf-rift-rift
X-BeenThere: rift@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Discussion of Routing in Fat Trees <rift.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rift>, <mailto:rift-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rift/>
List-Post: <mailto:rift@ietf.org>
List-Help: <mailto:rift-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rift>, <mailto:rift-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 04 Nov 2019 12:40:21 -0000

Tony, many thanks for your reply – please see [JEH] below.
Jon

From: Tony Przygienda <tonysietf@gmail.com>;
Sent: 04 November 2019 01:01
To: Jonathan Hardwick <Jonathan.Hardwick@metaswitch.com>;
Cc: rift-wg-chairs@ietf.org; draft-ietf-rift-rift.all@ietf.org; rtg-dir@ietf.org; Luc André Burdet <laburdet.ietf@gmail.com>;; Min Ye <amy.yemin@huawei.com>;; rift@ietf.org
Subject: Re: [Rift] Routing directorate early review of draft-ietf-rift-rift

NOTE: Message is from an external sender
Jonathan, thanks for your review, responses inline

On Thu, Oct 31, 2019 at 11:01 AM Jonathan Hardwick <Jonathan.Hardwick=40metaswitch.com@dmarc.ietf.org<mailto:40metaswitch.com@dmarc.ietf.org>> wrote:
Hello

I have been selected to do a Routing Directorate “early review” of this draft:
https://datatracker.ietf.org/doc/draft-ietf-rift-rift/<https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdatatracker.ietf.org%2Fdoc%2Fdraft-ietf-rift-rift%2F&data=02%7C01%7CJonathan.Hardwick%40metaswitch.com%7C6a9bf119c75546423e1508d760c2906f%7C9d9e56ebf6134ddbb27bbfcdf14b2cdb%7C1%7C1%7C637084261080863912&sdata=fekuG%2BUxuSsk0I9U2c%2B2Ln97BoJvpPCNSEj78uTJYzQ%3D&reserved=0>

The routing directorate will, on request from the working group chair, perform an “early” review of a draft before it is submitted for publication to the IESG. The early review can be performed at any time during the draft’s lifetime as a working group document. The purpose of the early review depends on the stage that the document has reached.  As this document has advanced to working group last call, my focus for the review was to determine whether the document is ready to be published. Please consider my comments along with the other working group last call comments.

For more information about the Routing Directorate, please see http://trac.tools.ietf.org/area/rtg/trac/wiki/RtgDir<https://nam01.safelinks.protection.outlook.com/?url=http%3A%2F%2Ftrac.tools.ietf.org%2Farea%2Frtg%2Ftrac%2Fwiki%2FRtgDir&data=02%7C01%7CJonathan.Hardwick%40metaswitch.com%7C6a9bf119c75546423e1508d760c2906f%7C9d9e56ebf6134ddbb27bbfcdf14b2cdb%7C1%7C1%7C637084261080873905&sdata=%2BSErIEghf8ICUsWEffos%2F0asQ7WnKam2LnJeiLSmxX4%3D&reserved=0>

Document: draft-ietf-rift-rift
Reviewer: Jon Hardwick
Review Date: 31 Oct 2019
Intended Status: Standards Track

Summary
Thanks for writing this document.  It is a very interesting approach and I really enjoyed getting to grips with the ideas presented in the draft!

thanks, quite a lot of work

Unfortunately, I have some concerns about the document and think it needs more work before being submitted to the IESG.  The problem is that I found the document hard to read, for several reasons.

  *   It is very light in its use of normative RFC-2119 style language.  An implementer would have to fill in quite a few gaps and/or make assumptions about various passages.
I will address in specifics the sections you raised inline.

Otherwise, in meta terms, as to the question of "is this specification being precise enough?" I can quote only what I wrote to Robert Sparks already:

"we have two interoperable implementations since a bit, one completely open source which has been produced based on the spec. It was in fact open source work that helped to refine the document content to make sure we can have an implementation produced based on the text without further "guessing things". As example the LIE FSM has been implemented initially in open source without consulting authors of the spec and interoperat'ed without a single defect (but discovered a protocol underspecification in case of misconfiguration that was subsequently added). Please refer to IETF proceedings for the according presentations if necessary. We have a third implemenation progressing now where all questions the implementor asked so far could be answered by pointing directly @  the specification as written. This seems to answer to me the "suspicion of specification maybe not being good enough to implement" as an objective measuring stick as far I can imagine one.
"

We are in IETF here where "rough consenus and running code" was the receipe of success vs. much heavier handed organizations like OSI and I think in this philosophy the spec, if anything, is possibly overspecified already ;-) The core pieces that bare no slips like flooding and adjacency formation are very precisely written including FSMs.

[JEH] Sure.  My comments were intended to help improve the use of normative language and the delineation between normative passages and informative ones.



  *
  *   The definition of the protocol and some of the normative behaviour is deferred to the appendices, whereas I would expect to encounter it early on in the text, with an in-line discussion of the purposes of the messages and fields.

Ok, seems like the second directorate reviewers prefers the appendices to be pulled into the document. Let me do that thenl

[JEH] My apologies – it is unfortunate when two different reviewers give contradictory opinions!  You should of course weigh my opinion with everyone else’s.


  *
  *   It sometimes refers to concepts or terms that are either not defined or have not yet been introduced to the reader, suggesting an ordering issue within the text.

I think that the document needs to be refactored somewhat to solve the ordering issues, use more normative language, eliminate any text that is not actually relevant to the implementation and deployment of the protocol, and pull together the normative definition of the protocol into a contiguous block early on in the document.

further inline


The other issue is that, because the document is large and I found it rather hard going, I did not have time do a thorough review beyond section 5.3.  I’d therefore have to recommend another directorate review once we have concluded on the issues I’m raising below.

ok, obviously as much is written as we expect is necessary to "clearly" spec out the protocol. The document is more than simply a dry prescriptive normative though since very early in the workgroup sessions the input of many people was that they would prefer is some more "narrative" explanation of "what" and "why" is inserted instead of purely the algorithms. We tried to find a balance but obviously opinions will always vary between "this is too chatty and should be just a dry normative" and "this does not explain WHY that would work and WHY it has been designed that way".. Based on Robert Sparks review I will try to simplify the language and cut out some superfluous text he pointed out or I find. We'll see where we end.

[JEH] Thanks.  As it happens I prefer documents to have informative passages to help me understand the normative ones, provided they give me enough context to understand them and they are sufficiently relevant.  My comments were targeted to help improve the context & relevance.  I suggest a subsequent RtgDir review only because I was not able to apply as much diligence to the later sections of the document as I would have liked.  I will leave it to the WG if they want to action this.


Details
Here are comments on the sections that I was able to review in detail before I ran out of time.

Abstract
Is it possible to reformat this as a list of items on multiple lines? It would read more clearly.

yes


Section 2
"an optimal approach does not seem however": this appears to be a value judgment rather than consensus opinion, appearing as it does without citation, and may be perceived as treading on the toes of other standardization efforts currently in progress at the IETF. I suggest you simply state the facts: "RIFT approaches this problem using a mixture of..."

done, sure.


Section 2.1
The form of words in the Requirements Language boilerplate has changed recently - see RFC 8174.

thanks, corrected


Section 3.1
ZTP - expand acronym on first use.

yes, ZTP added. glossary will be rea-arranged based on other reviewers input.

There is potential for confusion between N-TIE and Node TIE! I'd prefer "North TIE" for the former.
An example of confusion: is the "South Node TIE" referred to in the definition of "South Reflection" the same as the S-TIE referred to in the definition of "TIE"?
"The document sometimes calls them flood leaders as well." But it would be better if you just used one term.

OK, I expand N- and S- to North- and South- everywhere in the document


Section 4
Personally I could live without this section
Merge PEND1 with NONREQx (or explain the distinction)

thanks, there were multiple discussions pro/cons on mike/list about this section and suggestions along the lines to split it out into a different document (but standardizing requirement drafts went out of fashion recently ;-) or drop it. I'm dropping it based on your input and others desiring to shorten the document.


Section 5.1.3 - 5.1.5
This discussion is not possible to follow properly until you have been introduced to positive & negative disaggregation and southern reflection.  As such I wonder if it really belongs in a section called "overview".

Jonathan, well, section 5.1.3 "fallen leaf" (4 now given requirements is removed) _is_ the overview section. Southern reflection is defined in the glossary already and the "negative disaggregation" is a mechanism introduced to address the "fallen leaf problem" later and obviously the problem itself has to be explained & introduced first. Negative disaggregation is arguably (beside flooding scopes) the most complex part of the spec and we spent lots of time and effort (especially Pascal) with multiple rewrites to give the narrative describing the CLOS inherent problem. Moreover we didn't want to mix it up with RIFT specific mechanism since the "fallen leaf": problem exists in multi-plane CLOS independent of any protocol and BTW, I never saw it explained as clearly as Pascal did in the multi-plane introduction section. Also, we clearly state in the section that if someone builds a single plane CLOS the section can be disregarded to simplify the reading of the spec for many people.

[JEH] Thanks. Firstly, Pascal is to be congratulated on the text describing multi-plane topologies. I had no problem getting to grips with them with the help of his text and some Lego models that it inspired me to build :-)  I have re-read these sections just now and I do now find them easier to follow – having already read the relevant parts of the later spec.  On the first read-through I think I was troubled by too many questions: What do they mean by “positive” and “negative” in the context of disaggregation?  What do they mean by “transitive”?  I have been told what southern reflection is, but what relevant information does it provide and how is it useful?  In hindsight these were all guessable but I found these concepts a barrier to my understanding.  If you have the stomach for another iteration of these sections, I would request some additional explanation to be included.



Section 5.2.2

   A node configured with "undefined" PoD membership MUST, after
   building first northbound three way adjacencies to a node being in a
   defined PoD, advertise that PoD as part of its LIEs.  In case that
   adjacency is lost, from all available northbound three way
   adjacencies the node with the highest System ID and defined PoD is
   chosen.

It seems odd that the choice of advertised pod is at first non-deterministic (race to the first adjacency) and then, only if this initial adjacency is lost, the choice of pod becomes deterministic. Why not make it deterministic the whole time?

The first adjacency is simply used to speed up things since otherwise how long do you wait until you have all northbound adjacencies?  Observe that level ZTP will possibly drop adjacencies while it's converging so the consequent set will refine the PoD as well, i.e. the ZTP is guaranteed to get the node to the maximum available level @ which point in time the northbound available adjacencies will determine the PoD. Obviouly the adjacencies can disagree about the PoD and such a scenario can be used by an implementation to report miscablings. We talk quickly about miscabling detection in the spec since it's such a desirable property _of an implementaiton_ but it's not necessary for correct protocol operation so we don't make anything normative except disallowing adjacency forming across PoDs if defined. Since configurting and converging PoDs is optional we allow even to disregard this rule on adjacency formation.

[JEH] Thanks – makes sense. I had missed that ZTP can drop adjacencies when I wrote this comment.


Section 5.2.3.2

In the example TIEs, "Spine21" should be "ToF 21" to agree with the nomenclature of figure 2.  Ditto in table 4 (section 5.2.3.4)
In Spine 111's Node-S-TIE, I am not sure that the links(...) should be given for each neighbor.

corrected the ToF 21/22 everywhere.  Yes, on careful reading one wonders WHY node south tie should include _all_ links. This is necessary for both flood reduction as well as bandwidth balancing since both happen from south going up and the node computing needs the northbound neighbors of the level up. That's one of the reasons the example is givne. I'll add a clarifying sentence.

[JEH] Thanks. Does that mean the links(…) should be added to Spine121’s Node S-TIE in the same example?

Section 5.2.3.5
"It should only set it in the southbound direction."  - SHOULD?

corrected


Section 5.2.3.8
Define N-SPF on first use

OK, N-SPF and S-SPF added to glossary.


Section 5.2.4
"A node has three sources" - I see only two listed.
"We use simple, familiar SPF algorithms here..." - is the use of those algorithms supposed to be normative? Or are you just giving an example and leaving me to choose my own algorithm?  If SPF is normative then you need to specify it using normative language or include a normative reference to it.

I tried to clarify that better in the existing text by expanding to


<t>A node has three possible sources of relevant information for reachability computation.
    A node knows
    the full topology south of it from the received North Node TIEs or alternately
    north of it from the South Node TIEs..  A node has the
    set of prefixes with their associated distances and bandwidths from
    corresponding prefix TIEs.</t>

<t>To compute prefix reachability, a node runs conceptually a northbound
    and a southbound
    SPF.
    We call that N-SPF and S-SPF denoting the direction in which the computation
    front is progressing.
</t>

<t>Since neither computation can "loop", it is
    possible to compute non-equal-cost or even
    <xref target="EPPSTEIN">k-shortest paths</xref>
    and "saturate" the fabric
    to the extent desired but we use simple, familiar SPF algorithms and
    concepts here as example due to their prevalence in today's routing.
</t>

So the algorithms given are NOT normative but I improved what _is_ normative in the N-SPF and S-SPF section

<section anchor="nspf" title="Northbound SPF">

    <t> N-SPF MUST use ONLY northbound and East-West adjacencies in the computing
        node's node North TIEs (since if the node is a leaf it may not have
        generated a node South TIE)
        when starting SPF. ...

<t>Once progressing, we are using the next higher level's node South TIEs to
    find according adjacencies to verify backlink connectivity.
    Just as in case of IS-IS or OSPF, two unidirectional links MUST be
    associated
    together to confirm bidirectional connectivity. ...

<section anchor="sspf" title="Southbound SPF">

    <t> S-SPF MUST use ONLY the
        southbound adjacencies in the node South TIEs,
        i.e. progresses towards nodes at lower levels. Observe that
        E-W adjacencies are NEVER used in the computation. This enforces the
        requirement that a packet traversing in a southbound direction must
        never change its direction.</t>
    <t>S-SPF MUST use northbound adjacencies in node North TIEs to verify backlink
        connectivity by checking for presence of the link beside correct SystemID and
        level. </t>


This is about all that needs to be said here in terms of normative language beside the one already present.

[JEH] OK, thanks.
Section 5.2.4.1
Please define the terms "south prefix" and "north prefix"
"Supersuming" is not a word I recognise.  Use "or a non-default prefix which contains this south prefix"
"the node does not..." -> "the computing node does not..."

Section 5.2.4.2
"S-SPF uses northbound adjacencies in node N-TIEs to verify backlink connectivity" - this statement needs to be recast into normative language using RFC 2119 terms.  "A node MUST verify backlink connectivity ... Else it MUST NOT include the link.... Etc."
Same comment applies in many places throughout the document.

re-read and applied more normative language to the specific section as indicated above.  Re-read the document and normalized more languagte where necessary.


Section 5.2.4.3
What is a `"ring protection" scheme`?

Ring based protection scheme just like BLSR. I replace with "ring-based protection" which is fairly well understood term in networking.

Removed the ring based protection of a level to applicability draft which multiple authors work on and where it seems to belong rather than in the spec. Left only clarification


<t>Using south prefixes over horizontal links MAY occur
 if the N-SPF is East-West adjacencies in computation.
    It can
    protect against pathological fabric partitioning cases that
    leave only paths to destinations that would necessitate multiple
    changes of forwarding direction between north and south.
    </t>

[JEH] Suggest you change “if the N-SPF is East-West adjacencies” to “if the N-SPF includes East-West adjacencies”
Are E-W links permitted between planes?
Not sure what this is telling me: "Using south prefixes over horizontal links is optional..." - is that OPTIONAL as in RFC 2119?  Do you mean that my implementation can ignore them? Or not advertise them? Or that the network operator does not have to cable them?

Clarified as per section above. If the N-SPF is using horizontal adjacencies it will pick up those prefixes.

[JEH] Looks OK.

Section 5.2.4.4
"Even though a ToF node could
   be tempted to use those links during southbound SPF this MUST NOT be
   attempted since it may lead in, e.g. anycast cases to routing loops."

This is too verbose and obtuse.  I cannot see how anycast cases lead to routing loops and I don't know if I need to understand why or not.  Suggest


"A ToF node MUST NOT include east-west links in its south-SPF calculation."

This is already said in the S-SPF section very explicitly as


<t> S-SPF MUST use ONLY the
    southbound adjacencies in the node South TIEs,
    i.e. progresses towards nodes at lower levels. Observe that
    E-W adjacencies are NEVER used in the computation. This enforces the
    requirement that a packet traversing in a southbound direction must
    never change its direction.</t



This section gives the impression that E-W links at the ToF will never be used for forwarding data - is that true?  They are used for control plane only?

Yes, it is described in text but I clarified the section on horizontal links in ToF further


<t>E-W ToF links behave in terms of flooding scopes defined in
    <xref target="tiescopes"/> like northbound links and MUST be used for control plane
    information flooding ONLY. Even though a ToF node could be tempted
    to use those links during southbound SPF and carry traffic over them this
    MUST NOT be attempted since it may lead in, e.g. anycast cases to routing loops.
    An implemention MAY try to resolve the looping problem by following on the ring strictly
    tie-broken
    shortest-paths only but the details are outside this specification. And even then,
    the problem of proper capacity provisioning of such links when they become traffic-bearing in
    case of failures is vexing.</t>

[JEH] OK, this is clearer.

"An implementation could try ... but the details are outside this specification" - so why mention it?

Because of the fact that the question was coming up multiple times in meetings/mails and so on. Instead of negative disaggregation people were tempted to "forward through the horizontal links on top" when a fallen leaf starts forwarding in the wrong plane (i.e. the one where it's fallen). This section points out that this should not be attempted due to looping problems, i.e. a ToF node that has no reachability to an anycast address (since a fallen leaf forwarded to an anycast destination that is also fallen) could try to use horizontal links to forward traffic but it may have multiple planes that can reach the destination. Obviously when it forwards e.g. left on the ring & the traffic arrives on the ToF that seems to be able to reach that anycast the ToF may choose to forward it back on the ring to "another ToF" that can reach the anycast. Observer that RIFT is loop-free i.e. one can forward on any path as long it reaches the destination but since horizontal is considered equivalent to northbound forwarding and metric can be disregarded (RIFT is not bound by shortest path) the traffic may just end up looping in the ring. This is hard to describe and would to lots figures hence the spec simply says "don't do it" and if one is tempted to one will find out why it's a bad idea when one implemented this. And then the said implementer will probably try to fix it by the "shortest path" computation @ ToF level which is next layer of the onion the document mentions and then explains again that this may work but he stop going out there with this spec.

The "ring" between planes necessary is visualized in figure 13 and described in section


4.2.5.2.1.  Cabling of Multiple Top-of-Fabric Planes

again in an example. I don't think that needs further clarification.

[JEH] Understood. I would suggest moving “An implementation could…” to a footnote – if only one could have footnotes in an RFC.

Section 5.2.5.1
"A DAG computation" - expand DAG.

already expanded in entrance to terminology section but added a more specific definition
[JEH] Ah – sorry. Missed it.


"Neither
       is it necessary for the receiving node to reflect the
       disaggregated prefixes back over its adjacencies to nodes at the
       level from which it was received."

Please restate this using RFC 2119 language.

done. It's actually not necessary for this language here to be normative since the normative part is Table 3 and when it is implemented all the algorithm behavior and resulting flooding follows straight out of that. I emphasized that the flooding scopes table is normative.

[JEH] OK


How can we guarantee that a same-level node does not have a next hop to a given prefix that is unknown to the node doing the computation?  If X reaches P via N1 and N2, Y (at the same level as X) can reach P via N3 but X does not know this and assumes Y cannot reach P because Y is not adjacent to N1 and N2, then X unnecessarily disaggregates P positively.  For instance if X's link to N3 has failed and Y's links to N1 and N2 have failed.

that cannot be guaranteed. If X can reach prefix via N1 which Y doesn't have and Y via N3 that X doesn't have but they only see via a nexthops N0 (though which the prefix cannot be reached) then both will disaggregate since anything else would be assuming necessity of "harmonica routing" which RIFT doesn't do since harmonica is opposite to valley free routing which RIFT does to guarantee loop free behavior.  That is actually a good example why RIFT positive disaggregation guarantees sufficient disaggregation to prevent blackholes, loops and bow-ties but possibly more than necessary (which is never claimed in the document).

[JEH] Understood. So there may be redundant disaggregation but it keeps the forwarding plane valley free.  I think that’s OK.


"Each entry is a list of south neighbor of X and a list of nodes
       of X.level that can't reach that neighbor"

Think this should say

"Each entry in the set is a south neighbor of X and a list of nodes
       of X.level that can't reach that neighbor"

yes, thanks.


"X does not to disaggregate any prefixes" -> ""X does not disaggregate any prefixes..""

yes


"The PoD containing the prefix will prefer southbound anyway." - I didn't understand the point. Is it necessary for me to understand it? Please expand or delete the sentence if it's not necessary.

clarified:


<t>all the lower level nodes are flooded the same disaggregated
    prefixes since we don't want to build an South TIE per node and
    complicate things unnecessarily. The lower level node
    that can compute a southbound route to the prefix
    will prefer it to the disaggregated route anyway based on
    route preference rules.</t>

[JEH] That’s better, thanks.

Section 5.2.6
"such as mobility per section 5.3.3 necessary" - delete "necessary".

yes

"ties are broken based upon type first and then distance and further attributes" - I don't see mention of further attributes in the proposed algorithm.

corrected to

PrefixAttributes

which are contained in the schema. Mobility tie-breaking is described in its own section.

The document does not standardize further tie-breaking since .e.g. tie-breaking on tags is possible but can be completely implementation dependent given RIFT is loop-free. Neither do I think any kind of "standardizable agreement" could be possible here.


"The nexthop
   adjacencies for a negative prefix are inherited from the longest
   prefix that aggregates it" - suggest changing to "longest positive prefix"

ok


"all entries of the father" -> "all entries of the parent"

ok

Section 5.2.7.3
"we have to decide whether node Y is at the same level as I, J or at
   the same level as Y and consequently, X is south of it."

I could not parse this.  I think you might mean this:

"we have to decide whether node Y is at the same level as I, J
  (and consequently X is south of it) or at the same level as X."

yes, correct, somewhat it got garbled, corrected to


<t>First, we must anchor the "top" of the cabling and that's what
    the TOP_OF_FABRIC flag at node A is for. Then things look smooth until
    we have to decide whether node Y is at the same level as I, J
    (and as consequence, X is south of it) or at
    the same level as X. This is
    unresolvable here until we
    "nail down the bottom" of the topology. To achieve that we choose to
    use in this
    example the leaf flags in X and Y. In case where Y would not have a leaf
    flag it will try to elect highest level offered and end up being
    in same level as I and J.
    </t>

[JEH] Looks good.

Section 5.2.7.4
How does a ToF node know what value to advertise in its LEVEL_VALUE?

This constant is provided in appendix D.1

I'm working on the other directorate reviews and will try to cut a new version with all those changes before deadline

[JEH] Thanks again for considering all my comments.