Re: [Rift] AD Review of draft-ietf-rift-rift-12 (Part 2a)

Alvaro Retana <aretana.ietf@gmail.com> Thu, 01 April 2021 20:08 UTC

Return-Path: <aretana.ietf@gmail.com>
X-Original-To: rift@ietfa.amsl.com
Delivered-To: rift@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 94E223A218D; Thu, 1 Apr 2021 13:08:20 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.098
X-Spam-Level:
X-Spam-Status: No, score=-2.098 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2AioYrIZFDsZ; Thu, 1 Apr 2021 13:08:17 -0700 (PDT)
Received: from mail-ed1-x529.google.com (mail-ed1-x529.google.com [IPv6:2a00:1450:4864:20::529]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 9DF463A2189; Thu, 1 Apr 2021 13:08:16 -0700 (PDT)
Received: by mail-ed1-x529.google.com with SMTP id dm8so3280057edb.2; Thu, 01 Apr 2021 13:08:16 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:in-reply-to:references:mime-version:date:message-id:subject:to :cc:content-transfer-encoding; bh=1mKTg9cmgERQaxw5PTcicK1q96t2T2TxVpImpG4Dcos=; b=YTkXh7bDX8I5NMXizCujFDmQ/eG3dth54L8mKK7fyclychRZ12PDF5m9vBFMCuOXM6 ozup6dOJ2zmqL7nCEojddHCmwq2lk6ZL7GMfIVLb9I96Vy8HDrhrh61GHMsrsgSA2WDN rTG7w0CqbnNRRynQCZ+jVsq89m2Au3T5xJA89TMxc4qB1cbPJD9jXUbhbwxyEwjFvRwu im8M7XRlHf7dmaCHscPX169cr/a23hvBPAWpmSIPe/IxtAX3EbsKVh6lZ15FvLH4mAAE qstAEQhWjkwz00JF4nHivn73hlePyq0/FNA58hoP3B4KG23Vm8AFpHlpayH2I0yqQkCr UX9A==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:in-reply-to:references:mime-version:date :message-id:subject:to:cc:content-transfer-encoding; bh=1mKTg9cmgERQaxw5PTcicK1q96t2T2TxVpImpG4Dcos=; b=Ubln08VfeeAQsroyoKd+TGIr7XeX4keX3YLswCL8s973uq4nryrv0wxEPQJkfFTCdX qFm8SoZG3WFLAU94Fm/D/gj4HXV+Ra8NHh/VB7840JkhSGe9oFpUf/kwstB93jU1Ixmx dn90nEmrCzK1SONW6Mh743lSCcDy+Hv97Js1Jem8InC6dTHr1kmyvxrV5F18AtEkHKtG 40h5TZNoxlbB9pVX9bVG61CnoGolylPIhldOUb/ROf+pCEYTGPfi16HEpXoPvpcVfjg0 p5tjZxvM4mktBZLnDrB1WpyOaayYed49FQVIyNch5CLlTNqPB4A24rfIztS8QZgPMn+z VPkA==
X-Gm-Message-State: AOAM531FSfyw6xnSKOkyN5kLt+geX2SuLJZ5UNSrQCmcUQ77glLFP+Si AR7P6RZWJGFmHbsFDXWQh+If1WeJ/V2PPbk/3/x6Kal3
X-Google-Smtp-Source: ABdhPJzG4UbjNVhmb9aRQERvzKU3To+/qAOPuclKiklqNgRVLQvAB/R+f5aUEhCP1erXVGuR2xDW5yeadSFFHzi7lVM=
X-Received: by 2002:a05:6402:1545:: with SMTP id p5mr11705876edx.155.1617307693049; Thu, 01 Apr 2021 13:08:13 -0700 (PDT)
Received: from 1058052472880 named unknown by gmailapi.google.com with HTTPREST; Thu, 1 Apr 2021 13:08:12 -0700
From: Alvaro Retana <aretana.ietf@gmail.com>
In-Reply-To: <CA+wi2hNX2aFU2gYZp3JDGeSFdvWZSvJnJO2Gx3Ki-2ifOCvNeA@mail.gmail.com>
References: <CAMMESsxBr0+UriSaTDVZMrFzU6DSiuC3-wO4+7HgX4nX7SLHmg@mail.gmail.com> <CA+wi2hNX2aFU2gYZp3JDGeSFdvWZSvJnJO2Gx3Ki-2ifOCvNeA@mail.gmail.com>
MIME-Version: 1.0
Date: Thu, 01 Apr 2021 13:08:12 -0700
Message-ID: <CAMMESswX=ENvZki2nuf6BF-2OxWS=NaoyAMrq5_jKSgS7cKuEw@mail.gmail.com>
To: Tony Przygienda <tonysietf@gmail.com>
Cc: "draft-ietf-rift-rift@ietf.org" <draft-ietf-rift-rift@ietf.org>, "rift@ietf.org" <rift@ietf.org>, "rift-chairs@ietf.org" <rift-chairs@ietf.org>, "Jeffrey (Zhaohui) Zhang" <zzhang@juniper.net>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/rift/irGGc5d1vHOjuLfaerSZ71k-T2g>
Subject: Re: [Rift] AD Review of draft-ietf-rift-rift-12 (Part 2a)
X-BeenThere: rift@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Discussion of Routing in Fat Trees <rift.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rift>, <mailto:rift-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rift/>
List-Post: <mailto:rift@ietf.org>
List-Help: <mailto:rift-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rift>, <mailto:rift-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 01 Apr 2021 20:08:21 -0000

[+ authors, WG, Shepherd, Chairs]


On March 7, 2021 at 3:56:09 PM, Tony Przygienda wrote:

Tony:

Hi!

I have some replies to your answers (inline).

Some of your answers point me to the schema -- which is ok, we had
agreed on it being Normative.  But it makes me wonder, what are we
standardizing?  I know we had talked about having the schema in the
document -- but I don't think we're on the same page as to what that
means.  I'm starting a separate thread with the WG (not everyone plays
attention to the review threads), so let's please talk about it there.

Thanks!

Alvaro.



...
> > 1341 4.2. Specification
...
> > 1358 Any attempt to transition from a state towards another on reception
> > 1359 of an event where no action is specified MUST be considered an
> > 1360 unrecoverable error.
...
> > [major] "MUST be considered an unrecoverable error" What is the result of
> > that? Should the adjacencies be reset, the rift process restarted, the
> > interfaces shut down, etc.?? There's no other place in the document that
> > mentions "unrecoverable".
>
> implementation dependent. It's pretty much the same as the protocol throwing
> a "core" or memory corruption. From that point behavior is unspecified but we
> want it to be considered something that should not attempt a recovery. In
> most stark terms don't even start RIFT again but we can't mandate any such
> thing. It's implementation guidelines.

Why can't you mandate that?  This is *the* spec! :-)

If you need this behavior for interoperability (by using MUST), then
you need to specify what the behavior is.  Yes, there are probably
internals that you can't specify, but the external behavior is fair
game: "all adjacencies MUST be reset and not be restarted" (or
something like that).


...
> > 1371 4.2.1. Transport
> >
> > 1373 All packet formats are defined in Thrift [thrift] models in
> > 1374 Appendix B.
> >
> > [] Can we get an index/TOC or some type of guide in the appendix to make
> > it easier to find specific parts? For example, finding where a LIE is
> > defined is not straight forward -- I happened to stumble on it in B.2.
> > Another option would be to point in §4.2.2 to look for LIEPacket in B.2.
>
> yes, as we said there will be a more extensive intro and a reader's guide.
> And I'll refer to model & according structure.

An idea just occurred to me -- not sure if it would work: would a
summary similar to the overview of a YANG module work?  It might not
end up looking like a tree, and it would include packets/fields (not
ro/rw information)...but it might show a summary of the schema without
showing the full schema upfront.

Just thinking out loud.



...
> > 1376 The serialized model is carried in an envelope within a UDP frame
> > 1377 that provides security and allows validation/modification of several
> > 1378 important fields without de-serialization for performance and
> > 1379 security reasons.
> >
> > [minor] This is a very short section -- please put references to other
> > places where these topics are explained more, if any.
>
> aha, ok. I probably expound the section a bit as in "this MD5 protects this
> and then this MD5 on top protects this" so e'thing is completely convered.

MD5 won't pass the Security scrutiny.   I just did a quick scan and
found the Security Envelope, but no details on what is MTI in terms of
algorithms, etc.




...
> > 1381 4.2.2. Link (Neighbor) Discovery (LIE Exchange)
...
> > [major] "An implementation MAY listen and send LIEs on IPv4 and/or IPv6
> > multicast addresses. A node MUST NOT originate LIEs on an address family
> > if it does not process received LIEs on that family."
...
> So, generally I prefer this flexiblity the spec has now since practically,
> all kind of combinations arise (add v4ov6 to the whole thing) & that's also
> why we have AFs carried on the link information as well to be able to figure
> out during SPF/from topology what AFs are on and detect discontinuity if so
> desired or even allow to fwd. v4 normal then jump a v4ov6 and get there.
...

To be clear, I like the flexibility.  The comment is about the
specification: as is, it is easy to deadlock.  You can either try to
"fix" the specification (make it so that somehow there's no deadlock),
or you can add some "disclaimers" about the need for the operator to
configure things right -- and declare that outside the scope of this
document.

[I have a suggestion later on.]



...
> > 1418 A three-way adjacency over any address family implies support for
> > 1419 IPv4 forwarding if the `v4_forwarding_capable` flag is set to true
> > 1420 and a node can use [RFC5549] type of forwarding in such a situation.
> > 1421 It is expected that the whole fabric supports the same type of
> > 1422 forwarding of address families on all the links. Operation of a
> > 1423 fabric where only some of the links are supporting forwarding on an
> > 1424 address family and others do not is outside the scope of this
> > 1425 specification.
...
> > [major] "`v4_forwarding_capable` flag" (This comment is not specific to
> > this flag.)
> >
> > The specific LIE packet format has not been presented, introduced or even
> > referenced up to now. Pointing to a flag, which happens to be optional in
> > an optional part of the LIEPacket, comes out of the blue and feels
> > completely out of place without prior explanation.
>
> I can try to describe all the different scenarios of v4/v6/v4ov6/mix but fact
> is, it will read more and more confused so I try to keep it fairly dry and
> short. It's obvious when one deploys/implements.

I'm sure it's obvious to you! ;-)


> ```
> A three-way adjacency over any address family implies support for IPv4
> forwarding if the `v4_forwarding_capable` flag is set to true and a node can
> use type of forwarding in such a situation. It is expected that the whole
> fabric supports the same type of forwarding of address families on all the
> links. Operation of a fabric where only some of the links are supporting
> forwarding on an address family and others do not is outside the scope of
> this specification.
> ```
>
> reads completely clear to me with ref to 5549 so what am I supposed to add
> that will clarify it further without muddying things.

You are not the audience.


...
> > [major] "a node can use [RFC5549] type of forwarding"
> >
> > First of all, rfc5549 is a BGP-specific RFC. How do the BGP mechanisms
> > defined there (and in rfc8950) map to RIFT? IOW, please specify how RIFT
> > works.
> >
> > Because you listed rfc5549 as a Normative reference, I assume you want
> > this behavior to be normative. s/can use/MUST/SHOULD/MAY ??
>
> Sigh, Alvaro, is it really the job of RIFT to start describe forwarding path
> all major chip vendors implemented? I do NOT think it's a wise idea. It
> refers to RFC5549 because the forwarding behavior is ultimately precisely
> the same but I don't want to write models of forwarding paths, ARP tables
> etc.
>
> The protocol spec describes all the control plane necessary to interop
> properly, it does not explain LPM or v4ov6 forwarding or any such thing.
>
> I would expect some more guidance here but again, this is not an
> implementation primer for a DC switch including fwd path, it's a control
> protocol spec.

Some comments:

- rfc2328/rfc4271 don't include references to other protocols when
describing how the protocol works -- I expect the same from RIFT.
This is exactly the same point as referencing OSPF/ISIS...

- No, you don't need to describe chip implementations -- if you do,
then the document is focusing on the wrong parts.  The specification
should (as you mention above) focus on describing the control plane.

- If the forwarding plane is out of scope, then say that.  If
possible, recommend (don't require) behavior.  It is better to leave
details (that can change based on internal implementation details) out
of scope than trying to address every corner case.


Let's go back to the paragraph above:

1418	   A three-way adjacency over any address family implies support for
1419	   IPv4 forwarding if the `v4_forwarding_capable` flag is set to true
1420	   and a node can use [RFC5549] type of forwarding in such a situation.
1421	   It is expected that the whole fabric supports the same type of
1422	   forwarding of address families on all the links.  Operation of a
1423	   fabric where only some of the links are supporting forwarding on an
1424	   address family and others do not is outside the scope of this
1425	   specification.

rfc5549/rfc8959 don't explicitly talk about forwarding -- they talk
about the extensions to BGP to carry IPv6 NEXT_HOPs when advertising
an IPv4 route.  The use of that information is not specified; instead,
pointers to where it could be applied (for example, softwire
applications) is all that is provided.

Given that, what does it mean to to support "[RFC5549] type of forwarding"?

In this document, you haven't yet talked about how the routes are
calculated.  But I'm assuming that when that happens you will clearly
specify where the next hop comes from --- and the fact that if the
adjacency is over an IPv6 link and v4_forwarding_capable is set, then
the IPv6 next hop can be used to forward IPv4 traffic.  To explain
that you don't need to go into details of how it could be done, just
that it can.

BTW, if the adjacency is over IPv4 then the "[RFC5549] type of
forwarding" is not needed.  Or is it?  Below you mention support for
"v4 over v6 only", but not IPv6 over IPv4 only -- was that just an
omission, or is that not supported?  I'm assuming it is.  The point
here is that the text talks about support for IPv4, but it says
nothing about IPv6 -- that is the reason for the rest of the questions
below.


> > [major] "It is expected that the whole fabric supports the same type of
> > forwarding of address families on all the links. Operation of a fabric
> > where only some of the links are supporting forwarding on an address
> > family and others do not is outside the scope of this specification."
> >
> > Please remind me, is IPv4 optional or required? I'm assuming that IPv6 is
> > required and IPv4 optional, but I can't find anything definitive in the
> > document.
>
> neither is required. RIFT will work with any combination (v4 only, v6 only,
> v4 and v6, v4 over v6 only). If neither is implemented then there is no
> forwarding obviously. LIEs basically exchange enough info over both AF and
> the 'v4 capable' flag to negotiate _any_ viable combination. So what do you
> want me to say? People deploy all of the above variants & are religious about
> it. Why would I say "v6 is required" if protocol works just fine with v4
> only. And why have "v4 required" if it works fine with v6 only as well.


Let's take a step back.  At this point (§4.2.2) we're talking about
LIE Exchange.

There's no explicit signaling for IPv6 support (ala
v4_forwarding_capable).  So the only way to know if a neighbor
supports IPv6 forwarding is to assume that it does because the LIEs
were sent form an IPv6 address.  OR  Once I get to the section about
TIEs it'll probably explain that both IPv4/IPv6 routes can be
advertised regardless of which AF the adjacency was formed on.   These
are just guesses (because the text so far doesn't say neither).  We
may be getting ahead of ourselves with all this talk about forwarding.


...
> You cannot figure that out BTW, a fabric may partition along AFs
> supported/capabilities and unless you run reachabiliuty per AF the fabric
> will blackhole likely. Running computona per AF is a very bad idea,
> partitioned (partially connected) networks are operationally a nightmare. I
> learned that the hard way in MT years ago.
>
> Applicability draft may want to loose a word or two about it.

I'm asking here because this document sets an expectation.  If the
protocol expects something, then it would be nice to know (at least)
what happens if the expectation is not met -- or, even better, if the
protocol has a mechanism to get out of trouble.

I am ok with some of these considerations (I mention operational
considerations in other places too) to be in the applicability draft.
This document should then reference the applicability one.  Also, I
will progress them together.

One more thing; I will probably continue to make comments about
operational considerations with the objective to not forget to later
check the applicability draft for guidance.


...
> > 1436 Unless ZTP as described in Section 4.2.7 is used, each node is
> > 1437 provisioned with the level at which it is operating. It MAY be also
> > 1438 provisioned with its PoD. If any of those values is undefined, then
> > 1439 accordingly a default level and/or an "undefined" PoD are assumed.
> > 1440 This means that leaves do not need to be configured at all if initial
> > 1441 configuration values are all left at "undefined" value. Nodes above
> > 1442 ToP MUST remain at "any" PoD value which has the same value as
> > 1443 "undefined" PoD. This information is propagated in the LIEs
> > 1444 exchanged.
...
> > [minor] "It MAY be also provisioned with its PoD." What does this mean?
> > Does it mean that all the nodes in the PoD have the same level? ...
>
> I don't understand how PoD and level can be confused. The glossary is super
> explicit about those things. There is never anywhere anything that somehow
> implies PoD numbering has anything to do with level.

That's one of the reasons I don't understand what this sentence means.
The full text is:

   Unless ZTP as described in Section 4.2.7 is used, each node is
   provisioned with the level at which it is operating.  It MAY be also
   provisioned with its PoD.  If any of those values is undefined, then
   accordingly a default level and/or an "undefined" PoD are assumed.

It looks like the "It" refers to "level".  Does it?  Or are you
referring to the node (maybe that's it)?

If the latter, then s/MAY/may because you seem to be stating a fact.



...
> > [major] BTW, this reminds me, what are the manageability considerations
> > related to RIFT? Among other things, it would be nice to have a summary of
> > parameters that are expected to be configured and their defaults. Please
> > take a look at rfc5706.
>
> Alvaro, since when are manageability considerations part of the base protocol
> spec? we have yang model spec, we have applicability spec, we can have a
> manageablity spec, fine with me. Base spec is here to say enough to build an
> interoperable implementation. That has been proven by a clean room open
> source that inter'oped day one.

See rfc5706.


...
> > 1449 A node tries to form a three-way adjacency if and only if
...
> > [major] "if and only if" Can this be translated into a Normative statement?
> > Are all these required (MUST) before the formation of a 3-way adjacency
> > started?
>
> I do NOT want to touch the LIE rules, those have been ironed over 2
> implementations super carefully and held. mathematicians use IIF religiously,
> good enough for me. Beside me @ least 2 or 3 people implemented those rules
> correctly (based on interop) already so the description clearly works.

I'm not asking you to change the rules.  I'm asking you to express the
rules in the "IETF's language" (rfc2119).

The fact that some people understand the meaning -- and that these are
probably the same people who have participated in writing the text
doesn't make it clear to others.  The specification is not for your
consumption.



...
> > 1454 2. the neighboring node is running the same MAJOR schema version AND
...
> > [minor] Are there any requirements related to the minor version?
>
> the schema versioning does describe that in excruciating details @ beginning
> of appendix B. I'll fwd reference

Sorry, I meant: for this step, does the minor version matter?  Or is
it ok if just the major version is the same?



> > 1456 3. the neighbor is not member of some PoD while the node has a
> > 1457 northbound adjacency already joining another PoD AND
...
> > [major] "not [a] member of [the] some PoD" #1 says that the "node is in
> > the same PoD" -- it looks like there's some context missing -- I guess
> > about the ability to join other PoDs. Is this only a northbound
> > characteristic?
>
> no, there is no such thing. You can basically configure PoD ooptionally and
> that's it if you choose. That gives protection against cross-PoD miswiring,
> there is nothing more here.

Maybe I didn't explain myself correctly, but I'm still confused:

1449	   A node tries to form a three-way adjacency if and only if

1451	   1.  the node is in the same PoD or either the node or the neighbor
1452	       advertises "undefined/any" PoD membership (PoD# = 0) AND
...
1456	   3.  the neighbor is not member of some PoD while the node has a
1457	       northbound adjacency already joining another PoD AND

#1 says "the node is in the same PoD", while #3 says that "the
neighbor is not member of some PoD", and these are joined by an "AND".
I don't understand how the neighbor can be in the same PoD, *and* not
be there at the same time.

If the case here is specifically related to the second part of #1
("or...advertises "undefined/any" PoD membership"), then that
condition should be clarified in #3.



...
> > 1464 6. the advertised MTUs match on both sides AND
...
> > [major]
> > struct LIEPacket {
> > ...
> > /** Layer 3 MTU, used to discover to mismatch. */
> > 4: optional common.MTUSizeType link_mtu_size =
> > common.default_mtu_size;
> >
> > If MTUSizeType is not included, where is it specified that the
> > default_mtu_size must be considered? Without that specification it is not
> > possible to satisfy this condition.
>
> all in schemas
> /** @note MUST be interpreted in implementation as unsigned */
> typedef i32 MTUSizeType
> /** default MTU link size to use */
> const MTUSizeType default_mtu_size = 1400
> encoding model includes common model

Well, that just tells me what the default is, not that if my neighbor
doesn't advertise anything I should consider the default.   The
language used in #6 refers to the "advertised MTU"...




> > 1466 7. both nodes advertise defined level values AND
> >
> > [major] The advertisement of LevelType in PacketHeader is optional. That
> > means that even if there is a default value a node may not "*advertise*
> > defined level values". Also, there are only 2 level values defined:
> >
> > const LevelType top_of_fabric_level = 24
> > const LevelType leaf_level = 0
> >
> > What am I missing?
>
> its own section explains ZTP algorithm in detail, advertisement etc
>
> It's impossible to describe in LIE processing section ZTP already, it's all a
> big muddle then. Level is optional because a ZTP node coming up has no level
> (except ToF) so it cannot show anything. Then ZTP starts to propagate, people
> get level offers, tie-break them, start to pick up levels & see whether that
> converges and so on.

Part of the problem that I'm having in this section is that this list
of conditions is preceded by: "A node tries to form a three-way
adjacency if and only if".  This sounds as if the conditions have to
be true even before the node tries to do anything.

Also, §4.2.7 says that ZTP is optional...and that "Configured nodes
and nodes operating in ZTP can be mixed".  Your answer implies that
ZTP takes care of figuring out what the level (in this case) is, and
that is great...if ZTP is used.  But if ZTP is not used, then what?
It seems to me that indicating the requirement that a configured node
always advertises its (configured) level would address the problem
with optional fields and defaults.



...
> > 1478 iii) both nodes are at level 0 AND both indicate support for
> > 1479 Section 4.3.8 OR
> >
> > [minor] "support for Section 4.3.8" Is there a name for that? How is it
> > indicated? §4.3.8 says that a leaf can "advertise the LEAF_2_LEAF flag in
> > its node capabilities"; there is nothing called "LEAF_2_LEAF" in the
> > schema, but "leaf_only_and_leaf_2_leaf_procedures" does show up. Please be
> > consistent in the terminology. [Also, there are places where "leaf-2-leaf"
> > is used.]
>
> yes, leaf-2-leaf procedures but it's a mouthful and it's not precise either
> since it's also leaf-only
>
> leaf-2-leaf is basically a common term for two flags in `
> HierarchyIndications
> `
> called
>
> leaf_only_and_leaf_2_leaf_procedures
>
> leaf_only
>
> whereas first includes second (it's a complex relationship we modelled like
> this)
>
> hence section 4.3.8 is bascially " anything around leaf only or leaf-2-leaf or
> both"
>
> So, I think refering to the section is still least confusing thing.

How can it be if the relationship above is not explained?  §4.3.8
talks about the non-existent LEAF_2_LEAF flag, and the relationship
with the two real flags is not explained anywhere.


> Scheam uses leaf_2_leaf because of thrift id rules but people prefer
> leaf-2-leaf in the document so far.

Please be consistent!




> > 1486 The rules checking PoD numbering MAY be optionally disregarded by a
> > 1487 node if PoD detection is undesirable or has to be ignored. This will
> > 1488 not affect the correctness of the protocol except preventing
> > 1489 detection of certain miscabling cases.
...
> > [major] "MAY be optionally disregarded by a node if PoD detection is
> > undesirable or has to be ignored."
> >
> > This is the only place where "PoD detection" is used. What is it? Where is
> > it specified? Why would be it undesirable? When would it have to be
> > ignored? Are there operational considerations for making that decision?
>
> in the LIE processing. you check whethe the other side matches your PoD or
> either one is ANY_POD=0 and that's it. No further magic.

Going back to the list, which starts with "if and only if" (which I
interpret as a requirement) followed by a bunch of ANDs.  Why would it
be undesirable to look at the PoD number?  When would it have to be
ignored?

Besides wanting to know the answers, which may be in the applicability
document, the fact that a piece of a required set of conditions is
optional is a contradiction.  Maybe the list is not required, just
recommended...??



> > [major] "...except preventing detection of certain miscabling cases."
> > Where is miscabling detection described? Which cases may not be covered?
> > §4.2.7.6 says that "internal ruleset flags a possible miscabling", but
> > that is the closest that I came to finding a description (and rulesets are
> > not mentioned anywhere else).
>
> miscabling computation is not part of the protocol spec. We allow to indicate
> "miscabling" but what is declared miscabling is impossible to specify.
> Peoiple do ALL kind of things in fabric, e.g. when your link starts to loose
> 5% some folks declare link "faulty" or "miscabled". There are no rules. An
> implementation can declare miscabling and do whatever it wants with the
> adjacency (probably hold down) and then it can put it into
>
> /** If any local links are miscabled, the indication is flooded. */
> 10: optional set miscabled_links;
>
> so you can look @ the ToF LSDB and see your whole fabric with links with
> problems. It's pretty well liked.

...but undefined...

The reason I asked about "miscabling" as a "major" comment is that it
is associated with Normative language (MAY), and the fact that by
disregarding that rule some functionality seems to be lost.  If
miscabling is not part of the spec, then it should not be part of the
spec.

There is no guidance anywhere about when the miscabled_links
indication can/should be set.  While the functionality may be
useful/liked, it could indicate anything -- in general, "there's an
issue".  I can wait for the applicability document to tell us about
any operational considerations, but it would be nice for the spec to
be clear about what the indication means.

Also, note that the first sentence of §4.2.2 says that "LIE
exchange...discovers miscablings", but in reality it doesn't.



...
> > 1507 4.2.2.1. LIE FSM
...
> > 1515 Event `MultipleNeighbors` occurs normally when more than two nodes
> > 1516 see each other on the same link or a remote node is quickly
> > 1517 reconfigured or rebooted without regressing to `OneWay` first. Each
> > 1518 occurrence of the event SHOULD generate a clear, according
> > 1519 notification to help operational deployments.
...
> > [minor] "occurs normally when..." Are there other conditions (which are
> > not considered "normal")?
>
> yes, weird ones like e.g. a node reboots & changes system ID and all of a
> sudden it looks like 3 nodes on a link (which is hinted). This is all too
> far for a protocol spec, those are day one/applicability things if one wants
> to go into detailed details of all the sceanrios.

I don't think we need to go into details -- the wording caught my attention.



...
> > [major] "SHOULD generate a clear, according notification to help
> > operational deployments." Normatively, what is "clear"? Is there
> > information that you have in mind that should be included to make it
> > "clear"? I assume that a notification is a message logged on the node, is
> > that true?
> >
> > Suggestion>
> > ...SHOULD log a message including the System IDs of each node.
> >
> > (Anything else?)
>
> logging is the old days. today people want telemetry, yang, open config,
> splunk channel, dancing bears, hence the "notification"

Sure -- the main point is really that the characteristics of a "clear,
according notification" are not, well, clear.  What should be included
in the notification?



...
> > 2069 4.2.3.3. Flooding
> > ...
> > 2082 As described before, TIEs themselves are transported over UDP with
> > 2083 the ports indicated in the LIE exchanges and using the destination
> > 2084 address on which the LIE adjacency has been formed. For unnumbered
> > 2085 IPv4 interfaces same considerations apply as in equivalent OSPF case.
> >
> > [major] (Related to a comment in 4.2.2.) "using the destination address on
> > which the LIE adjacency has been formed" If multiple addresses are used
> > (from different AFs), which one is selected? Is there a "primary" address
> > that is considered when the adjacency is formed?
>
> very good catch.
>
> No, the spec does not give the rules. An implementation will pick what it
> likes and the other side has to follow because it sees it as LIE source
> address.
>
> Trying to start to build "prefrences" is a futile exercise. everybody has a
> different opinion what is "best" address on an interface and then the
> "unnumbered" wars start ;-)

Ok.

Having not fully read this section yet...  Can there be "race
conditions" where one side picks the IPv6 address and the other picks
IPv4 (at the same time) -- they both send, the other side receives --
is the operation like that ok?   I ask because you mention that "the
other side has to follow".



...
> > 6362 B.1. common.thrift
> > ...
> > 6730 /** Link capabilities. */
> > 6731 struct LinkCapabilities {
> > 6732 /** Indicates that the link is supporting BFD. */
> > 6733 1: optional bool bfd =
> > 6734 common.bfd_default;
> > 6735 /** Indicates whether the interface will support v4 forwarding.
> >
> > 6737 @note: This MUST be set to true when LIEs from a v4 address are
> > 6738 sent and MAY be set to true in LIEs on v6 address. If v4
> > 6739 and v6 LIEs indicate contradicting information the
> > 6740 behavior is unspecified. */
> >
> > 6742 2: optional bool v4_forwarding_capable =
> > 6743 true;
> > 6744 }
> >
> > [major] Not knowing thrift, and assuming that this structure is some sort
> > of enum/list, what should a receiver do if the undefined/unknown value 3
> > is received?
>
> the packet won't even be decoded. you'll get an encoder/decoder error, it's
> all handled internally by thrift.

I guess that in general this means that the packet is ignored.


> That's the beauty of modelling. 80%+ of problems/worries we bean count
> manually in protocols just vanish magically.

Yes, but...what are we standardizing?  This is one of the reasons for
my note at the top.  Is the objective for the model to be the standard
(similar to what we do with YANG models), or to specify something that
allows other languages/models to be used?  I think it's the latter.


> Remember, we had _no_ bugs in first interop of LIEs with a blind 2nd
> implementation.

If I remember correctly, was that in Python?

No bugs doesn't necessarily mean that the edge cases were tested.




> > [major] "contradicting information the behavior is unspecified." How can
> > the behavior be unspecified if this *is* the specification? More to the
> > point: if the indication is required in IPv4 LIEs, but only optional in
> > IPv6 LIEs, why wouldn't the information from the IPv4 LIEs be considered
> > as the truth?
>
> because _sooo_ many things can go wrong. LIEs on different AFs can be
> buffered, arrive out of sequence and do a myriad crazy things. We were
> ruminating for a long time whether we should have one LIE on a single AF but
> then like I told you there is no "required AF" in RIFT based on customer
> input, e'one runs what they want on their fabric. And customers prefer AF
> fate that is not shared. So de facto v4 and v6 run as ships in the night
> (except the v4 capable v4ov6) and we end up advertising the bfd/v4 capable
> twice (because again, we don't know wehther we'll have either AF or both on
> an adjacency). Works fine in practice, an implementation seeing persistent
> mismatch should probably just declare link mismatch and shut down adjacency
> but that's really implementation specific.

As specified ("MUST be set to true when LIEs from a v4 address are
sent and MAY be set to true in LIEs on v6 address"), there is already
significant room for issues because, regardless of the actual ability
to support IPv4, the setting on an IPv6 LIE is optional.  IOW, the
v4_forwarding_capable flag may never be set on an IPv6 LIE because it
is optional -- the the implementation would comply with this
specification: it is not explicitly mentioned that it is required for
the flag to be set, and it is assumed (correctly, I think) that an
IPv4 LIE means that IPv4 forwarding is supported.

I really don't like the last part about letting the implementation
decide what to do: some may decide to shut down the adjacency while
others may keep trying -- neither will result in an established
adjacency (not sure if that's the correct name), but the external
behavior is inconsistent.  I would prefer it if something was
recommended.


Suggestion:

OLD>
6735	     /** Indicates whether the interface will support v4 forwarding.

6737	         @note: This MUST be set to true when LIEs from a v4 address are
6738	                sent and MAY be set to true in LIEs on v6 address. If v4
6739	                and v6 LIEs indicate contradicting information the
6740	                behavior is unspecified. */

NEW>

   /** Indicates whether the interface supports IPv4 forwarding.

      @note: v4_forwarding_capable MUST be set to TRUE if IPv4 forwarding is
      supported on the interface, regardless of the source IP address of the
      LIEs.  If the IPv4 and IPv6 LIEs indicate contradicting information then
      the adjacency SHOULD be shut down.


[End]