[Rift] AD Review draft-ietf-rift-rift-16 (§4.2.3 thru §4.2.3.2)

Alvaro Retana <aretana.ietf@gmail.com> Mon, 13 February 2023 13:00 UTC

Return-Path: <aretana.ietf@gmail.com>
X-Original-To: rift@ietfa.amsl.com
Delivered-To: rift@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 82C86C16951E; Mon, 13 Feb 2023 05:00:07 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.097
X-Spam-Level:
X-Spam-Status: No, score=-2.097 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_BLOCKED=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Mh4ZogfIWYrj; Mon, 13 Feb 2023 05:00:06 -0800 (PST)
Received: from mail-pl1-x62f.google.com (mail-pl1-x62f.google.com [IPv6:2607:f8b0:4864:20::62f]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 88FA6C16951B; Mon, 13 Feb 2023 05:00:06 -0800 (PST)
Received: by mail-pl1-x62f.google.com with SMTP id i18so4978902pli.3; Mon, 13 Feb 2023 05:00:06 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date :mime-version:from:from:to:cc:subject:date:message-id:reply-to; bh=RAyXW+ea8bJTdAJAjodpyaBokJL4bFMaanShGyJGRWo=; b=ZhmcQB6R0db9cZHXfHXZ4V3cIoR77KjTMVLrmyt36pf5I0bwimt/ESUCjj5tggCTA/ EtDfrgEukufn9Ftgk7LbFNSNL1GLT0g+ZFK5tuzWCotjnKJ5KWaWghse8kAljp9Iqf1a z9Kf3u6XcEZWcSYPtT3MaocaW4ynvBHYFxB2Y0Ep1pqsQZ/Vy75U8Z+mO0sGKBF9EEE7 t8Zxr/ad95JqUsQZ92vG/eS5SQdBlCxqosjxouc7I2+5DyrAwD/e0+/+sKWm4Z7RdYEz 9mifSCW5/J1VdyTRFQUqcwhjNodjBVB1yCDRQW2rZUOtYLULfCyTyaYTI0glu0NY0AxP A41w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date :mime-version:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=RAyXW+ea8bJTdAJAjodpyaBokJL4bFMaanShGyJGRWo=; b=xU2plIIUG9R1bZL3T+iuYsMCttYXNocUcx8EODlHnSXueorWv+EGYB0TIuN61YJcBI blDhVRHLe+eRqMAx4hxbS6L3YqN0NbBxjdzrYgn5w+ZBwAn/4uCRuhxHpyb2PdfguDgs G5WzFWc9R87+3rvnigwWhxwDZobfpV/OsgjyGUtbmYXfWtppG+W5yHF1jl8oUZcfWu3D o1a1M7DcElaiXzCExE8I8qgz3Fnje/SY3c0t+dVRuV4SYg90GK//fZL8gYfEJyImq7Cy 868vd4nmqJxe+iAEf3yKr0Ji9OoXWmYwRdAlbRnb29krKFWGjC2sx6UP+6POxzKqU/7V lMjQ==
X-Gm-Message-State: AO0yUKUMlHGTTziMaeYWi1/QajRpLCmpo/EtO5hFmNLrb2reY6YMRxNa /66IQ4wgUZdlAJNLXpYNEOusIBlrZmTgTBmYze/f/98R
X-Google-Smtp-Source: AK7set/dRql3JYdZh+OfSk7fgWHfdgSPNWd4zgUB4vMS2ApG6shYSFePG+RpEXYU0LpX9NcGBQiJS9PteSVvmPfGwyw=
X-Received: by 2002:a17:90a:1d5:b0:233:b56f:a6c1 with SMTP id 21-20020a17090a01d500b00233b56fa6c1mr2066053pjd.62.1676293205773; Mon, 13 Feb 2023 05:00:05 -0800 (PST)
Received: from 1058052472880 named unknown by gmailapi.google.com with HTTPREST; Mon, 13 Feb 2023 07:00:04 -0600
From: Alvaro Retana <aretana.ietf@gmail.com>
MIME-Version: 1.0
Date: Mon, 13 Feb 2023 07:00:04 -0600
Message-ID: <CAMMESsyNYGGz10G=u6jwdM+1hmJMmnpd4W_75iyN4ZwNLQRPiw@mail.gmail.com>
To: Jordan Head <jhead@juniper.net>
Cc: "draft-ietf-rift-rift@ietf.org" <draft-ietf-rift-rift@ietf.org>, "EXT-zhang.zheng@zte.com.cn" <zhang.zheng@zte.com.cn>, rift-chairs@ietf.org, "rift@ietf.org" <rift@ietf.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/rift/ZG8yIWPSFML9kbdguV4abn_eNOQ>
Subject: [Rift] AD Review draft-ietf-rift-rift-16 (§4.2.3 thru §4.2.3.2)
X-BeenThere: rift@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Discussion of Routing in Fat Trees <rift.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rift>, <mailto:rift-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rift/>
List-Post: <mailto:rift@ietf.org>
List-Help: <mailto:rift-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rift>, <mailto:rift-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 13 Feb 2023 13:00:07 -0000

Hi!

This message covers §4.2.3 to the end of §4.2.3.2.

Alvaro.



2207	4.2.3.  Topology Exchange (TIE Exchange)

2209	4.2.3.1.  Topology Information Elements
...
2214	   The TIE exchange mechanism uses the port indicated by each node in
2215	   the LIE exchange as `flood_port` in `LIEPacket` and the interface on
2216	   which the adjacency has been formed as destination.  TIEs MUST be
2217	   sent with an IPv4 Time to Live (TTL) or an IPv6 Hop Limit (HL) of
2218	   either 1 or 255 and also MUST be ignored if received with values
2219	   different than 1 or 255.  This prevents RIFT information from
2220	   reaching beyond a single L3 next-hop in the topology.  TIEs SHOULD be
2221	   sent with network control precedence unless an implementation is
2222	   prevented from doing so [RFC2474].

[major] "This prevents RIFT information from reaching beyond a single
L3 next-hop in the topology."

This is not completely true for TTL = 1.  It is true that the TIE
cannot be forwarded beyond the local subnet.  But it is not true that
a TIE cannot be forwarded *into* the local subnet.  This is one of the
attack vectors that are left open -- these are the same concerns as
with the LIEs.



2224	   TIEs contain sequence numbers, lifetimes and a type.  Each type has
2225	   ample identifying number space and information is spread across
2226	   possibly many TIEs of a certain type by the means of a hash function
2227	   that an implementation can individually determine.  One extreme
2228	   design choice is a prefix per TIE which leads to more BGP-like
2229	   behavior where small increments are only advertised on route changes
2230	   vs. deploying with dense prefix packing into few TIEs leading to more
2231	   traditional IGP trade-off with fewer TIEs.  An implementation may
2232	   even rehash prefix to TIE mapping at any time at the cost of
2233	   significant amount of re-advertisements of TIEs.

[major] "TIEs contain sequence numbers, lifetimes and a type."

There is one TIE, but multiple types of TIEElements, right?  And each
TIE carries only one TIEElement at a time, right?

If so, then "information is spread across possibly many TIEs of a
certain type" should say "across multiple TIEElements", or maybe
"across multiple TIEs with the same TIEElement type", or something
like that.


[major] "...information is spread across possibly many TIEs of a
certain type by the means of a hash function that an implementation
can individually determine."

Does the receiver need to know the hash function?  I'm hoping the answer is No.

The rest of this paragraph only talks about prefixes.  Even if
possibly less common, I assume that the ability to spread information
applies to all TIEElement Types.  Is that true?  If so, please be
explicit -- and maybe keep the text about prefixes as an example.


[major] Please remove comparisons or references to other protocols:
"BGP-like behavior...traditional IGP".



...
2238	4.2.3.2.  Southbound and Northbound TIE Representation
...
2248	   The North TIEs hold all of the node's adjacencies and local prefixes
2249	   while the South TIEs hold only all of the node's adjacencies, the
2250	   default prefix with necessary disaggregated prefixes and local
2251	   prefixes.  Section 4.2.5 explains further details.

[minor] I know I should go look at §4.2.5, but the description of
doesn't seem correct:

"North TIEs hold all of the node's adjacencies and local prefixes"

"South TIEs hold only" -- "only" seems to indicate less information --
"all of the node's adjacencies, the default prefix with necessary
disaggregated prefixes and local prefixes".   So the South TIEs have
more information.

Maybe it's a matter of eliminating "only".



2253	   The TIE types are mostly symmetric in both directions and Table 3
2254	   provides a quick reference to main TIE types including direction and
2255	   their function.  The direction itself is carried in `direction` of
2256	   `TIEID` schema element.

[major] The TIE types (the table says "TIE-Types", please be
consistent) in the table are not the same as the types defined in the
schema:

/** Type of TIE.
*/
enum TIETypeType {
    Illegal                                     = 0,
    TIETypeMinValue                             = 1,
    /** first legal value */
    NodeTIEType                                 = 2,
    PrefixTIEType                               = 3,
    PositiveDisaggregationPrefixTIEType         = 4,
    NegativeDisaggregationPrefixTIEType         = 5,
    PGPrefixTIEType                             = 6,
    KeyValueTIEType                             = 7,
    ExternalPrefixTIEType                       = 8,
    PositiveExternalDisaggregationPrefixTIEType = 9,
    TIETypeMaxValue                             = 10,
}

See my comment above about differentiating between the types of TIEs
(shown in the table) and the types of TIEElements (defined in the
schema).  As is, the terminology is confusing because TIE types seem
to have two different meanings. (The last sentence in this section
also uses the same confusing terminology.)

A simplistic interpretation indicates that a TIE type is the same as
the corresponding TIEElement type, but with a direction.



2258	       +=========================+=================================+
2259	       | TIE-Type                | Content                         |
2260	       +=========================+=================================+
2261	       | Node North TIE          | node properties and adjacencies |
2262	       +-------------------------+---------------------------------+
2263	       | Node South TIE          | same content as node North TIE  |
2264	       +-------------------------+---------------------------------+
2265	       | Prefix North TIE        | contains nodes' directly        |
2266	       |                         | reachable prefixes              |
2267	       +-------------------------+---------------------------------+

[nit] s/nodes'/node's


2268	       | Prefix South TIE        | contains originated defaults    |
2269	       |                         | and directly reachable prefixes |
2270	       +-------------------------+---------------------------------+

[minor] "directly reachable prefixes"

The text above, and a definition in the schema, use "local prefixes"
instead.  Please be consistent.


2271	       | Positive Disaggregation | contains disaggregated prefixes |
2272	       | South TIE               |                                 |
2273	       +-------------------------+---------------------------------+
2274	       | Negative Disaggregation | contains special, negatively    |
2275	       | South TIE               | disaggregated prefixes to       |
2276	       |                         | support multi-plane designs     |
2277	       +-------------------------+---------------------------------+
2278	       | External Prefix North   | contains external prefixes      |
2279	       | TIE                     |                                 |
2280	       +-------------------------+---------------------------------+
2281	       | Key-Value North TIE     | contains nodes northbound KVs   |
2282	       +-------------------------+---------------------------------+
2283	       | Key-Value South TIE     | contains nodes southbound KVs   |
2284	       +-------------------------+---------------------------------+

2286	                             Table 3: TIE Types

[major] The table doesn't include all the TIEElement types + direction.



2288	   As an example illustrating a databases holding both representations,
2289	   the topology in Figure 2 with the optional link between spine 111 and
2290	   spine 112 (so that the flooding on an East-West link can be shown) is
2291	   considered.  Unnumbered interfaces are implicitly assumed and for
2292	   simplicity, the key value elements which may be included in their
2293	   South TIEs or North TIEs are not shown.  First, in Figure 15 are the
2294	   TIEs generated by some nodes.

2296	          ToF 21 South TIEs:
2297	          Node South TIE:
2298	            NodeElement(level=2, neighbors((Spine 111, level 1, cost 1),
2299	            (Spine 112, level 1, cost 1), (Spine 121, level 1, cost 1),
2300	            (Spine 122, level 1, cost 1)))
2301	          Prefix South TIE:
2302	            SouthPrefixesElement(prefixes(0/0, cost 1), (::/0, cost 1))

[major] s/NodeElement/NodeTIEElement/g
In order to match the schema.


[major] s/SouthPrefixesElement/PrefixTIEElement/g



[major] For the prefixes, "cost" is used.  However, the schema uses
"metric" in the PrefixAttributes structure:

/** Distance of the prefix. */
2: required common.MetricType            metric
        = common.default_distance;

Note that NodeNeighborsTIEElement uses different language to refer to
the same thing:

/**  Cost to neighbor. Ignore anything larger than `infinite_distance`
and `invalid_distance` */
3: optional common.MetricType               cost
            = common.default_distance;


The terminology section defines cost, but not metric.  Please be consistent.

The definitions are:

   Cost:  The term signifies the weighted distance between two neighbors.

   Distance:  Sum of costs (bound by infinite distance) between two nodes.


There is clearly some type of circular definition between the cost
("weighted distance") and the distance ("sum of costs").  ??

Also, note that "weighted" is only used in this definition -- except
when referring to weighted ECMP or UCMP.



2304	          Spine 111 South TIEs:
2305	          Node South TIE:

2307	            NodeElement(level=1, neighbors((ToF 21, level 2, cost 1,
2308	                        links(...)),
2309	            (ToF 22, level 2, cost 1, links(...)),
2310	            (Spine 112, level 1, cost 1, links(...)),
2311	            (Leaf111, level 0, cost 1, links(...)),
2312	            (Leaf112, level 0, cost 1, links(...))))

[minor] Why aren't the links included in the ToF 21 TIEs?  I know that
they are optional -- it just looks inconsistent.


2313	          Prefix South TIE:
2314	            SouthPrefixesElement(prefixes(0/0, cost 1), (::/0, cost 1))

2316	          Spine 111 North TIEs:
2317	          Node North TIE:
2318	            NodeElement(level=1,
2319	            neighbors((ToF 21, level 2, cost 1, links(...)),
2320	            (ToF 22, level 2, cost 1, links(...)),
2321	            (Spine 112, level 1, cost 1, links(...)),
2322	            (Leaf111, level 0, cost 1, links(...)),
2323	            (Leaf112, level 0, cost 1, links(...))))
2324	          Prefix North TIE:
2325	            NorthPrefixesElement(prefixes(Spine 111.loopback)

[major] s/NorthPrefixesElement/PrefixTIEElement/g

2327	          Spine 121 South TIEs:
2328	          Node South TIE:
2329	            NodeElement(level=1, neighbors((ToF 21,level 2,cost 1),
2330	            (ToF 22, level 2, cost 1), (Leaf121, level 0, cost 1),
2331	            (Leaf122, level 0, cost 1)))

[minor] The links are not shown here...



...
2362	   For node TIEs to carry more adjacencies than fit into an MTU, the
2363	   element `neighbors` may contain different set of neighbors in each
2364	   TIE.  Those disjoint sets of neighbors MUST be joined during
2365	   corresponding computation.  Nevertheless, in case across multiple
2366	   node TIEs

[nit] "node TIEs"

Some places capitalize "Node TIEs" -- please be consistent.


[nit] s/fit into an MTU/fit into an MTU-sized packet


[nit] s/contain different set/contain a different set


[major] "MUST be joined during corresponding computation"

How?  Given that this is a Normative requirement, you should indicate
how.  I get that this is a simple operation -- nonetheless, it should
be specified.  See more below.

Which "corresponding computations"?  I'm assuming that joining
(considering multiple sets of neighbors as one) should be done for
all.  IOW, are there specific cases where it is (not) needed?


2368	   1.  `capabilities` do not match *or*

2370	   2.  `flags` values do not match *or*

2372	   3.  same neighbor repeats in multiple TIEs with different values

2374	   the behavior is undefined and a warning SHOULD be generated after a
2375	   period of time.

[major] "behavior is undefined and a warning SHOULD be generated"

So, when joining (required above), what should a node do if a specific
"neighbor repeats in multiple TIEs with different values".  I know
you're saying that the behavior is undefined, but what happens to that
neighbor?  Is it still considered "during corresponding computation"?
Which version?

It seems to me that generating a warning implies that the node keeps
working -- I just don't understand which state it uses going forward.

The same concerns apply to `capabilities` and `flags`.


[major] "a warning SHOULD be generated after a period of time"

How long is that period?  Is there a timer somewhere?



2377	   The element `miscabled_links` SHOULD be repeated in every node TIE,
2378	   otherwise the behavior is undefined.

[major] "SHOULD be repeated"

When is it ok to not repeat it?  It seems like the answer is "never",
otherwise the behavior is undefined.  Why is this behavior recommended
and not required?

I see, from the schema, that `miscabled_links` is optional, which
means that it would be used only if any links are miscabled.  I can't
find a place that talks about what is considered a miscabled link (I
seem to remember having this discussion before) -- the only other
mention of "miscabled" is in §4.2.7.3, so (even if there's no explicit
mention) it looks like it (or perhaps the whole ZTP section) might be
a good enough reference.

Suggestion>
   If the fabric is miscabled (Secion xxx), the element `miscabled_links`
   MUST be repeated in every Node TIE.  If it isn't, the behavior is
   undefined.

BTW, which behavior is undefined?  I can't find a place that talks
about the effect of miscabling.

Also, no warning is needed in this case?



2380	   A top of fabric node MUST include in the node TIEs in
2381	   `same_plane_tofs` element all the other ToFs it is aware of through
2382	   reflection.  To prevent MTU overrun problems, multiple node TIEs can
2383	   carry disjoint sets of ToFs which can be joined to form a single set.
2384	   This element allows nodes in other planes that are on the multi-plane
2385	   ring with this node to have information describing the complete plane
2386	   and with that all ToFs in a multi-plane fabric are aware of all other
2387	   ToFs which can be used further to form input to complex multi-plane
2388	   elections.

[major] "A top of fabric node MUST include...in `same_plane_tofs`
element all the other ToFs..."

The requirement seems to indicate that "all the other ToFs" have to be
included in the element, but that is not strictly true because the set
can be divided into subsets...

Suggestion>
   A ToF none MUST include all the other ToFs it is aware of
   through reflection.  The `same_plane_tofs` element is used
   to carry this information.


[major] "can be joined to form a single set"

This text is not a requirement (as was the text above about joining
neighbor sets).  Should it be?  Why would it be different?


[nit] "...can be used further to form input to complex multi-plane elections."

It would be nice to reference where that is specified.



2390	   Different TIE types are carried in `TIEElement`.  Schema enum
2391	   `common.TIETypeType` in `TIEID` indicates which elements MUST be
2392	   present in the `TIEElement`. In case of mismatch the unexpected
2393	   elements MUST be ignored.  In case of lack of expected element in the
2394	   TIE an error MUST be reported and the TIE MUST be ignored.  The
2395	   element `positive_disaggregation_prefixes` and
2396	   `positive_external_disaggregation_prefixes` MUST be advertised
2397	   southbound only and ignored in North TIEs.  The element
2398	   `negative_disaggregation_prefixes` MUST be aggregated and propagated
2399	   according to Section 4.2.5.2 southwards towards lower levels to heal
2400	   pathological upper level partitioning, otherwise traffic loss may
2401	   occur in multiplane fabrics.  It MUST NOT be advertised within a
2402	   North TIE and ignored otherwise.

[minor] "Different TIE types are carried in `TIEElement`."

See my comment earlier about the types of TIEs and how the terminology
is confusing.  The last paragraph in the next section also mentions
"TIE types".


[major] "Schema enum `common.TIETypeType` in `TIEID` indicates which
elements MUST be present in the `TIEElement`."

No -- TIETypeType is just an enum, it doesn't indicate any required elements.


[major] "In case of mismatch...In case of lack of expected element..."

The schema says that all the parts of TIEElement are optional:

/** Single element in a TIE. */
union TIEElement {
    /** Used in case of enum common.TIETypeType.NodeTIEType. */
    1: optional NodeTIEElement     node;
    /** Used in case of enum common.TIETypeType.PrefixTIEType. */
    2: optional PrefixTIEElement          prefixes;
    /** Positive prefixes (always southbound). */
    3: optional PrefixTIEElement   positive_disaggregation_prefixes;
    /** Transitive, negative prefixes (always southbound) */
    5: optional PrefixTIEElement   negative_disaggregation_prefixes;
    /** Externally reimported prefixes. */
    6: optional PrefixTIEElement          external_prefixes;
    /** Positive external disaggregated prefixes (always southbound). */
    7: optional PrefixTIEElement
            positive_external_disaggregation_prefixes;
    /** Key-Value store elements. */
    9: optional KeyValueTIEElement keyvalues;
} (python.immutable = "")


I see that each of the parts *TIEElement has required elements in it.
The text above talks about required elements in TIEElement which is
not correct.  They are required elements in the optional components
(*TIEElement) of TIEElement.  Please be precise.


[minor] "unexpected elements MUST be ignored"

Is it just unexpected or also unknown?  I guess unknown are
unexpected, but this sentence talks about a mismatch, indicating that
there are specific yes/no elements (not unknown).


[major] "The element `positive_disaggregation_prefixes` and
`positive_external_disaggregation_prefixes` MUST be advertised
southbound only and ignored in North TIEs."

These elements are optional, so the "MUST" doesn't work.

Suggestion>
   The `positive_disaggregation_prefixes` and
   `positive_external_disaggregation_prefixes` elements MAY be
   advertised in South TIEs and MUST be ignored if advertised
   in North TIEs.


[major] "The element `negative_disaggregation_prefixes` MUST be
aggregated and propagated according to Section 4.2.5.2 southwards
towards lower levels to heal pathological upper level partitioning,
otherwise traffic loss may occur in multiplane fabrics. It MUST NOT be
advertised within a North TIE and ignored otherwise."

- §4.2.5.2 doesn't mention this element (or PrefixTIEElement)
explicitly.  The specification seems to be in §4.2.5.2.2.

- s/aggregated/disaggregated

- "MUST be aggregated and propagated according to Section 4.2.5.2"

§4.2.5.2.2 doesn't require the behavior (MUST), it recommends it
(SHOULD).  Instead of making both sentences use the same word, please
specify the behavior only once.  §4.2.5.2.2 seems like the right
place.

- "...to heal pathological upper level partitioning, otherwise traffic
loss may occur in multiplane fabrics."

If the specification is somewhere else, then this explanation should
be there too.


Suggestion>
   The `negative_disaggregation_prefixes` elements MAY be
   advertised in South TIEs and MUST be ignored if advertised
   in North TIEs.

[EoR - Feb/13]