Re: [trill] RtgDir review of draft-ietf-trill-directory-assist-mechanisms-07.txt

Donald Eastlake <> Fri, 15 April 2016 15:23 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 2BB1312DDDB; Fri, 15 Apr 2016 08:23:37 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.449
X-Spam-Status: No, score=-2.449 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id UBMg7gib1Ven; Fri, 15 Apr 2016 08:23:32 -0700 (PDT)
Received: from ( [IPv6:2607:f8b0:4003:c06::230]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 5A5BA12DF96; Fri, 15 Apr 2016 08:23:32 -0700 (PDT)
Received: by with SMTP id s79so127184827oie.1; Fri, 15 Apr 2016 08:23:32 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=XW+CZWyx/fFE6Xm+yOqrOGpr5mv12//meK/r3Z6uuYM=; b=cBeO8IJoDAb/HGhjiiPqrgc9FxfRA4xfNyhbxF3vocIpHSc5q5jMfabTeWRQsuB/XB HnXDuScFbIwurMOagdF8FvoNFqMnx9gYuydRP24oYKwxgFB+/t7PfUJ85oCF4NEq9Yjx 9rR8ftnZ0aWwGHm1QBvPIa7mCQui2t4fpPhCcPsMFVMycUgLtUAmW1TZkOL/SbjM01J1 lJyQexm0uhEFaRUoZ0hfHaFF789QZMl3blxwquzXCuYMz5K9l/6vtPb7NhOTra65Q0nJ VSAqaafAlcOuiERn12ihTU1i39bqauhFZhuAkEs3rPJy8y3A2nNcHrSGplIPa/tlJfP4 GRag==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=XW+CZWyx/fFE6Xm+yOqrOGpr5mv12//meK/r3Z6uuYM=; b=S5wY9tZRu5ZAnbVHyq0bDbMNDX5Jl5VsAZg5h8/6qgYR8GHwnxvurDK8MjBZbKTptN kOQoWUqASj3urlAvw4GXpQQuqgm9Kwev3O4TbIa3PDpHOf2cD/7RG0FAmWagBvP6dzkY w6ioNcx4H45YB/kPc92JoV3nabejU0VXIeOYkPx3GTElFs4/18CacvLdvqU2nfRm5j8c M5xIYY/+LwVNpVBQze39AzzdF1YaFsakthZ1q9eucvH7c0rXMSI33WBOaV7zI/fMZWuR cGv0oNuOdhJhSup4sggQR5VXvwBV3FdiD2fcd/iIFaRMF1RuFlJpRvz/2FxzAEjKM0t5 e2Zg==
X-Gm-Message-State: AOPr4FXVQpffFfcUsiJ3YeUO1t0BDsBZliQRsXvqHy7kK41ukKDWr1BDpbQezvOLF3z6zR6gwlBf5eMBq9zQMg==
X-Received: by with SMTP id 63mr11237504otb.170.1460733811655; Fri, 15 Apr 2016 08:23:31 -0700 (PDT)
MIME-Version: 1.0
Received: by with HTTP; Fri, 15 Apr 2016 08:23:17 -0700 (PDT)
In-Reply-To: <>
References: <>
From: Donald Eastlake <>
Date: Fri, 15 Apr 2016 11:23:17 -0400
Message-ID: <>
To: "Joel M. Halpern" <>
Content-Type: multipart/alternative; boundary="94eb2c114d24f48ca00530879739"
Archived-At: <>
Cc: "" <>, "" <>, "" <>,
Subject: Re: [trill] RtgDir review of draft-ietf-trill-directory-assist-mechanisms-07.txt
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: "Developing a hybrid router/bridge." <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Fri, 15 Apr 2016 15:23:38 -0000

Hi Joel

Thanks for your thorough review and comments. See below

On Wed, Apr 13, 2016 at 4:47 PM, Joel M. Halpern <>


> Hello,


> I have been selected as the Routing Directorate reviewer for this

> draft. The Routing Directorate seeks to review all routing or

> routing-related drafts as they pass through IETF last call and IESG

> review, and sometimes on special request. The purpose of the review

> is to provide assistance to the Routing ADs. For more information

> about the Routing Directorate, please see ​



> Although these comments are primarily for the use

> of the Routing ADs, it would be helpful if you could consider them

> along with any other IETF Last Call comments that you receive, and

> strive to resolve them through discussion or by updating the draft.


> Document: draft-ietf-trill-directory-assist-mechanisms-07.txt

> Reviewer: Joel Halpern

> Review Date: 13-April-2016

> IETF LC End Date: N/A

> Intended Status: Proposed Standard


> Summary: I have significant concerns about this document and

> recommend that the Routing ADs discuss these issues further with the

> authors.


>     I do believe that the major issues are easily resolvable.  I

>     have tried to provide my best guess as to text how to resolve

>     each of them.


>     I would like to see the minor issues discussed and preferably

>     addressed.


> Major Issues:

>     In the state machine transitions in section 2.3.3

> for push servers, it appears that if the event indicating that the

> server is being shut down occurs while the server is already Going

> Stand-By or Uncompleting, the transitions indicate that this "going

> down" event will be lost.  A strict reading of this would seem to

> mean that the "go Down" event would need to recur after the timeout

> condition.  This would seem to be best addressed by a new state

> "Going-Down" whose timeout behavior is to move to down state.

I understand your point but "going down" and the like are called

"events or conditions" in this draft, not just events.

The problem with adding a single "Going-Down" state is that transition

to that state would lose the information as to whether or not the Push

Directory had been advertising that it was pushing complete

information or not. The reason to remember this is that you would want

to behave a differently if the "going down" condition was revoked

before it completed.  This information could be preserved in a Boolean

pseudo variable but the current style of state machine in this draft

avoids such pseudo variables and encodes all of the relevant push

directory's state into the state machine state. Thus, I can see three

possible responses to your comment:

1) Change wording to emphasize that these "events or conditions" can

be conditions that cause a state transition some substantial time

after they become true.

2) Add two new states: (1) going down - was complete; (2) going down -

was incomplete.

3) Change the style of state machine to admit pseudo variables which

can be set and testing as part of the state machinery.

Option 1 is just some minor wording changes but adopting either

options 2 or 3 involves more extensive changes so I would prefer to

avoid them.

> In section 2.3.2, the descriptions for event 3 and 5 are identical.

> I believe from the state transitions that condition 3 is supposed to

> reflect the server NOT having complete data when the Activate

> condition is met.

You are correct. Thanks for spotting this somewhat glaring error in

the event descriptions.

> In section 3.2.1 there is provision for using a received frame as a

> Query.  There are type indications as to what the type of the frame

> is.  I believe that the intent is that the query always contains the

> full received Ethernet Frame, no matter what the type is.  But it

> does not say that.  So one could also conclude that for ARP, what I

> should send is the ARP message, and for ND, the ND message, etc.  I

> believe the text needs to be clarified.  If my guess is correct that

> the full Ethernet Frame is to be send in all cases, then explanatory

> text as to why the various type codes exist would seem helpful,

> since the received frame contains enough information to support

> decoding.

Good point that this needs to say that the full Ethernet Frame (less

the FCS) is to be included. QTYPEs 2, 3, and 4 for ARP, ND, and RARP

could be combined. QTYPE 5 for an unknown unicast destination MAC

address is really a different service.

> Minor Issues:

>     In section 2.3.3 describing the state transitions for push

> servers, there is an event (event 1) described as "the server was

> Down but is now Up."  The state transition diagram describes this as

> being a valid event that does not change the servers state if the

> server is in any state other than "Down." In one sense, this is

> reasonable, saying that such an event is harmless.  I would however

> expect some sort of logging or administrative notification, as

> something in the system is quite confused.

Again, I see your point but it seems to me to be a matter of state

machine style. Note that the "event" is described as a condition, so

from that point of view, it is true anytime the state is other than

Down. On the other hand, if you view it as strictly an event, you are

left with the question of what to put at the intersection of a state

and event in the table when it is impossible for that event to occur

in that state. Some people note this with an "N/A" (not applicable)

entry. In fact, previous TRILL state diagrams such as in RFC 7177 use

"N/A" so it would probably be simplest to change to that for


>     Should section 2.4 include a note that indicates that reliance

> on information completeness does mean that there are windows when

> new entities join the space represented by particular TRILL data

> label during which packets for that destination may be dropped, due

> to clients not yet having received the updated information?  I

> believe this window is small, and it is quite reasonable to also

> note that in such text.

Yes, something like that would be a good addition to Section 2.4. It

depends a bit on how the Push Directory is being managed. It may be

pushing data provided by orchestration. In any case, there are always

finite delays and for a particular ingress TRILL switch and an end

station being connected to some other TRILL switch, the ingress could

learn reachability for the end station either a bit before or after it

is actually reachable so traffic intended for the end station could,

for example, be dropped during a brief window of time when it should

be forwarded.

>     Text in section on lifetimes and the information

> maintenance in section 3.3 imply that the clients and servers must

> maintain a connection.  Presumably, this is required already by the

> RBridge Channel protocol, and I understand that we should not repeat

> the entire protocol here.  It would seem to make readers life MUCH

> simpler if the text noted that the RBridge Channel protocol requires

> that there be a maintained connection between the client and the

> server, and that these mechanisms leverage the presence of that

> connection.

The basic RBridge Channel protocol [RFC7178] is a datagram protocol

rather than a connection protocol. So there is no guaranteed

continuity of connection between RBridges that have previously

exchanged RBridge Channel messages. But connection would only be lost

if the network partitions since RBridge Channel messages look like

data packets to any transit RBridges and will get forwarded as long as

there is any route.  Network partition is immediately visible in the

link state database to the RBridges at both ends of an RBridge Channel

exchange.  Section 3.7 provides that if a Pull Directory is no longer

reachable (i.e., RBridge Channel protocol packets would no longer get

through), then all pull responses from that Pull Directory MUST be

discarded since cache consistency update messages can't get through.

Perhaps a reference to Section 3.7 should be added to Section 3.3.

>     In section on Pull directory forwarding, I expect to see

> text about and to whom the Pull server will flood the received

> request.  Instead, the text appears to say that it is the response

> that will be flooded.  More importantly, the descriptive text talks

> about sending the response, which would normally be a description of

> sending the response to the requestor, not sending it to someone

> else.

If an ingress RBridge receives one of these broadcast/multicast

requests (ARP, etc.) and wants it flooded, they just do the normal

encapsulation as a TRILL data packet and it will be flooded (within

the VLAN or FGL). It is only if the ingress RBridge wants the directory

to answer the request that it would package it into an RBridge Channel

message and unicast it to a directory.

As to whom a response synthesized by the directory should be sent, it

seems like the safest thing is to send it as the response would

normally be sent by an end station responder. So, if an ARP response

would be broadcast (within a VLAN or the like), then I don't see what

is wrong about the directory flooding it.

>     In a related confusion, it seems very strange that a "flood"

> request will result in sending an underlying packet unicast to the

> destination.  This may be just terminology, but it seems likely to

> confuse implementers.  Maybe the flag should be called the Forward

> flag, with a note in the definition that it normally causes the

> response to be sent to multiple parties, but in the case of a raw

> MAC frame, results in the packet being forwarded to the destination

> or flooded, as the server can manage?

I don't have any particular problem with changing the FL flood flag to

an FR forward flag or whatever, but I'm not sure what difference it

would make to behavior. If you want the failure of the directory to be

able to answer the query to be that the query (ARP request or

whatever) to be sent pretty much as if it had never been packaged into

an RBridge Channel message and sent to the directory, then you set

this flag. When the flag is set and the directory can't answer the

query, it sends it as a TRILL Data packet. If the original frame was

broadcast or multicast, as is usually the case, then the directory

server "floods" it (with the frames VLAN or FGL). The same is true if

the request was of the "unknown destination unicast MAC" type -- if

the directory does not know the edge RBridge from which that MAC is

reachable and the FL flag is set, it floods it even though in this

case the Inner.MacDA is a unicast address.

In the unknown destination MAC case where the directory does know the

reachability of that MAC, then the frame is decapsulated from the

RBridge Channel and then encapsulated as a TRILL Data packet and sent

to the edge RBridge from which that MAC is reachable. In this case,

the FL flag has no effect since it only comes into play if the

directory does not have the required information.

I’m fine with adding some clarifying text on all this.

>     In the description in section 3.3 of Cache management, in the

> text on method one in which the servers keep minimal state, it would

> seem that a large health warning is needed, as this method will

> cause all clients to discard all positive data whenever any positive

> data at the server changes (even if no client is using the modified

> data.)  This makes a flapping end station an attack on the cache of

> all clients!

That would be true if the directory data comes from data plane

learning but not so much if it is from an orchestration system in a

Data Center. Adding some additional efficiency warning(s) is


>     It strikes me that the working group could help get robust

> deployment by making method 3 (tracking what you told clients) a

> SHOULD.  (I grant that it is not a MUST, as the other choices do

> work.)

That sounds reasonable. We can check with the WG.




 Donald E. Eastlake 3rd   +1-508-333-2270 (cell)

 155 Beaver Street, Milford, MA 01757 USA