Re: [trill] Tsvart early review of draft-ietf-trill-over-ip-10

Joe Touch <> Mon, 26 June 2017 17:16 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 15D4C12EB0C; Mon, 26 Jun 2017 10:16:14 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -6.901
X-Spam-Status: No, score=-6.901 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_HI=-5, RP_MATCHES_RCVD=-0.001] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id 7L-oEZXtGeLI; Mon, 26 Jun 2017 10:16:12 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 0BDB1129BC4; Mon, 26 Jun 2017 10:16:12 -0700 (PDT)
Received: from [] ( []) (authenticated bits=0) by (8.13.8/8.13.8) with ESMTP id v5QHFGGQ023397 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT); Mon, 26 Jun 2017 10:15:18 -0700 (PDT)
To: Donald Eastlake <>, Magnus Westerlund <>
Cc:,, IETF Discussion <>, "" <>
References: <> <>
From: Joe Touch <>
Message-ID: <>
Date: Mon, 26 Jun 2017 10:15:15 -0700
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.0
MIME-Version: 1.0
In-Reply-To: <>
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Content-Language: en-US
X-ISI-4-43-8-MailScanner: Found to be clean
Archived-At: <>
Subject: Re: [trill] Tsvart early review of draft-ietf-trill-over-ip-10
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Developing a hybrid router/bridge." <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Mon, 26 Jun 2017 17:16:14 -0000

Hi, Donald,

On 6/25/2017 5:07 PM, Donald Eastlake wrote:
> Hi Magnus,
> Thanks for the extensive review. See my responses below.
> On Thu, Jun 15, 2017 at 1:32 PM, Magnus Westerlund
> <> wrote:
>> Reviewer: Magnus Westerlund
>> Review result: Not Ready
>> Early review of draft-ietf-trill-over-ip-10
>> Reviewer: Magnus Westerlund
>> Review result: Not Ready
>> TSV-ART review comments:
>> I have set this to not ready as there are several issues, some significant that
>> could affect the protocol realization significantly. Some may be me missing
>> things in TRILL, I was not that familiar with it before this review and I have
>> only tried looking up things, not reading the whole earlier specifications. So
>> don't hesitate to push back and provide pointers to things that can resolve
>> issues. The authors and the WG clearly have thought about a lot of issues and
>> dealt with much already.
> OK. Hopefully we can resolve these one way or the other.
> ...
>> TCP Encapsulation issue
>> -----------------------
>> Section 5.6:
>> The TCP encapsulation appear to be missing an delimiter format allowing each
>> individual TRILL packet/payload to be read out of the TCP's byte stream. In
>> other words, a normal implementation has no way of ensuring that the TCP
>> payload starts with the start of a new TRILL payload. Multiple small TRILL
>> payloads may be included in the same TCP payload, and also only parts as TCP is
>> one way of dealing with TRILL packets that are larger than the IP+Encapsulation
>> MTU that actually will work.
>> This comment is based on that there appear to be no length fields included in
>> the TRILL header. The most straight forward delimiter is a 2-byte length field
>> for the TRILL payload to be encapsulated.
> Right. It might also be useful to include some sort of check field, as
> is done in BGP, to detect if you are out of sync in parsing the TCP
> stream.

There is nothing in BGP that ever assumes that TCP write boundaries are
preserved. BGP uses markers and length fields to create message
boundaries in TCP's bytestream. The same is needed here.

Note that BGP also never claims to craft TCP packets by 'encapsulating'
a BGP message in a TCP segment. That part of this document needs to be
removed - it not how TCP is ever used.

> Another point is that, while with UDP it seems fine to send packets
> with assorted QoS, you don't want to encourage re-ordering of TCP
> packets in a stream. So if TCP encapsulation is being used,
Again - please, NO. NEVER use this term.

> you want
> to use the same DSCP value for the packets in a particular TCP stream.
Again, this is nonsensical. TCP would set a DSCP for the connection,
never in different ways for individual segments of a connection.

> So, generally, you need to have a TCP connection per priority handling
> category. Mapping the 8 priority levels into a smaller number of
> handling categories is a normal thing to do so you certainly don't
> necessarily need 8 TCP connections. Adding material on this should not
> be too hard.
Perhaps, but please - again, please - omit any mention or implication
that this occurs via encapsulation.

If you want to use TCP, please use it properly.

>> Section 5.6:
>> TCP endpoint requirements. I do wonder if an application like TRILL actual
>> would need to discuss performance impacting implementation choices or
>> limitations. For example use of NAGLE, the requirements on buffer sizes in
>> relation to Bandwidth delay products, as buffer memory in a RBridge will impact
>> performance.
> Well, I'm not sure how deeply this document should get into such
> performance issues. What about just saying something about
> consideration being given to tuning TCP for performance and pointing
> to one or a few other RFCs that talk about this?
Because your use of TCP (even if changed to describe it correctly) isn't
listed in those TCP RFCs.

And it's not so simple - NAGLE helps performance for interactive systems
that use single-byte messages (e.g., telnet) and reduces the number of
outstanding "less than full" segments. When used for encapsulation,
turning NAGLE off is the right thing for multibyte messages (e.g., TRILL
messages) and can avoid the "gathering" delay (200 ms stalls when there
isn't enough source data - i.e., incoming TRILL packets - to keep up
with the outgoing segments), but could also generate a large number of
small segments (which can interfere with segment-based congestion
control, vs. ABC).

Unless you want a very poorly performing result, *THIS* is what you need
to drill down into.