Re: [Tsv-art] Tsvart early review of draft-ietf-lsvr-l3dl-03

Randy Bush <> Tue, 05 May 2020 22:45 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id AB6773A0C08; Tue, 5 May 2020 15:45:19 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id 2B9ivpsDKtAb; Tue, 5 May 2020 15:45:18 -0700 (PDT)
Received: from ( [IPv6:2001:418:8006::18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 775013A0C09; Tue, 5 May 2020 15:45:14 -0700 (PDT)
Received: from localhost ([] by with esmtp (Exim 4.90_1) (envelope-from <>) id 1jW6JN-0001bk-3s; Tue, 05 May 2020 22:45:13 +0000
Date: Tue, 05 May 2020 15:45:09 -0700
Message-ID: <>
From: Randy Bush <>
To: Joerg Ott via Datatracker <>
Cc: <>,,
In-Reply-To: <>
References: <>
User-Agent: Wanderlust/2.15.9 (Almost Unreal) Emacs/26.3 Mule/6.0 (HANACHIRUSATO)
MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue")
Content-Type: text/plain; charset=US-ASCII
Archived-At: <>
Subject: Re: [Tsv-art] Tsvart early review of draft-ietf-lsvr-l3dl-03
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Review Team <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Tue, 05 May 2020 22:45:20 -0000

wow!  thanks!  great review.

unfortunately the wg has gone dormant, so we have let the l3dl drafts
expire.  should it ever wake up, i will happly merge in your excellent

again, thank you!


> Reviewer: Joerg Ott
> Review result: Ready with Issues
> The draft describes a peer/neighbour discovery mechanisms for large-scale L2/L3
> topologies in data centres. The aim is provide a protocol by means of which the
> involved nodes can learn about other nodes connected to their (broadcast or
> point-to-point) L2 links and about their respectively support encapsulation
> schemes, identifiers, L2/L3 addresses, etc. This information is then provided
> to a higher layer for further processing.
> The document is well written and fairly easy to follow, but could benefit from
> a bit of extra context and target application domain in the introduction. E.g.,
> explaining explicitly who would talk L3DL to whom.
> >From a transport perspective, I see three potential issues that deserve
> clarification or reconsideration:
> 1. Section 10 spells out a default HELLO interval of 60 seconds. With a large
> broadcast domain, this may create quite a bit of traffic. While this may not be
> an issue in well-provisioned data center networks,  a remark about sensible
> value ranges and the implications may be worthwhile. Just to provide some
> guidelines to implementers (who want to offer choices) and operators (who pick
> them).
> 2. Section 10 also suggest that in response to HELLO messages nodes will issue
> OPEN PDUs to newly discovered peers. This appears to bear the clear risk of an
> OPEN implosion when many system come up at the same time. Shouldn't guidance be
> given to avoid repeated traffic surges and possible losses and thus unnecessary
> delays? (I noted that other places foresee exponential backoff when
> retransmitting OPEN and other ACKed PDUs).
> 3. When the protocol applies fragmentation, should there be a note on
> preventing bursts?
> Other notes:
> Section 7 on the checksum needs more detail. It also talks about a "suggested"
> algorithm but this should be clearly mandated or way to choose one by means of
> configuration for a complete data centre would need to be made explicit. I also
> assume that the pseudo code on p.11 would benefit from a leader '0' in
> 0xffffffff -> 0x0ffffffff, otherwise expansion to 64 bits might fill the high
> order bits with '1's, which is clearly not intended.
> Section 11, p.17, second to last para ("If a properly authenticated...").  From
> the text, it is unclear what is meant by an "OPEN with the Serial Number of the
> last data received".
> I am curious about the error code, providing 16 bits for additional
> explanation. Why not a text field? Also wondering if repeated retries (due to
> failure, not lost packets) could yield fast repeated transmissions.
> Section 15, should the KEEPALIVE interval have suggested (lower) bounds?
> At the top of p.26, it says "One per second is the default", the previous page
> at the bottom refers to the inter-KEEPALIVE interval of ten seconds. Not sure
> if the two are the same, I suppose so. If they are, the numbers should match.
> If they are not, we'll need some extra text to explain the difference.
> Nits:
> There are two spellings of "Encapsulation", capitalised and lower case. Use one
> consistently. p10, first para: comprise -> comprising