Re: [Lsvr] Tsvart early review of draft-ietf-lsvr-l3dl-03

Randy Bush <randy@psg.com> Tue, 05 May 2020 22:45 UTC

Return-Path: <randy@psg.com>
X-Original-To: lsvr@ietfa.amsl.com
Delivered-To: lsvr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id AB6773A0C08; Tue, 5 May 2020 15:45:19 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2B9ivpsDKtAb; Tue, 5 May 2020 15:45:18 -0700 (PDT)
Received: from ran.psg.com (ran.psg.com [IPv6:2001:418:8006::18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 775013A0C09; Tue, 5 May 2020 15:45:14 -0700 (PDT)
Received: from localhost ([127.0.0.1] helo=ryuu.rg.net) by ran.psg.com with esmtp (Exim 4.90_1) (envelope-from <randy@psg.com>) id 1jW6JN-0001bk-3s; Tue, 05 May 2020 22:45:13 +0000
Date: Tue, 05 May 2020 15:45:09 -0700
Message-ID: <m2lfm6m26i.wl-randy@psg.com>
From: Randy Bush <randy@psg.com>
To: Joerg Ott via Datatracker <noreply@ietf.org>
Cc: tsv-art@ietf.org, draft-ietf-lsvr-l3dl.all@ietf.org, lsvr@ietf.org
In-Reply-To: <158870511665.7532.2079643708622987385@ietfa.amsl.com>
References: <158870511665.7532.2079643708622987385@ietfa.amsl.com>
User-Agent: Wanderlust/2.15.9 (Almost Unreal) Emacs/26.3 Mule/6.0 (HANACHIRUSATO)
MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue")
Content-Type: text/plain; charset="US-ASCII"
Archived-At: <https://mailarchive.ietf.org/arch/msg/lsvr/O-IVFe9DjKpB1_uN6lMx4mN7p3Y>
Subject: Re: [Lsvr] Tsvart early review of draft-ietf-lsvr-l3dl-03
X-BeenThere: lsvr@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Link State Vector Routing <lsvr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/lsvr>, <mailto:lsvr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/lsvr/>
List-Post: <mailto:lsvr@ietf.org>
List-Help: <mailto:lsvr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/lsvr>, <mailto:lsvr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 05 May 2020 22:45:20 -0000

wow!  thanks!  great review.

unfortunately the wg has gone dormant, so we have let the l3dl drafts
expire.  should it ever wake up, i will happly merge in your excellent
suggestions.

again, thank you!

randy

> Reviewer: Joerg Ott
> Review result: Ready with Issues
> 
> The draft describes a peer/neighbour discovery mechanisms for large-scale L2/L3
> topologies in data centres. The aim is provide a protocol by means of which the
> involved nodes can learn about other nodes connected to their (broadcast or
> point-to-point) L2 links and about their respectively support encapsulation
> schemes, identifiers, L2/L3 addresses, etc. This information is then provided
> to a higher layer for further processing.
> 
> The document is well written and fairly easy to follow, but could benefit from
> a bit of extra context and target application domain in the introduction. E.g.,
> explaining explicitly who would talk L3DL to whom.
> 
> >From a transport perspective, I see three potential issues that deserve
> clarification or reconsideration:
> 
> 1. Section 10 spells out a default HELLO interval of 60 seconds. With a large
> broadcast domain, this may create quite a bit of traffic. While this may not be
> an issue in well-provisioned data center networks,  a remark about sensible
> value ranges and the implications may be worthwhile. Just to provide some
> guidelines to implementers (who want to offer choices) and operators (who pick
> them).
> 
> 2. Section 10 also suggest that in response to HELLO messages nodes will issue
> OPEN PDUs to newly discovered peers. This appears to bear the clear risk of an
> OPEN implosion when many system come up at the same time. Shouldn't guidance be
> given to avoid repeated traffic surges and possible losses and thus unnecessary
> delays? (I noted that other places foresee exponential backoff when
> retransmitting OPEN and other ACKed PDUs).
> 
> 3. When the protocol applies fragmentation, should there be a note on
> preventing bursts?
> 
> Other notes:
> Section 7 on the checksum needs more detail. It also talks about a "suggested"
> algorithm but this should be clearly mandated or way to choose one by means of
> configuration for a complete data centre would need to be made explicit. I also
> assume that the pseudo code on p.11 would benefit from a leader '0' in
> 0xffffffff -> 0x0ffffffff, otherwise expansion to 64 bits might fill the high
> order bits with '1's, which is clearly not intended.
> 
> Section 11, p.17, second to last para ("If a properly authenticated...").  From
> the text, it is unclear what is meant by an "OPEN with the Serial Number of the
> last data received".
> 
> I am curious about the error code, providing 16 bits for additional
> explanation. Why not a text field? Also wondering if repeated retries (due to
> failure, not lost packets) could yield fast repeated transmissions.
> 
> Section 15, should the KEEPALIVE interval have suggested (lower) bounds?
> At the top of p.26, it says "One per second is the default", the previous page
> at the bottom refers to the inter-KEEPALIVE interval of ten seconds. Not sure
> if the two are the same, I suppose so. If they are, the numbers should match.
> If they are not, we'll need some extra text to explain the difference.
> 
> Nits:
> There are two spellings of "Encapsulation", capitalised and lower case. Use one
> consistently. p10, first para: comprise -> comprising
> 
> 
>