Re: [Tsv-art] Tsvart early review of draft-ietf-lsvr-l3dl-03
Joerg Ott <ott@in.tum.de> Mon, 11 May 2020 20:40 UTC
Return-Path: <ott@in.tum.de>
X-Original-To: tsv-art@ietfa.amsl.com
Delivered-To: tsv-art@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 761643A0B6E; Mon, 11 May 2020 13:40:37 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2O7LySk89wwn; Mon, 11 May 2020 13:40:34 -0700 (PDT)
Received: from mail-out2.informatik.tu-muenchen.de (mail-out2.in.tum.de [131.159.0.36]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6F92E3A0B86; Mon, 11 May 2020 13:40:32 -0700 (PDT)
Received: by mail.in.tum.de (Postfix, from userid 107) id CAA871C0851; Mon, 11 May 2020 22:40:30 +0200 (CEST)
Received: (Authenticated sender: ott) by mail.in.tum.de (Postfix) with ESMTPSA id F26971C084A; Mon, 11 May 2020 22:40:27 +0200 (CEST) (Extended-Queue-bit tech_mewbi@fff.in.tum.de)
To: Randy Bush <randy@psg.com>
Cc: tsv-art@ietf.org, lsvr@ietf.org, draft-ietf-lsvr-l3dl.all@ietf.org
References: <158870511665.7532.2079643708622987385@ietfa.amsl.com> <m2sggclma3.wl-randy@psg.com>
From: Joerg Ott <ott@in.tum.de>
Message-ID: <c1712c72-fcf1-f2d5-e5ab-f8f4eb3f911d@in.tum.de>
Date: Mon, 11 May 2020 22:40:27 +0200
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:68.0) Gecko/20100101 Thunderbird/68.8.0
MIME-Version: 1.0
In-Reply-To: <m2sggclma3.wl-randy@psg.com>
Content-Type: text/plain; charset="windows-1252"; format="flowed"
Content-Language: en-US
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsv-art/XSwzcdljy7M7iMkNjBiN4CVlu8Q>
Subject: Re: [Tsv-art] Tsvart early review of draft-ietf-lsvr-l3dl-03
X-BeenThere: tsv-art@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Review Team <tsv-art.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsv-art>, <mailto:tsv-art-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsv-art/>
List-Post: <mailto:tsv-art@ietf.org>
List-Help: <mailto:tsv-art-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsv-art>, <mailto:tsv-art-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 11 May 2020 20:40:37 -0000
Randy, thanks much, inline. > again, thanks for taking the time for a strong review. it is much > appreciated. > >> could benefit from a bit of extra context and target application >> domain in the introduction. E.g., explaining explicitly who would talk >> L3DL to whom. > > how about the last para of intro getting a few more words along the line > of > > In this document, the use case for L3DL is for point to point links > in a datacenter Clos in order to exchange the data needed for BGP-SPF > [I-D.ietf-lsvr-bgp-spf] bootstrap and continuity. Once layer two > connectivity has been leveraged to get layer three addressability and > forwarding capabilities, normal layer three forwarding and routing > can take over. > > L3DL might be found to be more widely applicable to a range of > routing and similar protocols which need layer three discovery and > characterisation. Sounds (well: reads) good. >> 1. Section 10 spells out a default HELLO interval of 60 seconds. With >> a large broadcast domain, this may create quite a bit of >> traffic. While this may not be an issue in well-provisioned data >> center networks, a remark about sensible value ranges and the >> implications may be worthwhile. Just to provide some guidelines to >> implementers (who want to offer choices) and operators (who pick >> them). > > If the configured destination address is one that is propagated by > switches, the HELLO SHOULD be repeated at a configured interval, with > a default of 60 seconds. This allows discovery by new devices which > come up on the layer-2 mesh. In this multi-link scenario, the > operator should be aware of the trade-off between timer tuning and > network noise and adjust the inter-HELLO timer accordingly. Ack. >> 2. Section 10 also suggest that in response to HELLO messages nodes >> will issue OPEN PDUs to newly discovered peers. This appears to bear >> the clear risk of an OPEN implosion when many system come up at the >> same time. Shouldn't guidance be given to avoid repeated traffic >> surges and possible losses and thus unnecessary delays? (I noted that >> other places foresee exponential backoff when retransmitting OPEN and >> other ACKed PDUs). > > added > > To ameliorate possible load spikes during bootstrap or event > recovery, there SHOULD be a jittered delay between receipt of a HELLO > and issue of the OPEN. The default delay range SHOULD BE zero to > five seconds, and MUST be configurable. Ok. >> 3. When the protocol applies fragmentation, should there be a note on >> preventing bursts? > > likely part of this is our fault, as we did not mean 'fragmentation' in > the classic "oops! we found a hop with a small mtu." I didn't take it to mean classic fragmentation but rather ALF-style operation. Still, this could generate bursts depending on how much information there is to 'fragment'. > does augmenting sec 6 as follow help? > > This is not classic 'fragmentation', but rather decomposition at the > origin to allow PDU payloads larger than the frame allows. There are > no intermediate devices capable of further fragmentation or > reassembly. > > L3DL is carrying relatively small amounts of data on relatively high > bandwidth links, and at a time when the link is not active with other > data as it does not yet have layer three connectivity. So congestion > is not considered a sufficiently significant risk to warrent > additional complexity. Fine by me. I assume that the limited amount of information you carry in the first place would not yield more than 2 or 3 packets, so that the above sounds good. >> Section 7 on the checksum needs more detail. > > could you be more specific re in what area? Let me go back: The code is fine (and certainly a good idea). I was kind of expecting the code to be described in text, but given the code is there, the concise description is fine. Except that I got confused by "repeat until zero". What should be zero? Looking at the code, surely not the result. >> It also talks about a "suggested" algorithm but this should be clearly >> mandated or way to choose one by means of configuration for a complete >> data centre would need to be made explicit. > > The following code describes a suggested algorithm. This > specification avoids mandatory to implement, algorithm agility, etc. > What matters is that the same algorithm is used consistently in any > deployment. Ok. >> I also assume that the pseudo code on p.11 would benefit from a leader >> '0' in 0xffffffff -> 0x0ffffffff, otherwise expansion to 64 bits might >> fill the high order bits with '1's, which is clearly not intended. > > ok, but what kinky compiler are you using? :) > > result = (result >> 32) + (result & 0x0FFFFFFFF); > result = (result >> 32) + (result & 0x0FFFFFFFF); We had this and I am happy to undo this comment. Hasn't occurred lately but surely earlier. Probably not having the leading zero would ensure that people don't assume that it's just 7 x F instead of 8. >> Section 11, p.17, second to last para ("If a properly >> authenticated..."). From the text, it is unclear what is meant by an >> "OPEN with the Serial Number of the last data received". > > If a properly authenticated OPEN arrives with a new Nonce from an > LLEI with which the receiving logical link endpoint believes it > already has an L3DL session (OPENs have already been exchanged), and > the Serial Number in the OPEN PDU is non-zero, the receiver SHOULD > establish a new session by sending an OPEN with the Serial Number > being the same as that of the last sent and ACKed PDU. Each party > MUST resume sending encapsulations etc. subsequent to the other > party's Sequence Number. And each MUST retain all previously > discovered encapsulation and other data. Clear now. >> I am curious about the error code, providing 16 bits for additional >> explanation. Why not a text field? > > parsing by the receiver of error codes then becomes an larger unbounded > chase. but this bit of syntactic sugar may be more pain than it is > worth. free form or under-defined fields are too fertile a ground for > interoperational incompatibility. opinions solicited. fine by me either way, was just curious. >> Also wondering if repeated retries (due to failure, not lost packets) >> could yield fast repeated transmissions. > > could you be specific about what you think could be improved in I was trying to decipher my handwriting on the tablet, and it appears to say "wait before retry to prevent fast loops", scribbled on the bottom of p.22. I had to go back tot he text and I believe this refers back to p.18, first para: "The Key is specific to the operational environment. A failure to authenticate is a failure to start the L3DL session, an ERROR PDU MUST BE sent (Error Code 3), and HELLOs MUST be restarted." If you have a systematic auth failure, this would cause HELLOs to be restarted. Without any jittering (now you introduced some above), you could, in principle, get a loop: HELLO -> OPEN -> (Auth) ERROR -> HELLO -> OPEN -> (Auth) ERROR -> ... So misconfig could be bad unless there is some delay built in. > 12.1. Retransmission > > If a PDU sender expects an ACK, e.g. for an OPEN, an Encapsulation, a > VENDOR PDU, etc., and does not receive the ACK for a configurable > time (default one second), and the interface is live at layer 2, the > sender resends the PDU using exponential back-off, see [RFC1122]. > This cycle MAY be repeated a configurable number of times (default > three) before it is considered a failure. The session MAY BE > considered closed in case of this ACK failure. > >> Section 15, should the KEEPALIVE interval have suggested (lower) >> bounds? At the top of p.26, it says "One per second is the default", >> the previous page at the bottom refers to the inter-KEEPALIVE interval >> of ten seconds. Not sure if the two are the same, I suppose so. If >> they are, the numbers should match. If they are not, we'll need some >> extra text to explain the difference. > > whoops. how about ten seconds for both? I have no specific opinion as long as it is consistent. 10s for both is fine. >> There are two spellings of "Encapsulation", capitalised and lower >> case. Use one consistently. > > mind if we stall on this one? we were attempting to differentiate > between the proper noun of a PDU type and normal use of the world. but > looking at the text, it's a bit messy. full employment for rfc editors! > :) yay! :-) >> p10, first para: comprise -> comprising > > s/compare/comparing/ in > > [RFC1982] on DNS Serial Number Arithmetic for too much detail on > comparing and incrementing a wrapping sequence number. > > pre-published text and xml are at https://git.rg.net/randy/draft-l3dl > Thanks much for the super-quick responses! Cheers, Jörg
- [Tsv-art] Tsvart early review of draft-ietf-lsvr-… Joerg Ott via Datatracker
- Re: [Tsv-art] Tsvart early review of draft-ietf-l… Randy Bush
- Re: [Tsv-art] Tsvart early review of draft-ietf-l… Joerg Ott
- Re: [Tsv-art] Tsvart early review of draft-ietf-l… Van De Velde, Gunter (Nokia - BE/Antwerp)
- Re: [Tsv-art] Tsvart early review of draft-ietf-l… Randy Bush
- Re: [Tsv-art] Tsvart early review of draft-ietf-l… Randy Bush
- Re: [Tsv-art] Tsvart early review of draft-ietf-l… Randy Bush
- Re: [Tsv-art] Tsvart early review of draft-ietf-l… Rob Austein
- Re: [Tsv-art] Tsvart early review of draft-ietf-l… Joerg Ott
- Re: [Tsv-art] Tsvart early review of draft-ietf-l… Randy Bush
- Re: [Tsv-art] Tsvart early review of draft-ietf-l… Joerg Ott
- Re: [Tsv-art] Tsvart early review of draft-ietf-l… Randy Bush
- Re: [Tsv-art] Tsvart early review of draft-ietf-l… Joerg Ott
- Re: [Tsv-art] Tsvart early review of draft-ietf-l… Randy Bush
- Re: [Tsv-art] Tsvart early review of draft-ietf-l… Joerg Ott
- Re: [Tsv-art] Tsvart early review of draft-ietf-l… Randy Bush