[bess] Re: Genart last call review of draft-ietf-bess-evpn-fast-df-recovery-09
Luc André Burdet <laburdet.ietf@gmail.com> Mon, 19 August 2024 22:36 UTC
Return-Path: <laburdet.ietf@gmail.com>
X-Original-To: bess@ietfa.amsl.com
Delivered-To: bess@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 66670C1840E8; Mon, 19 Aug 2024 15:36:08 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.106
X-Spam-Level:
X-Spam-Status: No, score=-2.106 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id e_45AABSkoAZ; Mon, 19 Aug 2024 15:36:04 -0700 (PDT)
Received: from mail-io1-xd29.google.com (mail-io1-xd29.google.com [IPv6:2607:f8b0:4864:20::d29]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A1C86C1CAF5D; Mon, 19 Aug 2024 15:36:04 -0700 (PDT)
Received: by mail-io1-xd29.google.com with SMTP id ca18e2360f4ac-8251e23eaebso32953939f.3; Mon, 19 Aug 2024 15:36:04 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724106963; x=1724711763; darn=ietf.org; h=mime-version:content-language:accept-language:in-reply-to :references:message-id:date:thread-index:thread-topic:subject:cc:to :from:from:to:cc:subject:date:message-id:reply-to; bh=qNJrsQE/BHn92aZzJdVDqrX6/pfsmUJUhWedRX1kqtM=; b=nPHNONYdQIAzTq4iPQxmVaeid7LYBv+jY24g0fTe4fdSVoR/LXqeLMvD1ujYIbEa8/ p11UqMGOR/5e8GwbFoXBxCG71THmEtAIqnpQtkOT9ASSK+NCL1gX08SXYxLUd5oU8a6R oSZhqJ74qg5KWLtVt4c53PU9t25x69oPRyz3aMD5lS2ZoorxHDqJVFHpYE2ijrF6aSZL o9vruAjvonoRtGguqP9poe+lqQVPAv/O/hGXrdGuTs/pzGGmIl1BAXG6ZhazZVRzQUqa KBdUyp9sdZvYtTAcGiNstF77jdNBo34uHKhHLOCp+xSfAUrhiXCiOgN8uabv59/cW9XV PAWg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724106963; x=1724711763; h=mime-version:content-language:accept-language:in-reply-to :references:message-id:date:thread-index:thread-topic:subject:cc:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=qNJrsQE/BHn92aZzJdVDqrX6/pfsmUJUhWedRX1kqtM=; b=F2XBwr2P9QaAzUU5j+wZAQeedCF30YpeyXJaja/os5cxObVEx7gZCggR0KIoA9TPCN Win+ot5+PS/mEFZxSBKhRS/C7at+E5GE4eeSDB92QzbMhtgIdV5EORSuB2wkpJuFtmmZ PiB2ApCQi/B99O+Lsu7QRFSAlQEq8Drn8B8Br8AJAQOTGSuGz4KyhOwLRsH4IGXTTbAf UN57F8NHY+BbAl+czC3L15x8BfW05ipzf8S32dpYvVldWEGgR53SJmomDTdDS2ybO5DF gx0XV+09JqNXolPFy7gqeEiOzPNFEV8esKnPi2q1gZBPsMZZRUehc/v+Ec6F3+2GJhLu QHKw==
X-Forwarded-Encrypted: i=1; AJvYcCX7sgPvah1kEMrHDLweulAQlRjiO1EU+T1ynTpsrEh+7aszqE9wOmd4Adn92RKhlVJzaLgkllsfUWLUSj0+SpRSoBgWU0mQiJcgEOHKvYO4FqVR9DWU5spzZqPhbuvvfhkz6Nf98XEgeyR9RAb+r5QFPWlBiC2XFOJvrLIxK2+dQS7aXNOA1Zo6
X-Gm-Message-State: AOJu0Yw7p0arPO/xcx48maZE24SuyJXM7/BooPto74jwjQkfZ48I1CfY 13glk3D2wAgfibxpQLPzdjU/HVPvkw9vtsj+jqI01UZEqIa3hULo4D9ZWRZaPjZmTg==
X-Google-Smtp-Source: AGHT+IGKD6YmN1OR4KSMY75qCAwynF2hHJBOFt+s1H9RgeGZaXqOOQ7mH0EzNmnsqf+yAw9W3HBl7Q==
X-Received: by 2002:a92:c56a:0:b0:39b:2cb9:575c with SMTP id e9e14a558f8ab-39d26d64667mr121896985ab.22.1724106963229; Mon, 19 Aug 2024 15:36:03 -0700 (PDT)
Received: from CH0PR14MB4962.namprd14.prod.outlook.com ([2603:1036:304:80d::5]) by smtp.gmail.com with ESMTPSA id e9e14a558f8ab-39d328ca51fsm22641325ab.3.2024.08.19.15.36.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 Aug 2024 15:36:02 -0700 (PDT)
From: Luc André Burdet <laburdet.ietf@gmail.com>
To: Elwyn Davies <elwynd@dial.pipex.com>, "gen-art@ietf.org" <gen-art@ietf.org>
Thread-Topic: [bess] Genart last call review of draft-ietf-bess-evpn-fast-df-recovery-09
Thread-Index: ATg5NDk33zk3azMW/5xbC6mMTI/xZMVymh1R
X-MS-Exchange-MessageSentRepresentingType: 1
Date: Mon, 19 Aug 2024 22:36:01 +0000
Message-ID: <CH0PR14MB4962D5766E45F86B5613E7C2AF8C2@CH0PR14MB4962.namprd14.prod.outlook.com>
References: <172349970974.759372.14440969269979201118@dt-datatracker-6df4c9dcf5-t2x2k>
In-Reply-To: <172349970974.759372.14440969269979201118@dt-datatracker-6df4c9dcf5-t2x2k>
Accept-Language: en-US
Content-Language: en-CA
X-Hashtags: #Promotions
X-MS-Has-Attach:
X-MS-Exchange-Organization-SCL: -1
X-MS-TNEF-Correlator:
X-MS-Exchange-Organization-RecordReviewCfmType: 0
Content-Type: multipart/alternative; boundary="_000_CH0PR14MB4962D5766E45F86B5613E7C2AF8C2CH0PR14MB4962namp_"
MIME-Version: 1.0
Message-ID-Hash: GKLGQHDRI7EFQA7WTJNALKFBFZQKKB4Q
X-Message-ID-Hash: GKLGQHDRI7EFQA7WTJNALKFBFZQKKB4Q
X-MailFrom: laburdet.ietf@gmail.com
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-bess.ietf.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
CC: "bess@ietf.org" <bess@ietf.org>, "draft-ietf-bess-evpn-fast-df-recovery.all@ietf.org" <draft-ietf-bess-evpn-fast-df-recovery.all@ietf.org>, "last-call@ietf.org" <last-call@ietf.org>
X-Mailman-Version: 3.3.9rc4
Precedence: list
Subject: [bess] Re: Genart last call review of draft-ietf-bess-evpn-fast-df-recovery-09
List-Id: BGP-Enabled ServiceS working group discussion list <bess.ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/bess/EtCgZ1XNPVjFAakvKrP2LB_69sA>
List-Archive: <https://mailarchive.ietf.org/arch/browse/bess>
List-Help: <mailto:bess-request@ietf.org?subject=help>
List-Owner: <mailto:bess-owner@ietf.org>
List-Post: <mailto:bess@ietf.org>
List-Subscribe: <mailto:bess-join@ietf.org>
List-Unsubscribe: <mailto:bess-leave@ietf.org>
Hi Elwyn, Thanks for the review. I am updating inline for all comments in -10. Regarding the ‘major’ below, I am trying to word it better. In fact what happens is that both PEs perform “DF Election” which will inevitable result in some VLANS going DF->NDF and some going NDF->DF – this occurs at both PEs The idea is that a recovering PE is already in NDF state (it used to be failed/down/ndf) so the DF->NDF transition is somewhat implied and both NDF-DF and DF-NDF happen at t=103 at the newly-inseerted PE. Because se want to eliminate duplicate traffic, the old-PE will transition the VLANs which are meant to be NDF first thus ensuring an overlap of both sides being NDF for some time. Let me see If I can reword to be more consistent and better capture. The Era concern and wording is being rewritten : the intent here is really for that to be out of scope since really it’s a 3-seconds problem in 2036. The new text reflects to just assume “current” Era same as local. For the rest of the comments below, I am just updating the document inline for -10 Thanks for the review ! Regards, Luc André Luc André Burdet | Cisco | laburdet.ietf@gmail.com | Tel: +1 613 254 4814 From: Elwyn Davies via Datatracker <noreply@ietf.org> Date: Monday, August 12, 2024 at 17:56 To: gen-art@ietf.org <gen-art@ietf.org> Cc: bess@ietf.org <bess@ietf.org>, draft-ietf-bess-evpn-fast-df-recovery.all@ietf.org <draft-ietf-bess-evpn-fast-df-recovery.all@ietf.org>, last-call@ietf.org <last-call@ietf.org> Subject: [bess] Genart last call review of draft-ietf-bess-evpn-fast-df-recovery-09 Reviewer: Elwyn Davies Review result: Not Ready I am the assigned Gen-ART reviewer for this draft. The General Area Review Team (Gen-ART) reviews all IETF documents being processed by the IESG for the IETF Chair. Please treat these comments just like any other last call comments. For more information, please see the FAQ at <https://wiki.ietf.org/en/group/gen/GenArtFAQ>. Document: draft-ietf-bess-evpn-fast-df-recovery-09 Reviewer: Elwyn Davies Review Date: 2024-08-12 IETF LC End Date: 2024-07-31 IESG Telechat date: 2024-08-22 Summary:I apologise for the rather late delivery of this review. This was partly due to domestic duties (it was our family birthday/anniversary period which diverted me from reviewing) and also due to my taking some time to come to grips with the document. I am not an expert in the EVPN technology althoughin essence this is not complex but it took me some time to get a handle on the technology which this document is trying to improve. Be that as it may, there appear to be some areas where the document is not internally consistent (see the Major Issue) and I think the reliance on what is an extended example (s3) to explain the operation of the technique has lead to a less than robust explanation of the generic system, particularly the conflation of the values of the SCT offset from the time when local recovery is complete and the time delay used to await the arrival of additional RT-4 messages from other PEs. These should be separate parameters in my opinion, and conflating them could lead to the skew offset resulting in operations being scheduled before the end of the time delay period. Major Issues: s2 vs s3: If I read the text correctly, the last two paragraphs of s2 appear to imply that the newly inserted PE performs its service carving (just some transition to DF state) at the advertised SCT whereas the partner PEs make all their transitions DF->NDF and NDF->DF at SCT+skew (where skew is negative). However the latter part of s3 appears to imply that the partner PEs only make their DF->NDF transitions at SCT+skew and both the inserted and partner PEs make their NDF->DF transitions at SCT. This seems to be inconsistent. Minor issues: s2 and s3: Appropriate choice of SCT: The SCT is an absolute time. It is passed to the other PEs which then have to calculate another absolute time which is 'skew' earlier than the SCT value at which time the other PEs are intended to take action.. Thus at the very worst the SCT needs to be 'skew' in the future at the time it is transmitted to the other PEs so that this time of action is not in the past. I think there needs to be a discussion of the calculation of the SCT to avoid the other PEs being requested to take action at a time which has now passed or before they might have received all RT-4s. The discussion in s3 conflates the offset of the SCT with the Timer period for awaiting other RT-4 receptions. I think this means that SCT+skew is before the expiry of the Timer. s2.1. para 3: Improving NTP Era handling: The need to worry about the NTP Era seems unfortunate. If it was assumed that the current NTP Era applied to all SCT values, only values of SCT less than the value of 'skew' would cause issues as the time value is used here. Constraining SCT to be greater than 'skew' is not an enormous computational burden and the chances are that postponing the restart of a PE device by one 'skew' if it was lucky enough to need to restart within one 'skew' of the era changeover are unlikely to be problematic. Nits/editorial comments: Global: s/i.e./i.e.,/ (2 instances) Global: s/BGP Extended Community/BGP EVPN Extended Community/ Abstract, para 1: Provide a note of RFC 7432 as the basic RFC for the EVPN solution and flag RFC 8584 when HRW is first mentioned. Also s/[RFC8584]/(RFC 8584)/ as references are not allowed in the Abstract. Abstract, para 1: s/Highest Random/the Highest Random/, s/of the failed link/of a failed link/ Abstract and s1, para 2: These paras mention 'signalling between the recovered node' but the previous words refer to recovered node or link. If it is a link that is recovered, which node is involved or how else is the recovery improved? Abstract and s1, para 1: The terms 'becoming pervasive' and 'next generation' are not future proof. Suggest s/becoming pervasive/extensively used/ and omit 'next generation'. s1, para 2:s/Frowarder/Forwarder/ s1.3, para 1: The term 'Layer2 duplicate' is used. Since we are dealing with an Ethernet infrastructure by definition, presumably this means a duplicated Ethernet packet. Can this term be used? Otherwise this needs some explanation. s1.3, para 2: The term 'redundancy group' appears in bullet point 3 of Section 8.5 of RFC7432 without precise definition. According to the Cisco EVPN deatures for the IOS XR Release 7.6 (https://www.cisco.com/c/en/us/td/docs/iosxr/ncs5500/vpn/76x/b-l2vpn-cg-ncs5500-76x/evpn-features.html) Redundancy Group membership is configured during startup. I think this term might merit some more specific explanation in this document (or an erratum registered for RFC7432). s1.3, para 2, 2nd sentence and para 3: Under certain conditions, this may cause Layer2 duplicates and potential loops if there is a momentary overlap in forwarding roles between two or more PE devices, consequently leading to broadcast storms. Where can one see evidence for this statement and identification of the conditions that lead to these problems? I think this may be covered by the initial part of s3. In which case a pointer to this would be helpful. I am not sure if s1.3,para 3 refers to another difficulty or is a duplication. Please clarify and again provide evidence and identification of the conditions. Also the last segment of s1.3 repeats a description of the nature of the problem described in para 2. I think the section needs tightening up to give a single description of the symptoms and possibly give pointer to where problem has been identified and quantified. s1.2: Additional terms need to be defined: NDF, SCT (usefully included in the terminology sctin). s1.3, para 5 : s/HRW also cannot help/HRW cannot help either/ s1.4, para 1: s/presents multiples advantages/offers multiple advantages/ s1.4, bullet 2: I cannot parse: by ensuring that PEs any unrecognized new BGP Extended Community. s1.4, bullet 4: suggest OLD: (Route Type 4) NEW: (Route Type 4; See [RFC7432] Sections 7 and 7.4) END s1.4, bullet 5: "....and normalizes to NTP for EVPN signalling only." I don't think 'normalizes' is the right term here. Do you mean defaults? Maybe I will see when I read further on. s2, para 3: OLD: A new BGP Extended Community, the Service Carving Timestamp NEW: IANA has allocated a new sub-type for the BGP EVPN Extended Community (type 0x06) [RFC7153], defining a community of PEs that utilize the time synchronization recovery mechanism. The "Service Carving Timestamp" with sub-type value 0x0F (see Section 6) is used in communicating the Serving Carving Time (SCT) for each Ethernet Segment route (RT-4) to other partners to ensure an orderly start up or transfer of forwarding duties. END s2, para 3: It may be obvious but I think it needs to be emphasised that the skew value must be consistent across all the PEs. I assume that the intention is that the skew value should be administratively configurable in PEs supporting RT-4. Should there be some advice on range of sensible values? s2. para 3: The term RT-4 needs to be expanded on first use (or better RT-4 and SCT should be expanded in the terminology section). s2.1, paras 1 and 2: These paragraphs largely duplicate the definition of the Service Carving Timestamp in s2. I suggest they are replaced with: The BGP advertisement of each Ethernet Segment route (RT-4) where this scheme is to be used contains an EVPN Extended Community (type 0x06) with Service Carving Timestamp sub-type (Type 0x0F). The expected Service Carving Time is encoded as an 8-octet value as follows: s3.1, para 3: s/the 64-bit NTP Timestamp Format/ an adapted form of the 64-bit NTP Timestamp Format/ s2.1, para 7: OLD: The use of a 16-bit fractional seconds yields adequate precision of 15 microseconds (2^-16 s). NEW: The use of a 16-bit fractional seconds value yields adequate precision of approximately 15 microseconds (2^-16 s). s2.1, para 8: Note that the short naming of the flags as 'A' and 'T' is purely local to this document. The IANA registry does not register this naming although 'A' is used in the same way in RFC 8584. I suggest OLD: This document introduces a new flag called "T" (for Time Synchronization) to the bitmap field of the DF Election Extended Community defined in [RFC8584]. NEW: This document introduces a new flag called Time Synchronization ) indicated by "T" in the bitmap field of the DF Election Extended Community defined in [RFC8584] (see Figure 3). END s3.1/s4: What should happen if a PE with SCT capability is in process of recovering and a PE without SCT capability that was not previously in the redundancy group starts recovery? Doubtless a very rare occurrence but might occur. for example, if a hardware replacement happened. s6: This section needs to be redrafted in more conventional IANA Considerations format. There should not be a date column. It would be helpful to have references to the IANA registries in the Normatiive Refs. _______________________________________________ BESS mailing list -- bess@ietf.org To unsubscribe send an email to bess-leave@ietf.org
- [bess] Genart last call review of draft-ietf-bess… Elwyn Davies via Datatracker
- [bess] Re: Genart last call review of draft-ietf-… Luc André Burdet