Re: Genart last call review of draft-ietf-rtgwg-backoff-algo-07

"Acee Lindem (acee)" <acee@cisco.com> Fri, 16 February 2018 18:40 UTC

Return-Path: <acee@cisco.com>
X-Original-To: rtgwg@ietfa.amsl.com
Delivered-To: rtgwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 04380126C2F; Fri, 16 Feb 2018 10:40:06 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -14.531
X-Spam-Level:
X-Spam-Status: No, score=-14.531 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cisco.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id sjLp3e8fvFKx; Fri, 16 Feb 2018 10:40:04 -0800 (PST)
Received: from alln-iport-2.cisco.com (alln-iport-2.cisco.com [173.37.142.89]) (using TLSv1.2 with cipher DHE-RSA-SEED-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 58CE2129BBF; Fri, 16 Feb 2018 10:39:57 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=9980; q=dns/txt; s=iport; t=1518806397; x=1520015997; h=from:to:cc:subject:date:message-id:references: in-reply-to:content-id:content-transfer-encoding: mime-version; bh=xF2QVVvgqHo4UF73ZatS6F4u2uqNwvQ+5Mn/boL01fs=; b=bE/0TtPE7r1t7N63eHnG9GfZH8sOwTbU2x0+GIKKQTr/lYujZFHmJywm mXtOHqmIkF90FhyTLb8gWpvM9TyaoiEUTKaYv+H++D0fANwFFXNcy7SKY eBQW2/z+mDQRUOoVqYj82CCb8uhLf/zK+4Y8wghwQRF0Bpt9g/EKuX46W 8=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: A0AsAQBHJIda/5BdJa1cGQEBAQEBAQEBAQEBAQcBAQEBAYMeBC2BVigKg0qKJY4FgxmWSYIWCoU7AhqCLFQYAQIBAQEBAQECayiFJAYjEUUQAgEIGgImAgICMBUQAgQBDQWKIa1MgieJAIITAQEBAQEBAQEBAQEBAQEBAQEBAQEBHYEPg3iCKIM+ASmDBYUjgxcxgjQFpDUJApYIDoIShiqEGYdll3ICERkBgTsBHzmBUXAVGUsBghiDegEJcniLWSuBCYEZAQEB
X-IronPort-AV: E=Sophos;i="5.46,520,1511827200"; d="scan'208";a="71864973"
Received: from rcdn-core-8.cisco.com ([173.37.93.144]) by alln-iport-2.cisco.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 16 Feb 2018 18:39:56 +0000
Received: from XCH-RTP-015.cisco.com (xch-rtp-015.cisco.com [64.101.220.155]) by rcdn-core-8.cisco.com (8.14.5/8.14.5) with ESMTP id w1GIduZc022888 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=FAIL); Fri, 16 Feb 2018 18:39:56 GMT
Received: from xch-rtp-015.cisco.com (64.101.220.155) by XCH-RTP-015.cisco.com (64.101.220.155) with Microsoft SMTP Server (TLS) id 15.0.1320.4; Fri, 16 Feb 2018 13:39:55 -0500
Received: from xch-rtp-015.cisco.com ([64.101.220.155]) by XCH-RTP-015.cisco.com ([64.101.220.155]) with mapi id 15.00.1320.000; Fri, 16 Feb 2018 13:39:55 -0500
From: "Acee Lindem (acee)" <acee@cisco.com>
To: Elwyn Davies <elwynd@dial.pipex.com>, "gen-art@ietf.org" <gen-art@ietf.org>
CC: "draft-ietf-rtgwg-backoff-algo.all@ietf.org" <draft-ietf-rtgwg-backoff-algo.all@ietf.org>, "ietf@ietf.org" <ietf@ietf.org>, "rtgwg@ietf.org" <rtgwg@ietf.org>
Subject: Re: Genart last call review of draft-ietf-rtgwg-backoff-algo-07
Thread-Topic: Genart last call review of draft-ietf-rtgwg-backoff-algo-07
Thread-Index: AQHTppDi5UPFeAFcl0esh5E8v/nu8w==
Date: Fri, 16 Feb 2018 18:39:55 +0000
Message-ID: <E3AF89C8-E8CF-408C-9EF4-0590E69CA48E@cisco.com>
References: <151872192828.7546.15103568221130514259@ietfa.amsl.com> <8C2D1776-C3F3-4C13-A403-3D4C112184C8@cisco.com>
In-Reply-To: <8C2D1776-C3F3-4C13-A403-3D4C112184C8@cisco.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-ms-exchange-messagesentrepresentingtype: 1
x-ms-exchange-transport-fromentityheader: Hosted
x-originating-ip: [10.116.152.195]
Content-Type: text/plain; charset="utf-8"
Content-ID: <1A14574CE7EC6A47A3998374E1B76EAC@emea.cisco.com>
Content-Transfer-Encoding: base64
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/rtgwg/kzS_fPSS7vWmnRAmGXiVj30KVLI>
X-BeenThere: rtgwg@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Routing Area Working Group <rtgwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rtgwg/>
List-Post: <mailto:rtgwg@ietf.org>
List-Help: <mailto:rtgwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 16 Feb 2018 18:40:06 -0000

Hi Elwyn, 

Also thank you much for your editorial comments. I must say I'm surprised that we didn’t catch some of these before. We will adopt most of them. One thing I'm not clear on is why you believe we should change RECOMMENDED to lowercase in the deployment recommendations. Unless convinced otherwise, we'll leave this for the IESG to decide. See inline. 

On 2/15/18, 2:12 PM, "Elwyn Davies" <elwynd@dial.pipex.com> wrote:

    Nits/editorial comments:
    General: The term 'back-off' may not be familiar to non-Emglish mother tongue
    speakers and on first occurrence needs a little explanation for naive readers
    to indicate what it means and to what the back-off is being applied.  I have
    suggested some additional text to this end for the abstract and s1.
    
    Abstract:
    OLD:
       This document defines a standard algorithm to back-off link-state IGP
       Shortest Path First (SPF) computations.
    NEW:
       This document defines a standard algorithm to temporararily postpone or
       'back-off' link-state IGP Shortest Path First (SPF) computations to reduce
       the computational load on IGP nodes if network events occurring at closely
       spaced times would otherwise lead to multiple, essentially redundant
       recalculations of the routing tables.
    ENDS

This is a rather long sentence. I don't have a problem with adding "to temporarily postpone or".  However, I'd split the second clause into a new sentence and shorten it. 

       This reduces the computational load on IGP nodes when multiple network events trigger multiple SPF computations over a short time interval. 

Or simply: 
   
            This reduces the computational load on IGP nodes when multiple temporally close network events trigger multiple SPF computations. 

    
    s1, para 1: s/at the same time/essentially at the same time/

Ok,  

    
    s1, para 2: s/new Shortest Path First (SPF)/new Shortest Path First (SPF)
    routing table/

We already changed this as per Benjamin's comment. 
    
    s1, para 2:
    OLD:
       experiencing multiple temporally close failures over a short
       period of time
    NEW:
       experiencing multiple temporally close failures (that is, eventuating over a
       short period of time)
    ENDS

I'm not sure "eventuating" is any clearer than "temporally" and the latter is more precise. 

    
    s1, para 2: There is a right bracket missing in the following and starting a
    clause with 'such as' and ending it with an ellipsis ('...') is redundant. >   
    such as LDP [RFC5036], RSVP-TE [RFC3209], >    BGP [RFC4271], Fast ReRoute
    computations (e.g.  Loop Free Alternates >    (LFA) [RFC5286], FIB updates...
    It is unclear to me where the bracket should go: maybe after [RFC5286] or at
    the end. Please clarify.

This should be
 (e.g.,  Loop Free Alternates   (LFA) [RFC5286], FIB updates, etc.). 

    
    s1, para 2: the phrase
    > This also reduces the churn on
    >    routers and in the network and.
    is useless, vague jargon.  The previous sentence expresses what I suspect is
    meant by 'churn'. so this is redundant and can be omitted.

We definitely want the explicitly reference to microloops and this clause sets the correct context. Perhaps, we could shortend to "This also reduces network churn and".

    
    s1, para 3:
    OLD:
    To allow for this, IGPs implement an SPF back-off algorithm.
    NEW:
    To allow for this, IGPs usually implement an SPF back-off algorithm that
    postpones or backs-off the running of the SPF calculation when the algorithm
    predicts that a run would be essentially redundant or even counter-productive
    because it appears that multiple closely timed routing-affecting events can be
    expected. ENDS

I think if you have read paragraph 2, the motivation is clear. However, I'd be okay with: 

     To allow for this, IGPs usually implement an SPF back-off algorithm that
     postpones or backs-off the SPF computation. 
    
    s1, para 3: s/choosen/chosen/

Ok. 
    
    s2, last bullet: SPF_DELAY is not defined at this point:
    s/SPF_DELAY timers values/values for any timers used to back-off SPF
    calculations/

Lets do a reference to section 3 instead "SPF_DELAY [Section 3]" 
    
    s2, last bullet:  s/Even though/This is important even though/

Ok. As you noticed, this wasn't really a complete sentence. 
    
    s3, para 1: Undesirable ellipsis:
    s/a metric change on a link or prefix.../and a metric change on a link or
    prefix./

Ok.
    
    s3:Need to expand SRLG on first use - it isn't deemed to be well-known.

Already handled in Benjamin's comments. 

    
    s3, INITIAL_SPF_DELAY bullet: s/A very small delay to quickly handle link
    failure/A very small delay to quickly handle a single isolated link failure/

Ok. 
    
    s3, SHORT_SPF_DELAY bullet:
    OLD:
        SHORT_SPF_DELAY: A small delay to have a fast convergence in case of
        a single failure (node, SRLG..), e.g., 50-100 milliseconds.
    NEW:
        SHORT_SPF_DELAY: A small delay to provide fast convergence in the case of
        a single component failure (node, SRLG..) that leads to multiple IGP events,
        e.g., 50-100 milliseconds.
    ENDS

Ok. 
    
    s5/s5.1: There is currently no text in s5: this is generally considered
    inappropriate.  Suggest removing the first sentence in s5.1 ("This section
    describes the state machine.") and adding to s5: NEW: This section describes
    the abstract finite state machine (FSM) intended to control the timing of the
    running of SPF calculations in response to IGP events.

Ok - I'd replace "running" with "execution" or "scheduling". 
    
    s5.1, QUIET bullet: s/occured/occurred/

Ok. 
    
    s5.2:  There is no need for 3 expansions of FSM - the expansion can be moved to
    s5 as suggested above.

Ok. 
    
    s5.3 title: s/States/State/

Ok. 
    
    s6, next to last para: s/it's RECOMMENDED to play it safe/it is recommended
    that timer intervals should be chosen conservatively/ (this is an operational
    recommendation).

Ok with the rewording but why not uppercase "RECOMMENDED". 
    
    s6, last para: s/RECOMMENDED/recommended/ (ditto).

Same question on uppercase. 
    
    s7, para 1: s/is based on/is dependent on/, s/RECOMMENDED/recommended/
    (operational again)

Ok with wording but same question on uppercase. 
    
    s8: Other documents (e.g., from vendors) have used the terms SPF wait time and
    SPF hold time.  It might be useful to mention that this document essentially
    provides ways to implement these settings.

No. I can you tell you with a fair amount of certainty that vendors will not change their existing SPF delay/backoff configuration definitions (at least not vendors with large deployments). Rather, this algorithm will be added as a new configuration options and, hopefully, become the default after some period of adoption and operational experience. Configuration of this standard algorithm is reflected as  a separate container in the OSPF and IS-IS YANG models. 

Thanks,
Acee