Re: [bmwg] Software Update Doc

Sarah Banks <sbanks@aerohive.com> Tue, 05 March 2013 01:14 UTC

Return-Path: <sbanks@aerohive.com>
X-Original-To: bmwg@ietfa.amsl.com
Delivered-To: bmwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BB7A721F85EE for <bmwg@ietfa.amsl.com>; Mon, 4 Mar 2013 17:14:36 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.599
X-Spam-Level:
X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id BOofWzgk-UpP for <bmwg@ietfa.amsl.com>; Mon, 4 Mar 2013 17:14:34 -0800 (PST)
Received: from mx2.aerohive.com (mx2.aerohive.com [209.128.123.59]) by ietfa.amsl.com (Postfix) with ESMTP id D324A21F888F for <bmwg@ietf.org>; Mon, 4 Mar 2013 17:14:34 -0800 (PST)
Received: from localhost (localhost.localdomain [127.0.0.1]) by mx2.aerohive.com (Postfix) with ESMTP id 3D9464D0170; Mon, 4 Mar 2013 17:13:48 -0800 (PST)
X-DKIM: Sendmail DKIM Filter v2.8.3 mx2.aerohive.com 3D9464D0170
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=aerohive.com; s=mx2; t=1362446028; bh=7NfxlQn45H2hbSufMJGHAeP6x0Yc0FSuyRIvZZhyRvo=; h=From:To:CC:Subject:Date:Message-ID:References:In-Reply-To: Content-Type:Content-ID:Content-Transfer-Encoding:MIME-Version; b=L/+8KE4eo4Y7zOlDGRaTZkPHz76eFJEx2kro+oubI3YhtOth1ehVLy+CLQaudYsk6 Tz7/bzMvV5o8dYdsxQk/EBx6Kl1EPsIrucxipbYBVTk0C6SXqvF1lwYTsuZG3Y0ig4 xbhAIak0rHafprFskdXWlQULUNJp0oVfk4ZI+YYk=
X-Virus-Scanned: amavisd-new at aerohive.com
Authentication-Results: mx2.aerohive.com (amavisd-new); dkim=pass header.i=@aerohive.com
Received: from mx2.aerohive.com ([127.0.0.1]) by localhost (mx2.aerohive.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id JPy0H8KlHHNT; Mon, 4 Mar 2013 17:13:34 -0800 (PST)
Received: from mail.aerohive.com (unknown [10.16.240.57]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx2.aerohive.com (Postfix) with ESMTP id F01734D012E; Mon, 4 Mar 2013 17:13:33 -0800 (PST)
X-DKIM: Sendmail DKIM Filter v2.8.3 mx2.aerohive.com F01734D012E
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=aerohive.com; s=mx2; t=1362446013; bh=7NfxlQn45H2hbSufMJGHAeP6x0Yc0FSuyRIvZZhyRvo=; h=From:To:CC:Subject:Date:Message-ID:References:In-Reply-To: Content-Type:Content-ID:Content-Transfer-Encoding:MIME-Version; b=Nw6SIf3g7bCvMZS+34UynGQMdVYuIxcIbBzNW6kFiMZk5Fz2coVHckm21OqBDe3S+ mTGnqEYwMUOblVfhaH9rgSSUj9cewUwqE+2+p3WXXPCqd86TV1WHrxdYq4kad4NEC7 JAbmiEL+B/z1bihKV/mv4CjVRkXKYbA2U2rG2xUs=
Received: from HQ-MBX2.aerohive.com ([fe80::f543:a7c1:dae7:111d]) by hq-cas2.aerohive.com ([::1]) with mapi id 14.02.0298.004; Mon, 4 Mar 2013 17:14:19 -0800
From: Sarah Banks <sbanks@aerohive.com>
To: "MORTON JR., ALFRED (AL)" <acmorton@att.com>
Thread-Topic: [bmwg] Software Update Doc
Thread-Index: AQHODyl3zIKhmzAZkEWfuNz+90ntf5iCx58AgAjr3ICAAVYpAIAJdzCAgABkdwA=
Date: Tue, 05 Mar 2013 01:14:17 +0000
Message-ID: <858A761C-18DC-4C75-97AF-9CBA65FC42D7@aerohive.com>
References: <87AD9F93929539479A256B7EDD36C82B7BA7A061@hq-mbx2.aerohive.com> <5124670C.2000303@networktest.com> <E9C76483-6202-4F37-BF5B-BB7805CCD047@aerohive.com> <512D01EA.6010200@networktest.com> <F1312FAF1A1E624DA0972D1C9A91379A1BF8990139@njfpsrvexg7.research.att.com>
In-Reply-To: <F1312FAF1A1E624DA0972D1C9A91379A1BF8990139@njfpsrvexg7.research.att.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.16.252.7]
Content-Type: text/plain; charset="Windows-1252"
Content-ID: <2EF0E343D2284B4793CF2E4B2CB7DE8E@aerohive.com>
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Cc: "bmwg@ietf.org" <bmwg@ietf.org>
Subject: Re: [bmwg] Software Update Doc
X-BeenThere: bmwg@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Benchmarking Methodology Working Group <bmwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bmwg>, <mailto:bmwg-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/bmwg>
List-Post: <mailto:bmwg@ietf.org>
List-Help: <mailto:bmwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bmwg>, <mailto:bmwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 05 Mar 2013 01:14:36 -0000

Hi Al,
	Thanks for your feedback. 
Let me answer inline.
On Mar 4, 2013, at 11:14 AM, "MORTON JR., ALFRED (AL)" <acmorton@att.com> wrote:

>> On 2/25/13 2:16 PM, Sarah Banks wrote:
>> David Newman wrote:
>> ...
>>> Having said this, I think that if or when downgrades are covered, the
>>> methodology would be exactly the same. Would you agree?
>> 
>> Yes.
>> 
>> dn
> 
> I tend to agree, possibly without the download step in some DUTs
> (though this seems to be included in section 6 on rollback).
> 

So let me point out - the document scopes ISSU, not ISSD; and a change to the next rev (after we post on March 11) to discuss WHAT makes an ISSU an ISSU; for example, it'll deal with questions like, "Can I upgrade just the Line Cards and not the Route Engines and still have it be an ISSU?" Rather than get into which vendors do what for downgrades, as I think there's a good amount of misunderstanding and confusion there, I thought we were clear on the scope. I am not opposed to ISSD; but I do have reservations.

> I'm happy to see this draft in text form, after much discussion
> and support for the topic in general (when on the agenda in the past).
> 

+1. it's been a long road getting here. Thanks for your support!

> Editorial:  this paragraph indent is a little distracting,
> I assume this is a MSWord source?  Let's talk doc formatting…

Yes sir, it is indeed an MSWord source, with two active editors. :) Shall we take the formatting discussion offline?

> 
> 1. Introduction
> 
> ISSU is a technique implemented by forwarding devices to upgrade or
>   downgrade from one software version to another as applicable. The
> 
> -=-=-=-=-
> 
> Editorial:  it's a good idea to avoid wording that might lead to acceptance
> criteria in bmwg specs. I'll try to flag them all, but here's an example:
> 
> 3.1. Software Download
> In this first phase, ...
> ... Such separation
> allows an administrator to download the new code inside or outside
> of a maintenance window; it is expected that downloading new code
> and saving it to disk on the router will not impact operations.
> 
> s/expected/anticipated/
> 

OK

> -=-=-=-=-
> 
> later in 3.1, Verification is purely functional testing, did it happen as planned
> or not?
> 
> ... Verification should include both positive verification (ensuring
> that an ISSU action should be permitted) as well as negative tests
> (creation of scenarios where the verification mechanisms would
> report exceptions).
> 
> It would be good to separate procedural verifications as pre-requisites
> for the benchmarking tests that follow. All good benchmarks assume/require
> proper functions are performed. I see that similar verifications are
> included in section 3.2, Software Staging, because at that stage they are an
> inextricable part of the process.
> 

ok

> -=-=-=-=-
> 
> 3.3 Upgrade Run
> 
> 2nd paragraph
> This is the critical phase of the ISSU, where the control plane MUST
> not be impacted and any interruptions to the forwarding plane should
> be minimal to none.
> 
> This is a case where the ideal results have been expressed as a requirement
> for the DUT to meet, rather than an anticipated result.
> 
> I suggest NEW:
> There will be continuous monitoring of control-plane and forwarding-plane
> functions, anticipating that any interruptions SHOULD be characterized.
> 

Hrmm. The difference between the two statements is that the suggested revised wording would not exclude interruption, and would require that it be captured; IMHO, an ISSU isn't an ISSU if there's a drop or break in control plane. Is there an amenable way to compromise here?

> -=-=-=-=-=-
> Editorial:
> 4.1 Test Topology
> The hardware configuration of the DUT (Device Under test) MUST be
> identical to the one expected to be or currently deployed in
> production.
> 
> suggest inserting the phrase below at end of first sentence:
> ... in order for the benchmark to have relevance in production.
> 

ok

> -=-=-=-=-=-
> 4.2 Load Model
> 2nd para
> For mixed protocol environments, frames SHOULD be distributed
> between all the different protocols. The distribution SHOULD
> approximate the network conditions of deployment. In all cases, the
> details of the mixed protocol distribution MUST be included in the
> reporting.
> 
> ?? what protocols are being mixed?  IPv4 and IPv6? please clarify
> 

It would seem that you're looking for us to denote specifically what protocols, whereas the section calls for what you have today; if you have an IPv4 only network today, then it'd be an IPv4 protocol being used. I think the spirit of the statement is very different; we want you to use what's representative of your network today, not protocol a, b and c that make vendor a, b and c look good on paper. Looking good on paper is great, but if it's not representative of your network, as an operator, what's the point?

> -=-=-=-=-=-
> 
> later, at the end of 4.2:
> 
> All in all, the load model should attempt to simulate the production
> network expectations to the greatest extent possible in order to
> maximize the applicability of the results generated.
> 
> s/expectations/conditions/
> 

ok

> -=-=-=-=-
> 
> 5.2 Software Staging
> 
> General comment:  It is simpler in a procedure like this to ask the tester
> to record the current time at different steps, then calculate the durations
> later.
> 
> Start time:                          T0
> Secondary RP stabilized to new code: T1
> Start of Upgrade Run:                T2
> Completion of Upgrade:               T3
> 
> Then when reporting (section 7), it looks like:
> 
>                                    Duration
> Software Download/Secondary Update: T1 - T0
> Upgrade Run:                        T3 - T2
> ...
> 
> worth a try?
> 
> -=-=-=-=-
> 
> 5.3 Upgrade Run
> 
> 4th para
> ... Examine the traffic
> generators for any indication of traffic loss over this interval. If
> the Test Set reported any traffic loss interval, note the duration
> of the outage as "TP".
> 
> The above requires more detail, so that outage durations are 
> reported the same way.  I think the recently approved MPLS
> protection methods can provide some help here:
> http://tools.ietf.org/html/draft-ietf-bmwg-protection-meth-14#page-12
> 

OK, we'll take a look.

> -=-=-=-=-
> 
> 5.4 Post ISSU
> ... last bullet
> 
> * Document the hitless or outage dark windows detected based upon
> the (TP) counter value (provided by the Test Set)
> 
> update to reflect the metric chosen: although "outage dark windows"
> sounds pretty cool, I don't know how to measure them.
> 

I agree. We'll work this out after next rev.

Al, thanks for the great feedback. Overall, I think the agreement was to post THIS (the current version of the doc, the one you've been commenting on) on March 11th, and then revise thereafter. Is this still desirable? Let me reword; It's unlikely I'll be able to meet with my coauthors and work in changes, given our current geographic and timezone challenges. :) Can we still proceed as aforementioned?

Thanks
Sarah