Re: [Dime] [dime] #46: Bad normative advice on not letting overload reports expire

Ben Campbell <ben@nostrum.com> Tue, 25 March 2014 23:16 UTC

Return-Path: <ben@nostrum.com>
X-Original-To: dime@ietfa.amsl.com
Delivered-To: dime@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 41D5E1A01C6 for <dime@ietfa.amsl.com>; Tue, 25 Mar 2014 16:16:56 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.91
X-Spam-Level:
X-Spam-Status: No, score=-1.91 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, T_RP_MATCHES_RCVD=-0.01] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Xx3Jcxc9dcgo for <dime@ietfa.amsl.com>; Tue, 25 Mar 2014 16:16:52 -0700 (PDT)
Received: from nostrum.com (raven-v6.nostrum.com [IPv6:2001:470:d:1130::1]) by ietfa.amsl.com (Postfix) with ESMTP id 63E1C1A0248 for <dime@ietf.org>; Tue, 25 Mar 2014 16:16:52 -0700 (PDT)
Received: from [10.0.1.29] (cpe-173-172-146-58.tx.res.rr.com [173.172.146.58]) (authenticated bits=0) by nostrum.com (8.14.8/8.14.7) with ESMTP id s2PNGnrL093175 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Tue, 25 Mar 2014 18:16:50 -0500 (CDT) (envelope-from ben@nostrum.com)
X-Authentication-Warning: raven.nostrum.com: Host cpe-173-172-146-58.tx.res.rr.com [173.172.146.58] claimed to be [10.0.1.29]
Content-Type: text/plain; charset="iso-8859-1"
Mime-Version: 1.0 (Mac OS X Mail 7.2 \(1874\))
From: Ben Campbell <ben@nostrum.com>
In-Reply-To: <533097D1.3090803@usdonovans.com>
Date: Tue, 25 Mar 2014 18:16:49 -0500
X-Mao-Original-Outgoing-Id: 417482209.103132-a592080972284749d783a3bed4dec29a
Content-Transfer-Encoding: quoted-printable
Message-Id: <D24C5BAB-C9CD-4AA1-8F1D-AB21D25EDB01@nostrum.com>
References: <057.8b248d3cb5db23879c2730b80d4657d7@trac.tools.ietf.org> <B08CCDA3-4E2B-444A-AE27-9DE2D9C0B458@gmail.com> <4B803326-40A9-4E98-AC12-7DDF46BD101B@nostrum.com> <A9CA33BB78081F478946E4F34BF9AAA014D6979E@xmb-rcd-x10.cisco.com> <087A34937E64E74E848732CFF8354B9209772E9C@ESESSMB101.ericsson.se> <58574389-BAEB-49DA-A07E-B6648905C291@gmail.com> <533097D1.3090803@usdonovans.com>
To: Steve Donovan <srdonovan@usdonovans.com>
X-Mailer: Apple Mail (2.1874)
Archived-At: http://mailarchive.ietf.org/arch/msg/dime/fLE6x5xu570doOTbMpLJ2LqNwq0
Cc: dime@ietf.org
Subject: Re: [Dime] [dime] #46: Bad normative advice on not letting overload reports expire
X-BeenThere: dime@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Diameter Maintanence and Extentions Working Group <dime.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dime>, <mailto:dime-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/dime/>
List-Post: <mailto:dime@ietf.org>
List-Help: <mailto:dime-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dime>, <mailto:dime-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 25 Mar 2014 23:16:56 -0000

I do not agree. While this fixes a related problem of using a zero validity-duration to signal the end of an overload condition, it still implies that one SHOULD NOT let a report "just expire". As I've argued before, I believe there are time when it is just as good, if not better, to let an overload condition expire naturally.

Here's a quote of my argument to that effect from further down the thread:

> I think it's reasonable to say that a reporting node should terminate an overload condition in a timely manner. But if it's about to expire anyway, then expiration might be just as timely as an explicit report. 
> 
> And of course, the definition of "timely" is somewhat a matter of policy. For example, I can imagine an deployment that had a large number of clients using fairly short validity durations, and _never_ explicitly signaling an end to an overload condition. This adds a bit of a "slow-start" to the recovery, since different clients will expire the overload condition at different times, and the load will ramp up gradually. I don't see anything wrong with that. Of course, it wouldn't work if one chose long validity durations, or if the signaling of overload to different clients happened in close synchronization.

So, here's a different proposal for your first paragraph:

   "When a reporting node has recovered from overload, it SHOULD invalidate any existing overload reports in a timely matter. This can be achieved by sending an updated overload report (meaning the OLR contains a new sequence number) with the OC-Validity-Duration AVP value set to zero ("0"). If the overload report is about to expire naturally, the reporting node MAY choose to simply let it do so."


On Mar 24, 2014, at 3:38 PM, Steve Donovan <srdonovan@usdonovans.com> wrote:

> Here's some proposed wording that will hopefully let us close this issue:
> 
> Regards,
> 
> Steve
> 
> -----
> 
> Section 4.5., paragraph 3 - 
> 
> Current -02 wording:
> 
> As a general guidance for implementations it is RECOMMENDED never to
>    let any overload report to timeout.  Following to this rule, an
>    overload endpoint should explicitly signal the end of overload
>    condition and not rely on the expiration of the validity time of the
>    overload report in the reacting node.  This is achieved by sending an
>    updated overload report (meaning it must contain a new sequence
>    number) with the OC-Validity-Duration AVP value set to zero ("0").
> 
> Proposed wording:
> 
>    A reporting node SHOULD explicitly signal the end of overload
>    condition in a timely manner.  This is achieved by sending an
>    updated overload report (meaning the OLR contains a new sequence
>    number) with the OC-Validity-Duration AVP value set to zero ("0").
> 
>   A reacting node MUST invalidate and remove an overload report that
>   expires without an explicit overload report containing an OC-Validity-Duration
>   value set to zero ("0").
> 
> 
> On 2/11/14 4:31 PM, Jouni Korhonen wrote:
>> Fine with me.
>> 
>> - Jouni
>> 
>> On Feb 11, 2014, at 12:24 PM, Maria Cruz Bartolome 
>> <maria.cruz.bartolome@ericsson.com>
>>  wrote:
>> 
>> 
>>> Ben, Nirav,
>>> 
>>> I follow same argumentation.
>>> Regards
>>> /MCruz
>>> 
>>> -----Original Message-----
>>> From: DiME [
>>> mailto:dime-bounces@ietf.org
>>> ] On Behalf Of Nirav Salot (nsalot)
>>> Sent: martes, 11 de febrero de 2014 11:23
>>> To: Ben Campbell; Jouni Korhonen
>>> Cc: 
>>> dime@ietf.org list; draft-docdt-dime-ovli@tools.ietf.org
>>> 
>>> Subject: Re: [Dime] [dime] #46: Bad normative advice on not letting overload reports expire
>>> 
>>> Ben,
>>> 
>>> I resonate with your thinking below.
>>> 
>>> Regards,
>>> Nirav.
>>> 
>>> -----Original Message-----
>>> From: DiME [
>>> mailto:dime-bounces@ietf.org
>>> ] On Behalf Of Ben Campbell
>>> Sent: Monday, February 10, 2014 9:54 PM
>>> To: Jouni Korhonen
>>> Cc: 
>>> dime@ietf.org list; draft-docdt-dime-ovli@tools.ietf.org
>>> 
>>> Subject: Re: [Dime] [dime] #46: Bad normative advice on not letting overload reports expire
>>> 
>>> 
>>> On Feb 10, 2014, at 5:16 AM, Jouni Korhonen 
>>> <jouni.nospam@gmail.com>
>>>  wrote:
>>> 
>>> 
>>>> My reasoning for explicit termination was that knowing the 
>>>> implementation folks they will let overload conditions expire unless advised otherwise.
>>>> And having unnecessary stuff hanging around waiting for a cleanup is 
>>>> not a good thing in general. But I am open here for other options..
>>>> 
>>>> 
>>> I think it's reasonable to say that a reporting node should terminate an overload condition in a timely manner. But if it's about to expire anyway, then expiration might be just as timely as an explicit report. 
>>> 
>>> And of course, the definition of "timely" is somewhat a matter of policy. For example, I can imagine an deployment that had a large number of clients using fairly short validity durations, and _never_ explicitly signaling an end to an overload condition. This adds a bit of a "slow-start" to the recovery, since different clients will expire the overload condition at different times, and the load will ramp up gradually. I don't see anything wrong with that. Of course, it wouldn't work if one chose long validity durations, or if the signaling of overload to different clients happened in close synchronization.
>>> 
>>> _______________________________________________
>>> DiME mailing list
>>> 
>>> DiME@ietf.org
>>> https://www.ietf.org/mailman/listinfo/dime
>>> 
>>> 
>>> _______________________________________________
>>> DiME mailing list
>>> 
>>> DiME@ietf.org
>>> https://www.ietf.org/mailman/listinfo/dime
>>> 
>>> 
>>> _______________________________________________
>>> DiME mailing list
>>> 
>>> DiME@ietf.org
>>> https://www.ietf.org/mailman/listinfo/dime
>> _______________________________________________
>> DiME mailing list
>> 
>> DiME@ietf.org
>> https://www.ietf.org/mailman/listinfo/dime
>> 
>> 
>> 
> 
> _______________________________________________
> DiME mailing list
> DiME@ietf.org
> https://www.ietf.org/mailman/listinfo/dime