Re: [Dime] New version of DOIC rate draft

Steve Donovan <srdonovan@usdonovans.com> Fri, 21 September 2018 16:01 UTC

Return-Path: <srdonovan@usdonovans.com>
X-Original-To: dime@ietfa.amsl.com
Delivered-To: dime@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 453B912777C for <dime@ietfa.amsl.com>; Fri, 21 Sep 2018 09:01:30 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.119
X-Spam-Level:
X-Spam-Status: No, score=-1.119 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_NEUTRAL=0.779, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id z511wNFf3y9X for <dime@ietfa.amsl.com>; Fri, 21 Sep 2018 09:01:28 -0700 (PDT)
Received: from biz131.inmotionhosting.com (biz131.inmotionhosting.com [173.247.247.114]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 08E73130EFC for <dime@ietf.org>; Fri, 21 Sep 2018 09:01:26 -0700 (PDT)
Received: from [97.99.50.102] (port=50514 helo=SDmac.local) by biz131.inmotionhosting.com with esmtpsa (TLSv1.2:ECDHE-RSA-AES128-GCM-SHA256:128) (Exim 4.91) (envelope-from <srdonovan@usdonovans.com>) id 1g3Nro-001OXM-54; Fri, 21 Sep 2018 09:01:25 -0700
To: Ben Campbell <ben@nostrum.com>, "Gunn, Janet P (NONUS)" <janet.gunn@gdit.com>, "dime@ietf.org" <dime@ietf.org>, Eric Noel <ecnoel@research.att.com>
References: <a7a5c833-427c-acd4-1502-675ce3c1bbac@usdonovans.com> <6725C490-0449-45BE-BF74-0B937D72CD96@nostrum.com> <2923159ae5604bc195c0bd6c644ecd75@CSRBTC1EXM029.corp.csra.com> <16FE21DA-469E-4923-B95F-2D72FD423364@nostrum.com>
From: Steve Donovan <srdonovan@usdonovans.com>
Message-ID: <77cf227c-4feb-9617-0c61-f57148fa47f9@usdonovans.com>
Date: Fri, 21 Sep 2018 11:02:31 -0500
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:45.0) Gecko/20100101 Thunderbird/45.8.0
MIME-Version: 1.0
In-Reply-To: <16FE21DA-469E-4923-B95F-2D72FD423364@nostrum.com>
Content-Type: multipart/alternative; boundary="------------5FDECA3437177D397C0B1C5F"
X-OutGoing-Spam-Status: No, score=-1.0
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - biz131.inmotionhosting.com
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - usdonovans.com
X-Get-Message-Sender-Via: biz131.inmotionhosting.com: authenticated_id: srdonovan@usdonovans.com
X-Authenticated-Sender: biz131.inmotionhosting.com: srdonovan@usdonovans.com
X-Source:
X-Source-Args:
X-Source-Dir:
Archived-At: <https://mailarchive.ietf.org/arch/msg/dime/M9kY5RLelgEMJYC4ziPxdr01D7E>
Subject: Re: [Dime] New version of DOIC rate draft
X-BeenThere: dime@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Diameter Maintanence and Extentions Working Group <dime.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dime>, <mailto:dime-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dime/>
List-Post: <mailto:dime@ietf.org>
List-Help: <mailto:dime-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dime>, <mailto:dime-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 21 Sep 2018 16:01:30 -0000

Thanks for the continued review.  I'll deal with the first issue in this
email.  I'll handle the remaining issues in (a) separate email(s).

For the first issue, I propose the following changes:

I propose adding the following wording between the 6th and 7th paragraph
of the introduction section:

"It should be noted that one of the implications of the rate based
algorithm is that the reporting node needs to determine how it wants to
distribute it's load over the set of reacting nodes from which it is
receiving traffic.  For instance, if the reporting node is receiving
Diameter traffic from 10 reacting nodes and has a capacity of 100
transactions per second then the reporting node could choose to set the
rate for each of the reacting nodes to 10 transactions per second. 
This, of course, is assuming that each of the reacting nodes has equal
performance characteristics.  The reporting node could also choose to
have a high capacity reacting node send 55 transactions per second and
the remaining 9 low capacity reacting nodes send 5 transactions per
second.  The ability of the reporting node to specify the amount of
traffic on a per reacting node basis implies that the reporting node
must maintain state for each of the reacting nodes.  This state includes
the current allocation of Diameter traffic to that reacting node.  If
the number of reacting node changes, either because new nodes are added,
nodes are removed from service or nodes fail, then the reporting node
will need to redistribute the maximum Diameter transactions over the new
set of reacting nodes."

I also propose the following wording added at the end of section 5.3.
Reporting Node Maintenance of Overload Control State

" A reporting node that receives a capability announcement from a new
reacting node, meaning a reacting node for which it does not have an OCS
entry, and the reporting node chooses the rate algorithm for that
reacting node SHOULD recalculate the rate to be allocated to all
reacting nodes.  Any changed rate values will be communicated in the
next OLR sent to each reacting node."

Steve
 
On 9/16/18 12:44 AM, Ben Campbell wrote:
>
>> On Sep 15, 2018, at 8:31 PM, Gunn, Janet P (NONUS) <janet.gunn@gdit.com> wrote:
>>
>> Just addressing the first point- deleted the rest
>>
>> -----Original Message-----
>> From: DiME <dime-bounces@ietf.org> On Behalf Of Ben Campbell
>> Sent: Saturday, September 15, 2018 7:57 PM
>> To: draft-ietf-dime-doic-rate-control.all@ietf.org
>> Cc: dime@ietf.org
>> Subject: Re: [Dime] New version of DOIC rate draft
>>
>> Substantive Comments:
>>
>> - General: I still think more discussion is needed about allocating rate to multiple input sources. I get that the actual allocation is a matter of local policy, but there’s still implications that need discussion. I’m not sure I got my concern across in previous discussion, so here’s another attempt:
>>
>> The issue I think needs elaboration on is how the offered load varies with the number of sources times the (average) rate per each source. That is, if the number of sources changes, the reacting node may need to change the rate limits assigned to each existing source.
>>
>> As a hypothetical example, lets assume a reporting node wants to limit its entire offered load to 1000 tps. Further assume it has 10 active reacting nodes (all supporting the rate algorithm). Local policy is to allocate the rate limit equally across sources. So it sends an OLR to each of those clients to give it a rate limit of 100 tps. Now, if another 10 reacting nodes become active, it needs to reallocate the load across all 20, giving each a limit of 50 tps. Now, if some of those reacting nodes go off-line, or simply reduce their activity beneath the limit for an extended period of time, the reporting node may need to increase the allocation to the remaining nodes.
>>
>> This is a fairly fundamental difference between rate and load; rate uses absolute numbers while load uses percentages.
>>
>> <JPG> I am not sure if it is as fundamental a difference  as you think. (in both cases assuming each node has a relatively stable  load.)
>> If there are 10 active reacting nodes, it asks each node to reduce its traffic to  (say) 50%.
>> If 10 more nodes become active, it will need to ask each node to reduce its traffic to 25%
>> If some of those nodes go offline, or reduce their traffic, it will increase the (percentage) allocation to each remaining node.
>>
>> The "fundamental" difference is when each node has a fluctuating, unstable, load.
>> Under "load" a node which currently has a small load needs to cut its traffic by the SAME percentage as the node which currently has a large load.
>> Under "rate" a node which currently has a small load does NOT need to cut its load while the node which currently has a large load DOES need to cut its load.</JPG>
>>
> Hi,
>
> There can be more than one fundamental difference :-)
>
> I wasn’t talking about how clients behave, I was talking about how the server expresses it’s intention. If a server wants to reduce the current load by half using the loss algorithm, it doesn’t need to know how many clients there are. Of course, it may have to keep adjusting the percentage if the offered load is not stable. OTOH, If it wants to reduce the load to a specific rate using the rate algorithm, it does need to know many clients there are.
>
>  I’m not saying this is a flaw in any way. I agree that in many cases an absolute TPS value may be more useful than a relative value. I’m just saying the draft should talk about it.
>
> Thanks,
>
> Ben.
>
>
>
>
>
>