Re: [Dime] New version of DOIC rate draft

Steve Donovan <srdonovan@usdonovans.com> Fri, 21 September 2018 17:01 UTC

Return-Path: <srdonovan@usdonovans.com>
X-Original-To: dime@ietfa.amsl.com
Delivered-To: dime@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C8B9F130F16 for <dime@ietfa.amsl.com>; Fri, 21 Sep 2018 10:01:17 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.119
X-Spam-Level:
X-Spam-Status: No, score=-1.119 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_NEUTRAL=0.779, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EqW1-1oiIVVc for <dime@ietfa.amsl.com>; Fri, 21 Sep 2018 10:01:15 -0700 (PDT)
Received: from biz131.inmotionhosting.com (biz131.inmotionhosting.com [173.247.247.114]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 82347130E9F for <dime@ietf.org>; Fri, 21 Sep 2018 10:01:15 -0700 (PDT)
Received: from [97.99.50.102] (port=50964 helo=SDmac.local) by biz131.inmotionhosting.com with esmtpsa (TLSv1.2:ECDHE-RSA-AES128-GCM-SHA256:128) (Exim 4.91) (envelope-from <srdonovan@usdonovans.com>) id 1g3Ong-002BSJ-29; Fri, 21 Sep 2018 10:01:14 -0700
To: Ben Campbell <ben@nostrum.com>
References: <a7a5c833-427c-acd4-1502-675ce3c1bbac@usdonovans.com> <6725C490-0449-45BE-BF74-0B937D72CD96@nostrum.com> <2923159ae5604bc195c0bd6c644ecd75@CSRBTC1EXM029.corp.csra.com> <16FE21DA-469E-4923-B95F-2D72FD423364@nostrum.com> <77cf227c-4feb-9617-0c61-f57148fa47f9@usdonovans.com> <AC3A6F0C-A162-4BFB-9FD7-D3811167D8D5@nostrum.com>
Cc: "Gunn, Janet P (NONUS)" <janet.gunn@gdit.com>, "dime@ietf.org" <dime@ietf.org>, Eric Noel <ecnoel@research.att.com>
From: Steve Donovan <srdonovan@usdonovans.com>
Message-ID: <b06a3ad3-4e11-6eb6-b144-5822ea2f6d6c@usdonovans.com>
Date: Fri, 21 Sep 2018 12:02:20 -0500
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:45.0) Gecko/20100101 Thunderbird/45.8.0
MIME-Version: 1.0
In-Reply-To: <AC3A6F0C-A162-4BFB-9FD7-D3811167D8D5@nostrum.com>
Content-Type: multipart/alternative; boundary="------------603017D8DA0100185E2A24E4"
X-OutGoing-Spam-Status: No, score=-1.0
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - biz131.inmotionhosting.com
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - usdonovans.com
X-Get-Message-Sender-Via: biz131.inmotionhosting.com: authenticated_id: srdonovan@usdonovans.com
X-Authenticated-Sender: biz131.inmotionhosting.com: srdonovan@usdonovans.com
X-Source:
X-Source-Args:
X-Source-Dir:
Archived-At: <https://mailarchive.ietf.org/arch/msg/dime/zs4IBugYnZPkWsC3MoagiPrxtfY>
Subject: Re: [Dime] New version of DOIC rate draft
X-BeenThere: dime@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Diameter Maintanence and Extentions Working Group <dime.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dime>, <mailto:dime-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dime/>
List-Post: <mailto:dime@ietf.org>
List-Help: <mailto:dime-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dime>, <mailto:dime-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 21 Sep 2018 17:01:25 -0000


On 9/21/18 11:06 AM, Ben Campbell wrote:
>
>
>> On Sep 21, 2018, at 11:02 AM, Steve Donovan <srdonovan@usdonovans.com
>> <mailto:srdonovan@usdonovans.com>> wrote:
>>
>> Thanks for the continued review.  I'll deal with the first issue in
>> this email.  I'll handle the remaining issues in (a) separate email(s).
>>
>> For the first issue, I propose the following changes:
>>
>> I propose adding the following wording between the 6th and 7th
>> paragraph of the introduction section:
>>
>> "It should be noted that one of the implications of the rate based
>> algorithm is that the reporting node needs to determine how it wants
>> to distribute it's load over the set of reacting nodes from which it
>> is receiving traffic.  For instance, if the reporting node is
>> receiving Diameter traffic from 10 reacting nodes and has a capacity
>> of 100 transactions per second then the reporting node could choose
>> to set the rate for each of the reacting nodes to 10 transactions per
>> second.  This, of course, is assuming that each of the reacting nodes
>> has equal performance characteristics.  The reporting node could also
>> choose to have a high capacity reacting node send 55 transactions per
>> second and the remaining 9 low capacity reacting nodes send 5
>> transactions per second.  The ability of the reporting node to
>> specify the amount of traffic on a per reacting node basis implies
>> that the reporting node must maintain state for each of the reacting
>> nodes.  This state includes the current allocation of Diameter
>> traffic to that reacting node.  If the number of reacting node
>> changes, either because new nodes are added, nodes are removed from
>> service or nodes fail, then the reporting node will need to
>> redistribute the maximum Diameter transactions over the new set of
>> reacting nodes.”
>
> Thats exactly what I had in mind, thanks!
>
>>
>> I also propose the following wording added at the end of section 5.3.
>> Reporting Node Maintenance of Overload Control State
>>
>> " A reporting node that receives a capability announcement from a new
>> reacting node, meaning a reacting node for which it does not have an
>> OCS entry, and the reporting node chooses the rate algorithm for that
>> reacting node SHOULD recalculate the rate to be allocated to all
>> reacting nodes.  Any changed rate values will be communicated in the
>> next OLR sent to each reacting node.”
>
> I will leave it to you to decide, but I don’t know that the SHOULD is
> necessary. We could treat this as a statement of fact of which people
> need to be aware. (“may need to” or “needs to”).
I like this suggestion.  I'll change it to may need to.
>
>>
>> Steve
>>  
>> On 9/16/18 12:44 AM, Ben Campbell wrote:
>>>> On Sep 15, 2018, at 8:31 PM, Gunn, Janet P (NONUS) <janet.gunn@gdit.com> wrote:
>>>>
>>>> Just addressing the first point- deleted the rest
>>>>
>>>> -----Original Message-----
>>>> From: DiME <dime-bounces@ietf.org> On Behalf Of Ben Campbell
>>>> Sent: Saturday, September 15, 2018 7:57 PM
>>>> To: draft-ietf-dime-doic-rate-control.all@ietf.org
>>>> Cc: dime@ietf.org
>>>> Subject: Re: [Dime] New version of DOIC rate draft
>>>>
>>>> Substantive Comments:
>>>>
>>>> - General: I still think more discussion is needed about allocating rate to multiple input sources. I get that the actual allocation is a matter of local policy, but there’s still implications that need discussion. I’m not sure I got my concern across in previous discussion, so here’s another attempt:
>>>>
>>>> The issue I think needs elaboration on is how the offered load varies with the number of sources times the (average) rate per each source. That is, if the number of sources changes, the reacting node may need to change the rate limits assigned to each existing source.
>>>>
>>>> As a hypothetical example, lets assume a reporting node wants to limit its entire offered load to 1000 tps. Further assume it has 10 active reacting nodes (all supporting the rate algorithm). Local policy is to allocate the rate limit equally across sources. So it sends an OLR to each of those clients to give it a rate limit of 100 tps. Now, if another 10 reacting nodes become active, it needs to reallocate the load across all 20, giving each a limit of 50 tps. Now, if some of those reacting nodes go off-line, or simply reduce their activity beneath the limit for an extended period of time, the reporting node may need to increase the allocation to the remaining nodes.
>>>>
>>>> This is a fairly fundamental difference between rate and load; rate uses absolute numbers while load uses percentages.
>>>>
>>>> <JPG> I am not sure if it is as fundamental a difference  as you think. (in both cases assuming each node has a relatively stable  load.)
>>>> If there are 10 active reacting nodes, it asks each node to reduce its traffic to  (say) 50%.
>>>> If 10 more nodes become active, it will need to ask each node to reduce its traffic to 25%
>>>> If some of those nodes go offline, or reduce their traffic, it will increase the (percentage) allocation to each remaining node.
>>>>
>>>> The "fundamental" difference is when each node has a fluctuating, unstable, load.
>>>> Under "load" a node which currently has a small load needs to cut its traffic by the SAME percentage as the node which currently has a large load.
>>>> Under "rate" a node which currently has a small load does NOT need to cut its load while the node which currently has a large load DOES need to cut its load.</JPG>
>>>>
>>> Hi,
>>>
>>> There can be more than one fundamental difference :-)
>>>
>>> I wasn’t talking about how clients behave, I was talking about how the server expresses it’s intention. If a server wants to reduce the current load by half using the loss algorithm, it doesn’t need to know how many clients there are. Of course, it may have to keep adjusting the percentage if the offered load is not stable. OTOH, If it wants to reduce the load to a specific rate using the rate algorithm, it does need to know many clients there are.
>>>
>>>  I’m not saying this is a flaw in any way. I agree that in many cases an absolute TPS value may be more useful than a relative value. I’m just saying the draft should talk about it.
>>>
>>> Thanks,
>>>
>>> Ben.
>>>
>>>
>>>
>>>
>>>
>>>
>>
>