Re: [Dime] New version of DOIC rate draft

Ben Campbell <> Fri, 21 September 2018 16:07 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id C595F130E1D for <>; Fri, 21 Sep 2018 09:07:07 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.878
X-Spam-Status: No, score=-1.878 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, T_SPF_HELO_PERMERROR=0.01, T_SPF_PERMERROR=0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id Q49etNVnp7J9 for <>; Fri, 21 Sep 2018 09:07:05 -0700 (PDT)
Received: from ( [IPv6:2001:470:d:1130::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id B482E12777C for <>; Fri, 21 Sep 2018 09:07:05 -0700 (PDT)
Received: from [] ( []) (authenticated bits=0) by (8.15.2/8.15.2) with ESMTPSA id w8LG6w5u068079 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NO); Fri, 21 Sep 2018 11:07:02 -0500 (CDT) (envelope-from
X-Authentication-Warning: Host [] claimed to be []
From: Ben Campbell <>
Message-Id: <>
Content-Type: multipart/signed; boundary="Apple-Mail=_79C12E53-1519-4CBA-9322-17C93847A660"; protocol="application/pgp-signature"; micalg=pgp-sha512
Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\))
Date: Fri, 21 Sep 2018 11:06:57 -0500
In-Reply-To: <>
Cc: "Gunn, Janet P (NONUS)" <>, "" <>, Eric Noel <>
To: Steve Donovan <>
References: <> <> <> <> <>
X-Mailer: Apple Mail (2.3445.9.1)
Archived-At: <>
Subject: Re: [Dime] New version of DOIC rate draft
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Diameter Maintanence and Extentions Working Group <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Fri, 21 Sep 2018 16:07:08 -0000

> On Sep 21, 2018, at 11:02 AM, Steve Donovan <> wrote:
> Thanks for the continued review.  I'll deal with the first issue in this email.  I'll handle the remaining issues in (a) separate email(s).
> For the first issue, I propose the following changes:
> I propose adding the following wording between the 6th and 7th paragraph of the introduction section:
> "It should be noted that one of the implications of the rate based algorithm is that the reporting node needs to determine how it wants to distribute it's load over the set of reacting nodes from which it is receiving traffic.  For instance, if the reporting node is receiving Diameter traffic from 10 reacting nodes and has a capacity of 100 transactions per second then the reporting node could choose to set the rate for each of the reacting nodes to 10 transactions per second.  This, of course, is assuming that each of the reacting nodes has equal performance characteristics.  The reporting node could also choose to have a high capacity reacting node send 55 transactions per second and the remaining 9 low capacity reacting nodes send 5 transactions per second.  The ability of the reporting node to specify the amount of traffic on a per reacting node basis implies that the reporting node must maintain state for each of the reacting nodes.  This state includes the current allocation of Diameter traffic to that reacting node.  If the number of reacting node changes, either because new nodes are added, nodes are removed from service or nodes fail, then the reporting node will need to redistribute the maximum Diameter transactions over the new set of reacting nodes.”

Thats exactly what I had in mind, thanks!

> I also propose the following wording added at the end of section 5.3. Reporting Node Maintenance of Overload Control State
> " A reporting node that receives a capability announcement from a new reacting node, meaning a reacting node for which it does not have an OCS entry, and the reporting node chooses the rate algorithm for that reacting node SHOULD recalculate the rate to be allocated to all reacting nodes.  Any changed rate values will be communicated in the next OLR sent to each reacting node.”

I will leave it to you to decide, but I don’t know that the SHOULD is necessary. We could treat this as a statement of fact of which people need to be aware. (“may need to” or “needs to”).

> Steve
> On 9/16/18 12:44 AM, Ben Campbell wrote:
>>> On Sep 15, 2018, at 8:31 PM, Gunn, Janet P (NONUS) <> <> wrote:
>>> Just addressing the first point- deleted the rest
>>> -----Original Message-----
>>> From: DiME <> <> On Behalf Of Ben Campbell
>>> Sent: Saturday, September 15, 2018 7:57 PM
>>> To: <>
>>> Cc: <>
>>> Subject: Re: [Dime] New version of DOIC rate draft
>>> Substantive Comments:
>>> - General: I still think more discussion is needed about allocating rate to multiple input sources. I get that the actual allocation is a matter of local policy, but there’s still implications that need discussion. I’m not sure I got my concern across in previous discussion, so here’s another attempt:
>>> The issue I think needs elaboration on is how the offered load varies with the number of sources times the (average) rate per each source. That is, if the number of sources changes, the reacting node may need to change the rate limits assigned to each existing source.
>>> As a hypothetical example, lets assume a reporting node wants to limit its entire offered load to 1000 tps. Further assume it has 10 active reacting nodes (all supporting the rate algorithm). Local policy is to allocate the rate limit equally across sources. So it sends an OLR to each of those clients to give it a rate limit of 100 tps. Now, if another 10 reacting nodes become active, it needs to reallocate the load across all 20, giving each a limit of 50 tps. Now, if some of those reacting nodes go off-line, or simply reduce their activity beneath the limit for an extended period of time, the reporting node may need to increase the allocation to the remaining nodes.
>>> This is a fairly fundamental difference between rate and load; rate uses absolute numbers while load uses percentages.
>>> <JPG> I am not sure if it is as fundamental a difference  as you think. (in both cases assuming each node has a relatively stable  load.)
>>> If there are 10 active reacting nodes, it asks each node to reduce its traffic to  (say) 50%.
>>> If 10 more nodes become active, it will need to ask each node to reduce its traffic to 25%
>>> If some of those nodes go offline, or reduce their traffic, it will increase the (percentage) allocation to each remaining node.
>>> The "fundamental" difference is when each node has a fluctuating, unstable, load.
>>> Under "load" a node which currently has a small load needs to cut its traffic by the SAME percentage as the node which currently has a large load.
>>> Under "rate" a node which currently has a small load does NOT need to cut its load while the node which currently has a large load DOES need to cut its load.</JPG>
>> Hi,
>> There can be more than one fundamental difference :-)
>> I wasn’t talking about how clients behave, I was talking about how the server expresses it’s intention. If a server wants to reduce the current load by half using the loss algorithm, it doesn’t need to know how many clients there are. Of course, it may have to keep adjusting the percentage if the offered load is not stable. OTOH, If it wants to reduce the load to a specific rate using the rate algorithm, it does need to know many clients there are.
>>  I’m not saying this is a flaw in any way. I agree that in many cases an absolute TPS value may be more useful than a relative value. I’m just saying the draft should talk about it.
>> Thanks,
>> Ben.