Re: [Dime] New version of DOIC rate draft

Steve Donovan <srdonovan@usdonovans.com> Fri, 21 September 2018 16:57 UTC

Return-Path: <srdonovan@usdonovans.com>
X-Original-To: dime@ietfa.amsl.com
Delivered-To: dime@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7E098130E4C for <dime@ietfa.amsl.com>; Fri, 21 Sep 2018 09:57:22 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.119
X-Spam-Level:
X-Spam-Status: No, score=-1.119 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_NEUTRAL=0.779, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id NNr0QPblhmEV for <dime@ietfa.amsl.com>; Fri, 21 Sep 2018 09:57:18 -0700 (PDT)
Received: from biz131.inmotionhosting.com (biz131.inmotionhosting.com [173.247.247.114]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id DC9921277CC for <dime@ietf.org>; Fri, 21 Sep 2018 09:57:18 -0700 (PDT)
Received: from [97.99.50.102] (port=50893 helo=SDmac.local) by biz131.inmotionhosting.com with esmtpsa (TLSv1.2:ECDHE-RSA-AES128-GCM-SHA256:128) (Exim 4.91) (envelope-from <srdonovan@usdonovans.com>) id 1g3Ojp-0028Hn-5D; Fri, 21 Sep 2018 09:57:18 -0700
To: dime@ietf.org, Eric Noel <ecnoel@research.att.com>
References: <a7a5c833-427c-acd4-1502-675ce3c1bbac@usdonovans.com> <6725C490-0449-45BE-BF74-0B937D72CD96@nostrum.com>
From: Steve Donovan <srdonovan@usdonovans.com>
Message-ID: <ce806a23-48f8-dfb4-e184-3f809cfa182e@usdonovans.com>
Date: Fri, 21 Sep 2018 11:58:21 -0500
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:45.0) Gecko/20100101 Thunderbird/45.8.0
MIME-Version: 1.0
In-Reply-To: <6725C490-0449-45BE-BF74-0B937D72CD96@nostrum.com>
Content-Type: multipart/alternative; boundary="------------FC6854CEE51B75D620B308B8"
X-OutGoing-Spam-Status: No, score=-1.0
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - biz131.inmotionhosting.com
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - usdonovans.com
X-Get-Message-Sender-Via: biz131.inmotionhosting.com: authenticated_id: srdonovan@usdonovans.com
X-Authenticated-Sender: biz131.inmotionhosting.com: srdonovan@usdonovans.com
X-Source:
X-Source-Args:
X-Source-Dir:
Archived-At: <https://mailarchive.ietf.org/arch/msg/dime/SefiaAxV55SkQGTSGBBrYQybxBk>
Subject: Re: [Dime] New version of DOIC rate draft
X-BeenThere: dime@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Diameter Maintanence and Extentions Working Group <dime.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dime>, <mailto:dime-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dime/>
List-Post: <mailto:dime@ietf.org>
List-Help: <mailto:dime-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dime>, <mailto:dime-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 21 Sep 2018 16:57:23 -0000

I've dealt with the majority of the remaining issues in this email.

There are a few questions that deal with section 7 that I have left for
Eric to address.

Steve

On 9/15/18 6:56 PM, Ben Campbell wrote:
> Hi, thanks for posting the update; I think it is making progress. However,  I still have some comments I would like to address before IETF LC:
>
> Thanks!
>
> Ben.
>
> Substantive Comments:
>
> - General: I still think more discussion is needed about allocating rate to multiple input sources. I get that the actual allocation is a matter of local policy, but there’s still implications that need discussion. I’m not sure I got my concern across in previous discussion, so here’s another attempt:
>
> The issue I think needs elaboration on is how the offered load varies with the number of sources times the (average) rate per each source. That is, if the number of sources changes, the reacting node may need to change the rate limits assigned to each existing source.
>
> As a hypothetical example, lets assume a reporting node wants to limit its entire offered load to 1000 tps. Further assume it has 10 active reacting nodes (all supporting the rate algorithm). Local policy is to allocate the rate limit equally across sources. So it sends an OLR to each of those clients to give it a rate limit of 100 tps. Now, if another 10 reacting nodes become active, it needs to reallocate the load across all 20, giving each a limit of 50 tps. Now, if some of those reacting nodes go off-line, or simply reduce their activity beneath the limit for an extended period of time, the reporting node may need to increase the allocation to the remaining nodes.
>
> This is a fairly fundamental difference between rate and load; rate uses absolute numbers while load uses percentages.
SRD> See previous email.
> §5.1:
>
> - The text is still not clear when a reporting node must send the first OLR. I understand that the choice of when to set a rate limit is local policy, but some of the text in this section suggests to me that you expect a reporting node to send an OLR immediately when it selects the rate algorithm for a specific reacting node.
>
> For example, paragraph 6 says:
>
> "A reporting node that has selected the rate overload abatement algorithm MUST indicate the rate requested to be applied by DOIC reacting nodes in the OC-Maximum-Rate AVP included in the OC-OLR AVP."
>
> This needs to talk about _when_ it must do that. Without some comment, it seems like it means “when it selects the algorithm” (i.e. “when you indicate support” ). That seems to conflict with the idea of this being local policy.
SRD> Well, it must indicate the rate anytime an OLR is sent.  The when
is specified in RFC7683, Section 5.2.3.  Here's part of that section
(paragraph 1) included to hopefully illustrate that no additional
wording is needed in the rate draft:

   If there is an active OCS entry, then a reporting node SHOULD include
   the OC-OLR AVP in all answers to requests that contain the
   OC-Supported-Features AVP and that match the active OCS entry.
>
> - new 5th paragraph: Why SHOULD instead of MUST? Is there a situation where you have an OCS but no allocated rate? (e.g. you’ve selected the rate algorithm for the reacting node, but have not send an OLR?)
SRD> There is a possible implementation that assumes the rate for each
reacting node is calculated by dividing the number of active nodes by
the maximum rate.  With this implementation the OCS would not require
storing the rate.  I did see an error in that paragraph, and as such,
I've changed that paragraph to the following:

The rate OCS entery SHOULD include the rate allocated to the reacting note.

Old text:

   The rate OCS entery SHOULD include the rate allocated to each
reacting note.

New text:

   The rate OCS entery SHOULD include the rate allocated to the reacting
note.

>
> §5.4, paragraph 1:
>
> Discussion indicated that the intent of this paragraph was the reacting node keeps OCS for each server than indicated support for the rate algorithm. But I don’t see text that says when the reacting note needs to actually create the state entry. I think the answer is “immediately when a reporting node indicates support”, right?
SRD> This is covered by the following paragraph from RFC7683:

   If the received OLR is for a new overload condition, then a reacting
   node MUST generate a new OCS entry for the overload condition.
>
> §5.6, first two paragraphs:
>
> I still think the text talking about using different algorithms needs to say something normative about the characteristics of those algorithms. Janet’s comments indicated the normative text is in §7.3.1. But that’s part of the algorithm that they MAY use, so it would not be constraining against other algorithm choices. Perhaps the 2nd paragraph should be normative?
SRD> How about the following change in paragraph 2:

Old text:

      Note: Other algorithms for controlling the rate can be implemented
      by the reacting node as long as they result in the correct rate of
      traffic being sent to the reporting node.

New text:

   Other algorithms for controlling the rate MAY be implemented
   by the reacting node.  Any algorithm implemented MUST result
   in the correct rate of traffic being sent to the reporting node.
>
> §7.2: “ But the resulting request rate presented to the overloaded reporting node will converge towards the target Diameter request rate.”
>
> Wasn’t there discussion to change this to “... the target Diameter request rate or a lower rate”?
SRD> Agreed, change made.
>
> §7.3: "In situations where reacting nodes are configured with some knowledge
>    about the reporting node (e.g., operator pre-provisioning), it can be
>    beneficial to choose a value of TAU based on how many reacting nodes
>    will be sending requests to the reporting node.”
>
> I previously commented that this requires knowledge of other traffic sources, not just the reporting node. I did not see a response.
SRD> I don't understand the issue.  I'm hoping Eric can comment or you
can clarify.
>
> Editorial Comments:
>
> [I note a number of editorial comments that fell into the “made changes unless indicated otherwise” category did not seem to get changed. I included those here again.]
SRD> My apologies if I missed these in the last refresh.
>
> - There are still things reported by IDNits that need checking. Some are obviously noise, but some appear to be real. (Line length, references in abstract, and references to RFC 5226)
SRD> There were no issues reported by IDNits when I submitted the
document.  Is there a different IDnits check than the one used when
submitting drafts?
>
> - Was there a reason not to use the RFC 8174 boilerplate in the “Requirements” section? (I thought you had intended to do so.)
SRD> Yes, this slipped through the cracks.  Change made.
>
> §1:
> - first paragraph: There were some editorial fixes that I thought were agreed that did not appear in the new version:
> s/“protect the stability”/“ensure the stability”
SRD> Change made
> s/“subjected with”/“subjected to”
SRD> Change made
> (new comment): In the new last sentence, is there a reason for the all-caps? That’s normally reserved for normative keywords.
SRD> I don't remember.  I've changed it to all lower.
>
> §4:
> - first paragraph: Please consider active voice in the last sentence,
SRD> Old text:

   This document defines the rate abatement algorithm (referred to as
   rate in this document) feature.  Support for the rate feature by a
   DOIC node will be indicated by a new value of the OC-Feature-Vector
   AVP, as described in Section 6.1.1, per the rules defined in
   [RFC7683].

New Text:

   This document defines the rate abatement algorithm (referred to as
   rate in this document) feature.  A DOIC node indicates support for the
   rate feature by indicating a new value of the OC-Feature-Vector
   AVP, as described in Section 6.1.1, per the rules defined in
   [RFC7683].
>
> §5.1:
> - New 5th paragraph:
> s/entery/entry
SRD> Change made.
>
> §7.1, 2nd paragraph: “ signal one another support for rate-based overload
>   control”: This seems awkward; are there missing words? Perhaps there should be something like “their” or “that they” between “another” and “support”?
SRD> Changed from:

   Following the procedures defined in [RFC7683], the reacting node and
   reporting node signal one another support for rate-based overload
   control.

To:

   Following the procedures defined in [RFC7683], the reacting node and
   reporting node signal their support for rate-based overload
   control.
>
> §7.2, last two paragraphs: The MUSTs do not seem necessary. 2119 keywords should be used when there is some sort of choice or room for error. You don’t need them to define the basic operation of the protocol.
SRD> I'm okay with removing the MUSTs.  Here the proposed change:

Old:

   Upon detection of overload, and the determination to invoke overload
   controls, the reporting node MUST follow the specifications in
   [RFC7683] to notify its clients of the allocated target maximum
   Diameter request rate and to notify them that the rate overload
   abatement is in effect.

   The reporting node MUST use the OC-Maximum-Rate AVP defined in this
   specification to communicate a target maximum Diameter request rate
   to each of its clients.

New:

   Upon detection of overload, and the determination to invoke overload
   controls, the reporting node follows the specifications in
   [RFC7683] to notify its clients of the allocated target maximum
   Diameter request rate and to notify them that the rate overload
   abatement is in effect.

   The reporting node uses the OC-Maximum-Rate AVP defined in this
   specification to communicate a target maximum Diameter request rate
   to each of its clients.
>
> §7.3.1: I found the text hard to follow. It would help to declare all the identifiers and initialization up front, and to present things in more of a stepwise fashion.
>
> - T is effectively a time interval, right? It would help to say that, especially later when you subtract a different time interval from it.
SRD> I'll leave this for Eric to handle.
>
> - paragraph 9: Should “admit” be “emit”?
>
> - the example code has several mentions of SIP requests.
SRD> These have been changed to Diameter requests
>
> §7.3.2:
>
> - “ Request candidates for reduction, requests not subject to reduction (except under extenuating circumstances when there aren’t any messages in the first category that can be reduced).”:
>
> That seems like an awkward way to say that the second category is the set of requests that is only subject to reduction if there are no messages left in the first category.
>
> - “ This can be generalized to n priorities using n thresholds for n>2 in the obvious way.”: I suggest you refrain from calling it “obvious”.
>
> §7.3.3: Paragraph starting with “ Then (only) if the arrival is admitted, increase the bucket by an amount…”: I think you increase the bucket _count_, right?
SRD> I'll leave these for Eric to handle.
>
>
>> On Sep 10, 2018, at 3:44 PM, Steve Donovan <srdonovan@usdonovans.com> wrote:
>>
>> I've posted a new version of the rate draft.
>>
>> I've attached the diff file.
>>
>> Regards,
>>
>> Steve
>> <Diff  draft-ietf-dime-doic-rate-control-08.txt - draft-ietf-dime-doic-rate-control-09.txt.html>_______________________________________________
>> DiME mailing list
>> DiME@ietf.org
>> https://www.ietf.org/mailman/listinfo/dime
>
>
> _______________________________________________
> DiME mailing list
> DiME@ietf.org
> https://www.ietf.org/mailman/listinfo/dime