Re: [Dime] draft-ietf-dime-doic-rate-control-08

"Gunn, Janet P (CNV)" <Janet.Gunn@csra.com> Fri, 25 May 2018 15:17 UTC

Return-Path: <Janet.Gunn@csra.com>
X-Original-To: dime@ietfa.amsl.com
Delivered-To: dime@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 488391200C5; Fri, 25 May 2018 08:17:19 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.901
X-Spam-Level:
X-Spam-Status: No, score=-1.901 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ejcDW2NLsKUu; Fri, 25 May 2018 08:17:16 -0700 (PDT)
Received: from mailport7.csra.com (mailport7.csra.com [131.131.83.25]) (using TLSv1.2 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 40F5E127137; Fri, 25 May 2018 08:17:15 -0700 (PDT)
Received: from csrrdu1exm027.corp.csra.com (HELO mail.csra.com) ([10.176.90.37]) by mailport7.csra.com with ESMTP/TLS/AES256-SHA; 25 May 2018 11:17:13 -0400
Received: from CSRRDU1EXM025.corp.csra.com (10.176.90.35) by CSRRDU1EXM026.corp.csra.com (10.176.90.36) with Microsoft SMTP Server (TLS) id 15.0.1365.1; Fri, 25 May 2018 11:17:12 -0400
Received: from CSRRDU1EXM025.corp.csra.com ([10.176.90.35]) by CSRRDU1EXM025.corp.csra.com ([10.176.90.35]) with mapi id 15.00.1365.000; Fri, 25 May 2018 11:17:12 -0400
From: "Gunn, Janet P (CNV)" <Janet.Gunn@csra.com>
To: Ben Campbell <ben@nostrum.com>, "draft-ietf-dime-doic-rate-control.all@ietf.org" <draft-ietf-dime-doic-rate-control.all@ietf.org>
CC: "dime@ietf.org" <dime@ietf.org>
Thread-Topic: [Dime] draft-ietf-dime-doic-rate-control-08
Thread-Index: AQHT7NclzbNywRlTQ0SaTYZnmhQyRKRAgGhQ
Date: Fri, 25 May 2018 15:17:12 +0000
Message-ID: <fd1b638dfc8f48b5b46b105ac40e5124@CSRRDU1EXM025.corp.csra.com>
References: <8FB01050-7B63-4A1B-B50A-974D0FA448C4@nostrum.com>
In-Reply-To: <8FB01050-7B63-4A1B-B50A-974D0FA448C4@nostrum.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-dg-ref: PG1ldGE+PGF0IG5tPSJib2R5LnR4dCIgcD0iYzpcdXNlcnNcZ3VubmpcYXBwZGF0YVxyb2FtaW5nXDA5ZDg0OWI2LTMyZDMtNGE0MC04NWVlLTZiODRiYTI5ZTM1Ylxtc2dzXG1zZy1iMTE1ODZhNS02MDJlLTExZTgtYmVmZS1lNGE0NzE3Y2NhZGZcYW1lLXRlc3RcYjExNTg2YTYtNjAyZS0xMWU4LWJlZmUtZTRhNDcxN2NjYWRmYm9keS50eHQiIHN6PSI4OTQ2IiB0PSIxMzE3MTczNTAzMDQyOTA5MzYiIGg9ImFTVC9GS0JkN1E2SHYxbHJza2V2MlBIMXYvQT0iIGlkPSIiIGJsPSIwIiBibz0iMSIgY2k9ImNBQUFBRVJIVTFSU1JVRk5DZ1VBQUp3R0FBQjRtS1J6Ty9UVEFkMGUxVXd0QUNNLzNSN1ZUQzBBSXo4S0FBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFIQUFBQUFzQmdBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFFQUFRQUJBQUFBSDMrMFBnQUFBQUFBQUFBQUFBQUFBSjRBQUFCaEFHUUFaQUJ5QUdVQWN3QnpBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUVBQUFBQUFBQUFBZ0FBQUFBQW5nQUFBR01BWXdCZkFHTUFkUUJ6QUhRQWJ3QnRBRjhBWVFCdUFIa0FBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFRQUFBQUFBQUFBQ0FBQUFBQUNlQUFBQVl3QjFBSE1BZEFCdkFHMEFYd0J3QUdVQWNnQnpBRzhBYmdBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUJBQUFBQUFBQUFBQUFBQUFCQUFBQUFBQUFBQUlBQUFBQUFKNEFBQUJqQUhVQWN3QjBBRzhBYlFCZkFIQUFhQUJwQUdzQVpRQjVBSGNBYndCeUFHUUFjd0FBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBRUFBQUFBQUFBQUFnQUFBQUFBbmdBQUFHTUFkUUJ6QUhRQWJ3QnRBRjhBY0FCb0FHOEFiZ0JsQUc0QWRRQnRBR0lBWlFCeUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQVFBQUFBQUFBQUFDQUFBQUFBQ2VBQUFBWXdCMUFITUFkQUJ2QUcwQVh3QnpBSE1BYmdBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUJBQUFBQUFBQUFBSUFBQUFBQUo0QUFBQmtBSGdBWHdCakFHOEFaQUJsQUhNQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFFQUFBQUFBQUFBQWdBQUFBQUFuZ0FBQUdVQWJRQmhBR2tBYkFCZkFHRUFaQUJrQUhJQVpRQnpBSE1BQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUF3QUFBQUFBQUFBQUFBQUFBUUFBQUFBQUFBQUNBQUFBQUFDZUFBQUFhQUJqQUhBQVl3QnpBRjhBWXdCdkFHUUFaUUJ6QUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQkFBQUFBQUFBQUFJQUFBQUFBSjRBQUFCd0FIZ0FYd0JqQUc4QVpBQmxBSE1BQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUFBQUVBQUFBQUFBQUFBZ0FBQUFBQSIvPjwvbWV0YT4=
x-dg-rorf:
x-ms-exchange-transport-fromentityheader: Hosted
x-originating-ip: [10.76.253.110]
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/dime/SNlz7v37HTU5tPOcAIv_BqDINxs>
Subject: Re: [Dime] draft-ietf-dime-doic-rate-control-08
X-BeenThere: dime@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Diameter Maintanence and Extentions Working Group <dime.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dime>, <mailto:dime-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dime/>
List-Post: <mailto:dime@ietf.org>
List-Help: <mailto:dime-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dime>, <mailto:dime-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 25 May 2018 15:17:20 -0000

Not an author, but I have a strong interest in this ID.  Comments in line.
Janet

-----Original Message-----
From: DiME <dime-bounces@ietf.org> On Behalf Of Ben Campbell
Sent: Wednesday, May 16, 2018 1:31 AM
To: draft-ietf-dime-doic-rate-control.all@ietf.org
Cc: dime@ietf.org
Subject: [Dime] draft-ietf-dime-doic-rate-control-08

Substantive Comments:

General:
The document seems inconsistent about whether rate limits are only reported during overload conditions, or in advance of overload conditions.

<JPG> I think that would be "local policy" of the serving (reporting) node, and independent of the protocol used to communicate it.  I think most cases would be reactive, but I can see situations where it could be proactive.<JPG>

I’d like to see the need to allocate the rate limit across all potential sources of traffic given some more emphasis. (Maybe a sub-section of its own?)

<JPG> I agree, but again I see that as "local policy" of the serving (reporting) node. In particular, there may be reacting nodes that do not support the rate abatement algorithm.<JPG>


§1:
- “ While this can effectively decrease the load handled by the
   server, it does not directly address cases where the rate of arrival
   of service requests increases quickly."

I think it fails to address cases where the load changes rapidly in either direction, right? At least, the following text seems to say that.

<JPG> I agree.  When there are rapid fluctuations in the offered load, the "loss" algorithm errs both in  throttling TOO MUCH when there is a dip in offered load, and throttling NOT ENOUGH when there is a spike in offered load.<JPG>



§3: Does the need for future report types to consider the rate algorithm have IANA implications?

§5.1: The first paragraph indicates state should be kept for every reacting node to which it sends an OLR. But the 5th paragraph can be interpreted to say it sends an OLR to every reacting node with which it has negotiated use of the rate algorithm. (see general comment).

§5.4: The first paragraph seems to suggest the reacting node keeps OCS for every server that has indicated support for the rate algorithm, not just nodes that have sent OLRs. Is that the intent?

§5.6, first paragraph: The MAY seems week here. I know and agree that we don’t want to force a particular application. But don’t we need to say that if an implementation uses a different algorithm, it MUST have the same behavior as the algorithm in section 7?

<JPG> I think it MUST "limit the message rate to the OC-Maximum-Rate AVP value in units of messages per second" (as stated in 7.3.1).  The algorithm described in the rest of 7.3.1 and 7.3.2 is somewhat more sophisticated, allowing for a smoothing factor (TAU) and prioritization.  I do not think we need  to say that the selected algorithm MUST have those features.<JPG>

§7.2, third and 4th paragraphs: I don’t understand what this is trying to say. Please elaborate.

<JPG>3rd para - Just as a "for instance"- if the reacting node has 50/second low priority messages and 50/second high priority messages that it want to send, and has a rate limit of 75/second, it will send 25/second low priority messages and 50 /second high priority messages.  The limit of 75/second applies to the combined stream of high and low priority messages, even though only the low priority messages are being abated.<JPG >

<JPG> 4th para - in the same example, it could be that the high priority messages typically require more processing resources (cpu, etc) than the low priority messages (or vice versa).  So cutting the rate to 75/sec may NOT produce the expected reduction in resource usage.<JPG>


-6th paragraph: “  may receive requests at a rate below its target maximum Diameter  request rate while others above that target rate.  But the resulting request rate presented to the overloaded reporting node will converge towards the target Diameter request rate.”

Why do we expect traffic to converge to the rate limit? It seems like that won't happen if some reporting nodes are not sending at full capacity, unless work can be shifted from the high-rate sources to the slow-rate ones.

<JPG> Probably would be better to say that it "will converge  toward a rate at or below the target Diameter request rate.”<JPG>

§7.3.1: paragraph starting with “ In situations where reacting nodes are configured with some knowledge”

that requires knowledge of other traffic sources, not just knowledge of the reporting node.

The example code says to transmit a message if (Xp <= TAU). But the text said the limit was “T+TAU).

<JPG> I think it is supposed to be "T+TAU"<JPG>

§9: I think the security considerations need more thought. What are the security considerations specific to the rate algorithm? If there aren’t any, then please describe the rational behind that. But I suspect there are, for example, can this be used for a DoS? Can it be used to help _mitigate_ a DoS? Could one reacting node cause others to be traffic starved?

<JPG>It is possible that a reacting node that does not support overload control could starve the nodes that do support overload control, but this is also true of the loss based version<JPG>

Editorial Comments:

General: IDNits returns several issues. Some of those may be errors on its part, but I’m pretty sure some of them are real. Please resolve these.

Requirements: There are instances of lower case “must” and “should”. Please use the new boilerplate from RFC 8174.

§1
- “protect the stability” seems awkward. Maybe “ensure the stability”?
- Also s/ “subjected with” / “subjected to”..

- Please cite the definitions for “reporting node” and “reacting node”. I know they are defined later, but these are somewhat non-intuitive concepts and people will likely stumble over the terms when they are used before they are defined.
- Please expand DOIC and SOC on first mention in the body. (Even if they were expanded in the abstract.)

§2:
 - Definitions of “Diameter Node” and “Diameter Endpoint”: Please use proper citations rather than just referring to the RFC in text. For example: “Diameter Node: A Diameter client, server, or agent.  [RFC6733]”

§4,
- first paragraph:
— “This extension defines”: I think this should say “This document defines…”
— Please consider active voice for the last sentence.

- 2nd paragraph: The first sentence seems awkward. Consider something to the effect of “Since all nodes that support DOIC are required to support the loss algorithm…”

- 3rd paragraph: This paragraph seems to belong as part of the previous paragraph.

- 4th paragraph: “ AVP in the sent to the DOIC reacting nodes”: Missing word(s)?

-5th paragraph: “A reporting node MAY select…” : Is that a new permission, or a statement of fact?

§5.1, third paragraph: The text is not clear whether this means OCS should be maintained per supported application, etc, or that it should maintain state when the rate algorithm on a per supported application, etc, basis.
- 4th paragraph: s/overoload/overload

§5.3: 2nd paragraph: This seems like a redundant restatement of the first paragraph.
- third paragraph: The first sentence is convoluted; can it be broken into simpler sentences?

§6.1.1, definition of " OLR_RATE_ALGORITHM”: Two periods at end of sentence.


§7.1, 2nd paragraph: “ signal one another support for rate-based overload
   control”: This seems awkward; are there missing words?

§7.2, last two paragraphs: The MUSTs do not seem necessary. 2119 keywords should be used when there is some sort of choice or room for error. You don’t need them to define the basic operation of the protocol.

§7.3.1: I found the text hard to follow. It would help to declare all the identifiers and initialization up front, and to present things in more of a stepwise fashion.

- T is effectively a time interval, right? It would help to say that, especially later when you subtract a different time interval from it.

- paragraph 9: Should “admit” be “emit”?

- the example code has several mentions of SIP requests.

§7.3.2: “ Request candidates for reduction, requests not subject to reduction (except under extenuating circumstances when there aren’t any messages in the first category that can be reduced).”: That seems like an awkward way to say that the second category is the set of requests that is only subject to reduction if there are no messages left in the first category.

<JPG> Yes, that is what it means.<JPG>

- “ This can be generalized to n priorities using n thresholds for n>2 in the obvious way.”: I suggest you refrain from calling it “obvious".

§7.3.3: Paragraph starting with “ Then (only) if the arrival is admitted, increase the bucket by an amount…”: I think you increase the bucket _count_, right?













This electronic message transmission contains information from CSRA that may be attorney-client privileged, proprietary or confidential. The information in this message is intended only for use by the individual(s) to whom it is addressed. If you believe you have received this message in error, please contact me immediately and be aware that any use, disclosure, copying or distribution of the contents of this message is strictly prohibited. NOTE: Regardless of content, this email shall not operate to bind CSRA to any order or other contract unless pursuant to explicit written agreement or government initiative expressly permitting the use of email for such purpose.