Re: [Dime] WGLC #1 for draft-ietf-dime-load-02
"Gunn, Janet P" <Janet.Gunn@csra.com> Tue, 21 June 2016 16:16 UTC
Return-Path: <Janet.Gunn@csra.com>
X-Original-To: dime@ietfa.amsl.com
Delivered-To: dime@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5263C12D091 for <dime@ietfa.amsl.com>; Tue, 21 Jun 2016 09:16:07 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.327
X-Spam-Level:
X-Spam-Status: No, score=-3.327 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RP_MATCHES_RCVD=-1.426, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id OWtc4QIBYa3U for <dime@ietfa.amsl.com>; Tue, 21 Jun 2016 09:16:04 -0700 (PDT)
Received: from mailport7.csra.com (mailport7.csra.com [131.131.97.25]) (using TLSv1.2 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 7491E12D9BC for <dime@ietf.org>; Tue, 21 Jun 2016 09:16:04 -0700 (PDT)
Received: from csrrdu1exm025.corp.csra.com (HELO mail.csra.com) ([10.8.2.25]) by mailport7.csra.com with ESMTP/TLS/AES256-SHA; 21 Jun 2016 12:15:45 -0400
Received: from CSRRDU1EXM025.corp.csra.com (10.8.2.25) by CSRRDU1EXM021.corp.csra.com (10.8.2.21) with Microsoft SMTP Server (TLS) id 15.0.1178.4; Tue, 21 Jun 2016 12:15:59 -0400
Received: from CSRRDU1EXM025.corp.csra.com ([10.8.2.25]) by CSRRDU1EXM025.corp.csra.com ([10.8.2.25]) with mapi id 15.00.1178.000; Tue, 21 Jun 2016 12:16:00 -0400
From: "Gunn, Janet P" <Janet.Gunn@csra.com>
To: Maria Cruz Bartolome <maria.cruz.bartolome@ericsson.com>, "jouni.nospam@gmail.com" <jouni.nospam@gmail.com>, "dime@ietf.org" <dime@ietf.org>
Thread-Topic: [Dime] WGLC #1 for draft-ietf-dime-load-02
Thread-Index: AQHRtdFbxNTUAf5iuEyzblc0BoJml5/yf3oAgAG3a2A=
Date: Tue, 21 Jun 2016 16:16:00 +0000
Message-ID: <3e2082d80d8e45caaca581c9dcc98468@CSRRDU1EXM025.corp.csra.com>
References: <5b31616d-efa3-ac03-8f1c-bd8883a35d65@gmail.com> <087A34937E64E74E848732CFF8354B9219758407@ESESSMB101.ericsson.se>
In-Reply-To: <087A34937E64E74E848732CFF8354B9219758407@ESESSMB101.ericsson.se>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-ms-exchange-transport-fromentityheader: Hosted
x-originating-ip: [10.136.2.8]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/dime/agMSZp7nQgiISYHg-8J57FuVshY>
Subject: Re: [Dime] WGLC #1 for draft-ietf-dime-load-02
X-BeenThere: dime@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Diameter Maintanence and Extentions Working Group <dime.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dime>, <mailto:dime-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dime/>
List-Post: <mailto:dime@ietf.org>
List-Help: <mailto:dime-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dime>, <mailto:dime-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 21 Jun 2016 16:16:07 -0000
Comments in line <JPG> -----Original Message----- From: DiME [mailto:dime-bounces@ietf.org] On Behalf Of Maria Cruz Bartolome Sent: Monday, June 20, 2016 5:14 AM To: jouni.nospam@gmail.com; dime@ietf.org Subject: Re: [Dime] WGLC #1 for draft-ietf-dime-load-02 Hello all, I would like to provide some questions, proposed changes and typos, see in different sections to ease reading. Best regards /MCruz =========== SOME QUESTIONS ===========: Appendix A. Topology Scenarios Does it really make sense to keep an appendix that states: "Nothing in this section should be construed to mean that a given scenario is in scope for this effort, or even a good idea." I think we need to keep only the scenarios that are "in scope of this effort", what I understand as "suitable for load conveyance as stated in this draft". If some of them are not considered suitable by any reasons, I presume they should be removed. <JPG> Or note as (counter) examples of scenarios NOT suitable.</JPG> A.10. Addition and removal of Nodes Shouldn't this part of the annex be in the regular body of the draft? =========== PROPOSED CHANGES ===========: Abstract: Now: This document defines a mechanism for *sharing* of Diameter load information. Proposed: This document defines a mechanism for *conveying* Diameter load information. Reasoning: *Sharing" may be a bit misleading. <JPG> Agree. Conveying is better. </JPG> 1. Introduction: Now: In particular, DOIC does not fulfill Req 24, which requires a mechanism where Diameter nodes can indicate their *current load* , even if they are not currently overloaded. DOIC also does not fulfill Req 23, which requires that *nodes that divert traffic* away from overloaded nodes be provided with sufficient information to select targets that are most likely to have sufficient capacity. Proposal: I think we need to include the exact requirement text from RFC7068, since the description you use does not keep the exact meaning. E.g. *current load* should be replaced by *load levels", *nodes that divert traffic*, in fact is *nodes with traffic diversion capability*. Better, just list requirements. If an interpretation is required, this is fine, but the original text is important to be kept: REQ 23: The solution MUST provide sufficient information to enable a load-balancing node to divert messages that are rejected or otherwise throttled by an overloaded upstream node to other upstream nodes that are the most likely to have sufficient capacity to process them. REQ 24: The solution MUST provide a mechanism for indicating load levels, even when not in an overload condition, to assist nodes in making decisions to prevent overload conditions from occurring. <JPG> Agree. Itt would make sense to have a section, or even an appendix, which lists the requirements, and notes whichare/are not met. </JPG> 1. Introduction Now: There are several other requirements in [RFC7068] that mention both overload and load information that are only partially fulfilled by DOIC. [....] This document defines a mechanism that addresses the load-related requirements from RFC 7068. Proposal We need to list the requirements we refer to. They are not listed anywhere, right? I think we refer to following Requirements: REQ 1: The solution MUST provide a communication method for Diameter nodes to exchange load and overload information. REQ 2: The solution MUST allow Diameter nodes to support overload control regardless of which Diameter applications they support. Diameter clients and agents must be able to use the received load and overload information to support graceful behavior during an overload condition. Graceful behavior under overload conditions is best described by REQ 3. REQ 12: When a single network node fails, goes into overload, or suffers from reduced processing capacity, the solution MUST make it possible to limit the impact of the affected node on other nodes in the network. This helps to prevent a small- scale failure from becoming a widespread outage. REQ 34: The solution SHOULD provide a method for exchanging overload and load information between elements that are connected by intermediaries that do not support the solution. <JPG> Agree. See above comment. </JPG> 2. Terminology and abbreviations Now: Load The *relative capacity of a Diameter node*. A low load level indicates that the Diameter node is under utilized. A high load level indicates that the node is closer to being fully utilized. Proposed: Load The* Diameter message processing capacity of a node*. A low load level indicates that the Diameter node is under utilized. A high load level indicates that the node is closer to being fully utilized. Reasoning: I think using "relative" is misleading. <JPG> I do not like either. "Capacity" is what the node can do. "Available capacity" is actually HIGH when there is a low load level, and LOW when there is a high laod level. If you want to avoid "Utilization", which implies an explicit calculation, you could say "the relative usage of the Daimeter message processing capacity'" </JPG> 4.1 Now: Second, Overload information, in the form of a DOIC Overload Report (OLR) [RFC7683] indicates an explicit request for action on the part of the reacting node. That is, the OLR requests that the reacting node reduce the offered load -- the actual traffic sent to the reporting node after overload abatement and routing decisions are made -- by an indicated amount *or to an indicated level *. Proposed: Second, Overload information, in the form of a DOIC Overload Report (OLR) [RFC7683] indicates an explicit request for action on the part of the reacting node. That is, the OLR requests that the reacting node reduce the offered load -- the actual traffic sent to the reporting node after overload abatement and routing decisions are made -- by an indicated amount *(by default, or other optional abatement algorithms).* - Or remove everything after "amount". <JPG> RFC7683 is clear that the Overload Report may be used to trigger EITHER a loss based algorithm, or a different (e.g. rate based) algorthm. So the summary here should not be restricted to a loss-based description. Perhaps "--by an indicated amount (by default), or as prescribed by the selected abatement algorithm." </JPG> 4.1 Now: None of this prevents a Diameter node from deciding to reduce the offered load based on load information. . Proposed (remove) Reasoning: This sentence is not properly linked to previous paragraph and it is covered by previous paragraph already <JPG> OK with this, though not sure it is necessary to delete.</JPG> 4.2 Now: Req 24 discusses how Diameter load information might be used when no overload condition currently exists. Diameter nodes can use the load information to make decisions to try to avoid overload conditions in the first place. Normal load-balancing falls into this category. A node might also take other proactive steps to reduce offered load based on load information, so that the loaded node never goes into overload in the first place. Proposed: Req 24 discusses how Diameter load information might be used when no overload condition currently exists. Diameter nodes can use the load information to make decisions to try to avoid overload conditions in the first place. Normal load-balancing falls into this category, but the diameter node can take other proactive steps as well. <JPG> Agree </JPG> 4.2 Now If the loaded nodes are Diameter servers (or clients in the case of server-to-client transactions), both of these uses are most effectively accomplished by a Diameter node that performs server selection. Proposed: If the loaded nodes are Diameter servers (or clients in the case of server-to-client transactions), both of these *load information* uses *should be* accomplished by a Diameter node that performs server selection. Reasoning: Diverting traffic can only be performed by a node that performs server selection, or? <JPG> Agree in principle, but I think that "..both of these uses of laod information should be ..." reads better than "... both of these load information uses should be ...". </JPG> 5. Now The second big difference between DOIC and Load is visibility of the DOIC or Load information within a Diameter network. DOIC information is sent end-to-end resulting in the ability of all nodes in the path of the answer message that carries the OC-OLR AVP to act on the information. The DOIC overload reports much remain in the message all the way from the reporting node to the node that is the target for the answer message. For the Load mechanism there are two types of load reports. The first is the load of the endpoint sending the answer message. This load report is carried end-to-end to enable any nodes that make server selection decisions to use the load status of the sending endpoint as part of the server selection decision. The second type of load report is a peer report. This report is used by Diameter nodes as part of the logic to select the next hop Diameter node and, as such, do not have significance beyond the peer node. These load reports are removed by the first supporting Diameter node to receive the report. Proposed: The second big difference between DOIC and Load is visibility of the DOIC or Load information within a Diameter network. DOIC information is sent end-to-end resulting in the ability of all nodes in the path of the answer message that carries the OC-OLR AVP to act on the information, *although only one node can actually consume the report*. The DOIC overload reports much remain in the message all the way from the reporting node to the node that is the target for the answer message. *However,* for the Load mechanism there are two types of load reports *and only the first one is transmitted end-to-end*. The first is the load of the endpoint sending the answer message. This load report is carried end-to-end to enable any nodes that make server selection decisions to use the load status of the sending endpoint as part of the server selection decision. *More than one node may make use of the load information received* The second type of load report is a peer report. This report is used by Diameter nodes as part of the logic to select the next hop Diameter node and, as such, do not have significance beyond the peer node. These load reports are removed by the first supporting Diameter node to receive the report. <JPG> Slightly different comment. I think the phrase " The DOIC overload reports much remain in the message..." is a typo and should be " The DOIC overload reports must (or MUST?) remain in the message.." <?JPG> 5. Now The goal is make it possible to use both the load values received as a part of the Diameter Load mechanism and weight values received as a result of a DNS SRV query. As a result, the Diameter load value has a range of 0-65535. This value and DNS SRV weight values are then used in a distribution algorithm similar to that specified in [RFC2782]. Comments: In order to have an efficient load balancing algorithm, it is not enough for the reacting node (for the node in charge of load balancing) to know the Load of each server, but it needs to know the load in relation to each server capacity. Unless we do so, the Load value of a server can't be compared with the Load of a Server with a different weight. Then, in my opinion, we need to find a way to provide a Load value that is in fact comparable with the rest of the Load values of the servers in the group. Reflecting a bit longer on this, I think we need then to define a group of servers in the load-balancing group, like a load-balancing context, and then, for all servers in such a group we need to provide a relative value of dynamic Load. <JPG> Agree with the thought- if "Little Server" is 30% utilized and "Big Server" is 50% utilized, it still makes sense to send more traffic to Big Server. But I am not sure if that is withn the scope of this document. </JPG> 5. Now The load report includes the relative load of the sending node. This relative load is specified in a manner consistent with that defined for DNS SRV [RFC2782]. Proposed: The load report includes a value to identify the load of the sending node, specified in a manner consistent with that defined for DNS SRV [RFC2782]. <JPG> Agree. </JPG> 5. Now: The distribution algorithm used by Diameter nodes supporting the Diameter Load mechanism is an implementation decision but it needs to result in similar behavior as the algorithm specified in [RFC2782]. Proposed: The distribution algorithm used by Diameter nodes supporting the Diameter Load mechanism is an implementation decision but it needs to result in similar behavior as the algorithm *described for the use of weigth values in* [RFC2782]. <JPG> Agree in principle. NIT- replace "similar behavior as" with "similar behavior to", and repalce "weigth" with "weight". </JPG> (End of my comments) 5.1 Now: If Agent A4 supports the Load mechanism then it will verify that the load information received is valid. For a HOST load report this is achieved by matching the identity included in the load information with the identity of the host node from which the answer message was received. Comments: A4 behaviour should be defined generically. In the example, we know S[n] is a peer of A4, but generically A4 will not know it when receiving a HOST report. Then, for an AgentX the HOST load report is valid as long as it is responsible for server selection, as explained for A1 below: A1's actions depend on whether A1 is responsible for doing server selection. If A1 is not doing server selection then A1 ignores the HOST load report. If A1 is responsible for doing server selection then it stores the load information for S[n] in its routing information for the handling of subsequent request messages. In both cases A1 leaves the HOST report in the message 6.1.1 Now: The method for determining the load value included in the load report is an implementation decision. Comments: In line to comment above, I agree it should be implementation specific, but we need to provide some guidance to be able to provide a value that could be used to achieve a successful load balancing. 6.2 Now: If the Diameter node is responsible for doing server selection then it SHOULD save the load value included in the Value AVP included in the Load AVP of type HOST in its routing information. Proposed: If the Diameter node is responsible for doing server selection then it SHOULD save the load value included in the Value AVP included in the Load AVP of type HOST. Reasoning: It is a bit misleading to state that is should be stored "in its routing information". It has to be used for server selection, regardless "how" and "where" it is stored. 7.3 Now: The Load-Value AVP (AVP code TBD3) is of type Unsigned64. It is used to convey relative load information about the sender of the load report. Comments: *Relative load* It seems it refers to what I commented before, about the "relative dynamic load", in that comment it is relative to the weight. But as the draft is now, I think it is misleading, since it is not clear to what it refers. 7.3 Now: The Load-Value AVP is specified in a manner similar to the weight value in DNS SRV ([RFC2782]). The Load-Value has a range of 0-65535. A higher value indicates a lower load on the sending node. A lower value indicates that the sending node is heavily loaded. Stated another way, a node that has zero load would have a load value of 65535. A node that is 100% loaded would have a load value of 0. Comments: I think it could be easier to use a %. It is more straight forward to figure out what it means. =========== TYPOS========: 2. Terminology and abbreviations Routing Information Routing Information - Routing information referred to in this document can include the Routing and Peer tables defined in RFC 6733. It can also include other implementation specific tables used to store load information. This document does not define the structure of such tables. Remove *Routing information* duplicated sentence. 4.1 At any given time that load *maybe* effectively zero *May be* 5.1 Because the load report is *an* HOST load report, A4 leaves the load report in the message it relays. 5.1 A1 then calculates its own load information and inserts load information AVPs of type PEER in the message before sending the message to *A1* *A1* should be C 6.1.1 For instance, if the only consumer of the load reports is the * endpoints peer* then the endpoint can choose to only include a load report when the load of the endpoint has changed by a meaningful percentage. If there are consumers of the endpoint load report other *thaen* the *endpoints peer* (this will be the case if other nodes are responsible for server selection) then the endpoint might choose to include load reports in all answer messages as a way of ensuring that all nodes doing server selection get accurate load information. *endpoint's peer* 6.2 A Diameter node MUST be prepared to process load reports of type HOST *and* of type PEER 6.2 Note that the node needs to be able to handle messages with no load reports, messages with just a PEER load report, messages with just *an* HOST load report and messages with both types of load reports. -----Original Message----- From: DiME [mailto:dime-bounces@ietf.org] On Behalf Of Jouni Korhonen Sent: martes, 24 de mayo de 2016 17:30 To: dime@ietf.org Subject: [Dime] WGLC #1 for draft-ietf-dime-load-02 Folks, This email starts the WGLC #1 for draft-ietf-dime-load-02. Please, review the document, post your comments to the mailing list and also insert them into the Issue Tracker with your proposed resolution. WGLC starts: 5/24/2016 ends: 6/7/2016 EOB PDT - Jouni & Lionel _______________________________________________ DiME mailing list DiME@ietf.org https://www.ietf.org/mailman/listinfo/dime _______________________________________________ DiME mailing list DiME@ietf.org https://www.ietf.org/mailman/listinfo/dime This electronic message transmission contains information from CSRA that may be attorney-client privileged, proprietary or confidential. The information in this message is intended only for use by the individual(s) to whom it is addressed. If you believe you have received this message in error, please contact me immediately and be aware that any use, disclosure, copying or distribution of the contents of this message is strictly prohibited. NOTE: Regardless of content, this email shall not operate to bind CSRA to any order or other contract unless pursuant to explicit written agreement or government initiative expressly permitting the use of email for such purpose.
- Re: [Dime] WGLC #1 for draft-ietf-dime-load-02 Maria Cruz Bartolome
- Re: [Dime] WGLC #1 for draft-ietf-dime-load-02 Maria Cruz Bartolome
- Re: [Dime] WGLC #1 for draft-ietf-dime-load-02 Trottin, Jean-Jacques (Nokia - FR)
- Re: [Dime] WGLC #1 for draft-ietf-dime-load-02 Maria Cruz Bartolome
- Re: [Dime] WGLC #1 for draft-ietf-dime-load-02 Steve Donovan
- Re: [Dime] WGLC #1 for draft-ietf-dime-load-02 Gunn, Janet P
- Re: [Dime] WGLC #1 for draft-ietf-dime-load-02 Maria Cruz Bartolome
- Re: [Dime] WGLC #1 for draft-ietf-dime-load-02 Steve Donovan
- Re: [Dime] WGLC #1 for draft-ietf-dime-load-02 Gunn, Janet P
- Re: [Dime] WGLC #1 for draft-ietf-dime-load-02 Maria Cruz Bartolome
- Re: [Dime] WGLC #1 for draft-ietf-dime-load-02 Jouni Korhonen
- Re: [Dime] WGLC #1 for draft-ietf-dime-load-02 Steve Donovan
- Re: [Dime] WGLC #1 for draft-ietf-dime-load-02 Maria Cruz Bartolome
- Re: [Dime] WGLC #1 for draft-ietf-dime-load-02 Trottin, Jean-Jacques (Nokia - FR)
- Re: [Dime] WGLC #1 for draft-ietf-dime-load-02 A. Jean Mahoney
- Re: [Dime] WGLC #1 for draft-ietf-dime-load-02 Maria Cruz Bartolome
- Re: [Dime] WGLC #1 for draft-ietf-dime-load-02 A. Jean Mahoney
- Re: [Dime] WGLC #1 for draft-ietf-dime-load-02 Maria Cruz Bartolome
- Re: [Dime] WGLC #1 for draft-ietf-dime-load-02 Trottin, Jean-Jacques (Nokia - FR)
- Re: [Dime] WGLC #1 for draft-ietf-dime-load-02 Jouni Korhonen
- [Dime] WGLC #1 for draft-ietf-dime-load-02 Jouni Korhonen
- Re: [Dime] WGLC #1 for draft-ietf-dime-load-02 Trottin, Jean-Jacques (Nokia - FR)
- Re: [Dime] WGLC #1 for draft-ietf-dime-load-02 A. Jean Mahoney
- Re: [Dime] WGLC #1 for draft-ietf-dime-load-02 Steve Donovan
- Re: [Dime] WGLC #1 for draft-ietf-dime-load-02 A. Jean Mahoney