Re: [Dime] WGLC #1 for draft-ietf-dime-load-02

"Trottin, Jean-Jacques (Nokia - FR)" <jean-jacques.trottin@nokia.com> Thu, 30 June 2016 11:00 UTC

Return-Path: <jean-jacques.trottin@nokia.com>
X-Original-To: dime@ietfa.amsl.com
Delivered-To: dime@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2C90C12D0FD for <dime@ietfa.amsl.com>; Thu, 30 Jun 2016 04:00:17 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.902
X-Spam-Level:
X-Spam-Status: No, score=-6.902 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 6cE7bVsUZOp3 for <dime@ietfa.amsl.com>; Thu, 30 Jun 2016 04:00:14 -0700 (PDT)
Received: from smtp-fr.alcatel-lucent.com (fr-hpida-esg-02.alcatel-lucent.com [135.245.210.21]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4DB1612D76F for <dime@ietf.org>; Thu, 30 Jun 2016 04:00:14 -0700 (PDT)
Received: from fr712umx3.dmz.alcatel-lucent.com (unknown [135.245.210.42]) by Websense Email Security Gateway with ESMTPS id 79DEBD33B7EB1; Thu, 30 Jun 2016 11:00:09 +0000 (GMT)
Received: from fr711usmtp1.zeu.alcatel-lucent.com (fr711usmtp1.zeu.alcatel-lucent.com [135.239.2.122]) by fr712umx3.dmz.alcatel-lucent.com (GMO-o) with ESMTP id u5UB0BAC015032 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 30 Jun 2016 11:00:11 GMT
Received: from FR712WXCHHUB03.zeu.alcatel-lucent.com (fr712wxchhub03.zeu.alcatel-lucent.com [135.239.2.74]) by fr711usmtp1.zeu.alcatel-lucent.com (GMO) with ESMTP id u5UB03we014640 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=FAIL); Thu, 30 Jun 2016 13:00:11 +0200
Received: from FR712WXCHMBA12.zeu.alcatel-lucent.com ([169.254.8.32]) by FR712WXCHHUB03.zeu.alcatel-lucent.com ([135.239.2.74]) with mapi id 14.03.0195.001; Thu, 30 Jun 2016 12:59:33 +0200
From: "Trottin, Jean-Jacques (Nokia - FR)" <jean-jacques.trottin@nokia.com>
To: Maria Cruz Bartolome <maria.cruz.bartolome@ericsson.com>, Steve Donovan <srdonovan@usdonovans.com>, "dime@ietf.org" <dime@ietf.org>
Thread-Topic: [Dime] WGLC #1 for draft-ietf-dime-load-02
Thread-Index: AQHRtdE4tMkqClz2NEuuQj0gjsd2sp/qPcJQgAnlegCAAXp7AIAHXWgAgAOaJgCAADIrgIABPp8g
Date: Thu, 30 Jun 2016 10:59:32 +0000
Message-ID: <E194C2E18676714DACA9C3A2516265D29D520AC0@FR712WXCHMBA12.zeu.alcatel-lucent.com>
References: <5b31616d-efa3-ac03-8f1c-bd8883a35d65@gmail.com> <087A34937E64E74E848732CFF8354B9219758407@ESESSMB101.ericsson.se> <3e2082d80d8e45caaca581c9dcc98468@CSRRDU1EXM025.corp.csra.com> <71796571-c370-cae8-d456-9d2dfb02544c@usdonovans.com> <087A34937E64E74E848732CFF8354B921975C3F4@ESESSMB101.ericsson.se> <71ffc339-37e0-e4fd-a16e-59da7fe23b6d@usdonovans.com> <087A34937E64E74E848732CFF8354B921975E5AB@ESESSMB101.ericsson.se>
In-Reply-To: <087A34937E64E74E848732CFF8354B921975E5AB@ESESSMB101.ericsson.se>
Accept-Language: fr-FR, en-US
Content-Language: fr-FR
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [135.239.27.41]
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/dime/sg3rQk_3pxn0I-aeVp7_AmkgXus>
Subject: Re: [Dime] WGLC #1 for draft-ietf-dime-load-02
X-BeenThere: dime@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Diameter Maintanence and Extentions Working Group <dime.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dime>, <mailto:dime-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dime/>
List-Post: <mailto:dime@ietf.org>
List-Help: <mailto:dime-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dime>, <mailto:dime-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 30 Jun 2016 11:00:17 -0000

Dear all
 
About discussion regarding draft-ietf-dime-load-02, I was in line with the new 02 version Steve distributed some time ago. I here reviewed Maria Cruz comments and Steve's reactions.
 
Globally I remain in line with the Steve's hereafter comments. My main comment is on .5 about server capacity. The other updates have for me no protocol impact and were mainly wording enhancements  and  are worthwhile for me .
Please see my few comments in line (with JJ>). Main one is in 5. about the capacity topic
 
I would also take this opportunity to indicate that due to my job evolution, my colleague Balint Uveges will follow the load/overload dime aspects in IETF.

Best regards
 
JJacques

-----Message d'origine-----
De : DiME [mailto:dime-bounces@ietf.org] De la part de Maria Cruz Bartolome
Envoyé : mercredi 29 juin 2016 19:19
À : Steve Donovan; dime@ietf.org
Objet : Re: [Dime] WGLC #1 for draft-ietf-dime-load-02

Hello Steve,
Thanks for the responses, see some more comments below Best regards /MCruz


On 6/27/16 2:18 AM, Maria Cruz Bartolome wrote:
>
>> 4.1
>> Now:
>>      None of this prevents a Diameter node from deciding to reduce the
>>      offered load based on load information.   .
>>
>> Proposed
>>     (remove)
>>
>> Reasoning:
>> This sentence is not properly linked to previous paragraph and it is 
>> covered by previous paragraph already
>>
>> <JPG> OK with this, though not sure it is necessary to delete.</JPG>
> SRD> This sentence adds emphasis to the point that a similar result 
> SRD> can
> happen between load and overload, leading into the next sentence outlining the fundamental difference between the two.  I don't see the harm in leaving it, even if what is says is implied by the previous paragraph.
> MCRUZ> My problem with the sentence is that it is not straight forward to what refers "none of this". The reader will look above to check what it refers to... and it seems to be the whole paragraph, i.e. the differences between load and overload. But this sentence refers again to something that is mentioned above. Then, I think the sentence, as it is, is misleading that turns reading a bit unease.
SRD> How about: "A Diameter node can, however, decide to reduce offered
load based on load information."
MCRUZ> Fine

>> 5.
>> Now
>>      The second big difference between DOIC and Load is visibility of the
>>      DOIC or Load information within a Diameter network.  DOIC information
>>      is sent end-to-end resulting in the ability of all nodes in the path
>>      of the answer message that carries the OC-OLR AVP to act on the
>>      information.  The DOIC overload reports much remain in the message
>>      all the way from the reporting node to the node that is the target
>>      for the answer message.
>>
>>      For the Load mechanism there are two types of load reports.
>>
>>      The first is the load of the endpoint sending the answer message.
>>      This load report is carried end-to-end to enable any nodes that make
>>      server selection decisions to use the load status of the sending
>>      endpoint as part of  the server selection decision.
>>
>>      The second type of load report is a peer report.  This report is used
>>      by Diameter nodes as part of the logic to select the next hop
>>      Diameter node and, as such, do not have significance beyond the peer
>>      node.  These load reports are removed by the first supporting
>>      Diameter node to receive the report.
>>
>> Proposed:
>>      The second big difference between DOIC and Load is visibility of the
>>      DOIC or Load information within a Diameter network.  DOIC information
>>      is sent end-to-end resulting in the ability of all nodes in the path
>>      of the answer message that carries the OC-OLR AVP to act on the
>>      information, *although only one node can actually consume the report*.  The DOIC overload reports much remain in the message
>>      all the way from the reporting node to the node that is the target
>>      for the answer message.
> SRD> How about "although only one node actually reacts to the report",
> changing consume to react.
> MCRUZ> I think "consume" is better since it implies that from then on the report is removed.
SRD> How about consume and react?  "although only one node actually
consumes and reacts to the report"
MCRUZ> Fine
JJ> this would be the only place (here and I also think in DOIC RFC) where "consume" word is used and raising the question what "consume" means, in particular this does not imply to remove the report. "react" was OK for me but no opposition to "consume and react".  

>
>>      *However,* for the Load mechanism there are two types of load reports *and only the
>>       first one is transmitted end-to-end*.
> SRD> This is covered in the following paragraphs.
> MCRUZ> Yes, but I think we need an introduction for the analysis below, in order to understand we are going to compare. Trying to ease reading.
SRD> Okay.
...
>
> 5.
> Now
>     The goal is make it possible to use both the load values received as
>      a part of the Diameter Load mechanism and weight values received as a
>      result of a DNS SRV query.  As a result, the Diameter load value has
>      a range of 0-65535.  This value and DNS SRV weight values are then
>      used in a distribution algorithm similar to that specified in
>      [RFC2782].
>
> Comments:
> In order to have an efficient load balancing algorithm, it is not enough for the reacting node (for the node in charge of load balancing) to know the Load of each server, but it needs to know the load in relation to each server capacity. Unless we do so, the Load value of a server can't be compared with the Load of a Server with a different weight.
> Then, in my opinion, we need to find a way to provide a Load value that is in fact comparable with the rest of the Load values of the servers in the group.
> Reflecting a bit longer on this, I think we need then to define a group of servers in the load-balancing group, like a load-balancing context, and then, for all servers in such a group we need to provide a relative value of dynamic Load.
>
> <JPG> Agree with the thought- if "Little Server" is 30% utilized and 
> "Big Server" is 50% utilized, it still makes sense to send more 
> traffic to Big Server.  But I am not sure if that is withn the scope 
> of this document. </JPG>
> SRD> I don't understand the concern.  The load values supplied will be
> input into the route selection algorithm as specified in RFC2782.  If 
> a node isn't getting enough traffic it will change its load value to a 
> lower value and will start getting more traffic.
> MCRUZ> Unless the LOAD info provided is in fact a value that represents the available capacity, then the load balancing will not select the less loaded server. Being able to select the less loaded server is the whole purpose of this mechanism, then we need to find a way to provide a LOAD value from different servers that we are able to compare, i.e. the value provide must indicate the available capacity regardless the static capacity of each server.
SRD> I view the goal of this a little differently.  The goal is to make
sure that requests are delivered to nodes with available capacity.  It is not strictly necessary that every request goes to the least loaded node.
MCRUZ> Well, I do not agree. The whole purpose of providing LOAD info is to be able to choose a node with available  load (I agree), but among the node with available load we need to choose the least loaded (or one of the least loaded). It does not make sense, in my opinion, to simply select a node with available load, when we are providing info about load. The information provided should be valid to be able to select the least (or close to) loaded.


> Providing an example, let me use dynamic Load (say DL) in % (100% is totally loaded) that I found it easier for calculation:
> - Server1: weight=1500; DL= 2%
> - Server2: weight=55000; DL= 70%
> Then, if we only use DL in the LB algorithm, obviously Server 1 seems to be clearly less loaded, but however, taking into account its weight is much smaller it may be the other way around. In fact, if traffic is redirected to this server, it may get overloaded rapidly (due to its small capacity).
> One possible way to calculate the relative DL is  to divide it by the weight, then for this example:
> - Server1 RDL= 10000 * (2/1500) = 13.33
> - Server2 RDL= 10000 * (70/55000) = 12.73
> (I multiplied by 10000 simply to get rid of the decimals for our discussion).
> Then, we actually find out that available load for both servers is pretty similar. In fact, in this case, a correct load balancing should select Server2 as the less loaded server instead of server1.
> My proposal is to consider this reflection in the draft, and then make a clear distinction between dynamic load (DL) and RELATIVE DL. We need to provide the RDL in the message, not DL.
SRD> This is about how the load value is calculated which is explicitly 
stated as being an implementation decision.
MCRUZ> Not exactly. We need to reflect in the draft that the LOAD provided should be the relative available load, taking into account the static weight. This is the only way we are providing a load value that can possibly be used by a client to LOAD-balance. 
I could accept that we leave the way to do so up to implementations.
Proposal: "LOAD should be calculated in a way that reflects the available load independently of the weight of each server, in order to allow the Diameter node that performs server selection to accurateraly compare values from different servers, i.e. LOAD value identifies the same amount of available capacity, regardless the server that has calculate it. "

JJ> I think we can remain with only the relative load information  proposed in the draft even when servers have different capacity. If a small capacity node sees its incoming traffic increasing it will quickly react by sending a higher load value, which, when received by a sending node, will become higher than the values  received from other nodes with higher capacity. Sending node will then reduce the traffic towards this node to ensure load balancing and will continue to adjust according to the load values received. This seems a simple loopback mechanism ensuring a right load balancing.
Do you think it is not sufficient? To add capacity value (weight, group of servers) ..)increases the increases the complexity given this capacity may vary over time (eg a partial failure) so possibly requiring to be dynamically updated. 
More sophisticated behaviors can be introduced (implementation specific) without impacting the protocol and AVPs specified in the draft. If justified requirements drive to new AVPs, this could be part of future evolution, out of the scope of the present draft .
>
>
>> 5.
>> Now
>>      The load report includes the relative load of the sending node.  This
>>      relative load is specified in a manner consistent with that defined
>>      for DNS SRV [RFC2782].
>>
>> Proposed:
>>      The load report includes a value to identify the load of the sending node,
>>     specified in a manner consistent with that defined
>>      for DNS SRV [RFC2782].
>>
>> <JPG> Agree. </JPG>
> SRD> I don't understand the need for this change.
> MCRUZ> Using "relative" is misleading unless we clarify "relative to what".
SRD> Okay.  How about a small change to: "The load report includes a 
value **indicating** the load of the sending node..."
MCRUZ> Fine

JJ> "relative" also used in the proposed update definition of "load" in section 2, which is consistent with the definition of the Load-Value AVP in 7.3., so for me relative is not misleading.     
>>
...
>> 6.1.1
>> Now:
>>      The method for determining the load value included in the load report
>>      is an implementation decision.
>>
>> Comments:
>> In line to comment above, I agree it should be implementation specific, but we need to provide some guidance to be able to provide a value that could be used to achieve a successful load balancing.
> SRD> See my comment above about DNS SRV algorithm.
> MCRUZ> This is related to my comment above to 5, but to the part related to a way to provide a LOAD value that represents the available capacity of a server, taking into account its static capacity.
SRD> Okay, I'll propose some text, based on your example, in the next 
version of the draft.  This would be a non normative example of how 
someone might compute the load value.
MCRUZ> Including the example is fine. Although above I proposed some normative text as well that I think we need to consider.

JJ> see my above comments to 5. The sender adjusts its traffic according the evolution of the received load values.  

...
>
>> 7.3
>> Now:
>>      The Load-Value AVP is specified in a manner similar to the weight
>>      value in DNS SRV ([RFC2782]).
>>
>>      The Load-Value has a range of 0-65535.
>>
>>      A higher value indicates a lower load on the sending node.  A lower
>>      value indicates that the sending node is heavily loaded.
>>
>>         Stated another way, a node that has zero load would have a load
>>         value of 65535.  A node that is 100% loaded would have a load
>>         value of 0.
>>
>> Comments:
>> I think it could be easier to use a %. It is more straight forward to figure out what it means.
> SRD> Percentage can be mapped to the range 0-65535 if that is the
> internal implementation decision.  The goal here is to be consistent
> with RFC2782.
> MCRUZ> Why do we need to keep consistency to that RFC? I think it is clearer to use a percentage, it is more straight forward to identify the available load we refer to.
SRD> This was discussed and agreed to early in the process of writing 
this mechanism.  There a a couple of reasons, first, its an algorithm 
that has already been specified and implemented.  Second, it  allows 
someone who has already implemented the DNS SRV algorithm to reuse it.  
Third, while RFC6733 doesn't directly address load 
balancing/distribution, it does reference use of DNS SRV for handling 
dynamic connections.  It is not unreasonable to expect that there are 
implementations would use the DNS SRV value for nodes that don't support 
load, along with load values received.
MCRUZ> I do not remember a discussion about this, sorry. I had the impression it was incorporated without much discussion.
However, I do not see that it helps reusing DNS SRV. Does an implementation take profit of anything previously implemented for DNS SRV when deciding what load value include in the AVP? I think the server will calculate the LOAD, and then it needs to reflect a value from totally-available to totally-loaded. It is more straight forward and more intuitive to simply use 0-100 as you agreed below.
I did not consider the comment before, sorry, but I think now it can be easily changed and simplify all the implementations and interpretation of LOAD value, don't you think?

> E.g. 50% loaded, using SRV is 32767,5;  25% is 49151,25;  and so on.
> In the mechanism we are defining we do not have the need to keep using a complex value like this one, when we can simply use 0 to 100%, 0 (totally available), 100 (totally loaded).
> In fact, this is in line to the definition in the doc:
SRD> I really don't want to revisit this decision this late in the 
game.  While not as intuitive to a casual reader of the specification as 
a percentage value might be, Using the DNS SRV value works.
...
JJ> Preliminary version of the draft started with %, then Steve proposed to use the same definition as with SRV which didn't raise comments, I am OK to remain on Steve proposal.
_______________________________________________
DiME mailing list
DiME@ietf.org
https://www.ietf.org/mailman/listinfo/dime