[Dime] Fwd: Feedback on overload set of docs

Hannes Tschofenig <hannes.tschofenig@gmx.net> Mon, 29 July 2013 12:07 UTC

Return-Path: <hannes.tschofenig@gmx.net>
X-Original-To: dime@ietfa.amsl.com
Delivered-To: dime@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 60BAB21F9E0B for <dime@ietfa.amsl.com>; Mon, 29 Jul 2013 05:07:23 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.599
X-Spam-Level:
X-Spam-Status: No, score=-102.599 tagged_above=-999 required=5 tests=[AWL=-0.001, BAYES_00=-2.599, HTML_MESSAGE=0.001, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cf61rKhOtVQO for <dime@ietfa.amsl.com>; Mon, 29 Jul 2013 05:07:18 -0700 (PDT)
Received: from mout.gmx.net (mout.gmx.net [212.227.15.18]) by ietfa.amsl.com (Postfix) with ESMTP id BD29F21F9D56 for <dime@ietf.org>; Mon, 29 Jul 2013 05:06:40 -0700 (PDT)
Received: from dhcp-13ba.meeting.ietf.org ([130.129.19.186]) by mail.gmx.com (mrgmx101) with ESMTPSA (Nemesis) id 0Lu3J4-1U2dSL15zK-011P8I for <dime@ietf.org>; Mon, 29 Jul 2013 14:06:39 +0200
From: Hannes Tschofenig <hannes.tschofenig@gmx.net>
Mime-Version: 1.0 (Apple Message framework v1085)
Content-Type: multipart/signed; protocol="application/pgp-signature"; micalg="pgp-sha512"; boundary="Apple-Mail-20--492077124"
Date: Mon, 29 Jul 2013 14:06:37 +0200
References: <1373E8CE237FCC43BCA36C6558612D2AA1FC14@USCHMBX001.nsn-intra.net>
To: dime@ietf.org
Message-Id: <7D12DC96-E091-4CB1-8054-5C31C3E99C0E@gmx.net>
Content-Transfer-Encoding: 7bit
X-Pgp-Agent: GPGMail 1.4.1
X-Mailer: Apple Mail (2.1085)
X-Provags-ID: V03:K0:B0SXkGDynmFNuTtKKyK/KsRHG+1H0mx9VaaBxmPA8mDz9OTqOtH jLj3oY0t97nwsd2lkH2mF8ob4Ir9ogek0i60PckOv2DvOXAb9ZW8eCqgdwP3y+7GAdVePDJ D90H+sFvhtU/dWkk5ASjqU9OGjp4wMPq3ccPs82+8AcQr54zD+ipeXbUTX1Gn3cTAPDYHFt Jr3vJ6WiwinbXjwTjHLfg==
Cc: Sebastien Decugis <sdecugis@gmail.com>
Subject: [Dime] Fwd: Feedback on overload set of docs
X-BeenThere: dime@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Diameter Maintanence and Extentions Working Group <dime.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dime>, <mailto:dime-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/dime>
List-Post: <mailto:dime@ietf.org>
List-Help: <mailto:dime-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dime>, <mailto:dime-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 29 Jul 2013 12:07:23 -0000

Hi all, 

I asked Sebastien to provide feedback about the overload documents I recently submitted to the working group to determine how difficult it would be to implement them in freeDiameter. Sebastien is the person behind freeDiameter (as most of you know). 

Ciao
Hannes

> From: ext Sebastien
> Sent: 7/21/2013 17:35
> To: Tschofenig, Hannes (NSN - FI/Espoo)
> Subject: Feedback on overload set of docs
> 
> Hi Hannes,
> 
> I have read the set of documents for overload 
> (draft-ietf-dime-overload-reqs-07, 
> draft-tschofenig-dime-overload-arch-00, draft-tschofenig-dime-dlba-00, 
> draft-tschofenig-dime-overload-piggybacking-00)
> 
> Concerning the load-balancing application, it should be quite 
> straight-forward to implement this. There are already some metrics 
> available in freeDiameter to get a hint of the current load, and there 
> is also all the support in freeDiameter to implement any load-balancing 
> algorithm you like. I even have an example of a load-balancing 
> application [1] that is based on the load of the different backend 
> servers (but measured by the number of pending requests for those 
> servers, not by information received remotely). Note that for the part 
> about PPID, you'd have to change some code in the lower layer of 
> freeDiameter, as such control is not provided through the normal API of 
> the framework (at the moment at least).
> 
> The requirements document distinguish between the "global" load of a 
> Diameter node (let's call this Base Protocol Load) and a specific 
> application load. All the metrics I have in freeDiameter concern only 
> the base protocol load. If a specific application wanted to use some 
> mechanism to report its own load, it would have to implement this logic 
> itself. One the other hand, I don't see how your DLBA fits with this 
> requirement either...
> 
> ...I guess your plan is to address this with the piggybacked AVPs in the 
> application's messages. Note that for this piggy-backed mechanism, I do 
> not really see the value of the "Supported-Feature" AVP: it is not 
> important for the client to signal the support of the mechanism to the 
> server, as it can ignore the piggybacked AVPs anyway if it does not 
> understand them.
> 
> For implementing the piggybacking, it is easy to add an AVP to a 
> message, but it is more difficult to know when / where to do it. Today 
> all the applications provided in freeDiameter consume the requests 
> synchronously, so their load state directly matchs the "base protocol 
> load" (i.e. as soon as one application becomes overloaded, the complete 
> node is overloaded). If you want to have an application-specific load, 
> you need to "consume" quickly the request received by your application 
> callback and just store it for asynchronous processing. Then your 
> application load state will be the status of this asynchronous 
> processing queue. This requires some additional logic, and I do not have 
> any example available publicly at the moment -- although I know some 
> users have implemented such logic successfully already.
> 
> I hope this answers your question :-)
> 
> I have some additional comments on 
> http://tools.ietf.org/html/draft-tschofenig-dime-overload-arch-00 (just 
> FYI, feel free to ignore those)
> 
> 3. Architectural Principles:
> I don't understand the issue with rejecting Requests ("requires 
> signaling to the Diameter clients") => this is exactly the same as when 
> the authentication fails for example. Today when I try to use my phone 
> during new year's eve, the call just fails, I don't receive a status 
> that the network is too busy (and I would not care). The information 
> that it is failing because of overload is only useful to the Diameter 
> nodes that could take a corrective action, such as sending to another 
> backend. As you know Diameter base protocol already has a status 
> DIAMETER_TOO_BUSY. An agent that receives this status can choose to 
> resend the request to another backend or forward the error to the 
> client. A Diameter client that receives this error can treat it is the 
> same way as a failure to authenticate, since it is an error, and just 
> deny the access to the user -- and an error message can be included as 
> well in the Diameter answer, that can be forwarded to the connecting device.
> 
> 5.2.3. Overload AVP.
> I am not sure "increasing overload" and "decreasing overload" makes any 
> sense. You can quantify the load, e.g. from 0% to 100%, then you are in 
> overload state where the Diameter server is not able to handle any new 
> request until it has processed the previous ones (because there is a 
> resource shortage). That's consistent with your Load AVP from 0 to 10. 
> Or, we have a different definition for Overload state, that I am not 
> familiar with.
> 
> In freeDiameter, the situation is as follow:
> - a given server has a given processing capability, resulting in a given 
> maximum throughput (the limit).
> - if the messages are incoming with a rate lower than this limit, 
> everything is well.
> - when there is a burst of messages that goes over the limit rate, the 
> messages start to be queued and the processing is slightly delayed. The 
> more the queue length increases, the more the messages are delayed.
> - there is a limit to the maximum length of the queue in freeDiameter, 
> once this limit is reached the framework does not read from the network 
> socket until the next message is processed -- the lower layer congestion 
> control will then start to kick in.
> - when I measure the "load" of the framework, it is the length of the 
> queue. I know I am overloaded if the length is equal to the maximum for 
> the queue.
> 
> 
> I am also not sure that the Overload information can be useful for the 
> Diameter client (when you write: "The increase and decrease of the 
> sending rate refers to new requests to the same realm and the same 
> application ID as the message carrying the overload information."). In 
> some real-life setups I have seen, the requests are dispatched by an 
> agent to different backends in the same realm depending on for e.g. the 
> User-Name (some users having been migrated to a new server). So, even if 
> one backend is overloaded, it does not mean that the complete realm 
> cannot answer anymore. Do you expect an agent to intercept this AVP in 
> such situation?
> 
> [1] rt_load_balance extension. The complete code takes less than 100 
> lines with comments: 
> http://www.freediameter.net/hg/freeDiameter-proposed/file/e72c9dad62ac/extensions/rt_load_balance/rt_load_balance.c
> 
> Cheers,
> Sebastien.
>