Re: [sip-overload] Issue: Multiple algorithms -> Single algorithm for overload control
<phil.m.williams@bt.com> Thu, 16 June 2011 11:03 UTC
Return-Path: <phil.m.williams@bt.com>
X-Original-To: sip-overload@ietfa.amsl.com
Delivered-To: sip-overload@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix)
with ESMTP id 869D011E8213 for <sip-overload@ietfa.amsl.com>;
Thu, 16 Jun 2011 04:03:40 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.746
X-Spam-Level:
X-Spam-Status: No, score=-2.746 tagged_above=-999 required=5 tests=[AWL=-0.300,
BAYES_00=-2.599, HELO_MISMATCH_COM=0.553, J_CHICKENPOX_82=0.6,
RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com
[127.0.0.1]) (amavisd-new, port 10024) with ESMTP id RRH7S1znxceQ for
<sip-overload@ietfa.amsl.com>; Thu, 16 Jun 2011 04:03:39 -0700 (PDT)
Received: from smtpe1.intersmtp.com (smtp64.intersmtp.COM [62.239.224.237]) by
ietfa.amsl.com (Postfix) with ESMTP id 951DC11E813C for
<sip-overload@ietf.org>; Thu, 16 Jun 2011 04:03:38 -0700 (PDT)
Received: from EVMHT64-UKRD.domain1.systemhost.net (10.36.3.101) by
RDW083A008ED64.smtp-e4.hygiene.service (10.187.98.13) with Microsoft SMTP
Server (TLS) id 8.3.159.2; Thu, 16 Jun 2011 12:03:37 +0100
Received: from EVMHT04-UKBR.domain1.systemhost.net (193.113.108.57) by
EVMHT64-UKRD.domain1.systemhost.net (10.36.3.101) with Microsoft SMTP Server
(TLS) id 8.3.159.2; Thu, 16 Jun 2011 12:03:37 +0100
Received: from EMV04-UKBR.domain1.systemhost.net ([169.254.1.73]) by
EVMHT04-UKBR.domain1.systemhost.net ([193.113.108.57]) with mapi;
Thu, 16 Jun 2011 12:03:36 +0100
From: <phil.m.williams@bt.com>
To: <ecnoel@research.att.com>
Date: Thu, 16 Jun 2011 12:03:25 +0100
Thread-Topic: [sip-overload] Issue: Multiple algorithms -> Single algorithm
for overload control
Thread-Index: AcwntjP3q241vkz9TK+mlWsBaU35jgCvi+UQAAogklAAXRVFQA==
Message-ID: <E4B3F0DC6D953D4EBEC223BC86FE322C4A4812B377@EMV04-UKBR.domain1.systemhost.net>
References: <4DF28DE3.6080804@bell-labs.com>
<E4B3F0DC6D953D4EBEC223BC86FE322C4A48049844@EMV04-UKBR.domain1.systemhost.net>
<2F8FB48C17221643AD77FA295756D2A71E239015C7@njfpsrvexg6.research.att.com>
In-Reply-To: <2F8FB48C17221643AD77FA295756D2A71E239015C7@njfpsrvexg6.research.att.com>
Accept-Language: en-US, en-GB
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US, en-GB
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Cc: cj5814@att.com, sip-overload@ietf.org
Subject: Re: [sip-overload] Issue: Multiple algorithms -> Single algorithm
for overload control
X-BeenThere: sip-overload@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: SIP Overload <sip-overload.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/sip-overload>,
<mailto:sip-overload-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/sip-overload>
List-Post: <mailto:sip-overload@ietf.org>
List-Help: <mailto:sip-overload-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sip-overload>,
<mailto:sip-overload-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Jun 2011 11:03:40 -0000
Eric,
Thanks for the offer.
Incidentally, one issue that you may already have taken account of is 'resonance' or 'synchronisation'. I was originally made aware of this in the papers by Leonard Forys et al, but I have modelled and verified it by simulation. This can occur with any naive implementation of a restriction algorithm, and occurs when there are a large number of client sources and the load is high and arises quickly.
In the specific case of leaky bucket ('rate-based') it will be manifested by the bucket fills all being nearly the same in synchronisation (and similarly for windowing).
In the case of proportional rejection ('loss-based') it would be when a deterministic sequence of admission/rejects are in phase.
The net result on the arrival stream at the overloaded server is that it can be extremely batched and peaky, resulting in very poor response times at best and at worst lost messages (and potentially rather catastrophic).
For the leaky bucket case a solution that I favour is to randomise the fill when control starts (about the initial fill) and when the bucket empties. The reason for the latter is to deal with the case of a mid-overload abatement followed by another surge, which can have the effect of re-synchronising the fill.
This may not be the forum to go into this in detail (as a separate thread?), but I would be interested in your solutions...
Regards,
Phil Williams
-----Original Message-----
From: NOEL, ERIC C (ERIC C) [mailto:ecnoel@research.att.com]
Sent: 14 June 2011 15:17
To: Williams,PM,Phil,DEV6 R; vkg@bell-labs.com
Cc: sip-overload@ietf.org; DOLLY, MARTIN C (ATTSI); JOHNSON, CAROLYN R (ATTSI)
Subject: RE: [sip-overload] Issue: Multiple algorithms -> Single algorithm for overload control
Folks,
If decision is made to support a rate control algorithm, I could submit a contribution that describes AT&T rate control algorithm developed in support of the Overload Design team work.
Thanks,
Eric Noel
LMTS
AT&T Labs, Inc.
Rethink Possible
Network Design and Performance Analysis
200 South Laurel Avenue, D5-3D19
Middletown, NJ 07748
P: 732.420.4174
ecnoel@att.com
-----Original Message-----
From: sip-overload-bounces@ietf.org [mailto:sip-overload-bounces@ietf.org] On Behalf Of phil.m.williams@bt.com
Sent: Tuesday, June 14, 2011 7:02 AM
To: vkg@bell-labs.com
Cc: sip-overload@ietf.org
Subject: Re: [sip-overload] Issue: Multiple algorithms -> Single algorithm for overload control
Thanks Vijay,
For continuing to consider the arguments that I raised supporting multiple restriction algorithms. I would like to make some comments and provide some further proposals below.
First, taking the points that you raised:
'...the basic problems with multiple algorithms are:
1) Interoperability suffers as protocol gets complex.'
I'm not sure that I understand this one - perhaps could you elaborate? In fact I see interoperability is an argument for supporting multiple (in particular rate-based) controls. The usefulness of proportional rejection is not just limited by performance issues, but perhaps more important they are not well suited for applying capacity guarantees (SLAs), which I have described in my original e-mail. I expect that as SIP becomes more ubiquitous across operator boundaries this will become increasingly important, and I suggest that we should be future-proofing SIP overload control for this reason.
'2) Get out one algorithm now (loss), and later if there is a need to
standardize more add the oc-algo parameter in a backwards
compatible manner.'
Realistically, if this is the approach taken then suppliers will only provide proportional rejection, and it will become difficult/expensive to introduce other restriction methods such as rate-based controls.
'3) Re-negotiating the chosen algorithm is not trivial.'
If support for restriction algorithms is mandatory, i.e. you either support all ( 3?) algorithms or none at all, then it would just be a case of indicating that overload control is supported. This does not seem to be complex.
4) Servers will have to maintain negotiated algorithms as pieces
of state for a potentially large set of upstream clients.
If upstream clients have to support multiple algorithms, then this is not necessary, since it is up to the overloaded server to decide which restriction algorithm to use. The server can be confident that if overload control is supported, then it can choose which algorithm (if no algorithms are supported this is an issue already with the current spec).
I would also like to re-iterate that the 3 client restriction algorithms have been around for a long time, and are already widely deployed (and subject to limited variation in their realisation). Therefore we would expect it to be less contentious to standardise these methods within SIP, but in any case we would expect less sensitivity to design variations.
Regarding another issue that was raised - the computational overhead of adaptation calculation performed by the overloaded server, from practical experience designing and deploying overload controls, I suggest that this should be insignificant compared to the other functions of the server. However there are much more critical constraints:
(1) The inaccuracy introduced by making updates too frequent, during to stochastic nature of arrivals etc. The updates will be based upon measures such as counting requests/messages and CPU occupancy. If the number of requests processed is 'too small' then there will be significant inaccuracy in the estimated throughput and therefore the required control parameters. The main factors here are the inherent limit to the accuracy of measuring CPU occupancy (what is really required is time-integrated measurement of all CPU activity between two updates), but also the effect of subsequent phases of session processing. By the latter I include mid-session processing and disconnects (which are not controlled by the sources). To overcome these issues it is necessary to measure for sufficiently long and to apply appropriate smoothing. [Note that the mix/complexity may change very suddenly due to surges, disconnect avalanches, etc. These must all be taken into account].
(2) The number of sources (clients) and the way in which the restriction algorithm state is changed. For example, with proportional rejection (loss-based) it will be typically be implemented as a deterministic sequence of alternating admissions and rejections. When the arrival rate at each source is very low this deterministic sequence is interrupted and has to be changed to reflect the modified control parameter, and how this is done may have a significant impact on the overall admission rate from all sources. If updates are very frequent and the number of sources very large some sources may not have even receive an arrival before the change is made!
Because of these factors there are fundamental limits to how quickly updates can be made. Generally the update time has to be longer when the server throughput is lower (more complex requests) and the number of sources is larger. The problem with proportional restriction is that during the time between updates it is possible that one or more rogue streams suddenly inflate. Although the precise performance depends upon the network configuration, in general an advantage of rate-based controls is that there will be an upper limit to the admission rate even before an update is made (not so for proportional rejection).
If these factors are not taken into account and the update period is too short, then the likely outcome is unstable control, and this can have a dramatic (bad) effect on throughput.
To progress the debate further I would like to offer to produce a modified version of draft-ietf-soc-overload-control-02 for comment. I could also (later) submit a recommendation for implementation of proportional rejection and rate-based methods if this is deemed a useful addition to the draft in order to facilitate adoption.
Regards,
Phil Williams
-----Original Message-----
From: sip-overload-bounces@ietf.org [mailto:sip-overload-bounces@ietf.org] On Behalf Of Vijay K. Gurbani
Sent: 10 June 2011 22:34
To: sip-overload@ietf.org
Subject: [sip-overload] Issue: Multiple algorithms -> Single algorithm for overload control
Folks: During the Prague IETF, Mr. Kaplan reopened the
case for supporting multiple overload control algorithms [1].
Summarizing, the basic problems with multiple algorithms are:
1) Interoperability suffers as protocol gets complex.
2) Get out one algorithm now (loss), and later if there is a need to
standardize more add the oc-algo parameter in a backwards
compatible manner.
3) Re-negotiating the chosen algorithm is not trivial.
4) Servers will have to maintain negotiated algorithms as pieces
of state for a potentially large set of upstream clients.
I cannot say that I am unsympathetic to the above list. If we can
simplify processing, then we should do so.
The main reason to support multiple algorithms appear to have been
the need to not preclude rate-based algorithms [2,3]. However
subsequent discussions in the thread started off by [3] have
resulted in the understanding that rate-based algorithms may have
a slight advantage if the offered load at the client significantly
surges over a period of time << then the overload server's feedback
period. Since the overload server's feedback period is already short
(500ms by default), the advantage possessed by rate-based algorithms
may be moot [5].
Further, minimizing state in the server is an argument to pick
loss-based as the default algorithm instead of rate-based. The
latter requires the server to assign --- and keep track --- of a
share of its overall capacity with each upstream client.
So ... given all this, it may be a good idea to go with one
algorithm, and make that the loss based algorithm.
I will like have discussion so we can reach consensus on this
issue and close it for good.
[1] http://www.ietf.org/mail-archive/web/sip-overload/current/msg00564.html
[2] http://www.ietf.org/mail-archive/web/sip-overload/current/msg00565.html
[3] http://www.ietf.org/mail-archive/web/sip-overload/current/msg00600.html
[4] http://www.ietf.org/mail-archive/web/sip-overload/current/msg00606.html
[5] http://www.ietf.org/mail-archive/web/sip-overload/current/msg00606.html
Thanks,
- vijay
--
Vijay K. Gurbani, Bell Laboratories, Alcatel-Lucent
1960 Lucent Lane, Rm. 9C-533, Naperville, Illinois 60566 (USA)
Email: vkg@{bell-labs.com,acm.org} / vijay.gurbani@alcatel-lucent.com
Web: http://ect.bell-labs.com/who/vkg/
_______________________________________________
sip-overload mailing list
sip-overload@ietf.org
https://www.ietf.org/mailman/listinfo/sip-overload
_______________________________________________
sip-overload mailing list
sip-overload@ietf.org
https://www.ietf.org/mailman/listinfo/sip-overload
- [sip-overload] Issue: Multiple algorithms -> Sing… Vijay K. Gurbani
- Re: [sip-overload] Issue: Multiple algorithms -> … Janet P Gunn
- Re: [sip-overload] Issue: Multiple algorithms -> … Bharrat, Shaun
- Re: [sip-overload] Issue: Multiple algorithms -> … Vijay K. Gurbani
- Re: [sip-overload] Issue: Multiple algorithms -> … Janet P Gunn
- Re: [sip-overload] Issue: Multiple algorithms -> … phil.m.williams
- Re: [sip-overload] Issue: Multiple algorithms -> … NOEL, ERIC C (ERIC C)
- Re: [sip-overload] Issue: Multiple algorithms -> … Janet P Gunn
- Re: [sip-overload] Issue: Multiple algorithms -> … phil.m.williams
- Re: [sip-overload] Issue: Multiple algorithms -> … phil.m.williams
- Re: [sip-overload] Issue: Multiple algorithms -> … Vijay K. Gurbani
- Re: [sip-overload] Issue: Multiple algorithms -> … phil.m.williams
- Re: [sip-overload] Issue: Multiple algorithms -> … Vijay K. Gurbani