Re: [sip-overload] Issue: Multiple algorithms -> Single algorithm for overload control

<phil.m.williams@bt.com> Thu, 16 June 2011 13:01 UTC

Return-Path: <phil.m.williams@bt.com>
X-Original-To: sip-overload@ietfa.amsl.com
Delivered-To: sip-overload@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1722221F8536 for <sip-overload@ietfa.amsl.com>; Thu, 16 Jun 2011 06:01:41 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.596
X-Spam-Level:
X-Spam-Status: No, score=-2.596 tagged_above=-999 required=5 tests=[AWL=-0.150, BAYES_00=-2.599, HELO_MISMATCH_COM=0.553, J_CHICKENPOX_82=0.6, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 03AGYta-GBh6 for <sip-overload@ietfa.amsl.com>; Thu, 16 Jun 2011 06:01:40 -0700 (PDT)
Received: from smtpe1.intersmtp.com (smtp62.intersmtp.COM [62.239.224.235]) by ietfa.amsl.com (Postfix) with ESMTP id 9993921F8535 for <sip-overload@ietf.org>; Thu, 16 Jun 2011 06:01:39 -0700 (PDT)
Received: from EVHUB70-UKRD.domain1.systemhost.net (10.36.3.153) by RDW083A006ED62.smtp-e2.hygiene.service (10.187.98.11) with Microsoft SMTP Server (TLS) id 8.3.159.2; Thu, 16 Jun 2011 14:01:38 +0100
Received: from E07HT01-UKBR.domain1.systemhost.net (193.113.197.94) by EVHUB70-UKRD.domain1.systemhost.net (10.36.3.153) with Microsoft SMTP Server (TLS) id 14.1.289.1; Thu, 16 Jun 2011 14:01:38 +0100
Received: from EMV04-UKBR.domain1.systemhost.net ([169.254.1.73]) by E07HT01-UKBR.domain1.systemhost.net ([193.113.197.94]) with mapi; Thu, 16 Jun 2011 14:01:27 +0100
From: <phil.m.williams@bt.com>
To: <vkg@bell-labs.com>
Date: Thu, 16 Jun 2011 14:01:08 +0100
Thread-Topic: [sip-overload] Issue: Multiple algorithms -> Single algorithm for overload control
Thread-Index: AcwntjP3q241vkz9TK+mlWsBaU35jgCvi+UQAGtp7oA=
Message-ID: <E4B3F0DC6D953D4EBEC223BC86FE322C4A4812B410@EMV04-UKBR.domain1.systemhost.net>
References: <4DF28DE3.6080804@bell-labs.com>
Accept-Language: en-US, en-GB
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US, en-GB
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Cc: sip-overload@ietf.org
Subject: Re: [sip-overload] Issue: Multiple algorithms -> Single algorithm for overload control
X-BeenThere: sip-overload@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: SIP Overload <sip-overload.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/sip-overload>, <mailto:sip-overload-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/sip-overload>
List-Post: <mailto:sip-overload@ietf.org>
List-Help: <mailto:sip-overload-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sip-overload>, <mailto:sip-overload-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 16 Jun 2011 13:01:41 -0000

Vijay,

To follow up on my e-mail below, I am producing a proposed revision of draft-ietf-soc-overload-control-02 for comment, and I expect to send this out for consideration early next week.

So far it looks like there will be a considerable simplification, since essentially the overloaded SIP server is now master and there is no longer any need for negotiation of the restriction algorithm at the SIP client. It should also make future evolution more straightforward both for the spec and for deployments.

There may some additional complexity to specify the algorithms themselves, but it is not clear to me to what extent the spec is expected to nail them down.

Regards,

Phil Williams

-----Original Message-----
From: Williams,PM,Phil,DEV6 R 
Sent: 14 June 2011 12:02
To: 'Vijay K. Gurbani'
Cc: sip-overload@ietf.org
Subject: RE: [sip-overload] Issue: Multiple algorithms -> Single algorithm for overload control

Thanks Vijay,

For continuing to consider the arguments that I raised supporting multiple restriction  algorithms. I would like to make some comments and provide some further proposals below.

First, taking the points that you raised:

'...the basic problems with multiple algorithms are:

1) Interoperability suffers as protocol gets complex.'

I'm not sure that I understand this one - perhaps could you elaborate? In fact I see interoperability is an argument for supporting multiple (in particular rate-based) controls. The usefulness of proportional rejection is not just limited by performance issues, but perhaps more important they are not well suited for applying capacity guarantees (SLAs), which I have described in my original e-mail. I expect that as SIP becomes more ubiquitous across operator boundaries this will become increasingly important, and I suggest that we should be future-proofing SIP overload control for this reason.

'2) Get out one algorithm now (loss), and later if there is a need to
    standardize more add the oc-algo parameter in a backwards
    compatible manner.'

Realistically, if this is the approach taken then suppliers will only provide proportional rejection, and it will become difficult/expensive to introduce other restriction methods such as rate-based controls.

'3) Re-negotiating the chosen algorithm is not trivial.'

If support for restriction algorithms is mandatory, i.e. you either support all ( 3?) algorithms or none at all, then it would just be a case of indicating that overload control is supported. This does not seem to be complex.

4) Servers will have to maintain negotiated algorithms as pieces
    of state for a potentially large set of upstream clients.

If upstream clients have to support multiple algorithms, then this is not necessary, since it is up to the overloaded server to decide which restriction algorithm to use. The server can be confident that if overload control is supported, then it can choose which algorithm (if no algorithms are supported this is an issue already with the current spec).

I would also like to re-iterate that the 3 client restriction algorithms have been around for a long time, and are already widely deployed (and subject to limited variation in their realisation). Therefore we would expect it to be less contentious to standardise these methods within SIP, but in any case we would expect less sensitivity to design variations.

Regarding another issue that was raised - the computational overhead of adaptation calculation performed by the overloaded server, from practical experience designing and deploying overload controls, I suggest that this should be insignificant compared to the other functions of the server. However there are much more critical constraints:

(1) The inaccuracy introduced by making updates too frequent, during to stochastic nature of arrivals etc. The updates will be based upon measures such as counting requests/messages and CPU occupancy. If the number of requests processed is 'too small' then there will be significant inaccuracy in the estimated throughput and therefore the required control parameters. The main factors here are the inherent limit to the accuracy of measuring CPU occupancy (what is really required is time-integrated measurement of all CPU activity between two updates), but also the effect of subsequent phases of session processing. By the latter I include mid-session processing and disconnects (which are not controlled by the sources). To overcome these issues it is necessary to measure for sufficiently long and to apply appropriate smoothing. [Note that the mix/complexity may change very suddenly due to surges, disconnect avalanches, etc. These must all be taken into account].

(2) The number of sources (clients) and the way in which the restriction algorithm state is changed. For example, with proportional rejection (loss-based) it will be typically be implemented as a deterministic sequence of alternating admissions and rejections. When the arrival rate at each source is very low this deterministic sequence is interrupted and has to be changed to reflect the modified control parameter, and how this is done may have a significant impact on the overall admission rate from all sources. If updates are very frequent and the number of sources very large some sources may not have even receive an arrival before the change is made!

Because of these factors there are fundamental limits to how quickly updates can be made. Generally the update time has to be longer when the server throughput is lower (more complex requests) and the number of sources is larger. The problem with proportional restriction is that during the time between updates it is possible that one or more rogue streams suddenly inflate. Although the precise performance depends upon the network configuration, in general an advantage of rate-based controls is that there will be an upper limit to the admission rate even before an update is made (not so for proportional rejection).

If these factors are not taken into account and the update period is too short, then the likely outcome is unstable control, and this can have a dramatic (bad) effect on throughput.

To progress the debate further I would like to offer to produce a modified version of draft-ietf-soc-overload-control-02 for comment. I could also (later) submit a recommendation for implementation of proportional rejection and rate-based methods if this is deemed a useful addition to the draft in order to facilitate adoption.

Regards,

Phil Williams

-----Original Message-----
From: sip-overload-bounces@ietf.org [mailto:sip-overload-bounces@ietf.org] On Behalf Of Vijay K. Gurbani
Sent: 10 June 2011 22:34
To: sip-overload@ietf.org
Subject: [sip-overload] Issue: Multiple algorithms -> Single algorithm for overload control

Folks: During the Prague IETF, Mr. Kaplan reopened the
case for supporting multiple overload control algorithms [1].

Summarizing, the basic problems with multiple algorithms are:

1) Interoperability suffers as protocol gets complex.
2) Get out one algorithm now (loss), and later if there is a need to
    standardize more add the oc-algo parameter in a backwards
    compatible manner.
3) Re-negotiating the chosen algorithm is not trivial.
4) Servers will have to maintain negotiated algorithms as pieces
    of state for a potentially large set of upstream clients.

I cannot say that I am unsympathetic to the above list.  If we can
simplify processing, then we should do so.

The main reason to support multiple algorithms appear to have been
the need to not preclude rate-based algorithms [2,3].  However
subsequent discussions in the thread started off by [3] have
resulted in the understanding that rate-based algorithms may have
a slight advantage if the offered load at the client significantly
surges over a period of time << then the overload server's feedback
period.  Since the overload server's feedback period is already short
(500ms by default), the advantage possessed by rate-based algorithms
may be moot [5].

Further, minimizing state in the server is an argument to pick
loss-based as the default algorithm instead of rate-based.  The
latter requires the server to assign --- and keep track --- of a
share of its overall capacity with each upstream client.

So ... given all this, it may be a good idea to go with one
algorithm, and make that the loss based algorithm.

I will like have discussion so we can reach consensus on this
issue and close it for good.

[1] http://www.ietf.org/mail-archive/web/sip-overload/current/msg00564.html
[2] http://www.ietf.org/mail-archive/web/sip-overload/current/msg00565.html
[3] http://www.ietf.org/mail-archive/web/sip-overload/current/msg00600.html
[4] http://www.ietf.org/mail-archive/web/sip-overload/current/msg00606.html
[5] http://www.ietf.org/mail-archive/web/sip-overload/current/msg00606.html

Thanks,

- vijay
-- 
Vijay K. Gurbani, Bell Laboratories, Alcatel-Lucent
1960 Lucent Lane, Rm. 9C-533, Naperville, Illinois 60566 (USA)
Email: vkg@{bell-labs.com,acm.org} / vijay.gurbani@alcatel-lucent.com
Web:   http://ect.bell-labs.com/who/vkg/
_______________________________________________
sip-overload mailing list
sip-overload@ietf.org
https://www.ietf.org/mailman/listinfo/sip-overload