Re: [sip-overload] Proposal: Support for the restriction algorithms should be mandatory for clients (draft-ietf-soc-overload-control-02)

Volker Hilt <volker.hilt@alcatel-lucent.com> Fri, 29 April 2011 02:06 UTC

Return-Path: <volker.hilt@alcatel-lucent.com>
X-Original-To: sip-overload@ietfa.amsl.com
Delivered-To: sip-overload@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6C3C4E0731 for <sip-overload@ietfa.amsl.com>; Thu, 28 Apr 2011 19:06:16 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -5.359
X-Spam-Level:
X-Spam-Status: No, score=-5.359 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4, SARE_LWSHORTT=1.24]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id RC96MEhMw79k for <sip-overload@ietfa.amsl.com>; Thu, 28 Apr 2011 19:06:15 -0700 (PDT)
Received: from ihemail2.lucent.com (ihemail2.lucent.com [135.245.0.35]) by ietfa.amsl.com (Postfix) with ESMTP id 73290E0715 for <sip-overload@ietf.org>; Thu, 28 Apr 2011 19:06:15 -0700 (PDT)
Received: from usnavsmail4.ndc.alcatel-lucent.com (usnavsmail4.ndc.alcatel-lucent.com [135.3.39.12]) by ihemail2.lucent.com (8.13.8/IER-o) with ESMTP id p3T26EVU025668 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for <sip-overload@ietf.org>; Thu, 28 Apr 2011 21:06:14 -0500 (CDT)
Received: from USNAVSXCHHUB01.ndc.alcatel-lucent.com (usnavsxchhub01.ndc.alcatel-lucent.com [135.3.39.110]) by usnavsmail4.ndc.alcatel-lucent.com (8.14.3/8.14.3/GMO) with ESMTP id p3T26EFc013043 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=NOT) for <sip-overload@ietf.org>; Thu, 28 Apr 2011 21:06:14 -0500
Received: from [135.104.20.65] (135.3.63.242) by USNAVSXCHHUB01.ndc.alcatel-lucent.com (135.3.39.110) with Microsoft SMTP Server (TLS) id 8.3.106.1; Thu, 28 Apr 2011 21:06:14 -0500
Message-ID: <4DBA1D11.7030001@alcatel-lucent.com>
Date: Thu, 28 Apr 2011 22:06:09 -0400
From: Volker Hilt <volker.hilt@alcatel-lucent.com>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.15) Gecko/20110303 Thunderbird/3.1.9
MIME-Version: 1.0
To: <sip-overload@ietf.org>
References: <E4B3F0DC6D953D4EBEC223BC86FE322C4A42034FB7@EMV04-UKBR.domain1.systemhost.net>
In-Reply-To: <E4B3F0DC6D953D4EBEC223BC86FE322C4A42034FB7@EMV04-UKBR.domain1.systemhost.net>
Content-Type: text/plain; charset="ISO-8859-1"; format=flowed
Content-Transfer-Encoding: 7bit
X-Scanned-By: MIMEDefang 2.57 on 135.245.2.35
X-Scanned-By: MIMEDefang 2.64 on 135.3.39.12
Subject: Re: [sip-overload] Proposal: Support for the restriction algorithms should be mandatory for clients (draft-ietf-soc-overload-control-02)
X-BeenThere: sip-overload@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: SIP Overload <sip-overload.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/sip-overload>, <mailto:sip-overload-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/sip-overload>
List-Post: <mailto:sip-overload@ietf.org>
List-Help: <mailto:sip-overload-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sip-overload>, <mailto:sip-overload-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 29 Apr 2011 02:06:16 -0000

Phil,

> SIP support for 3 restriction algorithms at client sources defined in
> soc-overload-control-02 is beneficial for the flexibility that it
> provides to potentially support a wide range of server overload controls
> and network applications. However, making proportional restriction,
> termed a 'loss-based' algorithm, mandatory for clients would severely
> limit the usefulness of this approach.
> I will summarise the reasons for this and why this algorithm has limited
> use in future generations of networks below.
> Practical issues
> -----------------------
> If client system suppliers are only required to support proportional
> restriction, then realistically it is likely that that is the only
> algorithm that will be provided by default, and the other algorithms
> will be seen as chargeable enhancements, making it difficult or
> expensive to deploy them.
> But, of the two functional ends of a closed adaptive overload control,
> it is the algorithm on the overloaded server that derives the control
> parameter that is not standardised and has dependency on node
> architecture, whereas the 3 client restriction algorithms are less
> subject to design variation, have been around for a long time, and are
> already widely deployed. Therefore we would expect it to be less
> contentious to standardise these methods within SIP, but in any case we
> would expect less sensitivity to design variations.
> It is not realistic to envisage a situation whereby an overloaded server
> supports several different client restriction algorithms simultaneously,
> because of the complexity in design of the algorithm, and because even
> if a unique solution is guaranteed, there will be issues of both speed
> of convergence to that solution, stability, and the need for the server
> capacity amongst upstream clients to be allocated in a predictable and
> precise way. This has implications for capacity guarantees, particularly
> at network boundaries (see below).
> If support for the 3 different restriction algorithms (or at least the 2
> in which we are interested) is mandated by the IETF rather than being
> optional, then a server subject to overload can guarantee that the
> sending entities will all use the same restriction algorithms, and can
> behave as predicted.

So you are arguing that a client should always implement the full set of 
restriction algorithms (e.g., loss, rate and window)?

If we would mandate that, we would loose the possibility to extend the 
feedback types at a later point. And, of course, we add the complexity 
of requiring clients to implement multiple algorithms that to the same.

> Performance imitations
> ----------------------------------
> The main performance limitation of proportional restriction is that it
> is vulnerable to sudden increases on the offered load at client sources,
> since the mean admitted rate after control is proportional to the mean
> arrival rate before control until an adaptation of the control parameter
> is made. But there will always be a delay to this control adaptation (at
> least in the closed loop method implied by the current specification),
> both because of the need for waiting sufficiently long at the overloaded
> server in order to obtain statistically accurate estimates before an
> adaptation is made, and the time and number of received messages
> required to distribute control to the clients.

Can you please point us to results that show this behavior?

I'd also recommend to look at the results of the SIP overload control 
design team that has investigated this problem.

> This vulnerability is
> worse when the number of clients is 'large' with a high capacity
> relative to the server. In contrast, a maximum rate-based control is
> generally not so vulnerable to short term surges in load.

A rate based mechanism needs to split its capacity across upstream 
neighbors. If new clients arrive, the split has to be adjusted. In 
particular if you are dealing with a large set of clients some of which 
may be inactive for some time, this requires a quick readjustment of the 
allocated capacity. This topic has been discussion in the design 
considerations draft.

> The need to support precise capacity guarantees
> ------------------------------------------------------------------------
> It is common practice for agreements concerning capacity to be provided
> at network operator boundaries [from now on I'll use SP or Service
> Provider as a generic term encompassing Operators etc], and in many
> realistic applications this is essential (I recall that his requirement
> is absent from the original list of SIP overload control requirements?).
> It is also possible to want to provide guarantees to sub-streams of SP
> traffic.
> These guarantees must be practically useful, so that they can form the
> basis of a service level agreement between SPs. I.e. they must be simple
> enough to be easily understood by all parties, and above all they must
> be clear and precise in the sense that the behaviour of the capacity is
> predictable/deterministic (in a stochastic sense). SPs are not
> interested in technicalities of restriction algorithms, but will want
> the policies to be defined in terms of traffic characteristics that are
> straightforward to interpret and agree.
> Clearly the policies must also be efficient in the sense that imply that
> the available capacity can be fully utilised. I suggest that these are
> most easily expressed as a guaranteed minimum rate and a precise way in
> which 'spare' capacity not being used by a client originated stream is
> distributed over the other SPs, e.g. in terms of maximum rates
> determined by agreed proportions of the available unused allowances. Of
> course other policies are possible (but they may be less precise or more
> complex). Whatever policies are chosen, realising this is an inherent
> part of the server overload control, but the difficulty and complexity
> is dependent upon the method of call restriction is the clients.
> With proportional restriction, note that the percentages have nothing
> directly to do with the proportions of server capacity allocated to
> different clients. So there is no natural and simple way to map between
> the parameters of the agreement and the control parameters. If the same
> control level were applied to all client traffic, then the changes in
> the offered traffic from one client will always imply changes to the
> traffic admitted by another, (and in particular this applies to sudden
> large increases). To apply maximum rate-based guarantees would require
> monitoring of the received rate from each source separately in order
> that the offered traffic can be derived implicitly and thereby
> percentages derived for each specific source.
> In contrast, with a rate-based restriction, it is much simpler to
> implement policies defined in terms of maximum rates, even though these
> are adapted according to minimum guarantees and use of unused allowances
> in a precise and predictable way.

I don't think overload control is the right tool to police SLAs.

Let's assume for a second you are in fact using overload control for 
this purpose and are configuring your overload control rates to match 
your SLAs. Say you have two servers A and B (each with capacity 500 
req/s) and four upstream neighbors. The SLA with each of them is that 
you accept 250 req/s.

If one of your servers goes down, your capacity is cut in half. If you 
have configured overload control to honor you SLAs, your remaining 
server will melt down within ms. Your only choice to survive this 
situation is to use overload control and cut the rates below the SLA.

Thanks,

Volker (as individual)




> Comments please!
> Phil Williams