Re: [MEDIACTRL] [sip-overload] WGLC: draft-ietf-soc-overload-design
Janet P Gunn <jgunn6@csc.com> Tue, 31 August 2010 15:24 UTC
Return-Path: <jgunn6@csc.com>
X-Original-To: mediactrl@core3.amsl.com
Delivered-To: mediactrl@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 5ED4C3A6A2D; Tue, 31 Aug 2010 08:24:35 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.712
X-Spam-Level:
X-Spam-Status: No, score=-2.712 tagged_above=-999 required=5 tests=[AWL=-3.528, BAYES_40=-0.185, GB_SUMOF=5, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 6rFnjUZbh2o3; Tue, 31 Aug 2010 08:24:30 -0700 (PDT)
Received: from mail64.messagelabs.com (mail64.messagelabs.com [216.82.249.227]) by core3.amsl.com (Postfix) with ESMTP id A751F3A6A41; Tue, 31 Aug 2010 08:24:29 -0700 (PDT)
X-VirusChecked: Checked
X-Env-Sender: jgunn6@csc.com
X-Msg-Ref: server-7.tower-64.messagelabs.com!1283268298!104766551!1
X-StarScan-Version: 6.2.4; banners=-,-,-
X-Originating-IP: [20.137.2.87]
Received: (qmail 18910 invoked from network); 31 Aug 2010 15:24:59 -0000
Received: from amer-mta101.csc.com (HELO amer-mta101.csc.com) (20.137.2.87) by server-7.tower-64.messagelabs.com with DHE-RSA-AES256-SHA encrypted SMTP; 31 Aug 2010 15:24:59 -0000
Received: from amer-gw09.amer.csc.com (amer-gw09.amer.csc.com [20.6.39.245]) by amer-mta101.csc.com (Switch-3.4.3/Switch-3.3.3mp) with ESMTP id o7VFOvjd022581; Tue, 31 Aug 2010 11:24:58 -0400
In-Reply-To: <034e01cb48c1$9b406dd0$d1c14970$@packetizer.com>
References: <4C71B1C3.6070805@ericsson.com> <A11921905DA1564D9BCF64A6430A62390293A4AF@XMB-BGL-411.cisco.com><4C7AA34D.4020000@alcatel-lucent.com> <A11921905DA1564D9BCF64A6430A62390293A4B0@XMB-BGL-411.cisco.com> <4C7AC02D.1000200@alcatel-lucent.com> <OF5FC5A3A1.0A30DB2F-ON8525778E.006FC85F-8525778E.0070FB2C@csc.com> <A11921905DA1564D9BCF64A6430A623903054F93@XMB-BGL-411.cisco.com> <4C7BC713.3010208@alcatel-lucent.com> <A11921905DA1564D9BCF64A6430A62390293A4B6@XMB-BGL-411.cisco.com> <034e01cb48c1$9b406dd0$d1c14970$@packetizer.com>
To: mediactrl@ietf.org, sip-overload@ietf.org
MIME-Version: 1.0
X-KeepSent: 8F9DDFDC:C309487D-85257790:00545617; type=4; name=$KeepSent
X-Mailer: Lotus Notes Release 8.0.2FP1 CCH2 April 23, 2009
From: Janet P Gunn <jgunn6@csc.com>
Message-ID: <OF8F9DDFDC.C309487D-ON85257790.00545617-85257790.0054AE10@csc.com>
Date: Tue, 31 Aug 2010 11:24:54 -0400
X-MIMETrack: Serialize by Router on AMER-GW09/SRV/CSC(Release 8.5.1FP1 HF440|June 18, 2010) at 08/31/2010 11:25:14 AM, Serialize complete at 08/31/2010 11:25:14 AM
Content-Type: multipart/alternative; boundary="=_alternative 0054AD7D85257790_="
Subject: Re: [MEDIACTRL] [sip-overload] WGLC: draft-ietf-soc-overload-design
X-BeenThere: mediactrl@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Media Control WG Discussion List <mediactrl.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/mediactrl>, <mailto:mediactrl-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/mediactrl>
List-Post: <mailto:mediactrl@ietf.org>
List-Help: <mailto:mediactrl-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mediactrl>, <mailto:mediactrl-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 31 Aug 2010 15:24:35 -0000
Comments on draft-ietf-soc-overload-design-01 Intro, third paragraph says: “For example, a PSTN gateway that runs out of trunk lines but still has plenty of capacity to process SIP messages should reject incoming INVITEs using a 488 (Not Acceptable Here) response [RFC4412].” While it is true that 4412 DOES say to use 488 in this case, we have found that, in the real world, this can lead to incorrect mapping back to ISUP. In at least some contexts, “503 with a Reason header field Q.850 cause value of 34 (no circuit available)” may be used instead of 488. (I believe this is covered in a PTSC document.) So I suggest “For example, a PSTN gateway that runs out of trunk lines but still has plenty of capacity to process SIP messages should reject incoming INVITEs using a response such as 488 (Not Acceptable Here), as described in RFC4412.” After this paragraph, I would add a new paragraph saying something like: “There are other failure cases in which a SIP server also serves non-SIP traffic (e.g., RTP packets, database queries and updates, event handling) which can lead to server overload. These other loads may, or may not, be correlated with the SIP message volume. The server is unable to process all SIP requests due to resource constraints, but simply reducing the flow of SIP messages may not sufficiently reduce the load to avoid congestion collapse. In this context, it is to be expected that the server has some other method of overload control addressing these other sources of load. However, the specifics of the overload control for other traffic types, and the coordination of the different overload controls, are out of scope for this document.” This should address Partha’s, and others’ concerns. Fourth paragraph. In addition to the other problems with 503 and Retry-After, 503 is used for other situations (with or without Retry-After), not just SIP Server overload. A SIP Overload Control process based on 503 would have to specify exactly which cause values trigger the Overload Control. Section 2 Even when SIP messages are not dropped, significant delay can cause time-outs which lead to retransmission. I would change the second sentence to “When SIP is running over the UDP protocol, it will retransmit messages that were dropped or excessively delayed by a SIP server due to overload and thereby increase the offered load for the already overloaded server.” At the end of section 2 you say “Another challenge for SIP overload control is that the rate of the true traffic source usually cannot be controlled. Overload is often caused by a large number of UAs each of which creates only a single message. These UAs cannot be rate controlled as they only send one message. However, the sum of their traffic can overload a SIP server.” In fact, the various wireless technologies DO have method for controlling the load “caused by a large number of UAs each of which creates only a single message.” Some of these are of the form “pick a random number and see if it exceeds the threshold you have been given”. Examples include Access Class Barring, and Access Persistence Mechanism. It would be possible to do something similar at the SIP level, though it would probably be redundant. My suggested rewording would be: “Another challenge for SIP overload control is controlling the rate of the true traffic source. Overload is often caused by a large number of UAs each of which creates only a single message. However, the sum of their traffic can overload a SIP server. The overload mechanisms suitable for controlling a SIP server (e.g., rate control) may not be effective for individual UAs. In some cases, there are other non-SIP mechanisms for limiting the load from the UAs. These may operate independently from, or in conjunction with, the SIP overload mechanisms described here. In either case, they are out of scope for this document.” Section 4 Your model is built on the premise of a “sending entity” and a “receiving entity”. In the real world, not only is Server A sending SIP messages to Server B, but Server B is also sending SIP messages to Server A. I don’t think you should clutter up your model by trying to address both directions at once, but you should state somewhere in the text that you have made that simplification/abstraction for ease of comprehension, and that any mechanism must work in the context of “SIP messages going both ways”. My suggestion would be to add another sentence after “The model in Figure 1 shows a scenario with one sending and one receiving entity. In a more realistic scenario a receiving entity will receive traffic from multiple sending entities and vice versa (see Section 6).” My suggestion would be: “In addition, in a more realistic scenario, SIP messages will be going both directions, from B to A as well as A to B. However, the overload control mechanisms in each direction can be considered independently.” Then, in section 5.1, change “Each control loop between two servers is completely independent of the control loop between other servers further up- or downstream.” To “Each control loop between two servers is completely independent of the control loop between other servers further up- or downstream, and of the control loop between the two servers in the other direction.” Section 8, second paragraph After “An overload control mechanism should ensure that the delay encountered by a SIP message is not increased significantly during periods of overload.” Add “Significantly increased delay can lead to time-outs, and retransmission of SIP messages, making the overload worse.” “Reactiveness” doesn’t seem the right word to me. “Responsiveness” sounds better to me. End of section 8 Another important metric is the (cpu) load used by the overload “monitor” and “actuator”. End of section 9 Suggest changing “Explicit overload control mechanisms can be differentiated based on the type of information conveyed in the overload control feedback and whether the control function is in the receiving or sending entity (receiver- vs. sender- based overload control).” To “Explicit overload control mechanisms can be differentiated based on the type of information conveyed in the overload control feedback and whether the control function is in the receiving or sending entity (receiver- vs. sender- based overload control), or both.” In 9.2, I think “A loss percentage enables a SIP server to ask an upstream neighbor to reduce the number of requests it would normally forward to this server by a percentage X. For example, a SIP server can ask an upstream neighbor to reduce the number of requests this neighbor would normally send by 10%. The upstream neighbor then redirects or rejects X percent of the traffic that is destined for this server.” Should be “A loss percentage enables a SIP server to ask an upstream neighbor to reduce the number of requests it would normally forward to this server by a X%. For example, a SIP server can ask an upstream neighbor to reduce the number of requests this neighbor would normally send by 10%. The upstream neighbor then redirects or rejects 10% of the traffic that is destined for this server.” End of 9.2 WRT: “Thus, percentage throttling requires an adjustment of the throttling percentage in response to the traffic received and may not always be able to prevent a server from encountering brief periods of overload in extreme cases.” This is not unique to percentage throttling. It is possible in rate based and window based methods as well. In all cases, it is heavily dependent on the frequency of updates by the control mechanism. But that needs to be balanced against the load generated by the control mechanism. I am not sure whether it makes sense to say something in each method, or put it up front as a general comment. Sec 9.4 Here again, remember that there are many other things that can generate 503, with or without Retry-After. Sec 11 Last paragraph add: “Conversely, the semantics of any proposed approach should permit a variety of different algorithms.” Nits/wordsmithing Note at end of section 6, change “different than” to “different from”. Section 12 first para Change “Overload control can require a SIP server to prioritize requests and select requests that need to be rejected or redirected.” To “Overload control can require a SIP server to prioritize requests and select requests to be rejected or redirected.” Sec 12 Third para Change “Responses should not be targeted when a SIP server is trying to reduce load for a number of reasons.” To “For a number of reasons, SIP responses should not be dropped in order to reduce SIP processing load” Janet
- Re: [MEDIACTRL] [sip-overload] WGLC: draft-ietf-s… Parthasarathi R (partr)
- Re: [MEDIACTRL] [sip-overload] WGLC: draft-ietf-s… Janet P Gunn
- Re: [MEDIACTRL] [sip-overload] WGLC: draft-ietf-s… Parthasarathi R (partr)
- Re: [MEDIACTRL] [sip-overload] WGLC: draft-ietf-s… Janet P Gunn
- Re: [MEDIACTRL] [sip-overload] WGLC: draft-ietf-s… Volker Hilt
- Re: [MEDIACTRL] [sip-overload] WGLC: draft-ietf-s… Volker Hilt
- Re: [MEDIACTRL] [sip-overload] WGLC: draft-ietf-s… Parthasarathi R (partr)
- Re: [MEDIACTRL] [sip-overload] WGLC: draft-ietf-s… RahulSrivastava 71616
- Re: [MEDIACTRL] [sip-overload] WGLC: draft-ietf-s… Paul E. Jones
- Re: [MEDIACTRL] [sip-overload] WGLC: draft-ietf-s… Janet P Gunn
- Re: [MEDIACTRL] [sip-overload] WGLC: draft-ietf-s… Parthasarathi R (partr)