[sip-overload] Updated loss-based algorithm

"Vijay K. Gurbani" <vkg@bell-labs.com> Mon, 02 July 2012 18:52 UTC

Return-Path: <vkg@bell-labs.com>
X-Original-To: sip-overload@ietfa.amsl.com
Delivered-To: sip-overload@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1C05811E80A3 for <sip-overload@ietfa.amsl.com>; Mon, 2 Jul 2012 11:52:30 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -109.932
X-Spam-Level:
X-Spam-Status: No, score=-109.932 tagged_above=-999 required=5 tests=[AWL=0.667, BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id LU4PmpMin4UL for <sip-overload@ietfa.amsl.com>; Mon, 2 Jul 2012 11:52:29 -0700 (PDT)
Received: from ihemail2.lucent.com (ihemail2.lucent.com [135.245.0.35]) by ietfa.amsl.com (Postfix) with ESMTP id 1160D11E8083 for <sip-overload@ietf.org>; Mon, 2 Jul 2012 11:52:28 -0700 (PDT)
Received: from usnavsmail1.ndc.alcatel-lucent.com (usnavsmail1.ndc.alcatel-lucent.com [135.3.39.9]) by ihemail2.lucent.com (8.13.8/IER-o) with ESMTP id q62IqM9Q028544 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for <sip-overload@ietf.org>; Mon, 2 Jul 2012 13:52:34 -0500 (CDT)
Received: from umail.lucent.com (umail-ce2.ndc.lucent.com [135.3.40.63]) by usnavsmail1.ndc.alcatel-lucent.com (8.14.3/8.14.3/GMO) with ESMTP id q62IiNg3027611 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT) for <sip-overload@ietf.org>; Mon, 2 Jul 2012 13:44:23 -0500
Received: from shoonya.ih.lucent.com (shoonya.ih.lucent.com [135.185.237.229]) by umail.lucent.com (8.13.8/TPES) with ESMTP id q62IiNDo021734 for <sip-overload@ietf.org>; Mon, 2 Jul 2012 13:44:23 -0500 (CDT)
Message-ID: <4FF1ED69.60209@bell-labs.com>
Date: Mon, 02 Jul 2012 13:50:17 -0500
From: "Vijay K. Gurbani" <vkg@bell-labs.com>
Organization: Bell Laboratories, Alcatel-Lucent
User-Agent: Mozilla/5.0 (X11; Linux i686; rv:13.0) Gecko/20120605 Thunderbird/13.0
MIME-Version: 1.0
To: "sip-overload@ietf.org" <sip-overload@ietf.org>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Scanned-By: MIMEDefang 2.57 on 135.245.2.35
X-Scanned-By: MIMEDefang 2.64 on 135.3.39.9
Subject: [sip-overload] Updated loss-based algorithm
X-BeenThere: sip-overload@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: SIP Overload <sip-overload.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/sip-overload>, <mailto:sip-overload-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/sip-overload>
List-Post: <mailto:sip-overload@ietf.org>
List-Help: <mailto:sip-overload-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sip-overload>, <mailto:sip-overload-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 02 Jul 2012 18:52:30 -0000

Folks: The last iteration of the loss-based draft's default algorithm
[1] was presented in the Paris IETF.  List discussions right before the
meeting [2] indicated that the algorithm will need to go through one
more revision.  The reasons for this revision are outlined as the
last two bullet items in [2].

I believe that to address these two bullet items, we have to make the
algorithm responsive to the traffic mix, i.e., category 1 and category
2 prioritization markers need to reflect the mix of the traffic at any
point in time.  That way, the algorithm can adequately respond to any
situation.  In an emergency situation, the number of category 2
requests will dominate, so the reduction will occur from these whereas
in non-emergency situations, normal messages will dominate the traffic
mix and it makes sense to cull requests from category 1 as much as we
can before moving to category 2.

An updated algorithm that takes the changing traffic mix into
account is presented below.  I would appreciate review of the updated
algorithm by the WG to make sure it is on the right track.

Here is the updated algorithm that will go in -09:

cat1 := 80.0          // Category 1 --- subject to reduction
cat2 := 100.0 - cat1  // Category 2 --- Not subject to
                       // reduction.  80/20 mix.
// Note that the above ratio is simply a reasonable default.  The actual
// values will change through periodic sampling as the traffic mix
// changes over time.

while (true) {
   // We're modeling message processing as a single work queue
   // that contains both incoming and outgoing messages.
   sip_msg := get_next_message_from_work_queue()

   update_mix(cat1, cat2)  // See Note below

   switch (sip_msg.type) {

     case outbound request:
       destination := get_next_hop(sip_msg)
       oc_context := get_oc_context(destination)

       if (oc_context == null)  {
           send_to_network(sip_msg) // Process it normally by sending the
           // request to the next hop since this particular destination
           // is not subject to overload
       }
       else  {
          // Determine if server wants to enter in overload or is in
          // overload
          in_oc := extract_in_oc(oc_context)

          oc_value := extract_oc(oc_context)
          oc_validity := extract_oc_validity(oc_context)

          if (in_oc == false or oc_validity is not in effect)  {
             send_to_network(sip_msg) // Process it normally by sending
             // the request to the next hop since this particular
             // destination is not subject to overload.  Optionally,
             // clear the oc context for this server (not shown).
          }
          else  {  // Begin perform overload control
             r := random()
             drop_msg := false

             if (cat1 >= cat2) {
                 category := assign_msg_to_category(sip_msg)
                 pct_to_reduce_cat2 := 0
                 pct_to_reduce_cat1 := oc_value / cat1 * 100
                 if (pct_to_reduce_cat1 > 100)  {
                    // Get remaining messages from category 2
                    pct_to_reduce_cat2 := 100 - pct_to_reduce_cat1
                    pct_to_reduce_cat1 := 100
                 }

                 if (category == cat1)  {
                    if (r <= pct_to_reduce_cat1)  {
                       drop_msg := true
                    }
                 }
                 else {  // Message from category 2
                    if (r <= pct_to_reduce_cat2)  {
                       drop_msg := true
                    }
                 }
             }
             else  { // More category 2 messages than category 1;
                     // indicative of an emergency situation.  Since
                     // there are more category 2 messages, we
                     // simply treat category 1 and category 2 equal
                     // for discard purposes.
                 if (r <= oc_value)
                    drop_msg := true
             }

             if (drop_msg == false) {
                 send_to_network(sip_msg) // Process it normally by
                // sending the request to the next hop
             }
             else  {
                // Do not send request downstream, handle locally by
                // generating response (if a proxy) or treating as
                // an error (if a user agent).
             }
          }  // End perform overload control
       }

     end case // outbound request

     case outbound response:
       if (we are in overload) {
         add_overload_parameters(sip_msg)
       }
       send_to_network(sip_msg)

     end case // outbound response

     case inbound response:

        if (sip_msg has oc parameter values)  {
            create_or_update_oc_context()  // For the specific server
            // that sent the response, create or update the oc context;
            // i.e., extract the values of the oc-related parameters
            // and store them for later use.
        }
        process_msg(sip_msg)

     end case // inbound response
     case inbound request:

       if (we are not in overload)  {
          process_msg(sip_msg)
       }
       else {  // We are in overload
          if (sip_msg has oc parameters)  {  // Upstream client supports
             process_msg(sip_msg)  // oc; only sends important requests
          }
          else {  // Upstream client does not support oc
             if (local_policy(sip_msg) says process message)  {
                process_msg(sip_msg)
             }
             else  {
                send_response(sip_msg, 503)
             }
          }
       }
     end case // inbound request
   }
}

Note: A simple way to sample the traffic mix for category 1 and
category 2 is to associate a counter with each category of message.
Periodically (every 5-10s) get the value of the counters and calculate
the ratio of category 1 messages to category 2 messages since the last
calculation.

Example: In the last 5 seconds, a total of 500 requests arrived at the
client.  Assume that 450 out of 500 were requests subject to reduction
and 50 out of 500 were classified as requests not subject to reduction.
Based on this ratio, cat1 := 90 and cat2 := 10, or a 90/10 mix will be
used in overload calculations for the next time period.

[1] 
http://tools.ietf.org/html/draft-ietf-soc-overload-control-08#section-6.3
[2] http://www.ietf.org/mail-archive/web/sip-overload/current/msg00764.html

Comments and feedback is eagerly sought.

Thanks,

- vijay
-- 
Vijay K. Gurbani, Bell Laboratories, Alcatel-Lucent
1960 Lucent Lane, Rm. 9C-533, Naperville, Illinois 60563 (USA)
Email: vkg@{bell-labs.com,acm.org} / vijay.gurbani@alcatel-lucent.com
Web:   http://ect.bell-labs.com/who/vkg/