Re: [sip-overload] draft-ietf-soc-overload-control-07 section 6.3 algorithm

"Vijay K. Gurbani" <vkg@bell-labs.com> Thu, 08 March 2012 18:03 UTC

Return-Path: <vkg@bell-labs.com>
X-Original-To: sip-overload@ietfa.amsl.com
Delivered-To: sip-overload@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0001A21F86DF for <sip-overload@ietfa.amsl.com>; Thu, 8 Mar 2012 10:03:47 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -108.929
X-Spam-Level:
X-Spam-Status: No, score=-108.929 tagged_above=-999 required=5 tests=[AWL=1.670, BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 5D0gXFL3126b for <sip-overload@ietfa.amsl.com>; Thu, 8 Mar 2012 10:03:47 -0800 (PST)
Received: from ihemail1.lucent.com (ihemail1.lucent.com [135.245.0.33]) by ietfa.amsl.com (Postfix) with ESMTP id E928C21F85CC for <sip-overload@ietf.org>; Thu, 8 Mar 2012 10:03:46 -0800 (PST)
Received: from usnavsmail1.ndc.alcatel-lucent.com (usnavsmail1.ndc.alcatel-lucent.com [135.3.39.9]) by ihemail1.lucent.com (8.13.8/IER-o) with ESMTP id q28I3k2s014333 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for <sip-overload@ietf.org>; Thu, 8 Mar 2012 12:03:46 -0600 (CST)
Received: from umail.lucent.com (umail-ce2.ndc.lucent.com [135.3.40.63]) by usnavsmail1.ndc.alcatel-lucent.com (8.14.3/8.14.3/GMO) with ESMTP id q28I3j7x023278 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT) for <sip-overload@ietf.org>; Thu, 8 Mar 2012 12:03:46 -0600
Received: from shoonya.ih.lucent.com (shoonya-135185238235.ih.lucent.com [135.185.238.235]) by umail.lucent.com (8.13.8/TPES) with ESMTP id q28I3jDR028285 for <sip-overload@ietf.org>; Thu, 8 Mar 2012 12:03:45 -0600 (CST)
Message-ID: <4F58F595.9070107@bell-labs.com>
Date: Thu, 08 Mar 2012 12:08:21 -0600
From: "Vijay K. Gurbani" <vkg@bell-labs.com>
Organization: Bell Laboratories, Alcatel-Lucent
User-Agent: Mozilla/5.0 (X11; Linux i686; rv:10.0.1) Gecko/20120209 Thunderbird/10.0.1
MIME-Version: 1.0
To: sip-overload@ietf.org
References: <D05CF57C-B7B8-4187-BF55-70426DB3762D@estacado.net> <4F57B393.3020702@nostrum.com>
In-Reply-To: <4F57B393.3020702@nostrum.com>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Scanned-By: MIMEDefang 2.57 on 135.245.2.33
X-Scanned-By: MIMEDefang 2.64 on 135.3.39.9
Subject: Re: [sip-overload] draft-ietf-soc-overload-control-07 section 6.3 algorithm
X-BeenThere: sip-overload@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: SIP Overload <sip-overload.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/sip-overload>, <mailto:sip-overload-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/sip-overload>
List-Post: <mailto:sip-overload@ietf.org>
List-Help: <mailto:sip-overload-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sip-overload>, <mailto:sip-overload-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 08 Mar 2012 18:03:48 -0000

Adam, Eric: Thank you for looking over the algorithm.  I was hoping
that more eyeballs will look at it and uncover problems.  The
baroqueness you refer to is an artefact of trying to come up with
a cohesive algorithm based on a simple textual description [1] coupled
with ensuing discussions on the list on how to specify local policy when
the server is overloaded and bolstered by some underlying unstated
assumptions.

In any case, I have modified the algorithm based on your suggested
changes with unstated assumptions filled out, for instance,
in the algorithm below, I have included more details on message
prioritization that occurs for category 2 messages under overload.
In the ealier iteration, I had simply stated in text that if there
are not enough messages in category 1, then client may use
"local policies" to target messages in category 2.

With the algorithm relatively fresh in our minds, let's see if what
appears below addresses your comments.  Note that I have also fixed
category 1 / category 2 messages to be a 80/20 mix instead of a
40/60 mix earlier.  It seems appropriate that the corpus of messages
that are subject to reduction be more than those that are not (unless
an emergency situation is occurring).

Thank you for pointing out where the errors lie!  Please take a look
at the updated algorithm below and lets see if we can work at making
this any better.

cat1 := 80.0         // Category 1 --- subject to reduction
cat2 := 100.0 - cat1 // Category 2 --- Not subject to reduction
                      // 80/20 mix.
while (true)  {
    sip_msg := get_sip_msg() // Next SIP message to process, could be a
                             // request or a response

    if (is_response(sip_msg)) {
        if (sip_msg has oc parameter values)  {
            create_or_update_oc_context()  // For the specific server
            // that sent the response, create or update the oc context;
            // i.e., extract the values of the oc-related parameters
            // and store them for later use.
        }
        process_msg(sip_msg)  // Rest of normal processing occurs on
        // the response, which may include consuming it if the client
        // is a user agent or sending it upstream if a proxy
    }
    else if (is_request(sip_msg))  {
       destination := get_next_hop(sip_msg)
       oc_context := get_oc_context(destination)

       if (oc_context == null)  {
           process_msg(sip_msg) // Process it normally by sending the
           // request downstream since this particular destination
           // is not subject to overload
       }
       else  {
          // Determine if server wants to enter in overload or is in
          // overload
          in_oc := extract_in_oc(oc_context)

          oc_value := extract_oc(oc_context)
          oc_validity := extract_oc_validity(oc_context)

          if (in_oc == false or oc_validity is not in effect)  {
             process_msg(sip_msg) // Process it normally by sending the
              // request downstream since this particular destination
              // is not subject to overload.  Optionally, clear
              // the oc context for this server (not shown).
          }
          else  {
             category := assign_msg_to_category(sip_msg)
             drop_msg := 0
             pct_to_reduce := min(100, oc_value / cat1 * 100)

             r := random()
             if (r <= pct_to_reduce)  {
                drop_msg := 1
             }

             if (category == cat2 && drop_msg == 1)  {
                if (local_policy(sip_msg, oc_value) says process message) {
                    drop_msg := 0  // See Note 1 below
                }
             }

             if (drop_msg == 0) {
                 process_msg(sip_msg) // Process it normally by sending
                // the request downstream
             }
             else  {
                // Do not send request downstream, handle locally by
                // generating response.
             }
          }
       }
    } // is_request(sip_msg)
}

Note 1: local_policy() will have to decide whether to allow a category
  2 request downstream if that request has been marked for discard.
  Some discussion on how to make this decision is captured in Section
  5.10.1.

  There will be four cases to consider in figuring out how local_policy()
  should behave.  These are enunciated below, and in these cases, t is
  the inter-invocation time of local_policy() and oc is the value of
  the "oc" parameter.

  Case 1: t is small (<= 10 times/sec?) and oc is small (<20%?)
  Case 2: t is large (>= 500 times/sec?) and oc is large (>70%?)
  Case 3: t is small and oc is large
  Case 4: t is large and oc is small

  The decision in cases 1 and 3 seems simple.  In case 1, local_policy()
  is not invoked as often and the oc value is small.  On the few
  times that local_policy() is invoked, it could allow the request to
  to be sent to the server.

  In case 3, local_policy() is not invoked as often but the oc value
  is large.  This implies that there are enough category 1 messages
  that are being dropped.  On the few times that local_policy() is
  invoked, it could allow the request to be sent to the server.

  It is cases 2 and 4 that local_policy() should do something more
  intelligent.

  In case 2, local_policy() is getting invoked very
  often and the oc is also large.  This implies that category 1
  requests are being dropped as much as possible and it will help
  to drop a good number of category 2 requests as well.  Thus,
  it seems reasonable to drop all but the SOS URN [RFC5031]
  requests and high priority RPH content requests.

  In case 4, local_policy() is getting invoked very often, but the
  oc value is small.  This implies that the bulk of traffic to be
  dropped consists of category 2 requests.  So here, it seems
  reasonable to drop all but the SOS URN [RFC5031] requests.

[1] www.ietf.org/mail-archive/web/sip-overload/current/msg00318.html

Thanks,

- vijay
-- 
Vijay K. Gurbani, Bell Laboratories, Alcatel-Lucent
1960 Lucent Lane, Rm. 9C-533, Naperville, Illinois 60563 (USA)
Email: vkg@{bell-labs.com,acm.org} / vijay.gurbani@alcatel-lucent.com
Web:   http://ect.bell-labs.com/who/vkg/