Re: [sip-overload] [dispatch] draft-shen-soc-avalanche-restart-overload-07

worley@ariadne.com (Dale R. Worley) Fri, 07 March 2014 18:14 UTC

Return-Path: <worley@shell01.TheWorld.com>
X-Original-To: sip-overload@ietfa.amsl.com
Delivered-To: sip-overload@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8A9E21A00A2 for <sip-overload@ietfa.amsl.com>; Fri, 7 Mar 2014 10:14:11 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.601
X-Spam-Level:
X-Spam-Status: No, score=-2.601 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id LnbGPUKtP2IG for <sip-overload@ietfa.amsl.com>; Fri, 7 Mar 2014 10:14:08 -0800 (PST)
Received: from TheWorld.com (pcls5.std.com [192.74.137.145]) by ietfa.amsl.com (Postfix) with ESMTP id 30B1C1A008B for <sip-overload@ietf.org>; Fri, 7 Mar 2014 10:14:07 -0800 (PST)
Received: from shell.TheWorld.com (svani@shell01.theworld.com [192.74.137.71]) by TheWorld.com (8.14.5/8.14.5) with ESMTP id s27IEww9010088; Fri, 7 Mar 2014 13:15:01 -0500
Received: from shell01.TheWorld.com (localhost.theworld.com [127.0.0.1]) by shell.TheWorld.com (8.13.6/8.12.8) with ESMTP id s27IDjS06445230; Fri, 7 Mar 2014 13:13:45 -0500 (EST)
Received: (from worley@localhost) by shell01.TheWorld.com (8.13.6/8.13.6/Submit) id s27IDjOM6453480; Fri, 7 Mar 2014 13:13:45 -0500 (EST)
Date: Fri, 07 Mar 2014 13:13:45 -0500
Message-Id: <201403071813.s27IDjOM6453480@shell01.TheWorld.com>
From: worley@ariadne.com
Sender: worley@ariadne.com
To: Charles Shen <charles@cs.columbia.edu>
In-reply-to: <CAPSQ9ZXaJFO-HAjTi6fkvLoy+xBOtPxWg_CvpbMF0ijNtSFLGQ@mail.gmail.com> (charles@cs.columbia.edu)
References: <CAPSQ9ZXvxc+tV2_Zd9e5LNf_UVrz9Tfp_sTVu-BC5pXB_RkUmg@mail.gmail.com> <CAPSQ9ZXaJFO-HAjTi6fkvLoy+xBOtPxWg_CvpbMF0ijNtSFLGQ@mail.gmail.com>
Archived-At: http://mailarchive.ietf.org/arch/msg/sip-overload/X3oPIpUYUK-cwYUJRQFDFDrYG9Y
Cc: sip-overload@ietf.org
Subject: Re: [sip-overload] [dispatch] draft-shen-soc-avalanche-restart-overload-07
X-BeenThere: sip-overload@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: SIP Overload <sip-overload.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/sip-overload>, <mailto:sip-overload-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/sip-overload/>
List-Post: <mailto:sip-overload@ietf.org>
List-Help: <mailto:sip-overload-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sip-overload>, <mailto:sip-overload-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 07 Mar 2014 18:14:11 -0000

> From: Charles Shen <charles@cs.columbia.edu>
> 
> Dear all, I'd like to re-send the message below, and would appreciate very
> much your opinions!
> 
> Charles
> 
> 
> On Tue, Feb 18, 2014 at 12:37 AM, Charles Shen <charles@cs.columbia.edu>wrote:
> 
> > Dear all,
> >
> > I am writing about the draft
> >
> > A Mechanism for Session Initiation Protocol (SIP) Avalanche Restart
> > Overload Control
> > [ http://tools.ietf.org/id/draft-shen-soc-avalanche-restart-overload-07.txt ]
[...]

It is generally helpful to specify the particular version number when
talking about a draft, because inevitably your e-mail will be read
months later, and it will be unclear which version you are referring
to.  I've edited the Subject line and the URL to give the version
number in question.

To resend comments which I sent to Dispatch (which wasn't the place to
discuss them):

> From: worley@ariadne.com (Dale R. Worley)
> 
> I do believe that this draft should be advanced, as this sort of
> registration storm causes a problem in practice, though I don't have
> any opinion on what would be the proper track for it.
> 
> It would be interesting to know if there have been any large-scale
> implementations, and what the results have been in practice.
> 
> In regard to the draft itself, I think a few tweaks would improve it.
> 
> - Clarify that the Restart-Timer value is associated with the URI that
>   is being registered (for REGISTER) or the URI/event that is being
>   subscribed to (for SUBSCRIBE and PUBLISH).
> 
> - You may want a more clever way of handling multiple Restart-Timer
>   values received from different servers during a boot sequence that
>   sends requests to several servers (which may be incompletely
>   coordinated with each other).  E.g., if the registration server has
>   a Restart-Timer of 300 and the voicemail server also has a
>   Restart-Timer of 300, it seems that the UA could safely wait
>   rand(0:300) then register and subscribe.  If the VM server has a zero
>   restart timer, the UA probably wants to wait until the registration
>   is done anyway before subscribing to VM.  But if the VM server has a
>   Restart-Timer of 600, there probably should be an additional delay
>   between registration and subscribing.
> 
>   The trouble is that rand(0:300)+rand(0:300) doesn't have the same
>   distribution as rand(0:600), so you may not want to say "Wait a random
>   fraction of the difference between the two Restart-Timers."
> 
>   Perhaps a workable algorithm is "Choose a random real number
>   uniformaly between zero and one.  Each bootup operation may be
>   executed no earlier than (the random number) * (the specified
>   Restart-Timer for that operation) seconds after power-up."  That
>   causes each server to see the time-distribution of requests that it
>   expects.

... and also insures that the UA will access a server with a
Restart-Timer of 300 before it accesses a server with a Restart-Timer
of 600, which is probably what the adminstrators of the servers desire
to happen.

> - The text suggests that the stored Restart-Timer value expires when
>   the registration expires.  ("The validity duration of the
>   Restart-Timer header is the same as that of the corresponding
>   registration operation.")  That doesn't work at all, because the
>   power failure may exceed the length of all the registrations.  The
>   Restart-Timer value has to be saved until another value is
>   received for that same target.
> 
> - You may want to allow/require the UA to place an upper limit on the
>   Restart-Timer value.  At the least, Restart-Timer should not exceed
>   the maximum registration/subscription duration the UA requests and
>   the server provides.

... because the server is already prepared to handle registrations
that are as frequent as that interval.

> - I expect that you want to require that if a REGISTER/SUBSCRIBE
>   response is received that does not contain Restart-Timer, then the
>   saved Restart-Timer value is set to zero.  That causes the
>   expected behavior when a server that supports Restart-Timer is
>   replaced with one that does not.
> 
> Dale