Re: [sip-overload] AD review: draft-ietf-soc-overload-design-04

Volker Hilt <volker.hilt@alcatel-lucent.com> Fri, 11 March 2011 02:41 UTC

Return-Path: <volker.hilt@alcatel-lucent.com>
X-Original-To: sip-overload@core3.amsl.com
Delivered-To: sip-overload@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 519A73A6840 for <sip-overload@core3.amsl.com>; Thu, 10 Mar 2011 18:41:00 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.599
X-Spam-Level:
X-Spam-Status: No, score=-6.599 tagged_above=-999 required=5 tests=[AWL=0.000, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Pu613xf-Z20q for <sip-overload@core3.amsl.com>; Thu, 10 Mar 2011 18:40:59 -0800 (PST)
Received: from ihemail4.lucent.com (ihemail4.lucent.com [135.245.0.39]) by core3.amsl.com (Postfix) with ESMTP id 33F123A6844 for <sip-overload@ietf.org>; Thu, 10 Mar 2011 18:40:59 -0800 (PST)
Received: from usnavsmail3.ndc.alcatel-lucent.com (usnavsmail3.ndc.alcatel-lucent.com [135.3.39.11]) by ihemail4.lucent.com (8.13.8/IER-o) with ESMTP id p2B2g3b8022894 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for <sip-overload@ietf.org>; Thu, 10 Mar 2011 20:42:16 -0600 (CST)
Received: from USNAVSXCHHUB01.ndc.alcatel-lucent.com (usnavsxchhub01.ndc.alcatel-lucent.com [135.3.39.110]) by usnavsmail3.ndc.alcatel-lucent.com (8.14.3/8.14.3/GMO) with ESMTP id p2B2Wp4D013412 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=NOT) for <sip-overload@ietf.org>; Thu, 10 Mar 2011 20:32:51 -0600
Received: from [135.244.41.194] (135.3.63.242) by USNAVSXCHHUB01.ndc.alcatel-lucent.com (135.3.39.110) with Microsoft SMTP Server (TLS) id 8.3.106.1; Thu, 10 Mar 2011 20:32:51 -0600
Message-ID: <4D7989CC.7070905@alcatel-lucent.com>
Date: Thu, 10 Mar 2011 21:32:44 -0500
From: Volker Hilt <volker.hilt@alcatel-lucent.com>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.15) Gecko/20110303 Thunderbird/3.1.9
MIME-Version: 1.0
To: sip-overload@ietf.org
References: <5DF92544-C0B5-4EF4-82F6-AB789A3CD251@nostrum.com>
In-Reply-To: <5DF92544-C0B5-4EF4-82F6-AB789A3CD251@nostrum.com>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Scanned-By: MIMEDefang 2.57 on 135.245.2.39
X-Scanned-By: MIMEDefang 2.64 on 135.3.39.11
Subject: Re: [sip-overload] AD review: draft-ietf-soc-overload-design-04
X-BeenThere: sip-overload@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: SIP Overload <sip-overload.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/sip-overload>, <mailto:sip-overload-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/sip-overload>
List-Post: <mailto:sip-overload@ietf.org>
List-Help: <mailto:sip-overload-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sip-overload>, <mailto:sip-overload-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 11 Mar 2011 02:41:00 -0000

Robert,

thanks for the detailed review. More inline.

On 3/4/2011 4:30 PM, Robert Sparks wrote:
> There are a few things to address before moving this document into IETF LC.
>
> 1) "This document is a product of the SIP overload control design team." should be
> adjusted to reflect its genesis in the design team and production in the working
> group.
>
Done.

> 2) The document has several passages that are telephony centric - speaking of
> failed calls, call attempts, and having the mechanism reject calls. This language
> should be generalized - the intended mechanism applies to all uses of SIP.
>
Done.

> 3) I don't think the distinction the last paragraph of 5.2 tries to draw is
> constructed correctly. The TCP packets between two routers are analogous to the individual
> SIP messages between the SIP servers you describe - the routers will see many
> TCP packets with different source and destination addresses.The whole point is that
> congenstion is managed by the behavior of the endpoints in a given stream. A direct
> comparison would be against mechanisms that only affected the behaviors of a
> particular pair of SIP endpoints - not changing the behavior of the SIP servers at all.
> Please adjust or remove this paragraph.
>
I've clarified this paragraph. The new version is below:

       Overload control for SIP servers is different from end-to-end
       congestion control used by transport protocols such as TCP. The
       traffic exchanged between SIP servers consists of many
       individual SIP messages. Each SIP message is typically created
       individually by a SIP UA (e.g., to start setting up a call) and
       each message has its own source and destination address. Even
       SIP messages containing identical SIP URIs (e.g., a SUBSCRIBE
       and a INVITE message to the same SIP URI) can be routed to
       different destinations. This is different from TCP which
       controls a large flow of packets (e.g., to transmit a file)
       between a single source and a single destination. If congestion
       occurs in a router, the sources can detect this condition and
       adjust the rate at which new packets are transmitted.

> 4) The end of section 6 talks about asking UAs to wait using 503/retry-after. What
> 503/retry-after does is ask the next upstream element to wait. I think it's the intention
> to scope this discussion to the case where the UA _is_ the next upstream element, but that
> needs to be made even more clear. The section should also discuss how an element knows
> that the next upstream element is a UA and not another proxy.
>
For this case, it is not important that the upstream neighbor is a UA or 
a proxy. The key is that there are many upstream neighbors that each 
contribute only a few requests. Edge proxy to UA is probably the common 
scenario for this case. I've clarified the text.

> 5) It would help to clarify in Section 9.1 (particularly paragraph 2) that "next request"
> is "the start of another transaction", and not "retransmissions in any ongoing transaction".
>
Done.

> 6) In section 9.3, I disagree that 100 Trying does not provide confirmation of receipt
> of a message. That is _exactly_ what 100 Trying does, and its point is to affect the
> transaction state machine's reliability mechanisms. It means it has been accepted for
> processing  - this hop is taking responsibility for it now.  If you have an implementation
> that would cause transaction reliability to fail by sending a 100 early in the processing
> as you suggest, that implementation in not conformant to the specification. Are you
> instead trying to say that the 100 Trying does not indicate that the message has
> already been forwarded?
>
This is based on a discussion in the design team. It was pointed out 
that it is common practice in SIP implementations to create 100 Trying 
very early in the processing and that a message can still be dropped 
(e.g., from an overflowing buffer).

Maybe someone who had the original comment can comment on this. I'd be 
happy to make the change as needed.

> 7) Did you mean dialog instead of session in the first paragraph of section 12?

Yes.

> Surely we
> have actual research backing up the claim "As a general rule". Can we point to that please?
>
The motivation for this recommendation is driven by user experience. 
Dropping mid-dialog requests will create a situation in which a user 
was, e.g., able to set up a call but is unable to modify it after that 
or terminate it. Targeting requests that initiate a dialog will, e.g., 
lead to calls that cannot be set up, which is more familiar to a user.

> 8) The security consideration section should note what key security properties
> each of the possible models have and what influence that could have on the mechanism
> chosen, particularly when specific mechanics have been discussed in a model's section.
> For instance, the mechanism detailed in the 4th paragraph of 9.1 is exposed if an attacker
> can easily make a new server appear to appear (by sending one message perhaps), cutting
> traffic at legitimate servers down by 1/n with each new malicious appearance.
>
I will address this comment in the next version of the draft.

> Nits:
>
> Section 2 Paragraph 1 Sentence 1: s/to the SIP congestion collapse/to SIP congestion collapse/
>
> Section 2 Paragraph 3:
> - s/spend/spent/g. Suggest changing this sentence to "Discarding a SIP
>    message after spending the resources to parse it is expensive."
> - s/less and less/fewer/,  s/more and more/more/
> - s/slope/rate/
>
> Section 6 Paragraph 5: s/determine much traffic/determine how much traffic/
>
Done.

Thanks!

Volker

>
> _______________________________________________
> sip-overload mailing list
> sip-overload@ietf.org
> https://www.ietf.org/mailman/listinfo/sip-overload