[sip-overload] AD review: draft-ietf-soc-overload-design-04

Robert Sparks <rjsparks@nostrum.com> Fri, 04 March 2011 21:29 UTC

Return-Path: <rjsparks@nostrum.com>
X-Original-To: sip-overload@core3.amsl.com
Delivered-To: sip-overload@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 9A69E3A6A43 for <sip-overload@core3.amsl.com>; Fri, 4 Mar 2011 13:29:51 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.528
X-Spam-Level:
X-Spam-Status: No, score=-102.528 tagged_above=-999 required=5 tests=[AWL=0.072, BAYES_00=-2.599, SPF_PASS=-0.001, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 24+axP-CFzRp for <sip-overload@core3.amsl.com>; Fri, 4 Mar 2011 13:29:49 -0800 (PST)
Received: from nostrum.com (nostrum-pt.tunnel.tserv2.fmt.ipv6.he.net [IPv6:2001:470:1f03:267::2]) by core3.amsl.com (Postfix) with ESMTP id EB2443A6A2D for <sip-overload@ietf.org>; Fri, 4 Mar 2011 13:29:48 -0800 (PST)
Received: from [192.168.2.105] (pool-173-57-105-99.dllstx.fios.verizon.net [173.57.105.99]) (authenticated bits=0) by nostrum.com (8.14.3/8.14.3) with ESMTP id p24LUvMk095579 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO) for <sip-overload@ietf.org>; Fri, 4 Mar 2011 15:30:57 -0600 (CST) (envelope-from rjsparks@nostrum.com)
From: Robert Sparks <rjsparks@nostrum.com>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Date: Fri, 04 Mar 2011 15:30:56 -0600
Message-Id: <5DF92544-C0B5-4EF4-82F6-AB789A3CD251@nostrum.com>
To: sip-overload@ietf.org
Mime-Version: 1.0 (Apple Message framework v1082)
X-Mailer: Apple Mail (2.1082)
Received-SPF: pass (nostrum.com: 173.57.105.99 is authenticated by a trusted mechanism)
Subject: [sip-overload] AD review: draft-ietf-soc-overload-design-04
X-BeenThere: sip-overload@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: SIP Overload <sip-overload.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/sip-overload>, <mailto:sip-overload-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/sip-overload>
List-Post: <mailto:sip-overload@ietf.org>
List-Help: <mailto:sip-overload-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sip-overload>, <mailto:sip-overload-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 04 Mar 2011 21:29:51 -0000

There are a few things to address before moving this document into IETF LC.

1) "This document is a product of the SIP overload control design team." should be
adjusted to reflect its genesis in the design team and production in the working
group.

2) The document has several passages that are telephony centric - speaking of 
failed calls, call attempts, and having the mechanism reject calls. This language
should be generalized - the intended mechanism applies to all uses of SIP.

3) I don't think the distinction the last paragraph of 5.2 tries to draw is
constructed correctly. The TCP packets between two routers are analogous to the individual
SIP messages between the SIP servers you describe - the routers will see many
TCP packets with different source and destination addresses. The whole point is that
congenstion is managed by the behavior of the endpoints in a given stream. A direct
comparison would be against mechanisms that only affected the behaviors of a 
particular pair of SIP endpoints - not changing the behavior of the SIP servers at all.
Please adjust or remove this paragraph.

4) The end of section 6 talks about asking UAs to wait using 503/retry-after. What
503/retry-after does is ask the next upstream element to wait. I think it's the intention
to scope this discussion to the case where the UA _is_ the next upstream element, but that
needs to be made even more clear. The section should also discuss how an element knows
that the next upstream element is a UA and not another proxy.

5) It would help to clarify in Section 9.1 (particularly paragraph 2) that "next request"
is "the start of another transaction", and not "retransmissions in any ongoing transaction".

6) In section 9.3, I disagree that 100 Trying does not provide confirmation of receipt
of a message. That is _exactly_ what 100 Trying does, and its point is to affect the
transaction state machine's reliability mechanisms. It means it has been accepted for
processing  - this hop is taking responsibility for it now.  If you have an implementation 
that would cause transaction reliability to fail by sending a 100 early in the processing
as you suggest, that implementation in not conformant to the specification. Are you
instead trying to say that the 100 Trying does not indicate that the message has 
already been forwarded?

7) Did you mean dialog instead of session in the first paragraph of section 12? Surely we
have actual research backing up the claim "As a general rule". Can we point to that please?

8) The security consideration section should note what key security properties
each of the possible models have and what influence that could have on the mechanism
chosen, particularly when specific mechanics have been discussed in a model's section.
For instance, the mechanism detailed in the 4th paragraph of 9.1 is exposed if an attacker
can easily make a new server appear to appear (by sending one message perhaps), cutting
traffic at legitimate servers down by 1/n with each new malicious appearance.

Nits:

Section 2 Paragraph 1 Sentence 1: s/to the SIP congestion collapse/to SIP congestion collapse/

Section 2 Paragraph 3: 
- s/spend/spent/g. Suggest changing this sentence to "Discarding a SIP 
  message after spending the resources to parse it is expensive." 
- s/less and less/fewer/,  s/more and more/more/
- s/slope/rate/

Section 6 Paragraph 5: s/determine much traffic/determine how much traffic/