[BLISS] AD review: draft-ietf-bliss-call-completion-14

Robert Sparks <rjsparks@nostrum.com> Fri, 13 January 2012 22:41 UTC

Return-Path: <rjsparks@nostrum.com>
X-Original-To: bliss@ietfa.amsl.com
Delivered-To: bliss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1CB3E21F861F for <bliss@ietfa.amsl.com>; Fri, 13 Jan 2012 14:41:27 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -101.281
X-Spam-Level:
X-Spam-Status: No, score=-101.281 tagged_above=-999 required=5 tests=[AWL=-1.281, BAYES_50=0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id jqQy8UmYmAkm for <bliss@ietfa.amsl.com>; Fri, 13 Jan 2012 14:41:25 -0800 (PST)
Received: from nostrum.com (nostrum-pt.tunnel.tserv2.fmt.ipv6.he.net [IPv6:2001:470:1f03:267::2]) by ietfa.amsl.com (Postfix) with ESMTP id 7E4DE21F85E4 for <bliss@ietf.org>; Fri, 13 Jan 2012 14:41:25 -0800 (PST)
Received: from dn3-177.estacado.net (vicuna-alt.estacado.net [75.53.54.121]) (authenticated bits=0) by nostrum.com (8.14.3/8.14.3) with ESMTP id q0DMfE96069927 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Fri, 13 Jan 2012 16:41:17 -0600 (CST) (envelope-from rjsparks@nostrum.com)
Message-ID: <4F10B30A.9060203@nostrum.com>
Date: Fri, 13 Jan 2012 16:41:14 -0600
From: Robert Sparks <rjsparks@nostrum.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:8.0) Gecko/20111105 Thunderbird/8.0
MIME-Version: 1.0
To: bliss@ietf.org, draft-ietf-bliss-call-completion@tools.ietf.org
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Received-SPF: pass (nostrum.com: 75.53.54.121 is authenticated by a trusted mechanism)
Subject: [BLISS] AD review: draft-ietf-bliss-call-completion-14
X-BeenThere: bliss@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Basic Level of Interoperability for SIP Services \(BLISS\) BoF" <bliss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bliss>, <mailto:bliss-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/bliss>
List-Post: <mailto:bliss@ietf.org>
List-Help: <mailto:bliss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bliss>, <mailto:bliss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 13 Jan 2012 22:41:27 -0000

Summary: This document has issues that need to be addressed
          before progressing to IETF Last Call.

This document was very difficult to review. Please call out anyone
who provided a substantive review in the document quality section
of an updated shepherd writeup - they deserve the acknowledgement!

I think there might be places where the structure of the document
became stressed as the design changed over time. Please consider
an editorial pass focusing on organizing the resulting implementation
requirements in ways that are easier to reference.

Major issues

   - The document's proposed use of PUBLISH is not consistent with the
     semantics of that method. It attempts to use PUBLISH to affect
     the subscription state, not the state of the event being
     subscribed to (it's telling that PUBLISHing to this event package
     doesn't allow setting the state being subscribed to). Among other
     things, this prevents separating the state agent from the state
     authority. The document modifies how PUBLISH identifies the
     resource being manipulated by looking at the From URI and not
     only at the Request-URI.  How the Callee's agent responds to the
     request to change this subscription state is underspecified -
     when can it reject a request? What is the caller supposed to do
     if the request fails?

   - As written, the first paragraph of section 9.7 asks this package
     to violate the basic mechanics of RFC3265. It is a violation of
     the architecture to completely ignore the the expiration time
     value requested in an initial or refresh SUBSCRIBE request. The
     responder may choose an expiration time less than or equal to the
     value there. It may not choose a longer expiration time for the
     subscription.

   - There is some important conversation missing from the security
     considerations section.

       - The dialog event package requires authentication, and digest
         authentication is mandatory to implement. This package
         doesn't appear to require any authentication other than
         presenting a (possibly well known) URI. More discussion of
         the policy for accepting subscriptions is needed to allow
         implementers to protect the privacy of the callee. Otherwise,
         it becomes trivial to use this package to obtain, for
         instance, information about the callee's phone usage.
         Similarly, the presence event package has a rich
         authorization model, and discusses the security (particularly
         privacy) implications of having the authorization settings
         too open.

       - As written, there does not appear to be any protection
         against an attacker causing everyone else that might be in a
         queue to be marked not-available, ensuring his call moves to
         the front of the queue. He only needs to know the AoRs of the
         callers he might be competing with and send PUBLISH requests
         with those AoRs in the From header field.

       - What keeps a new caller from just adding the m= attribute to
         a new INVITE in order to get the preferential treatment by
         the network and the callee's UA described in several sections
         of the document? Was an approach that used a temp-GRUU
         considered instead? It would not have the property of being
         as easy to guess as adding an m= URI parameter to an AoR.

       - A malicious callee could return several (many) NOTIFYs with
         different to-tags, each containing a different cc-URI,
         leading the caller to parallel-fork a large number of
         subscriptions to a victim.

Some questions

   - Section 9.10 calls out that subscribers need to be prepared to
     get NOTIFYs from multiple places due to forks in the SUBSCRIBE,
     but nothing in the document explores how this affects the
     call-completion application. What keeps the following scenario
     from occurring: Adam tries to call me, but I'm busy (on my desk
     phone). He subscribes for call completion, and the subscribe gets
     forked to both my desk and home phone. My home phone is not busy,
     so it sends a NOTIFY with "ready" right away. Adam's phone calls
     my home phone.

   - What keeps this from happening? Adam calls and I reject his call
     because I'm waiting for another  (I press X and the phone just
     reports that I'm busy). Adam's phone subscribes for
     call-completion and gets a NOTIFY of "ready" - his phone calls
     mine again, forcing me to re-reject him. This repeats until I
     take my phone off the hook (or engage a global DND) causing me to
     not be able to receive the call I was waiting for.

   - Would a callee ever want to subscribe to call-completion.winfo to
     see who's in his queue? Will the current design prevent
     implementing a server for call-completion.winfo?

Remaining issues (mostly in document order)

   - application/call-completion needs to be sent to type review.

   - It would be useful to more carefully describe exactly what the
     resource being subscribed to.

   - Please call out how this document updated 3261 in the
     introduction.

   - Section 4.2 paragraph 1: Is 100rel required? recommended?

   - It's not easy to understand from the text why the subscribing UA
     is attempting to subscribe to multiple URIs (the first occurrence
     is in 4.2 paragraph 4). Some additional motivating text would
     help.

   - The document mischaracterizes 'merged' requests as being those
     that share the same Call-ID. As Section 8.2.2.2 of RFC3261
     defines, it's more than that - the things that have to be the
     same are the From tag, Call-Id, and CSeq. This occurs several
     places in the document:  6.2 second paragraph, description of
     example in section 8, 9.7 third paragraph. It's worth noting that
     the UA core in 8.2.2.2 does this merge detection - you are
     restating a requirement, not adding one -  you should probably
     just note that the UA will behave as required by that section of
     RFC3261.

   - It's worth explicitly calling out (at least in section 6.2) that
     you are expecting the subscribing UA to fork its own requests (so
     that the merge behavior you are describing can take place). This
     means keeping more than the Call-Id constant. An implementer will
     have to select or develop a SIP implementation that allows them
     to do that.

   - There needs to be additional clarity to the specification of the
     use of the service-retention indication. What is the caller's
     (the subscriber's) endpoint supposed to do differently when it
     sees the service-retention option arrive in a NOTIFY? The
     difference in the behavior of the callee's system is hard to
     extract - the most salient description is the last paragraph of
     4.2.

   - Section 6.2 first paragraph: m parameter of a SUBSCRIBE SHOULD
     match the m parameter passed through the Call-Info header. Why is
     this not MUST?

   - Why does the document specify a request-disposition of no-cancel
     for SUBSCRIBE requests? An intermediary cannot send a CANCEL to
     forked legs of a SUBSCRIBE request in the first place.

   - In section 6.2 paragraph 4, you mean to say the caller's agent
     must be prepared to receive multiple NOTIFYs establishing
     different dialogs for each initial SUBSCRIBE request it sends. It
     is not possible for the agent to receive multiple (final)
     responses to the SUBSCRIBE request itself.

   - The string 'cc-state' appears for the first time in section 6.3
     with no context. The discussion of state before that in the
     document is a superset of the states represented with cc-state.
     Please at least provide a forward pointer. It would be better to
     explicitly describe what cc-state is before you get to this
     section.

   - The first sentence in section 6.3 is hard to parse. Could it be
     broken into more than one sentence? Why are the SHOULDs in this
     section not MUSTs?

   - In section 7.1, why is the callee's monitor required to send at
     least one non-100 provisional (with a Call-Info in it)? Is it
     because the final response might not be delivered to the calling
     endpoint due to forking. If so, don't you need to require 100rel?

   - Why is the SHOULD in 7.1 paragraph 3 not a MUST?

   - Why does 7.1 paragraph 4 start "When applicable,"?

   - In this version of the document, the last paragraph of 7.1 is the
     only definition of the possible values for the m= URI parameter.
     It would help to list them with the definition of the parameter
     itself.

   - The requirements around forking in section 7.2 paragraph 2 belong
     in section 9. Why is the requirement to respond with a 482 to all
     but one fork a SHOULD and not a MUST?

   - Why is the SHOULD in 7.3 paragraph 2 not a MUST?

   - Subsections of Section 7 use SHALL instead of MUST - it would be
     better to be consistent throughout the document.

   - In 7.4 paragraph one, where you say "if the CC call fails", it
     would be better to say "if the CC call is not accepted". The call
     could fail without the callee's monitor seeing any of the
     signalling.

   - In 7.4 paragraph 1, last sentence, in what circumstance would the
     callee's monitor NOT terminate the relevant subscription?

   - 7.4 paragraph 2 (which assumes the UA can only handle one call at
     a time) should be made consistent with 7.3 paragraph 3 (which
     allows UAs that can support multiple calls)

   - 7.6 paragraph 1 says "SHALL process the queue as described in
     subclause 7.3". But 7.3 does not talk about processing queues.

   - In the example, you show a 487 to the invite and motivate it by
     some proxy having generated a CANCEL. That proxy would have
     received a 487, but assuming it got no better responses from any
     other leg, it would most likely send a 480. If there weren't
     intervening proxies, the response might be one of several
     400-class responses (perhaps a 408). Please call out that there
     may be many variations in this failure response.

   - Proxies will not aggregate Call-Info header fields from multiple
     final responses into the response they send upstream. In a
     general deployment, the only time you will see that the callee
     supports call-completion (at least given how the capability is
     signaled in this document) is if it's final response is chosen as
     "best" by every proxy in the chain. It's worth pointing out that
     some 4xx responses from the callee's UA are more likely to be
     chosen as "best" than others. It's also probably worth pointing
     out that in in situations like you allude to in the example in
     section 8, when proxies cancel legs, the 487 they stimulate from
     the callee's UAs are not likely to be chosen as "best".

   - The third paragraph of section 9.4 is very unclear. I can't parse
     the first sentence at all. In the second sentence, it might be
     clearer to say "can never" instead of "cannot" (assuming my guess
     at what the paragraph is trying to say is correct). The third
     sentence doesn't make sense, and I wonder if the text matched a
     previous design better? Moving between available and
     not-available (using PUBLISH) doesn't affect the subscription
     duration - what is the sentence trying to talk about when it
     mentions granting a duration as part of resuming a subscription?

   - Section 9.5 third paragraph points to a format described in
     section 8. It means to point to section 10.

   - The description of NOTIFY bodies in section 9.5 allows bodies of
     type application/sdp to be sent in notifies as long as that type
     occurs in the Accept header field of the most recent SUBSCRIBE
     request on the dialog. Is that intentional?

   - Section 9.6 is vague about a call-completion service specific
     timer. It points into 9.4 claiming the timer is described there,
     but 9.4 is talking about subscription duration, only noting that
     the duration default value is chosen based on a timer value from
     other specifications. Why is this MAY important? What are the
     implementations supposed to do with this implication?

   - In the second paragraph of section 9.7, should the 480 include a
     retry-after? Why was 403 chosen for long-term-denial _error_
     situations. Why isn't that a 500?

   - The first sentence of section 9.8 would be much more effective if
     it said (or pointed to text that describes) what the event
     triggering conditions actually are.

   - The third paragraph of section 9.8 has a MUST requirement that is
     conditional on an agent initiating an INVITE "promptly", but
     there's no characterization of "promptly" in the document. How
     does it account for the time it takes to reconfirm the caller is
     actually present and available before initiating the INVITE due
     to a recall? (This should also be accounted for in the first part
     of the security considerations).

   - Section 9.9 (corresponding to section 4.4.8 of RFC3261) is not
     adequate. It needs to actually describe the package specific
     subscription processing (including how the state is built), or
     provide a finer reference to where that specification lies than
     "in this and possibly in other documents". Section 7 has most of
     this information, but it's fairly widely scattered. Please
     consider consolidating the normative behavior into one place.

   - Section 9.11 claims the service typically involves a single
     notification per notifier per subscription. This cannot be the
     case. There will typically be three - the initial notify in
     response to the subscribe request, the notify representing the
     state transition from queued to ready, and the notify
     corresponding to the termination of the subscription. (It is not
     clear from the document when you expect the notification of
     "ready" to immediately terminate the subscription, if ever.)

   - The timing restrictions in section 9.11 seem artificial, and
     interact badly with the application this package is intended to
     support (the implication is that the server should delay  send a
     "ready" for example). Can the document explain how these
     restrictions were chosen?

   - Why does the call-completion information format make a provision
     for X- headers since you ignore lines with unknown names?

   - Instead of saying "Two lines with the same name MUST NOT be
     present, except where specifically permitted", consider saying
     "The header lines defined in this document can occur at most once
     in any given call-completion document. Extensions must define
     whether defined lines may occur more than once. How likely is
     this format to be extended? Do these need to be put in a registry?

   - Why does the syntax for cc-URI allow cc-URI header line
     parameters? You certainly want the URI to be able to contain URI
     parameters, but when would you ever use the header line
     parameters? What you have now allows

     cc-URI: random display text 
<sip:name@domain;uri-param=uri-value>;cc-uri-header-param-name=cc-uri-header-param-value

     How is having that display text ever useful? When would you every use
     a cc-uri-header-param? In other words, why isn't this simply
     cc-URI = "cc-URI" HCOLON addr-spec?

   - Item 2 in the security considerations section is unclear. It
     seems to be placing a requirement on the subscriber (the caller),
     but it's not clear what that requirement is (don't suspend any
     subscriptions longer than a typical call? than some duration a
     user entered for _this_ call? or what?). What's the subscriber
     supposed to do if it would have suspended a subscription that
     long - terminate the subscription? How does this protect the
     privacy of the callee?

   - The media-type form sections should point to specific sections in
     this document. Consider calling out the most important
     interoperability and security considerations.