Re: [BLISS] AD review: draft-ietf-bliss-call-completion-14

Robert Sparks <rjsparks@nostrum.com> Thu, 09 August 2012 18:27 UTC

Return-Path: <rjsparks@nostrum.com>
X-Original-To: bliss@ietfa.amsl.com
Delivered-To: bliss@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C867421F86BA for <bliss@ietfa.amsl.com>; Thu, 9 Aug 2012 11:27:14 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -101.3
X-Spam-Level:
X-Spam-Status: No, score=-101.3 tagged_above=-999 required=5 tests=[AWL=-1.300, BAYES_50=0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id uZ2UEHgLwF2S for <bliss@ietfa.amsl.com>; Thu, 9 Aug 2012 11:27:13 -0700 (PDT)
Received: from nostrum.com (nostrum-pt.tunnel.tserv2.fmt.ipv6.he.net [IPv6:2001:470:1f03:267::2]) by ietfa.amsl.com (Postfix) with ESMTP id 87A4B21F86B3 for <bliss@ietf.org>; Thu, 9 Aug 2012 11:27:12 -0700 (PDT)
Received: from unnumerable.local (pool-173-57-102-202.dllstx.fios.verizon.net [173.57.102.202]) (authenticated bits=0) by nostrum.com (8.14.3/8.14.3) with ESMTP id q79IR14v093364 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Thu, 9 Aug 2012 13:27:02 -0500 (CDT) (envelope-from rjsparks@nostrum.com)
Message-ID: <502400F5.1090800@nostrum.com>
Date: Thu, 09 Aug 2012 13:27:01 -0500
From: Robert Sparks <rjsparks@nostrum.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:14.0) Gecko/20120713 Thunderbird/14.0
MIME-Version: 1.0
To: Martin.Huelsemann@telekom.de
References: <4F10B30A.9060203@nostrum.com> <9762ACF04FA26B4388476841256BDE020115C8FA18DA@HE111543.emea1.cds.t-internal.com> <4F43BA45.9090606@nostrum.com>
In-Reply-To: <4F43BA45.9090606@nostrum.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
Received-SPF: pass (nostrum.com: 173.57.102.202 is authenticated by a trusted mechanism)
Cc: draft-ietf-bliss-call-completion@tools.ietf.org, bliss@ietf.org, alexeitsev@teleflash.com
Subject: Re: [BLISS] AD review: draft-ietf-bliss-call-completion-14
X-BeenThere: bliss@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Basic Level of Interoperability for SIP Services \(BLISS\) BoF" <bliss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bliss>, <mailto:bliss-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/bliss>
List-Post: <mailto:bliss@ietf.org>
List-Help: <mailto:bliss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bliss>, <mailto:bliss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 09 Aug 2012 18:27:14 -0000

Martin -

I haven't see a response to this message, and the updates in -15 do not 
address the issues called out here
with the use of PUBLISH as you currently specify. Could you please reply 
to those sections below?

RjS

On 2/21/12 9:37 AM, Robert Sparks wrote:
> Thanks Martin - more comments inline.
>
> Will you be at the Paris IETF meeting?
>
> On 2/17/12 10:41 AM, Martin.Huelsemann@telekom.de wrote:
>> Hi Robert,
>>
>> thank you very much for the detailed review.
>>
>> We will provide a new version considering your comments.
>>
>> Please find already some answers inside.
>>
>>
>> Best regards, Martin
>>
>>
>>
>>
>>> -----Urspr√ľngliche Nachricht-----
>>> Von: bliss-bounces@ietf.org [mailto:bliss-bounces@ietf.org]
>>> Im Auftrag von Robert Sparks
>>> Gesendet: Freitag, 13. Januar 2012 23:41
>>> An: bliss@ietf.org; draft-ietf-bliss-call-completion@tools.ietf.org
>>> Betreff: [BLISS] AD review: draft-ietf-bliss-call-completion-14
>>>
>>> Summary: This document has issues that need to be addressed
>>>            before progressing to IETF Last Call.
>>>
>>> This document was very difficult to review. Please call out
>>> anyone who provided a substantive review in the document
>>> quality section of an updated shepherd writeup - they deserve
>>> the acknowledgement!
>>>
>>> I think there might be places where the structure of the
>>> document became stressed as the design changed over time.
>>> Please consider an editorial pass focusing on organizing the
>>> resulting implementation requirements in ways that are easier
>>> to reference.
>>>
>>> Major issues
>>>
>>>     - The document's proposed use of PUBLISH is not consistent with the
>>>       semantics of that method. It attempts to use PUBLISH to affect
>>>       the subscription state, not the state of the event being
>>>       subscribed to (it's telling that PUBLISHing to this event package
>>>       doesn't allow setting the state being subscribed to). Among other
>>>       things, this prevents separating the state agent from the state
>>>       authority.
>> The concept is the following: The caller sends it's own status 
>> information in the PUBLISH request. In this case there is no change 
>> in the CC monitor (the queue) state done by the PUBLISH request 
>> directly, but either an indirect change of the state based on the 
>> information about the caller status. The CC monitor acts as a state 
>> composer, composing the state of the queue from the states of the 
>> different callers.
> This isn't composition. It's conflation of unrelated things. You have 
> an application that's operating on two unrelated sources of 
> information. If you really want to use sip events to carry the 
> information about the caller,
> you should be considering a separate event package for it. If that 
> feels wrong, it's another sign that PUBLISH is
> not the hammer you're looking for to pound on this particular nail.
>
> Speaking more generally to the events architecture, an element that 
> has enough information to act as a publisher necessarily has the 
> information it needs to be able to serve subscriptions directly. It is 
> choosing to publish to a state agent to deal with matters outside of 
> the contents of the package itself (not always being connected, having 
> limited bandwidth, etc.). Even in an idealized presence system where 
> the ultimate presence document is composed from many contributing 
> publishers, the individual sources have enough information (and it 
> makes sense within the definition of the package) to accept 
> subscriptions to their part of the total state. Further,
> subscribing to presence.winfo at the state agent will provide 
> meaningful data to each of the contributing sources.
>
>>
>>
>>
>>
>>>       The document modifies how PUBLISH identifies the
>>>       resource being manipulated by looking at the From URI and not
>>>       only at the Request-URI.  How the Callee's agent responds to the
>>>       request to change this subscription state is underspecified -
>>>       when can it reject a request? What is the caller supposed to do
>>>       if the request fails?
>> As mentioned the PUBLISH is not a RPC, the CC agent on behalf of the 
>> callee publishes information about the reachability of the callee for 
>> CC recalls. The monitor composes the state. This state can be used to 
>> evaluate if it is useful to start a CC recall or not.
> I think you were still trying to respond to the first point above here.
> This is a different question, and will stand independent of what 
> technique you use to make
> the request to change the information about the caller. What happens 
> when that request needs
> to fail (due to parsing, overload, authorization, application-specific 
> logic, or any of the other reasons
> UAS have to reject requests)?
>>
>>
>>
>>>     - As written, the first paragraph of section 9.7 asks this package
>>>       to violate the basic mechanics of RFC3265. It is a violation of
>>>       the architecture to completely ignore the the expiration time
>>>       value requested in an initial or refresh SUBSCRIBE request. The
>>>       responder may choose an expiration time less than or equal to the
>>>       value there. It may not choose a longer expiration time for the
>>>       subscription.
>> Yes this is unclear, what is meant is that the duration in the 
>> exoires answer shall not exceed the remaining CC timer.
>>
>>
>>>     - There is some important conversation missing from the security
>>>       considerations section.
>>>
>>>         - The dialog event package requires authentication, and digest
>>>           authentication is mandatory to implement. This package
>>>           doesn't appear to require any authentication other than
>>>           presenting a (possibly well known) URI. More discussion of
>>>           the policy for accepting subscriptions is needed to allow
>>>           implementers to protect the privacy of the callee. Otherwise,
>>>           it becomes trivial to use this package to obtain, for
>>>           instance, information about the callee's phone usage.
>>>           Similarly, the presence event package has a rich
>>>           authorization model, and discusses the security (particularly
>>>           privacy) implications of having the authorization settings
>>>           too open.
>>
>>>         - As written, there does not appear to be any protection
>>>           against an attacker causing everyone else that might be in a
>>>           queue to be marked not-available, ensuring his call moves to
>>>           the front of the queue. He only needs to know the AoRs of the
>>>           callers he might be competing with and send PUBLISH requests
>>>           with those AoRs in the From header field.
>>>         - What keeps a new caller from just adding the m= attribute to
>>>           a new INVITE in order to get the preferential treatment by
>>>           the network and the callee's UA described in several sections
>>>           of the document? Was an approach that used a temp-GRUU
>>>           considered instead? It would not have the property of being
>>>           as easy to guess as adding an m= URI parameter to an AoR.
>> As for many mechanisms in the CC draft also here we had to consider 
>> the interworking to the PSTN. There we have only a very basic CC 
>> prioritization indicator in the IAM, saying not much more than 'this 
>> IAM is prioritized for CC'. That's why the callee's monitor MUST 
>> record the From URI from the initial call, to have the chance to 
>> check it against incoming INVITEs and verify id this INV is in fact 
>> for CC. Clarifying text for this is needed in the draft.
> That clarifying text needs to capture the scenarios above, and answer 
> the questions. For instance, if there is nothing to keep an attacker 
> from marking everyone else in the queue as not available, the document 
> needs to call that out as a security consideration. (I hope this isn't 
> what the group plans to end with).
>>
>>>         - A malicious callee could return several (many) NOTIFYs with
>>>           different to-tags, each containing a different cc-URI,
>>>           leading the caller to parallel-fork a large number of
>>>           subscriptions to a victim.
>>> Some questions
>>>
>>>     - Section 9.10 calls out that subscribers need to be prepared to
>>>       get NOTIFYs from multiple places due to forks in the SUBSCRIBE,
>>>       but nothing in the document explores how this affects the
>>>       call-completion application. What keeps the following scenario
>>>       from occurring: Adam tries to call me, but I'm busy (on my desk
>>>       phone). He subscribes for call completion, and the subscribe gets
>>>       forked to both my desk and home phone. My home phone is not busy,
>>>       so it sends a NOTIFY with "ready" right away. Adam's phone calls
>>>       my home phone.
>> To avoid this situation Adam should add the 'm' (mode) URI parameter, 
>> in this case set to 'BS' (busy subscriber). In this case a CC recall 
>> is triggered twhen a busy condition at a callee UA has ended. Of 
>> course this situation also depends a littlebit on the forking proxy, 
>> does it for the initial INVITE send back the 180 from your home phone 
>> (indication CC possible 'NR') and also the 486 (indication CC 
>> possible 'BS')? Clarifying text is needed.
>>
>>
>>>     - What keeps this from happening? Adam calls and I reject his call
>>>       because I'm waiting for another  (I press X and the phone just
>>>       reports that I'm busy). Adam's phone subscribes for
>>>       call-completion and gets a NOTIFY of "ready" - his phone calls
>>>       mine again, forcing me to re-reject him. This repeats until I
>>>       take my phone off the hook (or engage a global DND) causing me to
>>>       not be able to receive the call I was waiting for.
>> If the callee's monitor does not want to enable the caller to make 
>> use of the CC service, it will not insert a Call-Info header field 
>> with "purpose=call-completion" in the final response message.
> Where is the text in the document that defines that behavior?
>> Of course Adam's phone could try to sunscribe for CC at your phone, 
>> but in this case your phone simply rejects the subscription, which 
>> should not affect your reachability for the call you are waiting for.
> Similarly, where in the text is this made clear? It is separable from 
> the the scenario above - Adam could write a script to subscribe to 
> every address at an enterprise. What normative text causes all those 
> subscriptions to be rejected?
>>
>>
>>>     - Would a callee ever want to subscribe to call-completion.winfo to
>>>       see who's in his queue? Will the current design prevent
>>>       implementing a server for call-completion.winfo?
>> Actually we never discussed this option. We will check it.
>>
>>
>>> Remaining issues (mostly in document order)
>>>
>>>     - application/call-completion needs to be sent to type review.
>>>     - It would be useful to more carefully describe exactly what the
>>>       resource being subscribed to.
>>>     - Please call out how this document updated 3261 in the
>>>       introduction.
>>>     - Section 4.2 paragraph 1: Is 100rel required? recommended?
>>>     - It's not easy to understand from the text why the subscribing UA
>>>       is attempting to subscribe to multiple URIs (the first occurrence
>>>       is in 4.2 paragraph 4). Some additional motivating text would
>>>       help.
>>>     - The document mischaracterizes 'merged' requests as being those
>>>       that share the same Call-ID. As Section 8.2.2.2 of RFC3261
>>>       defines, it's more than that - the things that have to be the
>>>       same are the From tag, Call-Id, and CSeq. This occurs several
>>>       places in the document:  6.2 second paragraph, description of
>>>       example in section 8, 9.7 third paragraph. It's worth noting that
>>>       the UA core in 8.2.2.2 does this merge detection - you are
>>>       restating a requirement, not adding one -  you should probably
>>>       just note that the UA will behave as required by that section of
>>>       RFC3261.
>>>     - It's worth explicitly calling out (at least in section 6.2) that
>>>       you are expecting the subscribing UA to fork its own requests (so
>>>       that the merge behavior you are describing can take place). This
>>>       means keeping more than the Call-Id constant. An implementer will
>>>       have to select or develop a SIP implementation that allows them
>>>       to do that.
>>>     - There needs to be additional clarity to the specification of the
>>>       use of the service-retention indication. What is the caller's
>>>       (the subscriber's) endpoint supposed to do differently when it
>>>       sees the service-retention option arrive in a NOTIFY? The
>>>       difference in the behavior of the callee's system is hard to
>>>       extract - the most salient description is the last paragraph of
>>>       4.2.
>>>     - Section 6.2 first paragraph: m parameter of a SUBSCRIBE SHOULD
>>>       match the m parameter passed through the Call-Info header. Why is
>>>       this not MUST?
>> Again because of the PSTN interworking, on TCAP there isn't an 
>> equivalent for all the m-parameter values.
>>
>> I see there are more questions why there is SHOULD and not MUST. I 
>> still have to check them in particular. But as i said, most of the 
>> softening in the draft is due to enable an interworking with the PSTN 
>> CC service. For the latter more information can be found in ETSI ETS 
>> 300 356-18 and ITU-T Q.733.
> Please watch for opportunities to make this clear in the text.
>>>     - Why does the document specify a request-disposition of no-cancel
>>>       for SUBSCRIBE requests? An intermediary cannot send a CANCEL to
>>>       forked legs of a SUBSCRIBE request in the first place.
>>
>>>     - In section 6.2 paragraph 4, you mean to say the caller's agent
>>>       must be prepared to receive multiple NOTIFYs establishing
>>>       different dialogs for each initial SUBSCRIBE request it sends. It
>>>       is not possible for the agent to receive multiple (final)
>>>       responses to the SUBSCRIBE request itself.
>>
>>>     - The string 'cc-state' appears for the first time in section 6.3
>>>       with no context. The discussion of state before that in the
>>>       document is a superset of the states represented with cc-state.
>>>       Please at least provide a forward pointer. It would be better to
>>>       explicitly describe what cc-state is before you get to this
>>>       section.
>>
>>>     - The first sentence in section 6.3 is hard to parse. Could it be
>>>       broken into more than one sentence? Why are the SHOULDs in this
>>>       section not MUSTs?
>>
>>>     - In section 7.1, why is the callee's monitor required to send at
>>>       least one non-100 provisional (with a Call-Info in it)? Is it
>>>       because the final response might not be delivered to the calling
>>>       endpoint due to forking. If so, don't you need to require 100rel?
>>
>>>     - Why is the SHOULD in 7.1 paragraph 3 not a MUST?
>>
>>>     - Why does 7.1 paragraph 4 start "When applicable,"?
>> Means simply 'if CC is offered'.
>>
>>>     - In this version of the document, the last paragraph of 7.1 is the
>>>       only definition of the possible values for the m= URI parameter.
>>>       It would help to list them with the definition of the parameter
>>>       itself.
>>>     - The requirements around forking in section 7.2 paragraph 2 belong
>>>       in section 9. Why is the requirement to respond with a 482 to all
>>>       but one fork a SHOULD and not a MUST?
>>
>>>     - Why is the SHOULD in 7.3 paragraph 2 not a MUST?
>>
>>>     - Subsections of Section 7 use SHALL instead of MUST - it would be
>>>       better to be consistent throughout the document.
>>>     - In 7.4 paragraph one, where you say "if the CC call fails", it
>>>       would be better to say "if the CC call is not accepted". The call
>>>       could fail without the callee's monitor seeing any of the
>>>       signalling.
>>>     - In 7.4 paragraph 1, last sentence, in what circumstance would the
>>>       callee's monitor NOT terminate the relevant subscription?
>> I think this SHOULD should or better must be a MUST.
>>
>>>     - 7.4 paragraph 2 (which assumes the UA can only handle one call at
>>>       a time) should be made consistent with 7.3 paragraph 3 (which
>>>       allows UAs that can support multiple calls)
>>>     - 7.6 paragraph 1 says "SHALL process the queue as described in
>>>       subclause 7.3". But 7.3 does not talk about processing queues.
>>
>>>     - In the example, you show a 487 to the invite and motivate it by
>>>       some proxy having generated a CANCEL. That proxy would have
>>>       received a 487, but assuming it got no better responses from any
>>>       other leg, it would most likely send a 480. If there weren't
>>>       intervening proxies, the response might be one of several
>>>       400-class responses (perhaps a 408). Please call out that there
>>>       may be many variations in this failure response.
>>
>>>     - Proxies will not aggregate Call-Info header fields from multiple
>>>       final responses into the response they send upstream. In a
>>>       general deployment, the only time you will see that the callee
>>>       supports call-completion (at least given how the capability is
>>>       signaled in this document) is if it's final response is chosen as
>>>       "best" by every proxy in the chain. It's worth pointing out that
>>>       some 4xx responses from the callee's UA are more likely to be
>>>       chosen as "best" than others. It's also probably worth pointing
>>>       out that in in situations like you allude to in the example in
>>>       section 8, when proxies cancel legs, the 487 they stimulate from
>>>       the callee's UAs are not likely to be chosen as "best".
>> Text for forking proxies need, s.a.
>>
>>
>>>     - The third paragraph of section 9.4 is very unclear. I can't parse
>>>       the first sentence at all. In the second sentence, it might be
>>>       clearer to say "can never" instead of "cannot" (assuming my guess
>>>       at what the paragraph is trying to say is correct). The third
>>>       sentence doesn't make sense, and I wonder if the text matched a
>>>       previous design better? Moving between available and
>>>       not-available (using PUBLISH) doesn't affect the subscription
>>>       duration - what is the sentence trying to talk about when it
>>>       mentions granting a duration as part of resuming a subscription?
>> For example if it is clear that you leave your office latest at 8 in 
>> the evening, and a subscription for CC arrives at your desk phone at 
>> 7:45, expires set to 1 hour, expires in the response should be set to 
>> 15 minutes.
>>
>>>     - Section 9.5 third paragraph points to a format described in
>>>       section 8. It means to point to section 10.
>>>
>>>     - The description of NOTIFY bodies in section 9.5 allows bodies of
>>>       type application/sdp to be sent in notifies as long as that type
>>>       occurs in the Accept header field of the most recent SUBSCRIBE
>>>       request on the dialog. Is that intentional?
>>>
>>>     - Section 9.6 is vague about a call-completion service specific
>>>       timer. It points into 9.4 claiming the timer is described there,
>>>       but 9.4 is talking about subscription duration, only noting that
>>>       the duration default value is chosen based on a timer value from
>>>       other specifications. Why is this MAY important? What are the
>>>       implementations supposed to do with this implication?
>>>
>>>     - In the second paragraph of section 9.7, should the 480 include a
>>>       retry-after? Why was 403 chosen for long-term-denial _error_
>>>       situations. Why isn't that a 500?
>>>
>>>     - The first sentence of section 9.8 would be much more effective if
>>>       it said (or pointed to text that describes) what the event
>>>       triggering conditions actually are.
>>>
>>>     - The third paragraph of section 9.8 has a MUST requirement that is
>>>       conditional on an agent initiating an INVITE "promptly", but
>>>       there's no characterization of "promptly" in the document. How
>>>       does it account for the time it takes to reconfirm the caller is
>>>       actually present and available before initiating the INVITE due
>>>       to a recall? (This should also be accounted for in the first part
>>>       of the security considerations).
>>>
>>>     - Section 9.9 (corresponding to section 4.4.8 of RFC3261) is not
>>>       adequate. It needs to actually describe the package specific
>>>       subscription processing (including how the state is built), or
>>>       provide a finer reference to where that specification lies than
>>>       "in this and possibly in other documents". Section 7 has most of
>>>       this information, but it's fairly widely scattered. Please
>>>       consider consolidating the normative behavior into one place.
>>>
>>>     - Section 9.11 claims the service typically involves a single
>>>       notification per notifier per subscription. This cannot be the
>>>       case. There will typically be three - the initial notify in
>>>       response to the subscribe request, the notify representing the
>>>       state transition from queued to ready, and the notify
>>>       corresponding to the termination of the subscription. (It is not
>>>       clear from the document when you expect the notification of
>>>       "ready" to immediately terminate the subscription, if ever.)
>>>
>>>     - The timing restrictions in section 9.11 seem artificial, and
>>>       interact badly with the application this package is intended to
>>>       support (the implication is that the server should delay  send a
>>>       "ready" for example). Can the document explain how these
>>>       restrictions were chosen?
>>>
>>>     - Why does the call-completion information format make a provision
>>>       for X- headers since you ignore lines with unknown names?
>>>
>>>     - Instead of saying "Two lines with the same name MUST NOT be
>>>       present, except where specifically permitted", consider saying
>>>       "The header lines defined in this document can occur at most once
>>>       in any given call-completion document. Extensions must define
>>>       whether defined lines may occur more than once. How likely is
>>>       this format to be extended? Do these need to be put in a
>>> registry?
>>>
>>>     - Why does the syntax for cc-URI allow cc-URI header line
>>>       parameters? You certainly want the URI to be able to contain URI
>>>       parameters, but when would you ever use the header line
>>>       parameters? What you have now allows
>>>
>>>       cc-URI: random display text
>>> <sip:name@domain;uri-param=uri-value>;cc-uri-header-param-name
>>> =cc-uri-header-param-value
>>>
>>>       How is having that display text ever useful? When would
>>> you every use
>>>       a cc-uri-header-param? In other words, why isn't this simply
>>>       cc-URI = "cc-URI" HCOLON addr-spec?
>>>
>>>     - Item 2 in the security considerations section is unclear. It
>>>       seems to be placing a requirement on the subscriber (the caller),
>>>       but it's not clear what that requirement is (don't suspend any
>>>       subscriptions longer than a typical call? than some duration a
>>>       user entered for _this_ call? or what?). What's the subscriber
>>>       supposed to do if it would have suspended a subscription that
>>>       long - terminate the subscription? How does this protect the
>>>       privacy of the callee?
>>>
>>>     - The media-type form sections should point to specific sections in
>>>       this document. Consider calling out the most important
>>>       interoperability and security considerations.
>>>
>>> _______________________________________________
>>> BLISS mailing list
>>> BLISS@ietf.org
>>> https://www.ietf.org/mailman/listinfo/bliss
>>>