Re: [sipcore] Review of draft-ietf-sipcore-keep-05

Christer Holmberg <christer.holmberg@ericsson.com> Mon, 09 August 2010 07:20 UTC

Return-Path: <christer.holmberg@ericsson.com>
X-Original-To: sipcore@core3.amsl.com
Delivered-To: sipcore@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id C41F73A6A75 for <sipcore@core3.amsl.com>; Mon, 9 Aug 2010 00:20:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.996
X-Spam-Level:
X-Spam-Status: No, score=-3.996 tagged_above=-999 required=5 tests=[AWL=-0.597, BAYES_50=0.001, J_CHICKENPOX_41=0.6, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id G3O7VRI-zmPb for <sipcore@core3.amsl.com>; Mon, 9 Aug 2010 00:20:50 -0700 (PDT)
Received: from mailgw10.se.ericsson.net (mailgw10.se.ericsson.net [193.180.251.61]) by core3.amsl.com (Postfix) with ESMTP id 6A32E3A67F6 for <sipcore@ietf.org>; Mon, 9 Aug 2010 00:20:49 -0700 (PDT)
X-AuditID: c1b4fb3d-b7b90ae00000278d-51-4c5fac7261b1
Received: from esessmw0184.eemea.ericsson.se (Unknown_Domain [153.88.253.124]) by mailgw10.se.ericsson.net (Symantec Mail Security) with SMTP id 5C.DB.10125.27CAF5C4; Mon, 9 Aug 2010 09:21:22 +0200 (CEST)
Received: from ESESSCMS0356.eemea.ericsson.se ([169.254.1.104]) by esessmw0184.eemea.ericsson.se ([153.88.115.81]) with mapi; Mon, 9 Aug 2010 09:21:21 +0200
From: Christer Holmberg <christer.holmberg@ericsson.com>
To: "WORLEY, Dale R (Dale)" <dworley@avaya.com>, "sipcore@ietf.org" <sipcore@ietf.org>
Date: Mon, 09 Aug 2010 09:21:20 +0200
Thread-Topic: Review of draft-ietf-sipcore-keep-05
Thread-Index: AQHLNxQfbKRBGdvoDEWm/nBr9CW465LX0d7F
Message-ID: <7F2072F1E0DE894DA4B517B93C6A05850CB1D6@ESESSCMS0356.eemea.ericsson.se>
References: <CD5674C3CD99574EBA7432465FC13C1B21FE98EF86@DC-US1MBEX4.global.avaya.com>
In-Reply-To: <CD5674C3CD99574EBA7432465FC13C1B21FE98EF86@DC-US1MBEX4.global.avaya.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-Brightmail-Tracker: AAAAAA==
Subject: Re: [sipcore] Review of draft-ietf-sipcore-keep-05
X-BeenThere: sipcore@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: SIP Core Working Group <sipcore.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/sipcore>, <mailto:sipcore-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/sipcore>
List-Post: <mailto:sipcore@ietf.org>
List-Help: <mailto:sipcore-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sipcore>, <mailto:sipcore-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 09 Aug 2010 07:20:51 -0000

Hi Dale,

Thanks for your comments. See inline.

>I've not been in on the technical discussion, so I can't comment on
>the mechanism design.  But the exposition in the draft could be
>improved in a few places.

Some of your comments ARE related to issues that were discussed during WGLC, and I will indicate it when it occurs.


--------------


>The running title is "STUN-keep", which is surely not what is intended.

I can change that to "keep-alive".


--------------


>Section 1.1
>
>Uses "session" twice, but I think "dialog" is more correct.  The
>session (media) contains the RTP, which will probably keep the NAT
>open; the problem is keeping the NAT open for the signaling (dialog).

I can change "session" to "dialog".


--------------


>Section 3
>
>"Edge proxy"
>
>It's not clear that "edge proxy" has any normative significance in
>this I-D, as this mechanism can be used between any two entities.
>It seems that this definition and the accompanying Note could be
>removed at no loss.  (The definition needs revision, also.)

Section 5 talks about the edge proxy, where it describes that the usage of Flow-Timer is only defined between a UA and an edge proxy, so I think it is useful to have the definition.

I guess I could also add the following sentence to section 1:

	"In addition, [RFC5626] only allows keep-alives to be negotiated between a UA and an edge proxy, and not between other SIP entities."


--------------


>"Keep-alives"
>
>It would help if this definition mentioned that both CRLF and STUN
>keepalives were intended.  Also, the phrase "refers to" should be
>removed.  E.g.:
>
>    Keep-alives:  The keep-alive messages defined in SIP Outbound
>    [RFC5626], including both "CRLF" and STUN keep-alive messages.

People have asked me to remove text, and make it shorter and easier to read, so I would not like to start adding text unless something is broken or very unclear. In this case I think that sections 4.3 and 4.4 makes it clear that both CRLF and STUN are included.

But, I DO agree that "refers to" can be removed:

    	"Keep-alives:  The keep-alive messages defined in SIP Outbound [RFC5626]."

--------------

>'"keep" parameter'
>
>Strictly speaking, this entry is a description not a definition; the
>phrase '"keep" parameter' identifies the parameter unambiguously.  It
>would be better to remove this entry and add its information to
>section 4.1, which needs to be a bit more explicit about the
>mechanism.  (See below.)

The definition text used to be even longer, but parts were removed. However, the terminology is used in the documents, so I think it is good to have it in the definitions, instead of writing "the Via header field "keep" parameter" everywhere.


--------------


>Section 4.1, 2nd paragraph is:
>
>   SIP entities indicate willingness to send keep-alives towards the
>   adjacent downstream SIP entity using SIP requests.  The associated
>   responses are used by SIP entities to indicate willingness to receive
>   keep-alives.  SIP entities that indicate willingness to receive keep-
>   alives can provide a recommended keep-alive frequency.
>
>This should be made more explicit.  (And can be clarified using the
>singular.)
>
>   A SIP entity indicates willingness to send keep-alives towards the
>   adjacent downstream SIP entity using SIP requests by applying a
>   "keep" header parameter to the topmost Via (which is the Via that
>   it inserts).  The associated response is used by the adjacent
>   downstream SIP entity to indicate willingness to receive
>  keep-alives by applying to that topmost Via a "keep" header
>   parameter with an integer value.  The integer value, if not zero,
>   indicates the keep-alive frequency that is recommended by the
>   adjacent downstream SIP entity.
>
>That gives the outline of the mechanism in one paragraph.

During WGLC, there were comments/discussions about removing text, and making it shorter and easier to read. I don't want to start adding text again at this point, unless something is unclear or wrong.

Also, in this case, section 4.1 is a general overview section. The details on how to indicate willingness to send and receive keep-alives are described in sections 4.3. and 4.4, so there is no need to repeat the same text in 4.1.


--------------


>The 3rd paragraph should probably be expanded to make explicit a
>somewhat subtle point:
>
>   The procedures to negotiate usage of keep-alives are identical for
>   SIP UAs and proxies.  However, it is expected that the sender of
>   keep-alives is the sender of an initial request (REGISTER or
>   INVITE), which in normal usage is a UA, and so a proxy will not
>   initiate negotiation for sending keep-alives. 

The mechanism allows any SIP entity to negotiate sending of keep-alives with its adjacent downstream entity.

>   Thus, a UA which is
>   willing to send keep-alives SHOULD always indicate that it is
>   willing to send keep-alives even if it is not aware of any
>   necessity of doing so, allowing the adjacent downstream entity to
>   indicate if it knows that keep-alives are necessary (by responding
>   with a non-zero value).  Similarly, an entity which is willing to
>   receive keep-alives SHOULD always respond with a keep value of zero
>   when it is not aware of any necessity for keep-alives, thus leaving
>   the sender to send keep-alives if the sender is aware of any
>   necessity to do so.
>
>Urgh.  I suspect that these additions should instead be added to 4.3
>and 4.4 as appropriate.  But the mechanism needs to be used correctly
>by both entities in order to ensure that keep-alives are sent when
>either entity knows that they are needed, and the draft doesn't make
>explicit how to do that.

I would not like to start adding normative text on how and when entities shall use the mechanism, because it might depend on the entity types, and the environments where they are deployed.

However, I guess we could add some general guideline text:

	"In general, it can be useful for SIP entities to indicate willingness to send keep-alives, even if they 
	are not aware of any necessity for them to send keep-alives, since the adjacent downstream SIP entity might 
	have knowledge about the necessity. Similarly, it can be useful for SIP entities to indicate willingess to 
	receive keep-alives, even if they are not aware of any necessity for the adjacent upstream SIP entity to 
	send them."


--------------


>Section 4.3
>
>It's not clear to me what keep-alives the upstream entity must send
>once keep-alives are negotiated.  The first paragraph says that the
>entity must be able to send both STUN and CRLF keep-alives, but that
>seems to be not entirely accurate, as the sender seems to have the
>choice of what type of keep-alive to send, and so can choose to send
>only one type.  I suspect that there are rules elsewhere regarding
>what sort of keep-alive is to be sent in what circumstances; if so,
>there should be a reference to them.  Otherwise, a note should be
>added to indicate that the sender has the choice of what sort of
>keep-alive to add, and thus does not need to implement both.  (The
>recipient is required to implement both, of course.)

The text refers to RFC 5626 regarding the STUN and CRLF keep-alive messages, and RFC 5626 defines when each is used.

Also, remember that the purpose of the daft is not to change the keep-alive mechanism in RFC 5626, only to provide a mechanism to explicitly negotiate the usage of it, so we should not repeat mechanism text that is already defined in RFC 5626.

BUT, in order to try to make everyone happy, I could add the following sentence to the first section of 4.3:

	"[RFC5626] defines when to use STUN, respectively CRLF, for keep-alives."


--------------


>Section 4.4
>
>In every place where "frequency" is used, it would be more correct to
>use "period".  (A frequency is events-per-second, whereas a period is
>seconds-per-event.)
>
>In some place it must be stated that the period is specified in
>seconds, which is not done anywhere in the normative part of this
>draft.

"frequency" is used in Outbound, and I've been asked to use the same terminology. 

I DO agree, however, that it should be indictated that the frequency is provided in seconds.

I suggest to change the last sentence of the second paragraph to:

	"The parameter can contain a recommended keep-alive frequency, given in seconds, or a zero value."

I suggest to add "given in seconds" also to the first paragraph of section 5:

	"If a SIP entity receives a SIP response, where its Via header field
   	contains a "keep" parameter with a non-zero value that indicates a
   	recommended keep-alive frequency, given in seconds, it MUST use the 
	procedures defined for the Flow-Timer header field [RFC5626]."


--------------


>Section 5
>
>In the first paragraph, a few places could be phrased more clearly:
>
>   If a SIP entity receives a SIP response >>whose topmost<< Via
>   header field contains a "keep" parameter with a non-zero value
>   >>(which indicates a recommended keep-alive period)<<, it MUST use
>   the procedures defined >>in [RFC5626], using the value as if it was
>   the value of the Flow-Timer header field<<.  According to >>those<<
>   procedures, the SIP entity must send keep-alives at least as often
>   as the indicated recommended keep-alive >>period<<, >>and the SIP
>   entity should<< send its keep-alives so that the interval between
>   >>keep-alives<< is randomly distributed between 80% and 100% of the
>   recommended keep-alive >>period<<.

>After this paragraph should be a note saying when the receiving SIP
>entity is allowed to assume the connection is dead.  Copying from RFC
>5626, we could say:
>
>   Following [RFC5626], if a recommended keep-alive period has been
>   negotiated, the receiving SIP entity may, after not receiving a
>   keep-alive for that period, consider the flow to be dead.  Note
>   that the entity should wait for a time larger than the period in
>   order to have a grace period to account for transport delay.

I don't think we can use "flow", since that is Outbound specific.

Maybe we could simply say:

	"This specification does not specify actions to take if negotiated keep-alives are not
	received. As defined in [RFC5626], the receiving SIP entity may consider a connection to be
	dead in such situations."


--------------


>3rd paragraph
>
>It makes sense that if a single entity inserts both a keep=n parameter
>and a Flow-Timer: n header that they should contain the same value.
>But that is of no use to (and is not enforceable by) an upstream SIP
>entity, since from the upstream entity's point of view, the Flow-Timer
>header may have been inserted by an entity further downstream, not its
>adjacent downstream entity.  That is, an entity expects to see
>responses with Flow-Timer and keep=n which disagree, and (I think)
>must use the keep value instead of the Flow-Timer value.

The text previously said that one MUST NOT insert Flow-Timer if it inserts keep=n.

However, based on WGLC comments from Paul we agreed to remove that restriction, in order for this draft to not make restrictions to Outbound (in cases where both Outbound and this draft is used).


--------------


>Actually, I think the situation is probably more complex than I
>realize.  The Flow-Timer value seems to apply only between the edge
>proxy and the UA, and it it seems that only the UAC should pay
>attention to its value.  But it's not clear who can add the Flow-Timer
>header.  It appears that it can only appear in REGISTER responses.  It
>can be added by the UAS (obviously), but it appears from RFC 5626
>section 5.4 that it can also be added to the response by intermediate
>proxies.
>
>This section needs revision to make clear the interaction between
>Flow-Timer and keep=n.  I suspect the only workable answer is that an
>entity must examine both values (if it supports both mechanisms),
>determine whether Flow-Timer applies to this hop, and if so, send
>keep-alives according to the minimum of the two specified periods.
>Since the two periods might be specified by different entities, there
>doesn't seem to be any workable alternative rule.

My understanding of 5626 is that only an edge proxy can add the Flow-Timer header field, since Outbound only defines the usage of keep-alives between the UA and edge proxy.

Outbound itself wouldn't work if multiple entities were allowed to insert Flow-Timer.


--------------


>Section 6
>
>All examples use STUN keep-alives.  It would probably be beneficial to
>change one example to use CRLF, or to add a note explaining whether
>and when CRLF might be used in the example situations.

I can do something like that.

>It is possible for two proxies to use this mechanism on a hop between
>themselves.  Would it be useful to add an example of this?

The mechanism allows proxy-to-proxy keep-alives, and there also used to be an example (or, at least a use-case description) about proxy-to-proxy. However, it was removed, because people confused it with proxy-to-proxy heartbeats, which is a different thing than keep-alives.



Regards,

Christer