[sipcore] Review of draft-ietf-sipcore-keep-05

"WORLEY, Dale R (Dale)" <dworley@avaya.com> Sun, 08 August 2010 16:11 UTC

Return-Path: <dworley@avaya.com>
X-Original-To: sipcore@core3.amsl.com
Delivered-To: sipcore@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id C07283A68DB for <sipcore@core3.amsl.com>; Sun, 8 Aug 2010 09:11:18 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -100.809
X-Spam-Level:
X-Spam-Status: No, score=-100.809 tagged_above=-999 required=5 tests=[AWL=-1.410, BAYES_50=0.001, J_CHICKENPOX_41=0.6, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9aE+Im+Iq4Bk for <sipcore@core3.amsl.com>; Sun, 8 Aug 2010 09:11:15 -0700 (PDT)
Received: from co300216-co-outbound.net.avaya.com (co300216-co-outbound.net.avaya.com [198.152.13.100]) by core3.amsl.com (Postfix) with ESMTP id D1A5A3A68B3 for <sipcore@ietf.org>; Sun, 8 Aug 2010 09:11:11 -0700 (PDT)
X-IronPort-AV: E=Sophos;i="4.55,338,1278302400"; d="scan'208";a="231768196"
Received: from unknown (HELO p-us1-erheast.us1.avaya.com) ([135.11.50.53]) by co300216-co-outbound.net.avaya.com with ESMTP; 08 Aug 2010 12:11:44 -0400
X-IronPort-AV: E=Sophos;i="4.55,338,1278302400"; d="scan'208";a="490828212"
Received: from dc-us1hcex2.us1.avaya.com (HELO DC-US1HCEX2.global.avaya.com) ([135.11.52.21]) by p-us1-erheast-out.us1.avaya.com with ESMTP; 08 Aug 2010 12:11:44 -0400
Received: from DC-US1MBEX4.global.avaya.com ([169.254.1.161]) by DC-US1HCEX2.global.avaya.com ([::1]) with mapi; Sun, 8 Aug 2010 12:11:43 -0400
From: "WORLEY, Dale R (Dale)" <dworley@avaya.com>
To: "sipcore@ietf.org" <sipcore@ietf.org>
Date: Sun, 08 Aug 2010 12:09:47 -0400
Thread-Topic: Review of draft-ietf-sipcore-keep-05
Thread-Index: AQHLNxQfbKRBGdvoDEWm/nBr9CW46w==
Message-ID: <CD5674C3CD99574EBA7432465FC13C1B21FE98EF86@DC-US1MBEX4.global.avaya.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: [sipcore] Review of draft-ietf-sipcore-keep-05
X-BeenThere: sipcore@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: SIP Core Working Group <sipcore.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/sipcore>, <mailto:sipcore-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/sipcore>
List-Post: <mailto:sipcore@ietf.org>
List-Help: <mailto:sipcore-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/sipcore>, <mailto:sipcore-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 08 Aug 2010 16:11:19 -0000

I've not been in on the technical discussion, so I can't comment on
the mechanism design.  But the exposition in the draft could be
improved in a few places.

The running title is "STUN-keep", which is surely not what is intended.

Section 1.1

Uses "session" twice, but I think "dialog" is more correct.  The
session (media) contains the RTP, which will probably keep the NAT
open; the problem is keeping the NAT open for the signaling (dialog).

Section 3

"Edge proxy"

It's not clear that "edge proxy" has any normative significance in
this I-D, as this mechanism can be used between any two entities.
It seems that this definition and the accompanying Note could be
removed at no loss.  (The definition needs revision, also.)

"Keep-alives"

It would help if this definition mentioned that both CRLF and STUN
keepalives were intended.  Also, the phrase "refers to" should be
removed.  E.g.:

    Keep-alives:  The keep-alive messages defined in SIP Outbound
    [RFC5626], including both "CRLF" and STUN keep-alive messages.

'"keep" parameter'

Strictly speaking, this entry is a description not a definition; the
phrase '"keep" parameter' identifies the parameter unambiguously.  It
would be better to remove this entry and add its information to
section 4.1, which needs to be a bit more explicit about the
mechanism.  (See below.)

Section 4.1, 2nd paragraph is:

   SIP entities indicate willingness to send keep-alives towards the
   adjacent downstream SIP entity using SIP requests.  The associated
   responses are used by SIP entities to indicate willingness to receive
   keep-alives.  SIP entities that indicate willingness to receive keep-
   alives can provide a recommended keep-alive frequency.

This should be made more explicit.  (And can be clarified using the
singular.)

   A SIP entity indicates willingness to send keep-alives towards the
   adjacent downstream SIP entity using SIP requests by applying a
   "keep" header parameter to the topmost Via (which is the Via that
   it inserts).  The associated response is used by the adjacent
   downstream SIP entity to indicate willingness to receive
   keep-alives by applying to that topmost Via a "keep" header
   parameter with an integer value.  The integer value, if not zero,
   indicates the keep-alive frequency that is recommended by the
   adjacent downstream SIP entity.

That gives the outline of the mechanism in one paragraph.

The 3rd paragraph should probably be expanded to make explicit a
somewhat subtle point:

   The procedures to negotiate usage of keep-alives are identical for
   SIP UAs and proxies.  However, it is expected that the sender of
   keep-alives is the sender of an initial request (REGISTER or
   INVITE), which in normal usage is a UA, and so a proxy will not
   initiate negotiation for sending keep-alives.  Thus, a UA which is
   willing to send keep-alives SHOULD always indicate that it is
   willing to send keep-alives even if it is not aware of any
   necessity of doing so, allowing the adjacent downstream entity to
   indicate if it knows that keep-alives are necessary (by responding
   with a non-zero value).  Similarly, an entity which is willing to
   receive keep-alives SHOULD always respond with a keep value of zero
   when it is not aware of any necessity for keep-alives, thus leaving
   the sender to send keep-alives if the sender is aware of any
   necessity to do so.

Urgh.  I suspect that these additions should instead be added to 4.3
and 4.4 as appropriate.  But the mechanism needs to be used correctly
by both entities in order to ensure that keep-alives are sent when
either entity knows that they are needed, and the draft doesn't make
explicit how to do that.

Section 4.3

It's not clear to me what keep-alives the upstream entity must send
once keep-alives are negotiated.  The first paragraph says that the
entity must be able to send both STUN and CRLF keep-alives, but that
seems to be not entirely accurate, as the sender seems to have the
choice of what type of keep-alive to send, and so can choose to send
only one type.  I suspect that there are rules elsewhere regarding
what sort of keep-alive is to be sent in what circumstances; if so,
there should be a reference to them.  Otherwise, a note should be
added to indicate that the sender has the choice of what sort of
keep-alive to add, and thus does not need to implement both.  (The
recipient is required to implement both, of course.)

Section 4.4

In every place where "frequency" is used, it would be more correct to
use "period".  (A frequency is events-per-second, whereas a period is
seconds-per-event.)

In some place it must be stated that the period is specified in
seconds, which is not done anywhere in the normative part of this
draft.

Section 5

In the first paragraph, a few places could be phrased more clearly:

   If a SIP entity receives a SIP response >>whose topmost<< Via
   header field contains a "keep" parameter with a non-zero value
   >>(which indicates a recommended keep-alive period)<<, it MUST use
   the procedures defined >>in [RFC5626], using the value as if it was
   the value of the Flow-Timer header field<<.  According to >>those<<
   procedures, the SIP entity must send keep-alives at least as often
   as the indicated recommended keep-alive >>period<<, >>and the SIP
   entity should<< send its keep-alives so that the interval between
   >>keep-alives<< is randomly distributed between 80% and 100% of the
   recommended keep-alive >>period<<.

After this paragraph should be a note saying when the receiving SIP
entity is allowed to assume the connection is dead.  Copying from RFC
5626, we could say:

   Following [RFC5626], if a recommended keep-alive period has been
   negotiated, the receiving SIP entity may, after not receiving a
   keep-alive for that period, consider the flow to be dead.  Note
   that the entity should wait for a time larger than the period in
   order to have a grace period to account for transport delay.

3rd paragraph

It makes sense that if a single entity inserts both a keep=n parameter
and a Flow-Timer: n header that they should contain the same value.
But that is of no use to (and is not enforceable by) an upstream SIP
entity, since from the upstream entity's point of view, the Flow-Timer
header may have been inserted by an entity further downstream, not its
adjacent downstream entity.  That is, an entity expects to see
responses with Flow-Timer and keep=n which disagree, and (I think)
must use the keep value instead of the Flow-Timer value.

Actually, I think the situation is probably more complex than I
realize.  The Flow-Timer value seems to apply only between the edge
proxy and the UA, and it it seems that only the UAC should pay
attention to its value.  But it's not clear who can add the Flow-Timer
header.  It appears that it can only appear in REGISTER responses.  It
can be added by the UAS (obviously), but it appears from RFC 5626
section 5.4 that it can also be added to the response by intermediate
proxies.

This section needs revision to make clear the interaction between
Flow-Timer and keep=n.  I suspect the only workable answer is that an
entity must examine both values (if it supports both mechanisms),
determine whether Flow-Timer applies to this hop, and if so, send
keep-alives according to the minimum of the two specified periods.
Since the two periods might be specified by different entities, there
doesn't seem to be any workable alternative rule.

Section 6

All examples use STUN keep-alives.  It would probably be beneficial to
change one example to use CRLF, or to add a note explaining whether
and when CRLF might be used in the example situations.

It is possible for two proxies to use this mechanism on a hop between
themselves.  Would it be useful to add an example of this?

Dale