Re: [hybi] NAT reset recovery? Was: Extensibility mechanisms?

Jamie Lokier <jamie@shareable.org> Tue, 20 April 2010 19:44 UTC

Return-Path: <jamie@shareable.org>
X-Original-To: hybi@core3.amsl.com
Delivered-To: hybi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 17B1928C1B9 for <hybi@core3.amsl.com>; Tue, 20 Apr 2010 12:44:25 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.966
X-Spam-Level:
X-Spam-Status: No, score=-2.966 tagged_above=-999 required=5 tests=[AWL=-0.367, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id FPeJmz7PjIAf for <hybi@core3.amsl.com>; Tue, 20 Apr 2010 12:44:24 -0700 (PDT)
Received: from mail2.shareable.org (mail2.shareable.org [80.68.89.115]) by core3.amsl.com (Postfix) with ESMTP id 0367A28C156 for <hybi@ietf.org>; Tue, 20 Apr 2010 12:44:24 -0700 (PDT)
Received: from jamie by mail2.shareable.org with local (Exim 4.63) (envelope-from <jamie@shareable.org>) id 1O4JMi-0004RV-AC; Tue, 20 Apr 2010 20:44:12 +0100
Date: Tue, 20 Apr 2010 20:44:12 +0100
From: Jamie Lokier <jamie@shareable.org>
To: "Thomson, Martin" <Martin.Thomson@andrew.com>
Message-ID: <20100420194412.GE11723@shareable.org>
References: <20100419140423.GC3631@shareable.org> <6959E9B3-B1AC-4AFB-A53D-AB3BA340208C@d2dx.com> <B3F72E5548B10A4A8E6F4795430F841832040F78C0@NOK-EUMSG-02.mgdnok.nokia.com> <w2q5821ea241004191309t7362de42p922788d380119dc4@mail.gmail.com> <B3F72E5548B10A4A8E6F4795430F841832040F78DB@NOK-EUMSG-02.mgdnok.nokia.com> <20100420013220.GC21899@shareable.org> <l2s5821ea241004192301u692d2344y8da146470a68ab75@mail.gmail.com> <8B0A9FCBB9832F43971E38010638454F03E7D06A36@SISPE7MB1.commscope.com> <B3F72E5548B10A4A8E6F4795430F841832040F7B57@NOK-EUMSG-02.mgdnok.nokia.com> <8B0A9FCBB9832F43971E38010638454F03E7D06A56@SISPE7MB1.commscope.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <8B0A9FCBB9832F43971E38010638454F03E7D06A56@SISPE7MB1.commscope.com>
User-Agent: Mutt/1.5.13 (2006-08-11)
Cc: "hybi@ietf.org" <hybi@ietf.org>, "Markus.Isomaki@nokia.com" <Markus.Isomaki@nokia.com>
Subject: Re: [hybi] NAT reset recovery? Was: Extensibility mechanisms?
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 20 Apr 2010 19:44:25 -0000

Thomson, Martin wrote:
> However, negotiation seems like overkill for this sort of mechanism.
> Each peer has some expectations and constraints that it knows best.
> Let them each take whatever action they need to ensure that the
> connection stays alive.

If they don't negotiate, the only option is both sides using ping requests.

That is the most expensive KA pattern.

Remember that KA is the *biggest* bandwidth cost in many applications.

It makes the biggest difference when client and server both need to
rapidly detect a stale connection.  (Depends on application - see below).

> If a keep-alive frame is defined, either peer could unilaterally
> initiate it.  There's no technical reason why this shouldn't be
> allowed.  If the server cares, and it doesn't see any activity from
> the client, it could use the keep-alive too.

I agree that a ping request should be defined, and allowed to be sent at
any time by either side.

It is important to treat it as a *request* with other attributes that
generic requests might use, such as matching requests with responses.

Otherwise you get a condition where you send a ping request, get a
ping response, but the response might be an *old* response from 30
seconds ago (queued somewhere) and that doesn't tell you the endpoint
is alive now.

> As Jamie points out, the server frequently doesn't care.  More to
> the point - the server can't do little about it if the connection does
> go away.

Quite untrue.  The server cares a lot.  It *usually* has a less
aggressive timing requirement than the client - but not always, if the
application running on top has "presence" needs or some kind of cloudy
write-back caching (let your imagination run wild ;-).

If it sees no activity, the server must close the connection to avoid
running out of memory in the long run.  It can't make a new connection
(as usual there are exceptions for some applications), but it may
still do something useful, for example deleting an entry from a
"user's online" list.

> (I can imagine use of other channels to prod the client into
> wakefulness, but that's not something inherent to WS.)

Mobile phones, SMS signalling...  I'm guessing a big power saver in
mostly-idle applications.  Not inherent to WS, but WS should be
designed to make those methods possible to use.

> Rather than outright prohibition, I'd instead prefer to discourage
> server use of keep-alive (SHOULD NOT).

I'd discourage the use of SHOULD NOT, because it will only encourage
widespread clients to fail to implement the KA response, and thus make
it impossible for any server to use the feature should the need arise.

Just let implementors decide what to use, and give them good advice.

> Servers might know better - we have a server that has a 5 minute TCP
> "go dead" interval on the load balancer in front of it.  That server
> should not be outright prevented from sending a 4:55 keep-alive if
> it sees nothing from a client.

There is no point in sending the 4:55 keepalive, if you *know* all
working clients send KAs at shorter intervals.  Or is there something
I've missed?

(4:55 is too late, btw, on a mobile network with 6s latency...)

-- Jamie