Re: [hybi] Ping/Pong body (was Re: TSV-Directorate review of draft-ietf-hybi-thewebsocketprotocol-07)

Jamie Lokier <jamie@shareable.org> Fri, 20 May 2011 21:26 UTC

Return-Path: <jamie@shareable.org>
X-Original-To: hybi@ietfa.amsl.com
Delivered-To: hybi@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E64E9E0714 for <hybi@ietfa.amsl.com>; Fri, 20 May 2011 14:26:26 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.599
X-Spam-Level:
X-Spam-Status: No, score=-4.599 tagged_above=-999 required=5 tests=[AWL=-2.000, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id IC1RK6d551wh for <hybi@ietfa.amsl.com>; Fri, 20 May 2011 14:26:22 -0700 (PDT)
Received: from mail2.shareable.org (mail2.shareable.org [80.68.89.115]) by ietfa.amsl.com (Postfix) with ESMTP id 0DA8AE06CF for <hybi@ietf.org>; Fri, 20 May 2011 14:26:21 -0700 (PDT)
Received: from jamie by mail2.shareable.org with local (Exim 4.63) (envelope-from <jamie@shareable.org>) id 1QNXD8-0007tV-U7; Fri, 20 May 2011 22:26:18 +0100
Date: Fri, 20 May 2011 22:26:18 +0100
From: Jamie Lokier <jamie@shareable.org>
To: Pieter Hintjens <ph@imatix.com>
Message-ID: <20110520212618.GB969@shareable.org>
References: <ED13A76FCE9E96498B049688227AEA292C6A81E4@TK5EX14MBXC206.redmond.corp.microsoft.com> <F390D8D1-335B-4595-93A2-0741DD693559@gmail.com> <ED13A76FCE9E96498B049688227AEA292C6A85DE@TK5EX14MBXC206.redmond.corp.microsoft.com> <BANLkTimg6Z8rs+SDp-HX+FzJQukKndWqkg@mail.gmail.com> <20110519232827.GA969@shareable.org> <BANLkTinAzZxAojTUDAz3X69=odAQ+xse_g@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <BANLkTinAzZxAojTUDAz3X69=odAQ+xse_g@mail.gmail.com>
User-Agent: Mutt/1.5.13 (2006-08-11)
Cc: "hybi@ietf.org" <hybi@ietf.org>, Greg Wilkins <gregw@intalio.com>, Gabriel Montenegro <Gabriel.Montenegro@microsoft.com>
Subject: Re: [hybi] Ping/Pong body (was Re: TSV-Directorate review of draft-ietf-hybi-thewebsocketprotocol-07)
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 20 May 2011 21:26:27 -0000

Pieter Hintjens wrote:
> On Fri, May 20, 2011 at 1:28 AM, Jamie Lokier <jamie@shareable.org> wrote:
> 
> > If the client had to wait for pong before sending another ping, that
> > would cause spurious "offline" statuses whenever there was a lot of
> > data transiting in server-to-client direction.
> 
> We had exactly this problem in AMQP and resolved it quite effectively
> by treating _any_ incoming data as a liveness indicator. Any reason
> you'd not do this?

(I was going to agree, but then I remembered complicated things from
coherent systems that had hoped to avoid here.  So yes and no!)

That's better, but it addresses a different issue than this thread.

That's a good way to keep a TCP connection alive.  Optimal seems to be
to decouple directions, and (for each direction) a one-way keepalive
when there hasn't been any transmission for a fixed time; all received
data indicates liveness.

But keeping a connection alive isn't quite the same as reliably
detecting a dead one!  (To reconnect, break a transaction, etc.)

With the example timings I gave, the client has to send keepalives
every 20 seconds _even if_ it treats any data from the server as
indicating liveness, otherwise the server will flag the client as
"offline" spuriously.

For "quiet" connections, explicit ping/pong work, but the the client
must send pings 20 seconds apart from _each other_, not wait 20
seconds after receiving a pong (an easy coding error).

If we discard the whole concept of ping/pong, I think you and I are
in sharp agreement.

But this sub-thread is about whether it's ever useful to overlap
ping/pong.  My reply is:

   If you insist on using ping/pong, perhaps feeling the need to use
   the features of WebSocket draft -07 instead of inventing your own,
   then yes sometimes they need to overlap.  But it's even better to
   abandon ping/pong and use a more robust keepalive method.

In a way, you have highlighted a problem with the explicit ping/pong
strategy implicitly advocated in draft -07.

Unfortunately, treating all data as liveness has a different set of
problems (good for keeping a connection alive, but unreliable at
detecting connection loss in a bounded time), which is often
ignorable, but breaks things that depend on "confirm the client was
definitely alive in the last 30 seconds".  Think about variable
network delays.  The only robust solution I've come up with combines
all-data-liveness and round-tripped sequence stamps.  Hence yes and no
to your question.

-- Jamie