Re: [hybi] I-D Action:draft-ietf-hybi-thewebsocketprotocol-01.txt

John Tamplin <jat@google.com> Thu, 02 September 2010 17:45 UTC

Return-Path: <jat@google.com>
X-Original-To: hybi@core3.amsl.com
Delivered-To: hybi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id F29943A6ABA for <hybi@core3.amsl.com>; Thu, 2 Sep 2010 10:45:54 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -105.874
X-Spam-Level:
X-Spam-Status: No, score=-105.874 tagged_above=-999 required=5 tests=[AWL=0.103, BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id l1B5T05BJHQQ for <hybi@core3.amsl.com>; Thu, 2 Sep 2010 10:45:53 -0700 (PDT)
Received: from smtp-out.google.com (smtp-out.google.com [216.239.44.51]) by core3.amsl.com (Postfix) with ESMTP id F27BC3A6B62 for <hybi@ietf.org>; Thu, 2 Sep 2010 10:19:23 -0700 (PDT)
Received: from hpaq7.eem.corp.google.com (hpaq7.eem.corp.google.com [172.25.149.7]) by smtp-out.google.com with ESMTP id o82HJlXM008916 for <hybi@ietf.org>; Thu, 2 Sep 2010 10:19:47 -0700
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=google.com; s=beta; t=1283447988; bh=WQwD+a3WDAyZixuorRIo7w9YQyA=; h=MIME-Version:In-Reply-To:References:From:Date:Message-ID:Subject: To:Cc:Content-Type:Content-Transfer-Encoding; b=UvQVPc+5j0EVDB0blY78qhFLl1e+/x2f8GNjn8p+UvFVy28VYCqyunMOWbZEojBHD ISOaeuXqmIX4ECFL1TwFA==
Received: from gwj21 (gwj21.prod.google.com [10.200.10.21]) by hpaq7.eem.corp.google.com with ESMTP id o82HJ6ED010883 for <hybi@ietf.org>; Thu, 2 Sep 2010 10:19:46 -0700
Received: by gwj21 with SMTP id 21so124360gwj.36 for <hybi@ietf.org>; Thu, 02 Sep 2010 10:19:32 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=beta; h=domainkey-signature:received:mime-version:received:in-reply-to :references:from:date:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=stDb+aoU76f3DUF/zDPsQ6ltVliSR/cUDg6GvLuKi44=; b=m0eW5X9Amssthbo2UNJ5zXdRk6tsbMBPrrEyfCCVGI743yV+GRGLpEbuArySrL6/bV MrSjAj04HCXJAX4vOvmw==
DomainKey-Signature: a=rsa-sha1; c=nofws; d=google.com; s=beta; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; b=UhJorF0xctHfWcesO0LYrdGtVxSTInIZOlJLyAuhc/MVecZIQl4aXtDaXVPhYAfz2T McOjOexmuauuBnrASrAQ==
Received: by 10.150.216.16 with SMTP id o16mr14521ybg.156.1283447972259; Thu, 02 Sep 2010 10:19:32 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.151.103.4 with HTTP; Thu, 2 Sep 2010 10:19:12 -0700 (PDT)
In-Reply-To: <4C7F8EE7.1040106@opera.com>
References: <20100901224502.0519B3A687C@core3.amsl.com> <4C7F8EE7.1040106@opera.com>
From: John Tamplin <jat@google.com>
Date: Thu, 02 Sep 2010 13:19:12 -0400
Message-ID: <AANLkTimg1dgWFThSHXhpu4q1XFx_6kLQe1cMy1dq_dWa@mail.gmail.com>
To: James Graham <jgraham@opera.com>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-System-Of-Record: true
Cc: hybi@ietf.org
Subject: Re: [hybi] I-D Action:draft-ietf-hybi-thewebsocketprotocol-01.txt
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 02 Sep 2010 17:45:55 -0000

On Thu, Sep 2, 2010 at 7:47 AM, James Graham <jgraham@opera.com> wrote:
> The framing sections seem to have a lot of SHOULDs. This is worrying as
> SHOULD-level conditions can't really be tested (it is not an error to
> violate them) and can be a source of interoperability problems. I would
> prefer that we make all behaviour mandatory unless there is a good reason to
> do otherwise.

I think we can tighten those up as we get consensus.  As an example,
there were people that wanted keep-alives, and people that didn't.
The language in Ping/Pong is intended to mean that those who want them
can have them, while the ones who don't, don't.  I agree it would be
better to have fewer options left up to the implementation, but I
think we need more debate to get there.

> Some specific clauses seem problematic. For example:
>
> """A receiver MUST be prepared to accept arbitrarily fragmented
>      messages, even if the sender sent the message in a single frame."""
>
> "be prepared to accept" seems like poor wording. I would just say "Clients
> and servers MUST support recieving both fragmented and unfragmented
> messages". The clause "even if the sender sent the message..." seems odd
> because it is not clear how the recipient can know how the message is
> originally sent. In any case it seems redundant. This whole clause could be
> avoided with normative processing requirements that require support for
> fragmented and unfragmented messages (see below).

So, the case I was thinking of was someone writing the JS code for
their app and their own stand-alone server.  Without some statement to
that effect, they might assume that if they send a single message the
server will receive it in a single frame, but the browser or some
intermediary might decide to fragment it anyway.  I am open to
different ways of wording it, but I think there needs to be some
clarification that just because a message was sent in one fragment
doesn't meant it will stay that way.

> """Ping
>
>      Upon receipt of a Ping message, an endpoint SHOULD send a Pong
>      response as soon as is practical.  The Pong response MUST contain
>      the payload provided in the Ping message, though an implementation
>      MAY truncate the message at an implementation-defined size which
>      MUST be at least 8 _(TBD)_ bytes."""
>
> It seems simpler and less error prone to require truncation always in case
> servers come to depend on the behaviour of specific implementations here.

So what should that size be?  Also, the simplest case is you already
have the ping frame in memory, and you just immediately send it back
out -- it would actually be more work to truncate it first.

> """      Ping frames MAY be sent as a keep-alive mechanism, but if so the
> interval SHOULD be configurable."""
>
> I have no idea what this SHOULD is trying to require. Is it supposed to be a
> requirement on browser UI? On server implementations? Something else? In any
> case, we can't really require that things are configurable.

See above.  If we can get agreement on what a keep-alive mechanism
should look like (which I think would be good so that they could be
aggregated on muxed connections or rate-limited on mobile networks,
for example), then I think it would be great to specify exactly what
that happens.

The attempt here was to capture that part of the framing which we
appear to have agreed on, and use that as a base for finishing the
rest of it.

> """Close:
>
>      Upon receipt of a close frame, an endpoint SHOULD send a Close
>      frame to the remote recipient, if it has not already done so,
>      deliver a close event to the application if necessary, and then
>      close the WebSocket."""
>
> It is not clear what the scope of the SHOULD here is. Since this is nested
> inside a MUST clause I guess "and then close the connection" is supposed to
> be a requirement. I'm not sure why sending the close frame is "SHOULD". I
> don't think it is necessary to talk about "deliver[ing] a close event to the
> application" since the interaction between the application and the protocol
> is for the application to determine. For the specific case of the JS API we
> just need to ensure we use the right terminology so that the description in
> the API document is correct (i.e. this needs to match section 7.3 of the
> protocol draft and section 5 of the API draft).

I don't know if it is reasonable to require the close frame in all
cases -- some implementations may not make it easy to close inbound
while keeping outbound traffic alive to send the close frame, or the
architecture makes it awkward.  The original closer has to be prepared
to not get a response anyway, in case the connection goes down.

Regarding the application layer, it seems like you would want to
define the interaction between the WebSocket layer and the application
layer, but I agree perhaps the discussion of framing isn't the right
place for it.

> For example it is unclear what happens if the first frame a client receives has
> opcode=0 (aside: it seems like we can design around this particular problem by
> e.g. ditching opcode=0 and just ignoring opcode after the first frame in a
> fragment).

That seems ripe for interoperability problems -- some implementations
may wind up just using the opcode of the last frame if they are all
sent the same, and an attacker could take advantage of differing
implementations to sneak some data past a filtering intermediary by
having a different opcode on an intermediate frame.  It seems better
to avoid those problems by mandating a Continuation opcode, and it
also gets you useful error checking (similar to when two bits were
used for fragmentation support).

> It is also not defined what should happen if any of the reserved
> bits are set or if the opcode is in the reserved range. All of these things
> are essential for interoperable implementations.

Ok.  I would suggest mandating that any framing error, such as a
non-Continuation opcode on a non-intitial frame, using reserved bits
if no extension was negotiated which defined them, etc, the connection
should simply be dropped.  Maybe we want to have a Control frame for
"Framing error", which might be part of the Close frame, or maybe just
drop it.

-- 
John A. Tamplin
Software Engineer (GWT), Google