Re: [hybi] preliminary WebSockets compression experiments

John Tamplin <jat@google.com> Fri, 23 April 2010 22:11 UTC

Return-Path: <jat@google.com>
X-Original-To: hybi@core3.amsl.com
Delivered-To: hybi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 9D5D23A695D for <hybi@core3.amsl.com>; Fri, 23 Apr 2010 15:11:57 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -99.376
X-Spam-Level:
X-Spam-Status: No, score=-99.376 tagged_above=-999 required=5 tests=[BAYES_50=0.001, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mTg84pSOPyBV for <hybi@core3.amsl.com>; Fri, 23 Apr 2010 15:11:53 -0700 (PDT)
Received: from smtp-out.google.com (smtp-out.google.com [74.125.121.35]) by core3.amsl.com (Postfix) with ESMTP id 6260E3A6949 for <hybi@ietf.org>; Fri, 23 Apr 2010 15:11:52 -0700 (PDT)
Received: from wpaz5.hot.corp.google.com (wpaz5.hot.corp.google.com [172.24.198.69]) by smtp-out.google.com with ESMTP id o3NMBdie028254 for <hybi@ietf.org>; Sat, 24 Apr 2010 00:11:40 +0200
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=google.com; s=beta; t=1272060700; bh=aSP+r0MRDLhdoJRvn4Cjn53WWww=; h=MIME-Version:In-Reply-To:References:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=Gqr3AuJywFE85ib7EBDY+WPVztkr9Ze21mN9AEFF3w3f6vm5fhKe41pHVGNwmKAD5 EJjNRXxlOYcasurIpa8Sg==
DomainKey-Signature: a=rsa-sha1; s=beta; d=google.com; c=nofws; q=dns; h=mime-version:in-reply-to:references:from:date:message-id: subject:to:cc:content-type:x-system-of-record; b=UsT74npaysO+49qoPC9NlOuavKmZGOHJj1uZC0IqW8hTeFmKXvajvPhUsq63ViDuC NWwpGKiHZxVWVpI68hMDQ==
Received: from gwj21 (gwj21.prod.google.com [10.200.10.21]) by wpaz5.hot.corp.google.com with ESMTP id o3NMBc8s023766 for <hybi@ietf.org>; Fri, 23 Apr 2010 15:11:39 -0700
Received: by gwj21 with SMTP id 21so9914952gwj.2 for <hybi@ietf.org>; Fri, 23 Apr 2010 15:11:38 -0700 (PDT)
Received: by 10.151.93.17 with SMTP id v17mr597862ybl.344.1272060698613; Fri, 23 Apr 2010 15:11:38 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.150.117.30 with HTTP; Fri, 23 Apr 2010 15:11:18 -0700 (PDT)
In-Reply-To: <z2o2a10ed241004231448l7a63e329p98e04fbe1a750539@mail.gmail.com>
References: <q2z3f94964f1004231247zc7b60dc3l5fbb4748d129c3c@mail.gmail.com> <z2o2a10ed241004231448l7a63e329p98e04fbe1a750539@mail.gmail.com>
From: John Tamplin <jat@google.com>
Date: Fri, 23 Apr 2010 18:11:18 -0400
Message-ID: <z2w3f94964f1004231511u57f0d702z78e582b5481a2877@mail.gmail.com>
To: Mike Belshe <mike@belshe.com>
Content-Type: multipart/alternative; boundary="000e0cd3559a12da130484eeb6fb"
X-System-Of-Record: true
Cc: hybi@ietf.org
Subject: Re: [hybi] preliminary WebSockets compression experiments
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 23 Apr 2010 22:11:57 -0000

On Fri, Apr 23, 2010 at 5:48 PM, Mike Belshe <mike@belshe.com> wrote:

> *Memory Consumption:*
> BTW - zlib, using the configuration you specified, might use a surprising
> amount of RAM (to the tune of ~250KB!).  I suspect you want to decrease the
> window size. Brian Olson already ran this experiment a while back with SPDY
> and his results are here.  It would be interesting to see if you reach the
> same conclusion.
>
> groups.google.com/group/spdy-dev/browse_thread/thread/dfaf498542fac792?pli=1
>
> As a result of Brian's work, SPDY decreased the windowBits from 15 to 11.
>  This reduces the memory footprint substantially, while only leading to
> modest reduction in compression ratios.
>

Thanks for the pointer -- I will try that out.


> *On Generically Compressing A Stream:*
> While we wanted to have SPDY provide required compression, in the end we
> find it likely a bad idea.  Compressing the stream is never as good as
> compressing the content.  For example, if the content can compress the
> image, there is no point in having the stream redundantly (and likely
> less-effectively) compress it.  Since most developers will get the content
> compressed correctly, that is the right layer to do so; rather than having
> the channel blindly compress what is already done.  We might revisit this
> approach if new data arises, though.
>
> SSL has taken a similar approach.  When the protocol was designed, the
> authors had the foresight to include compression negotiation in addition to
> cypher negotiation as part of the handshake.  They did this because
> compression should be applied before encryption (compressing encrypted data
> is never good).  But, what do browsers today negotiate?  The major browsers
> all advertise the empty-set of supported compression algorithms (meaning
> don't do compression).  Why?  because compressing the stream blindly when
> the application content is already compressed is inefficient.
>
> Anyway - those are two protocols which are opting not to do stream-based
> compression and instead requiring the content provider to make the choice
> (like HTTP does).  I think this matches your conclusion that you shouldn't
> just compress everything.
>

I agree with the basic premise that the application can do a better job
compressing the data than some generic algorithm -- my compression
background is chess endgame databases where application-specific compression
algorithms are much better than generic ones.  However, I don't know about
trying to implement that in JS along with a corresponding server-side
implementation.

For HTTP, most servers are configured to compress things automatically if
the client supports it, typically with a blacklist of file types not to
compress (granted, you can pre-compress static content and serve that, and
some servers may be intentionally configured to not compress other data).
 Clearly using gzip compression on HTTP traffic is generally a good thing,
and especially in the case of mobile or other low-bandwidth networks.

As the GWT Quake example shows, even data that you wouldn't think is
compressible (and isn't if you don't maintain compression state) compresses
quite well.  With the safety valve of allowing uncompressed frames even when
compression has been negotiated, I don't see how it can be a bad thing.  The
server-side API might allow a way for the particular service on the other
end of the socket to deny compression if it knows its data can't benefit,
but that is really beyond the scope of the wire format.

It would be feasible to provide an API for JS to call for compressing data
and sending it itself, but there are a couple of problems:
 - JS currently doesn't have any efficient way to manipulate binary data
 - in the common case, it just adds extra work for the developer and extra
code to download

I think that might be a good idea once the binary data problem is solved,
but I think a default case of allowing reasonable compression on the stream
automatically is a good idea.

-- 
John A. Tamplin
Software Engineer (GWT), Google