Re: Header Compression - streaming proposal
Martin Thomson <martin.thomson@gmail.com> Fri, 05 July 2013 17:40 UTC
Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0C29921F9E44 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Fri, 5 Jul 2013 10:40:16 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.449
X-Spam-Level:
X-Spam-Status: No, score=-10.449 tagged_above=-999 required=5 tests=[AWL=-0.150, BAYES_00=-2.599, MIME_8BIT_HEADER=0.3, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 1EeLaoTV3FkK for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Fri, 5 Jul 2013 10:40:09 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com (Postfix) with ESMTP id A27B321F9476 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Fri, 5 Jul 2013 10:40:09 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1Uv9yb-0004l3-Tz for ietf-http-wg-dist@listhub.w3.org; Fri, 05 Jul 2013 17:39:21 +0000
Resent-Date: Fri, 05 Jul 2013 17:39:21 +0000
Resent-Message-Id: <E1Uv9yb-0004l3-Tz@frink.w3.org>
Received: from maggie.w3.org ([128.30.52.39]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <martin.thomson@gmail.com>) id 1Uv9yU-0004gw-KZ for ietf-http-wg@listhub.w3.org; Fri, 05 Jul 2013 17:39:14 +0000
Received: from mail-wi0-f180.google.com ([209.85.212.180]) by maggie.w3.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.72) (envelope-from <martin.thomson@gmail.com>) id 1Uv9yT-0004NP-Jb for ietf-http-wg@w3.org; Fri, 05 Jul 2013 17:39:14 +0000
Received: by mail-wi0-f180.google.com with SMTP id c10so2354518wiw.13 for <ietf-http-wg@w3.org>; Fri, 05 Jul 2013 10:38:47 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=/jiQZ1ZgMxymXstlT2njwJiSMubqWI2zB+onZs1hMP0=; b=r1vazICO0FZPni4ICyPloNQHSTOLje3atcjknLxFjOyQUsDZcgIwt5FavMJ0FM1ice nCWu0iB20/ii7w6ikBlbwW2bjXjSnkaBdu4DvK54wi+RPFis8FYhBMQ1yiFuR7TlMVF7 pq4uqxaLqHu9RJc+uCumEOLrpQJyjxeMuRkyxR5yj43PgmDX9HwRysx+5Wu+yiojr2iy E/gIPtzTMX9kNNCOC1zKvYJE6/P5R7i95Br7z7fiGVDiZjZFQeiVq57QB38T+ekMa1ng jnddapd2Rfg0oTUCTb+GlLLHub+7KqQAW6aZD5pqlBIbrMT9+IUYQtSjRNneuWtOh+Xy yEuA==
MIME-Version: 1.0
X-Received: by 10.181.12.10 with SMTP id em10mr6359579wid.14.1373045927398; Fri, 05 Jul 2013 10:38:47 -0700 (PDT)
Received: by 10.194.60.46 with HTTP; Fri, 5 Jul 2013 10:38:47 -0700 (PDT)
In-Reply-To: <CA+KJw_4zqU7jdZNs9NpfA3HbjAcnhRLgMKG0Apf_nzyK9VrkHg@mail.gmail.com>
References: <CA+KJw_5xfvnCYM7QmtLQebPDO-fJbZz6D47mjHEWui3=fiHUoQ@mail.gmail.com> <CA+KJw_4zqU7jdZNs9NpfA3HbjAcnhRLgMKG0Apf_nzyK9VrkHg@mail.gmail.com>
Date: Fri, 05 Jul 2013 10:38:47 -0700
Message-ID: <CABkgnnVqWjWrGWuP+eZniGJe+WWL7Ekt+88wJ8xO9tkHqzhNfA@mail.gmail.com>
From: Martin Thomson <martin.thomson@gmail.com>
To: Gábor Molnár <gabor.molnar@sch.bme.hu>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Received-SPF: pass client-ip=209.85.212.180; envelope-from=martin.thomson@gmail.com; helo=mail-wi0-f180.google.com
X-W3C-Hub-Spam-Status: No, score=-3.5
X-W3C-Hub-Spam-Report: AWL=-2.689, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001
X-W3C-Scan-Sig: maggie.w3.org 1Uv9yT-0004NP-Jb 68f0edfa6563c5341b501db4e9b16834
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Header Compression - streaming proposal
Archived-At: <http://www.w3.org/mid/CABkgnnVqWjWrGWuP+eZniGJe+WWL7Ekt+88wJ8xO9tkHqzhNfA@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/18626
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>
In terms of simplicity, doing toggles first, then literals (w/ or w/o changes to the table), makes a lot of sense to me. That shifts some complexity to the encoder. An encoder will have to run two passes over its headers. It also makes routing decisions a little harder for intermediation, since routing information (usually :path, but the other :-headers need to be checked too) are no longer at the head of the line if we assume that :path changes request-by-request. I'm just pointing out the trade-off. Those costs do seem manageable. On 5 July 2013 01:51, Gábor Molnár <gabor.molnar@sch.bme.hu> wrote: > An important detail was left out: > 3.3. step: if the entry was inserted, set the reference flag to true on > it. > > > 2013/7/5 Gábor Molnár <gabor.molnar@sch.bme.hu> >> >> This a proposal for a seemingly minor change, that could make it possible >> to implement >> a streaming encoder/decoder for the compression spec, and make the >> decoding process >> simpler. It would also eliminate certain corner cases, like the shadowing >> problem. >> >> There's a lot of talk recently on enforcing the memory usage limits of the >> compression >> spec. There's one component however, that we don't take into account when >> computing >> the memory usage of compression implementations: it's the Working Set. The >> problem >> is that it can grow without bounds, since as far as I know, HTTP does not >> impose limits >> on the size of the header set. I tried to come up with a decoder >> implementation >> architecture for the compression spec that would not have to store the >> whole set in the >> memory. >> >> Such a decoder would instead stream the output of the decoding process, >> header by >> header. This seems to be a legitimate approach, since most of the >> memory-conscious >> parsers I know are implemented as streaming parsers (streaming json, xml, >> http, ... parsers). Gzip, the base of the previously used header >> compression mechanism >> is a streaming compressor/decompressor as well, of course. >> >> It turns out that it is not possible to implement the current spec as a >> streaming parser. >> The only reason is this: if an entry gets inserted into the working set, >> it is not guaranteed >> that it will remain there until the end of the decompression process, >> since it could be >> deleted any time. Because of this, it is not possible to emit any headers >> until the end >> of the process. >> >> I propose a simple change, that could, however, guarantee this: in header >> blocks, Indexed >> Representations should come first. This would guarantee that after the >> Indexed >> Representations are over, there will be no deletion from the Working Set. >> This is the only >> thing that would have to be changed. Existing decoding process can be >> applied as if nothing >> would change. >> >> But it is now possible to implement a streaming, and - as a side effect - >> much simpler >> decoder like this: >> >> 0. There's only one component: the Header Table. An entry in the Header >> Table is a >> name-value pair with an index (just like before), and a 'reference' >> flag that is not set by >> default. >> 1. First phase of decoding: dealing with indexed representations. Indexed >> representations >> simply flip the 'reference' flag on the entry they reference. >> 2. Second phase of decoding: before starting the processing of literal >> representations, emit >> every name-value pair that is flagged in the Header Table. >> 3. Third phase of decoding: for every literal representations: >> 1. emit the name-value pair >> 2. insert it in the table if needed (incremental or substitution >> indexing with table size >> enforcement) >> 4. When a new header block arrives, jump to 1. >> >> It is maybe not obvious at first, but this process is equivalent the the >> current decoding process, >> if indexed representations come first. Please point out corner cases if >> you find any. >> >> I think that the 'Indexed Representations come first' pattern is something >> that comes naturally >> when implementing an encoder. Even examples in the spec can remain >> unchanged, since they >> follow this pattern already. >> >> Regards, >> Gábor > >
- Header Compression - streaming proposal Gábor Molnár
- Re: Header Compression - streaming proposal Gábor Molnár
- Re: Header Compression - streaming proposal Martin Thomson
- Re: Header Compression - streaming proposal Gábor Molnár
- Re: Header Compression - streaming proposal Gábor Molnár
- Re: Header Compression - streaming proposal Martin Thomson
- RE: Header Compression - streaming proposal RUELLAN Herve
- Re: Header Compression - streaming proposal Roberto Peon