Re: Header Compression - streaming proposal
Gábor Molnár <gabor.molnar@sch.bme.hu> Wed, 10 July 2013 07:56 UTC
Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A594D21F9CBF for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 10 Jul 2013 00:56:02 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.649
X-Spam-Level:
X-Spam-Status: No, score=-9.649 tagged_above=-999 required=5 tests=[AWL=0.028, BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, MIME_8BIT_HEADER=0.3, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Rb7iBDo59Wv8 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 10 Jul 2013 00:55:57 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com (Postfix) with ESMTP id E27E221F9CDC for <httpbisa-archive-bis2Juki@lists.ietf.org>; Wed, 10 Jul 2013 00:55:56 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1UwpEN-0001Tj-Qd for ietf-http-wg-dist@listhub.w3.org; Wed, 10 Jul 2013 07:54:31 +0000
Resent-Date: Wed, 10 Jul 2013 07:54:31 +0000
Resent-Message-Id: <E1UwpEN-0001Tj-Qd@frink.w3.org>
Received: from lisa.w3.org ([128.30.52.41]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <gabor.molnar@sch.bme.hu>) id 1UwpEF-0001Qp-0Z for ietf-http-wg@listhub.w3.org; Wed, 10 Jul 2013 07:54:23 +0000
Received: from balu.sch.bme.hu ([152.66.208.40]) by lisa.w3.org with esmtp (Exim 4.72) (envelope-from <gabor.molnar@sch.bme.hu>) id 1UwpED-00077L-Lm for ietf-http-wg@w3.org; Wed, 10 Jul 2013 07:54:22 +0000
Received: from mail-ie0-f174.google.com (mail-ie0-f174.google.com [209.85.223.174]) by balu.sch.bme.hu (Sun Java System Messaging Server 6.2-7.05 (built Sep 5 2006)) with ESMTPSA id <0MPP00IBXN9TDX00@balu.sch.bme.hu> for ietf-http-wg@w3.org; Wed, 10 Jul 2013 09:53:55 +0200 (CEST)
Received: by mail-ie0-f174.google.com with SMTP id 9so14564802iec.19 for <ietf-http-wg@w3.org>; Wed, 10 Jul 2013 00:53:52 -0700 (PDT)
Received: by 10.64.24.210 with HTTP; Wed, 10 Jul 2013 00:53:32 -0700 (PDT)
X-Received: by 10.50.117.38 with SMTP id kb6mr11116637igb.57.1373442832450; Wed, 10 Jul 2013 00:53:52 -0700 (PDT)
Date: Wed, 10 Jul 2013 09:53:32 +0200
From: Gábor Molnár <gabor.molnar@sch.bme.hu>
In-reply-to: <CA+KJw_6W_J=cLP3AsVQK3GV-PuvM2MsDW7BVUeH=7=s_rk4gGw@mail.gmail.com>
To: Martin Thomson <martin.thomson@gmail.com>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Message-id: <CA+KJw_6Dfgr7AWVHtHbuphqveYOPKuD_R55QscYrrtsaa0XT5A@mail.gmail.com>
MIME-version: 1.0
Content-type: text/plain; charset="UTF-8"
Content-transfer-encoding: quoted-printable
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; bh=5gYd5x5IDEw+0W2gQORWAJ0KK5GTSpKLiNAdA+1lqMw=; b=AfSVIf1mLy5aGiEi8xao4pUtaqOnwxDBCVXHaq11Y1yHF/ZLfSbuWv6k7G3uoGNtxU +vEpC3OJ3CO7K4ljJvfmHpCxGfkixKzLdt1eLHWdnmh0Gi2MrKBcKWm5CqN9IjbE0b3K w11TSBGgfc/VLK4rvGCQZvup9OTwMGANiuSjr8cLx9F4jTNXMuRGN9YanDZ80SPPFipR UnR1POnrmxIST+YxXP+u4PzzWkDAzsC8ZL2+ijWbhf8XBzcBTxreD71BRRo7VD4GPdhm 4llJQQnnCn8czIr0fIt0em+qIo3HTnbrgq2wfwkhqr/yDcZ5upQBBD57S3lGRUYQsAlI R2Lw==
References: <CA+KJw_5xfvnCYM7QmtLQebPDO-fJbZz6D47mjHEWui3=fiHUoQ@mail.gmail.com> <CA+KJw_4zqU7jdZNs9NpfA3HbjAcnhRLgMKG0Apf_nzyK9VrkHg@mail.gmail.com> <CABkgnnVqWjWrGWuP+eZniGJe+WWL7Ekt+88wJ8xO9tkHqzhNfA@mail.gmail.com> <CA+KJw_6W_J=cLP3AsVQK3GV-PuvM2MsDW7BVUeH=7=s_rk4gGw@mail.gmail.com>
Received-SPF: pass client-ip=152.66.208.40; envelope-from=gabor.molnar@sch.bme.hu; helo=balu.sch.bme.hu
X-W3C-Hub-Spam-Status: No, score=-4.4
X-W3C-Hub-Spam-Report: AWL=-1.759, RCVD_IN_DNSWL_MED=-2.3, RP_MATCHES_RCVD=-0.303, SPF_PASS=-0.001
X-W3C-Scan-Sig: lisa.w3.org 1UwpED-00077L-Lm 36438706726a4ba37d54abfb1c6b34e7
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Header Compression - streaming proposal
Archived-At: <http://www.w3.org/mid/CA+KJw_6Dfgr7AWVHtHbuphqveYOPKuD_R55QscYrrtsaa0XT5A@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/18666
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>
I opened an issue on GitHub for this proposal with minor modification to the original text: https://github.com/http2/compression-spec/issues/11 If I get more feedback, I will be happy to draft a PR as well. 2013/7/8 Gábor Molnár <gabor.molnar@sch.bme.hu>: > 2013/7/5 Martin Thomson <martin.thomson@gmail.com> >> >> In terms of simplicity, doing toggles first, then literals (w/ or w/o >> changes to the table), makes a lot of sense to me. >> >> That shifts some complexity to the encoder. An encoder will have to >> run two passes over its headers. > > > I've been thinking about this for a while: is it worth moving complexity > from > the decoder to the encoder, when a resource constrained client have to > implement both (for request/response)? I think the answer is yes, because > 1. it *remains possible* to implement encoding with minimal resource > usage: just use literal representations without indexing for all > headers. > 2. it *becomes possible* to implement decoding with minimal resource > usage (streaming decoder). > >> >> It also makes routing decisions a little harder for intermediation, >> since routing information (usually :path, but the other :-headers need >> to be checked too) are no longer at the head of the line if we assume >> that :path changes request-by-request. > > > I don't think it becomes harder. If an output buffer is used, and at the end > the headers are reordered, then we get back to the situation we have now. > When implementing a routing logic that can handle streaming, then the > routing > decision can be made somewhat sooner than the end of the decompressing > process. I agree that implementing such a streaming capable routing logic > might be more difficult and might not be worth it (especially given that > proxies are usually not memory-constrained). > >> >> I'm just pointing out the trade-off. Those costs do seem manageable. > > > Thanks for the feedback! > >> >> On 5 July 2013 01:51, Gábor Molnár <gabor.molnar@sch.bme.hu> wrote: >> > An important detail was left out: >> > 3.3. step: if the entry was inserted, set the reference flag to true >> > on >> > it. >> > >> > >> > 2013/7/5 Gábor Molnár <gabor.molnar@sch.bme.hu> >> >> >> >> This a proposal for a seemingly minor change, that could make it >> >> possible >> >> to implement >> >> a streaming encoder/decoder for the compression spec, and make the >> >> decoding process >> >> simpler. It would also eliminate certain corner cases, like the >> >> shadowing >> >> problem. >> >> >> >> There's a lot of talk recently on enforcing the memory usage limits of >> >> the >> >> compression >> >> spec. There's one component however, that we don't take into account >> >> when >> >> computing >> >> the memory usage of compression implementations: it's the Working Set. >> >> The >> >> problem >> >> is that it can grow without bounds, since as far as I know, HTTP does >> >> not >> >> impose limits >> >> on the size of the header set. I tried to come up with a decoder >> >> implementation >> >> architecture for the compression spec that would not have to store the >> >> whole set in the >> >> memory. >> >> >> >> Such a decoder would instead stream the output of the decoding process, >> >> header by >> >> header. This seems to be a legitimate approach, since most of the >> >> memory-conscious >> >> parsers I know are implemented as streaming parsers (streaming json, >> >> xml, >> >> http, ... parsers). Gzip, the base of the previously used header >> >> compression mechanism >> >> is a streaming compressor/decompressor as well, of course. >> >> >> >> It turns out that it is not possible to implement the current spec as a >> >> streaming parser. >> >> The only reason is this: if an entry gets inserted into the working >> >> set, >> >> it is not guaranteed >> >> that it will remain there until the end of the decompression process, >> >> since it could be >> >> deleted any time. Because of this, it is not possible to emit any >> >> headers >> >> until the end >> >> of the process. >> >> >> >> I propose a simple change, that could, however, guarantee this: in >> >> header >> >> blocks, Indexed >> >> Representations should come first. This would guarantee that after the >> >> Indexed >> >> Representations are over, there will be no deletion from the Working >> >> Set. >> >> This is the only >> >> thing that would have to be changed. Existing decoding process can be >> >> applied as if nothing >> >> would change. >> >> >> >> But it is now possible to implement a streaming, and - as a side effect >> >> - >> >> much simpler >> >> decoder like this: >> >> >> >> 0. There's only one component: the Header Table. An entry in the Header >> >> Table is a >> >> name-value pair with an index (just like before), and a 'reference' >> >> flag that is not set by >> >> default. >> >> 1. First phase of decoding: dealing with indexed representations. >> >> Indexed >> >> representations >> >> simply flip the 'reference' flag on the entry they reference. >> >> 2. Second phase of decoding: before starting the processing of literal >> >> representations, emit >> >> every name-value pair that is flagged in the Header Table. >> >> 3. Third phase of decoding: for every literal representations: >> >> 1. emit the name-value pair >> >> 2. insert it in the table if needed (incremental or substitution >> >> indexing with table size >> >> enforcement) >> >> 4. When a new header block arrives, jump to 1. >> >> >> >> It is maybe not obvious at first, but this process is equivalent the >> >> the >> >> current decoding process, >> >> if indexed representations come first. Please point out corner cases if >> >> you find any. >> >> >> >> I think that the 'Indexed Representations come first' pattern is >> >> something >> >> that comes naturally >> >> when implementing an encoder. Even examples in the spec can remain >> >> unchanged, since they >> >> follow this pattern already. >> >> >> >> Regards, >> >> Gábor >> > >> > > >
- Header Compression - streaming proposal Gábor Molnár
- Re: Header Compression - streaming proposal Gábor Molnár
- Re: Header Compression - streaming proposal Martin Thomson
- Re: Header Compression - streaming proposal Gábor Molnár
- Re: Header Compression - streaming proposal Gábor Molnár
- Re: Header Compression - streaming proposal Martin Thomson
- RE: Header Compression - streaming proposal RUELLAN Herve
- Re: Header Compression - streaming proposal Roberto Peon