Re: Header compression question: duplicate header entry and current index on computing working set
Tatsuhiro Tsujikawa <tatsuhiro.t@gmail.com> Thu, 18 July 2013 15:51 UTC
Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EF22511E8176 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Thu, 18 Jul 2013 08:51:18 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.598
X-Spam-Level:
X-Spam-Status: No, score=-10.598 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Odpz8bLC3oEb for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Thu, 18 Jul 2013 08:51:12 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com (Postfix) with ESMTP id 7AA8211E8175 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Thu, 18 Jul 2013 08:51:12 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1UzqSS-00088v-JG for ietf-http-wg-dist@listhub.w3.org; Thu, 18 Jul 2013 15:49:32 +0000
Resent-Date: Thu, 18 Jul 2013 15:49:32 +0000
Resent-Message-Id: <E1UzqSS-00088v-JG@frink.w3.org>
Received: from maggie.w3.org ([128.30.52.39]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <tatsuhiro.t@gmail.com>) id 1UzqSH-00088A-U1 for ietf-http-wg@listhub.w3.org; Thu, 18 Jul 2013 15:49:21 +0000
Received: from mail-ie0-f179.google.com ([209.85.223.179]) by maggie.w3.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.72) (envelope-from <tatsuhiro.t@gmail.com>) id 1UzqSG-0007Em-OJ for ietf-http-wg@w3.org; Thu, 18 Jul 2013 15:49:21 +0000
Received: by mail-ie0-f179.google.com with SMTP id c10so7487052ieb.10 for <ietf-http-wg@w3.org>; Thu, 18 Jul 2013 08:48:55 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=VpUv6UPhNc7MYnAOaKjLYA0jUA8IHDeWQVhtI/Gve+4=; b=QkWgqkHD1AhCfDQX61uNPi6nf6mP7Qowkx06arIaGPtqUTxRegsJswlphv0Uk2pB84 PqkT5zeCEiUe9Cqe0TOgBSJKncpw6lrKSJKJOX+9R32pZrfMm3nsVzroyxzjI/jb9qlr g26hdL4j1wS8usU6vkXbUkd1tR2815Oh5Mx9481fzCuxJO7Vg81ZTYxk8xqEV5AlrYov Ri2NjQ5G41OkVXuwp0QXyB6TVEuo8B9CMDehyuGc5Y8XqPBRjLRPFONPeoE0JgnY66Bb NPw/VkOLthplR/mpfh4jxW9NX3cu8jGmSknutN4a/PLEPgw6CScoT0NfR4Qd5bTBIf89 iHcQ==
X-Received: by 10.42.76.5 with SMTP id c5mr6668019ick.91.1374162534930; Thu, 18 Jul 2013 08:48:54 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.64.32.103 with HTTP; Thu, 18 Jul 2013 08:48:34 -0700 (PDT)
In-Reply-To: <CAP+FsNfDRW4qjVsO-KBS5n0DGRdPehmA9fJw7+MhJciDU_0SwA@mail.gmail.com>
References: <CAPyZ6=+sFhOuSSE6OEB9bJAMYm0zt4SJwgmdMjUTprmiAj3ujg@mail.gmail.com> <CABkgnnWP=TgVZ2pruTCWUbZuLkBMwWa3rCH-Kiup8j-c=hKcLg@mail.gmail.com> <CAPyZ6=JJONW8uciYMmApEuUaM+DQZ3PDaS7n-JaXhz7DQuytQA@mail.gmail.com> <5056ccb2fb804632bd5425e9f7b49e14@BY2PR03MB025.namprd03.prod.outlook.com> <CA+pLO_i1iPWBibZZanTDMCB+TQC5xHEVUbO_ZegUWsvSb8skfQ@mail.gmail.com> <CAP+FsNfDRW4qjVsO-KBS5n0DGRdPehmA9fJw7+MhJciDU_0SwA@mail.gmail.com>
From: Tatsuhiro Tsujikawa <tatsuhiro.t@gmail.com>
Date: Fri, 19 Jul 2013 00:48:34 +0900
Message-ID: <CAPyZ6=LoOgACf39ziUXAT6yMpL-1hcLUsSnoG3ra58j8e7qu4Q@mail.gmail.com>
To: Roberto Peon <grmocg@gmail.com>
Cc: Jeff Pinner <jpinner@twitter.com>, Mike Bishop <Michael.Bishop@microsoft.com>, Martin Thomson <martin.thomson@gmail.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Content-Type: multipart/alternative; boundary="90e6ba3fcee1c22d0204e1cb2516"
Received-SPF: pass client-ip=209.85.223.179; envelope-from=tatsuhiro.t@gmail.com; helo=mail-ie0-f179.google.com
X-W3C-Hub-Spam-Status: No, score=-3.5
X-W3C-Hub-Spam-Report: AWL=-2.711, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001
X-W3C-Scan-Sig: maggie.w3.org 1UzqSG-0007Em-OJ 0832814df9863169941f81a3fe213121
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Header compression question: duplicate header entry and current index on computing working set
Archived-At: <http://www.w3.org/mid/CAPyZ6=LoOgACf39ziUXAT6yMpL-1hcLUsSnoG3ra58j8e7qu4Q@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/18843
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>
>From the responses, I'm under the impression that reference set is the set of reference to the entry in the header table, and not the set of name/value pairs. I am now convinced that multiple duplicate entries are no problem with this method, well as long as the encoder and decoder uses this method. Since the spec seems to infer that reference set is just a pair of name and value (Appendix B shows only name/value pairs, not index), one may think that the reference set is name/value pair set and do the sweep header table to get index. The node-http2 implements this way. If the encoder uses the method described above and decoder uses sweep, then they may be out of sync if there is duplicate in the header table. Personally I much prefer the first method since it avoids sweeps and string matching. But this problem only occurs when the encoder throws duplicate entry to the header table and since it is buggy and even considered as "bug", so as long as the encoder is good enough not to do this, there are no problem. Best regards, Tatsuhiro Tsujikawa On Thu, Jul 18, 2013 at 4:39 AM, Roberto Peon <grmocg@gmail.com> wrote: > Not necessarily-- If the encoder/compressor says that element 5 is k,v, > and then it also appends k,v as element 7, for example. > > When this happens things are less efficient, yes, but that is it. > > There has been little optimization around duplicate k,v entries because > 1) It is very uncommon > 2) It doesn't break anything > 3) There doesn't seem to be a good reason to encourage the behavior by > optimizing for it. > -=R > > > On Wed, Jul 17, 2013 at 10:08 AM, Jeff Pinner <jpinner@twitter.com> wrote: > >> Doesn't the decompressor have to sweep the table to create the new >> reference set and compare the (index, name, value) entries? >> >> >> On Wed, Jul 17, 2013 at 9:59 AM, Mike Bishop < >> Michael.Bishop@microsoft.com> wrote: >> >>> How did the object get into the reference set? Because the compressor >>> referenced an object by index, or included it as a literal and added it to >>> the table.**** >>> >>> ** ** >>> >>> So the object in the reference set points to the entry in the table it >>> was added with. If there happens to be another identical entry in the >>> table, nothing says that the decompressor will even notice that. I don’t >>> recall anything that requires the decompressor to sweep the header table >>> looking for matches – that’s the compressor’s job.**** >>> >>> ** ** >>> >>> *From:* Tatsuhiro Tsujikawa [mailto:tatsuhiro.t@gmail.com] >>> *Sent:* Wednesday, July 17, 2013 9:52 AM >>> *To:* Martin Thomson >>> *Cc:* ietf-http-wg@w3.org >>> *Subject:* Re: Header compression question: duplicate header entry and >>> current index on computing working set**** >>> >>> ** ** >>> >>> ** ** >>> >>> ** ** >>> >>> On Thu, Jul 18, 2013 at 1:36 AM, Martin Thomson < >>> martin.thomson@gmail.com> wrote:**** >>> >>> On 17 July 2013 08:56, Tatsuhiro Tsujikawa <tatsuhiro.t@gmail.com> >>> wrote: >>> > In 3.4, to compute working set from reference set of headers, the >>> index of >>> > entry in header table is required. >>> > The question is, when the duplicate entries are in the header table, >>> which >>> > index is used as the index of working set?**** >>> >>> If, for some strange reason, a compressor created multiple identical >>> entries in the table, the decompressor is required to respect that >>> choice, even if it is likely to be a bug. This prevents the >>> decompressor and compressor from getting out of sync. >>> >>> The compressor can then reference any of the entries when using an index. >>> **** >>> >>> ** ** >>> >>> If the choice is arbitrary, then the compressor and decompressor may >>> choose different index and**** >>> >>> can get out of sync.**** >>> >>> ** ** >>> >>> For example,**** >>> >>> Current header table:**** >>> >>> |0|name1|value1|**** >>> >>> |1|name1|value1|**** >>> >>> ** ** >>> >>> If name1/value1 is in reference set, compressor chooses index 0, and >>> decompressor chooses index 1.**** >>> >>> compressor wants to remove name1/value1, so reference index 0.**** >>> >>> In decompressor side, however, seeing index header representation with >>> index 0 and it is not in its reference set**** >>> >>> (because name1/value1 is index 1), retrieve index 0 from header table >>> and add it to working set.**** >>> >>> Maybe I misunderstand the draft.**** >>> >>> ** ** >>> >>> If multiple identical entries are considered as a bug, then it would be >>> better to**** >>> >>> prohibit it in the spec and we are happy to not to consider these things. >>> **** >>> >>> ** ** >>> >>> Best regards,**** >>> >>> ** ** >>> >>> Tatsuhiro Tsujikawa**** >>> >>> ** ** >>> >>> ** ** >>> >> >> >
- Header compression question: duplicate header ent… Tatsuhiro Tsujikawa
- Re: Header compression question: duplicate header… Martin Thomson
- Re: Header compression question: duplicate header… Tatsuhiro Tsujikawa
- RE: Header compression question: duplicate header… Mike Bishop
- Re: Header compression question: duplicate header… Jeff Pinner
- Re: Header compression question: duplicate header… Roberto Peon
- Re: Header compression question: duplicate header… Tatsuhiro Tsujikawa