Re: Header compression question: duplicate header entry and current index on computing working set

Roberto Peon <grmocg@gmail.com> Wed, 17 July 2013 19:42 UTC

Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EFC5321E809A for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 17 Jul 2013 12:42:29 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.552
X-Spam-Level:
X-Spam-Status: No, score=-10.552 tagged_above=-999 required=5 tests=[AWL=0.046, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id uZrYZZ1xW8Rw for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 17 Jul 2013 12:42:23 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com (Postfix) with ESMTP id 7889921E8093 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Wed, 17 Jul 2013 12:42:22 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1UzXa6-0001w1-Mz for ietf-http-wg-dist@listhub.w3.org; Wed, 17 Jul 2013 19:40:10 +0000
Resent-Date: Wed, 17 Jul 2013 19:40:10 +0000
Resent-Message-Id: <E1UzXa6-0001w1-Mz@frink.w3.org>
Received: from maggie.w3.org ([128.30.52.39]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <grmocg@gmail.com>) id 1UzXZx-0000eg-4p for ietf-http-wg@listhub.w3.org; Wed, 17 Jul 2013 19:40:01 +0000
Received: from mail-ob0-f179.google.com ([209.85.214.179]) by maggie.w3.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.72) (envelope-from <grmocg@gmail.com>) id 1UzXZv-00070w-S3 for ietf-http-wg@w3.org; Wed, 17 Jul 2013 19:40:01 +0000
Received: by mail-ob0-f179.google.com with SMTP id xk17so2777882obc.10 for <ietf-http-wg@w3.org>; Wed, 17 Jul 2013 12:39:33 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=vgnR1FeLMaldopKAeSsc7fENZl8UlHD1ZBmZsDysu1E=; b=yHAop4FDKHFOEPTUpbR/NJHTxmgY3BxMoOzfCWzPT8iKwTin94QHvrliD4MQ3kezCD nSuZqV7ayzkpvkbZVMTADcqdTQQJFXU1QLAV1VB+GschnASg1ySuTwDon7DpzsZn4w28 NxJJwfRz20/sxBm27/JsPcu2O99ECkrmVD6ww6D5OgusEJAB4kfUnRWkRyr+nGyk7nUu N6j143vXAciSsnZXlxuJQa6+W6+ohLA45ZzT1/lblpsgTJA8XGjAmisHqbJ+SjH5d2M5 izasMKk9AbBJ+KIvorjN6mF4n6yMBcTHuq4zlpsgycXRBZsS0UcHGmAwihjjw9HJSKbo QC8g==
MIME-Version: 1.0
X-Received: by 10.60.95.198 with SMTP id dm6mr9893427oeb.44.1374089973718; Wed, 17 Jul 2013 12:39:33 -0700 (PDT)
Received: by 10.76.91.229 with HTTP; Wed, 17 Jul 2013 12:39:33 -0700 (PDT)
In-Reply-To: <CA+pLO_i1iPWBibZZanTDMCB+TQC5xHEVUbO_ZegUWsvSb8skfQ@mail.gmail.com>
References: <CAPyZ6=+sFhOuSSE6OEB9bJAMYm0zt4SJwgmdMjUTprmiAj3ujg@mail.gmail.com> <CABkgnnWP=TgVZ2pruTCWUbZuLkBMwWa3rCH-Kiup8j-c=hKcLg@mail.gmail.com> <CAPyZ6=JJONW8uciYMmApEuUaM+DQZ3PDaS7n-JaXhz7DQuytQA@mail.gmail.com> <5056ccb2fb804632bd5425e9f7b49e14@BY2PR03MB025.namprd03.prod.outlook.com> <CA+pLO_i1iPWBibZZanTDMCB+TQC5xHEVUbO_ZegUWsvSb8skfQ@mail.gmail.com>
Date: Wed, 17 Jul 2013 12:39:33 -0700
Message-ID: <CAP+FsNfDRW4qjVsO-KBS5n0DGRdPehmA9fJw7+MhJciDU_0SwA@mail.gmail.com>
From: Roberto Peon <grmocg@gmail.com>
To: Jeff Pinner <jpinner@twitter.com>
Cc: Mike Bishop <Michael.Bishop@microsoft.com>, Tatsuhiro Tsujikawa <tatsuhiro.t@gmail.com>, Martin Thomson <martin.thomson@gmail.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
Content-Type: multipart/alternative; boundary="089e011606f4c5ee5804e1ba4072"
Received-SPF: pass client-ip=209.85.214.179; envelope-from=grmocg@gmail.com; helo=mail-ob0-f179.google.com
X-W3C-Hub-Spam-Status: No, score=-3.0
X-W3C-Hub-Spam-Report: AWL=-3.190, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, FREEMAIL_REPLY=1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001
X-W3C-Scan-Sig: maggie.w3.org 1UzXZv-00070w-S3 5fb99343bb0cc0da9f92e3a61aee4bc0
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Header compression question: duplicate header entry and current index on computing working set
Archived-At: <http://www.w3.org/mid/CAP+FsNfDRW4qjVsO-KBS5n0DGRdPehmA9fJw7+MhJciDU_0SwA@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/18834
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

Not necessarily-- If the encoder/compressor says that element 5 is k,v, and
then it also appends k,v as element 7, for example.

When this happens things are less efficient, yes, but that is it.

There has been little optimization around duplicate k,v entries because
1) It is very uncommon
2) It doesn't break anything
3) There doesn't seem to be a good reason to encourage the behavior by
optimizing for it.
-=R


On Wed, Jul 17, 2013 at 10:08 AM, Jeff Pinner <jpinner@twitter.com> wrote:

> Doesn't the decompressor have to sweep the table to create the new
> reference set and compare the (index, name, value) entries?
>
>
> On Wed, Jul 17, 2013 at 9:59 AM, Mike Bishop <Michael.Bishop@microsoft.com
> > wrote:
>
>>  How did the object get into the reference set?  Because the compressor
>> referenced an object by index, or included it as a literal and added it to
>> the table.****
>>
>> ** **
>>
>> So the object in the reference set points to the entry in the table it
>> was added with.  If there happens to be another identical entry in the
>> table, nothing says that the decompressor will even notice that.  I don’t
>> recall anything that requires the decompressor to sweep the header table
>> looking for matches – that’s the compressor’s job.****
>>
>> ** **
>>
>> *From:* Tatsuhiro Tsujikawa [mailto:tatsuhiro.t@gmail.com]
>> *Sent:* Wednesday, July 17, 2013 9:52 AM
>> *To:* Martin Thomson
>> *Cc:* ietf-http-wg@w3.org
>> *Subject:* Re: Header compression question: duplicate header entry and
>> current index on computing working set****
>>
>> ** **
>>
>> ** **
>>
>> ** **
>>
>> On Thu, Jul 18, 2013 at 1:36 AM, Martin Thomson <martin.thomson@gmail.com>
>> wrote:****
>>
>>  On 17 July 2013 08:56, Tatsuhiro Tsujikawa <tatsuhiro.t@gmail.com>
>> wrote:
>> > In 3.4, to compute working set from reference set of headers, the index
>> of
>> > entry in header table is required.
>> > The question is, when the duplicate entries are in the header table,
>> which
>> > index is used as the index of working set?****
>>
>> If, for some strange reason, a compressor created multiple identical
>> entries in the table, the decompressor is required to respect that
>> choice, even if it is likely to be a bug.  This prevents the
>> decompressor and compressor from getting out of sync.
>>
>> The compressor can then reference any of the entries when using an index.
>> ****
>>
>>  ** **
>>
>> If the choice is arbitrary, then the compressor and decompressor may
>> choose different index and****
>>
>> can get out of sync.****
>>
>> ** **
>>
>> For example,****
>>
>> Current header table:****
>>
>> |0|name1|value1|****
>>
>> |1|name1|value1|****
>>
>> ** **
>>
>> If name1/value1 is in reference set, compressor chooses index 0, and
>> decompressor chooses index 1.****
>>
>> compressor wants to remove name1/value1, so reference index 0.****
>>
>> In decompressor side, however, seeing index header representation with
>> index 0 and it is not in its reference set****
>>
>> (because name1/value1 is index 1), retrieve index 0 from header table and
>> add it to working set.****
>>
>> Maybe I misunderstand the draft.****
>>
>> ** **
>>
>> If multiple identical entries are considered as a bug, then it would be
>> better to****
>>
>> prohibit it in the spec and we are happy to not to consider these things.
>> ****
>>
>> ** **
>>
>> Best regards,****
>>
>> ** **
>>
>> Tatsuhiro Tsujikawa****
>>
>> ** **
>>
>> ** **
>>
>
>