Re: QCRAM Review Request: PR #1141

Martin Thomson <martin.thomson@gmail.com> Tue, 13 March 2018 02:40 UTC

Return-Path: <martin.thomson@gmail.com>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BA89A124E15 for <quic@ietfa.amsl.com>; Mon, 12 Mar 2018 19:40:50 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.697
X-Spam-Level:
X-Spam-Status: No, score=-2.697 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_FONT_LOW_CONTRAST=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id W74L_axZrFM1 for <quic@ietfa.amsl.com>; Mon, 12 Mar 2018 19:40:47 -0700 (PDT)
Received: from mail-oi0-x22a.google.com (mail-oi0-x22a.google.com [IPv6:2607:f8b0:4003:c06::22a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 454D4120726 for <quic@ietf.org>; Mon, 12 Mar 2018 19:40:47 -0700 (PDT)
Received: by mail-oi0-x22a.google.com with SMTP id j79so14093545oib.12 for <quic@ietf.org>; Mon, 12 Mar 2018 19:40:47 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=12oIGmR0vbnffzcF5LsNVcalG1c2WfvO+kIUGF94ybA=; b=egOhnOTANzOAjre9qWQcQe9nSU6E0x9nSZGAf4oYw2x7AAv7e9O6QtccXeUKQNXOfP tUgmGJvm1q26KiiRwxvU+Drm0twYK1u3vQcvqZitBsUcuO+YBeA9AMQBGGX01BZOBCse Up3q4AyK2HSn6bF9sBBESizvUpBLAC5p00xBJWsOvHmNrH+yxEuRvcR0rGviSHHzNRLn L5uhRbytqCDDf01XZKH07+yLao0JZPKNG6YLB1TpnxWITJ8tdYGJg7gTxbe4P4zFWH+E SlLCucS894PsPFbe9BqKcpG4pDc7oJAqqLMs4mCpjrdRE1T2Sl3zlFN/dPD1YVTTMNMx AYZQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=12oIGmR0vbnffzcF5LsNVcalG1c2WfvO+kIUGF94ybA=; b=IEFbR+Armki3GDgPlVRmhe3bZ+IGNiQC0NV7gkvBvPhEencIky4ALhau5R7c+2tya5 Yy8NENeM1qdoTvF5p/zuE8fXCwj1tvIh3L6OBwm7YAlw2SnKcui8zxiokVuEjxCdvjtd PAwV/5dcuwZVqH8OJvz2JhwFVeLxF55uyOIK0+Gf0FBOpeFb2Q+BJPumtfCzCN59UJcP JW3QRPfUpwM9IkV6S28c3MCOg5+CENJb/9aBGQJuKHLcZzOaXV5nqeAT+F9SlNwOyxZE 0DwmXJgq3lCrilN9VL+ICBLC51hxrYEH3TuxqvSVeZk/p2J/vOT7StEqweGeyymDsJnB XMKg==
X-Gm-Message-State: AElRT7FKH2Ji2+H24FXqc7/IqTc+r9fCiO71t0b3ScCHgdmdoSyvPgrs DxPIv7LDu+wn/clRRvTa/WEFFTSNdUbQFo5pfsY=
X-Google-Smtp-Source: AG47ELvFcnuFmr1cx8bJHfOZ8xhvZ3jnK9XGvxIEp5wNq5RisfzvIu4/+lPaw5mGgVMnKV+ndL8uDouCtbzJDPs9nCg=
X-Received: by 10.202.84.132 with SMTP id i126mr5960255oib.295.1520908846513; Mon, 12 Mar 2018 19:40:46 -0700 (PDT)
MIME-Version: 1.0
Received: by 2002:a9d:1055:0:0:0:0:0 with HTTP; Mon, 12 Mar 2018 19:40:45 -0700 (PDT)
Received: by 2002:a9d:1055:0:0:0:0:0 with HTTP; Mon, 12 Mar 2018 19:40:45 -0700 (PDT)
In-Reply-To: <CAKcm_gOTLFBO5+rZMa98eXiWB1avh=-x+p90-dMFHbmg4X+PEA@mail.gmail.com>
References: <MWHPR08MB24325CADE88C887A869F65F1DAC60@MWHPR08MB2432.namprd08.prod.outlook.com> <CABkgnnWGWA7+e1DgmWT+MBDDSScoOyOg_RghhXgCNLaAOtTtPw@mail.gmail.com> <8FDB7B3D-E206-435F-9D74-22A18DC516F8@fb.com> <CAKcm_gOTLFBO5+rZMa98eXiWB1avh=-x+p90-dMFHbmg4X+PEA@mail.gmail.com>
From: Martin Thomson <martin.thomson@gmail.com>
Date: Tue, 13 Mar 2018 02:40:45 +0000
Message-ID: <CABkgnnV4Z_XkVAuiW8mfTRmVPOmEnRDOUAcB_xNTN8gsz6AJaw@mail.gmail.com>
Subject: Re: QCRAM Review Request: PR #1141
To: Ian Swett <ianswett@google.com>
Cc: Alan Frindell <afrind@fb.com>, Mike Bishop <mbishop@evequefou.be>, IETF QUIC WG <quic@ietf.org>
Content-Type: multipart/alternative; boundary="001a113d2c728845d00567422f9d"
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/XgwLulE6smiauz2fFbYydfGW5xY>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 13 Mar 2018 02:40:51 -0000

Efficiency gains are not especially relevant here. I think that the primary
concern is which choice is best for those implementing the spec.

FWIW, I disagree with Alan, I think that the impact of a change is minimal.
We might discuss ways to wind back some of the changes Mike suggests, but I
would be sad if we didn't at least fix the but where you have to discard
invalid instructions. That is where this started. I also very much like
using the bit to separate static and dynamic. That will let us fix that
rotten static table.  I care much less about the changes to strings though.

(I expect the gains to be miniscule.)

On 12 Mar. 2018 4:23 pm, "Ian Swett" <ianswett@google.com> wrote:

> The table above is great, but is it also possible to estimate 'small
> efficiency gains' in terms of %?
>
> On Mon, Mar 12, 2018 at 11:58 AM Alan Frindell <afrind@fb.com> wrote:
>
>> I mentioned this on issue #1123, but will repeat it on the list.  I think
>> redoing the instructions and string literal representation will limit code
>> re-use between HPACK and QPACK (formerly called QCRAM).  Initial simulation
>> of QPACK shows that it is pretty close to HPACK compression as is.  I’m
>> more swayed by eliminating the possibility of sending an instruction in an
>> invalid context (eg: a literal on the control stream), but on balance I
>> prefer to keep the same instructions as HPACK.
>>
>>
>>
>> It would be great to hear from other implementers (eg folks not on the
>> compression design committee) how they value code-reuse versus small
>> efficiency gains and fewer error checks.
>>
>>
>>
>> -Alan
>>
>>
>>
>> *From: *QUIC <quic-bounces@ietf.org> on behalf of Martin Thomson <
>> martin.thomson@gmail.com>
>> *Date: *Thursday, March 1, 2018 at 8:15 PM
>> *To: *Mike Bishop <mbishop@evequefou.be>
>> *Cc: *QUIC WG <quic@ietf.org>
>> *Subject: *Re: QCRAM Review Request: PR #1141
>>
>>
>>
>> I was a little surprised to see the savings here, modest though they
>> are.  Overall, this is overwhelmingly in the right direction.
>>
>>
>>
>> We might quibble on some of the details, but I would prefer to see this
>> larger change made and then iterate where the details are less good (for
>> instance, we might revisit the notion of octet alignment in other ways, or
>> go back to strings starting on octet boundaries).
>>
>>
>>
>> On Thu, Mar 1, 2018 at 11:39 AM, Mike Bishop <mbishop@evequefou.be>
>> wrote:
>>
>> https://github.com/quicwg/base-drafts/pull/1141
>>
>>
>>
>> One suggestion from Martin’s implementation feedback that has been
>> mentioned before is reworking the QCRAM instruction space.  The QCRAM draft
>> uses the HPACK instructions, except that it cannibalizes table size changes
>> for the Duplicate instruction.  That leads to some inefficiency with the
>> separate streams, since you have to validate that instructions aren’t used
>> on the wrong stream and some of the opcode space is wasted on each stream.
>>
>>
>>
>> This PR observes that each stream has a fully disjoint set of commands
>> from the other:
>>
>>    - Table updates
>>
>>
>>    - Insert with Name Reference from Static Table
>>       - Insert with Name Reference from Dynamic Table
>>       - Insert without Name Reference
>>       - Duplicate
>>       - Change table size
>>
>>
>>    - Header blocks
>>
>>
>>    - Indexed entry from Static
>>       - Indexed entry from Dynamic
>>       - Literal with Name Reference from Static Table
>>       - Literal with Name Reference from Static Table, Never Indexed
>>       - Literal with Name Reference from Dynamic Table
>>       - Literal with Name Reference from Dynamic Table, Never Indexed
>>       - Literal without Name Reference
>>       - Literal without Name Reference, Never Indexed
>>
>>
>>
>> The PR takes advantage of the disjoint space to rework the opcodes in
>> hopes of saving bytes on common instructions.  It does this in two ways:
>>
>>    - Uses a bit in the opcode to differentiate static versus dynamic
>>    tables, instead of concatenating them
>>
>>
>>    - Gives us the future option to make the static table longer without
>>       hurting the dynamic table performance
>>
>>
>>    - Uses a bit in the opcode to identify string literals, instead of
>>    using a sentinel value of zero
>>
>>
>>    - Requires modifying the HPACK string definition to not require byte
>>       alignment at the beginning
>>       - Gives us the future option to make the tables zero-based instead
>>       of one-based
>>       - Makes indexing easier to explain in a future PR
>>
>>
>>
>> Here’s the impact:
>>
>>
>>
>> *Operation*
>>
>> *HPACK*
>>
>> *QCRAM*
>>
>> *PR#1141*
>>
>> *Impact*
>>
>> *Table updates*
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *Insert with Name Reference from Static Table*
>>
>> “01” + index < 62
>>
>> “01” + index < 62
>>
>> “11” + index
>>
>> Same
>>
>> *Insert with Name Reference from Dynamic Table*
>>
>> “01” + index >= 62
>>
>> (one byte for first 2 entries, then 2 bytes)
>>
>> “01” + index >= 62
>>
>> (one byte for first 2 entries, then 2 bytes)
>>
>> “10” + index
>>
>> (one byte for first 64 entries)
>>
>> One byte saved for 62 of 64 most recent entries
>>
>> *Insert without Name Reference*
>>
>> “01000000”
>>
>> “01000000”
>>
>> “01”
>>
>>
>>
>> One byte saved for header names < 32 octets
>>
>> *Duplicate*
>>
>> Not supported
>>
>> “001” + index >= 62
>>
>> “000” + index
>>
>> Moot; you’re unlikely to duplicate recent entries
>>
>> *Change table size*
>>
>> “001”
>>
>> Not supported
>>
>> “001”
>>
>> Same
>>
>> *Header blocks*
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *Indexed entry from Static*
>>
>> “1” + index < 62
>>
>> “1” + index < 62
>>
>> “11” + index
>>
>> Same
>>
>> *Indexed entry from Dynamic*
>>
>> “1” + index >= 62
>>
>> “1” + index >= 62
>>
>> “10” + index
>>
>> One byte longer for indices 62-63
>>
>> *Literal with Name Reference from Static Table*
>>
>> “0000” + index < 62
>>
>> (two bytes for index > 15)
>>
>> “0000” + index < 62
>>
>> (two bytes for index > 15)
>>
>> “0001” + index
>>
>> Same
>>
>> *Literal with Name Reference from Static Table, Never Indexed*
>>
>> “0001” + index < 62
>>
>> (two bytes for index > 15)
>>
>> “0001” + index < 62
>>
>> (two bytes for index > 15)
>>
>> “0011” + index
>>
>> Same
>>
>> *Literal with Name Reference from Dynamic Table*
>>
>> “0000” + index >= 62
>>
>> (always 2+ bytes)
>>
>> “0000” + index >= 62
>>
>> (always 2+ bytes)
>>
>> “0000” + index
>>
>> One byte saved for first 15 entries
>>
>> *Literal with Name Reference from Dynamic Table, Never Indexed*
>>
>> “0001” + index > 62
>>
>> (always 2+ bytes)
>>
>> “0001” + index > 62
>>
>> (always 2+ bytes)
>>
>> “0010” + index
>>
>> One byte saved for first 15 entries
>>
>> *Literal without Name Reference*
>>
>> “00000000”
>>
>> “00000000”
>>
>> “010”
>>
>> One byte saved for header names < 16 octets
>>
>> *Literal without Name Reference, Never Indexed*
>>
>> “00010000”
>>
>> “00010000”
>>
>> “011”
>>
>> One byte saved for header names < 16 octets
>>
>>
>>
>> There’s been some churn on the PR today as Martin and I worked some of
>> this out, so please give it a read (or another read) now and provide
>> feedback.
>>
>>
>>
>> Thanks!
>>
>>
>>
>