RE: QPACK - proposal to optimize compression

Mike Bishop <mbishop@evequefou.be> Tue, 08 May 2018 21:57 UTC

Return-Path: <mbishop@evequefou.be>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B10EB12D7E6 for <quic@ietfa.amsl.com>; Tue, 8 May 2018 14:57:59 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.909
X-Spam-Level:
X-Spam-Status: No, score=-1.909 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, T_DKIMWL_WL_MED=-0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=evequefou.onmicrosoft.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id TCQoJVTtF5fJ for <quic@ietfa.amsl.com>; Tue, 8 May 2018 14:57:56 -0700 (PDT)
Received: from NAM02-BL2-obe.outbound.protection.outlook.com (mail-bl2nam02on0097.outbound.protection.outlook.com [104.47.38.97]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 57F9B12EAE9 for <quic@ietf.org>; Tue, 8 May 2018 14:57:56 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=evequefou.onmicrosoft.com; s=selector1-evequefou-be; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=6Kb7IlfSURHmkn5xWMS6jX41cgnTxuJ9ZVnZWJrAXDI=; b=It9A6dRYljM/3baiOYlgFDhYmrK5xP5XZfS6P4BOqgfma35JLToumTB9Vw2aornkVa399xNssrbinVmCw6VmyS084sxAlL9DTPlGsv5g1bn/3uUXEbmqFJFA6j0++dRjtlnoH/kQVkXlr6Lnk3DfV9Y5GzM+Tad8UmuM5fwQieg=
Received: from SN1PR08MB1854.namprd08.prod.outlook.com (10.169.39.8) by SN1PR08MB1758.namprd08.prod.outlook.com (10.162.134.12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.735.16; Tue, 8 May 2018 21:57:52 +0000
Received: from SN1PR08MB1854.namprd08.prod.outlook.com ([fe80::3c18:f60d:11c1:143d]) by SN1PR08MB1854.namprd08.prod.outlook.com ([fe80::3c18:f60d:11c1:143d%13]) with mapi id 15.20.0735.018; Tue, 8 May 2018 21:57:52 +0000
From: Mike Bishop <mbishop@evequefou.be>
To: Brian Swander <briansw=40microsoft.com@dmarc.ietf.org>, Alan Frindell <afrind@fb.com>, "quic@ietf.org" <quic@ietf.org>
Subject: RE: QPACK - proposal to optimize compression
Thread-Topic: QPACK - proposal to optimize compression
Thread-Index: AdPmSVy6766G9sm1TzOdAr1rcH/nrAAhOhQAAA++e/AAAjyFUA==
Date: Tue, 08 May 2018 21:57:52 +0000
Message-ID: <SN1PR08MB18544B4959E9E21B5F1CF578DA9A0@SN1PR08MB1854.namprd08.prod.outlook.com>
References: <DM5PR21MB0857F41DCB080F130E1436ADB39B0@DM5PR21MB0857.namprd21.prod.outlook.com> <BCF18407-B5DD-4608-ACFE-C229E17A2C1B@fb.com> <DM5PR21MB08572BEFDA60C6EE920D0357B39A0@DM5PR21MB0857.namprd21.prod.outlook.com>
In-Reply-To: <DM5PR21MB08572BEFDA60C6EE920D0357B39A0@DM5PR21MB0857.namprd21.prod.outlook.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [2601:600:8080:5a28:5004:e201:a5af:9d41]
x-ms-publictraffictype: Email
x-microsoft-exchange-diagnostics: 1; SN1PR08MB1758; 7:XTe5OWE8Ddcxyau5azstifeclOKykgxp5zbeq5Fq2zhwzpgcJXG0r5hpLZcGKYACiGdr0WNMEGYmz/olag4kRn/k1uFTFRKbRUgQmXR0mwLxEyOo/SMJgvd1cWjaocEZcTbkEiRh7YjGkZYpeGQL8bVltyuaz1geYqeQnv2Oexr0YIyk58oy1AWonv2Z58mt314LQVIKzrG7fk6TOCLGuYtMuILuzBoGatC+8KGaTf6hIGIUQVwjedRZvK/vJAM9
x-ms-exchange-antispam-srfa-diagnostics: SOS;
x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(7021125)(5600026)(4534165)(7022125)(4603075)(4627221)(201702281549075)(7048125)(7024125)(7027125)(7028125)(7023125)(2017052603328)(7153060)(7193020); SRVR:SN1PR08MB1758;
x-ms-traffictypediagnostic: SN1PR08MB1758:
authentication-results: spf=none (sender IP is ) smtp.mailfrom=mbishop@evequefou.be;
x-microsoft-antispam-prvs: <SN1PR08MB1758145FC2E8FED2AFC32F3EDA9A0@SN1PR08MB1758.namprd08.prod.outlook.com>
x-exchange-antispam-report-test: UriScan:(28532068793085)(89211679590171)(67672495146484)(100405760836317)(21748063052155);
x-ms-exchange-senderadcheck: 1
x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(6040522)(2401047)(5005006)(8121501046)(3231254)(944501410)(52105095)(93006095)(93001095)(3002001)(10201501046)(149027)(150027)(6041310)(20161123564045)(20161123560045)(2016111802025)(20161123558120)(20161123562045)(6043046)(6072148)(201708071742011); SRVR:SN1PR08MB1758; BCL:0; PCL:0; RULEID:; SRVR:SN1PR08MB1758;
x-forefront-prvs: 0666E15D35
x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(346002)(39380400002)(366004)(39830400003)(396003)(376002)(199004)(189003)(790700001)(97736004)(74316002)(3660700001)(6116002)(81156014)(99286004)(229853002)(45080400002)(86362001)(8936002)(3280700002)(25786009)(5250100002)(8676002)(76176011)(74482002)(2906002)(6436002)(7736002)(5660300001)(7696005)(2900100001)(81166006)(6506007)(6246003)(2501003)(53546011)(476003)(478600001)(14454004)(561944003)(102836004)(106356001)(11346002)(186003)(110136005)(105586002)(9686003)(236005)(446003)(53936002)(59450400001)(55016002)(68736007)(486006)(46003)(33656002)(316002)(6306002)(54896002); DIR:OUT; SFP:1102; SCL:1; SRVR:SN1PR08MB1758; H:SN1PR08MB1854.namprd08.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:0; MX:1;
received-spf: None (protection.outlook.com: evequefou.be does not designate permitted sender hosts)
x-microsoft-antispam-message-info: jCWngTiBB/jetftHRaLtXgoxZ+jjoM3Vc3gERTeJSC0Wq16dXmACwQuF3iG7AAy+cUlg3jH9yriWvWBS023mxQ0BfBFMtofh6iwHcVcqbzSk1VA5YNdt6We8D/Qk+vBryEazLx1bDwwH16uKfT/zyfYGu4fKygQcufLJjdpR8uvyFS0uLeedqbIOzi0TFnZE
spamdiagnosticoutput: 1:99
spamdiagnosticmetadata: NSPM
Content-Type: multipart/alternative; boundary="_000_SN1PR08MB18544B4959E9E21B5F1CF578DA9A0SN1PR08MB1854namp_"
MIME-Version: 1.0
X-MS-Office365-Filtering-Correlation-Id: 74ede924-8cec-45f5-9424-08d5b52ebf8a
X-OriginatorOrg: evequefou.be
X-MS-Exchange-CrossTenant-Network-Message-Id: 74ede924-8cec-45f5-9424-08d5b52ebf8a
X-MS-Exchange-CrossTenant-originalarrivaltime: 08 May 2018 21:57:52.8508 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 41eaf50b-882d-47eb-8c4c-0b5b76a9da8f
X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN1PR08MB1758
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/SRmsYgcKA3sutuVkuTuwbc_5eGY>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 08 May 2018 21:58:00 -0000

This is part of why I didn’t attempt to merge Alan’s post-base design into #1141, but left it as a separate PR.  That helps to identify exactly which instructions have been degraded by adding these two additional instructions.  The impact turned out to be:

  *   “Depends” was resurrected from the dead, one byte per header block
  *   Literal headers without name references now require one more byte for names which are 7-15 octets long

While the impact is certainly not non-zero, this seems fairly contained.  I don’t have a problem with two-pass encoding – that was actually my original design, and I still think it makes a lot of sense.  But part of the consensus at the Seattle interim was that we want to allow the intelligence and choices to live in the encoder, and I see permitting the encoder to choose single-pass versus double-pass as an example of that.

Also, double-pass has its own potential pitfalls – noting that you’ll refer to an entry which is in the dynamic table during your first pass, but then discovering that entry has been pushed out due to insertions when you try to actually reference it on the second pass, for example.  There are mitigations for these, of course, but I don’t see either as dramatically simpler than the other.

From: QUIC <quic-bounces@ietf.org> On Behalf Of Brian Swander
Sent: Tuesday, May 8, 2018 1:50 PM
To: Alan Frindell <afrind@fb.com>; quic@ietf.org
Subject: RE: QPACK - proposal to optimize compression

Of course single pass isn’t required.  But the PostBase is there only to support single pass.  So there are definitely (probably fringe) cases where the encoded size will be larger because of PostBase.

So IMO, it comes down to the tradeoff of the possible speed improvement of single pass vs. compression size.   And I just don’t see the speed of the single pass being very compelling either.  For instance, you can cache the indices, etc from pass 1, and just quickly burn thru the array again encoding the indices determined from pass 1.

Personally, given that the speed improvement of single pass is probably marginal at best, I’d err on the side of less complexity – and slightly better compression.

But I’m not passionate about this issue.   So I’m fine with the current draft, if everyone else likes it.   I just prefer simplicity when it seems like complexity isn’t buying us much.  And in this case, simplicity also gives slightly better compression.

bs

From: Alan Frindell <afrind@fb.com<mailto:afrind@fb.com>>
Sent: Tuesday, May 8, 2018 1:13 PM
To: Brian Swander <briansw@microsoft.com<mailto:briansw@microsoft.com>>; quic@ietf.org<mailto:quic@ietf.org>
Subject: Re: QPACK - proposal to optimize compression

Single pass encoding is not required, an encoder has the option to encode using one or two passes.  The only wire penalty for a two-pass encoder is one extra byte per header block in the prefix, as largest reference and base index can be the same in two-pass encoding.  The original specification (which used Depends) had a one bit flag in the HTTP frame that allowed the encoder save this byte as well, but the consensus seemed to be it was better to have a fixed format to the prefix.  Without post-base, there could also be one more bit available for the length of a literal header name without adding an extra byte (header names with 7 <= length < 15).  We can of course simulate it, but I expect the difference to be in the noise.

Decoders do need to support both kinds of instructions, but I don’t find it to be terribly complex.  The proxygen implementation of the latest draft including one-pass encoding/decoding should be synced to github this week.

I’m not sure what you mean when you say “Many of the headers are going to be encoded in both streams.”  If you mean it will encode an insert on the control stream and an index reference on the request stream, that is true, and the post-base encoding of the references occupies only one byte for the first 15 such headers per block.

-Alan


From: QUIC <quic-bounces@ietf.org<mailto:quic-bounces@ietf.org>> on behalf of Brian Swander <briansw=40microsoft.com@dmarc.ietf.org<mailto:briansw=40microsoft.com@dmarc.ietf.org>>
Date: Monday, May 7, 2018 at 2:35 PM
To: "quic@ietf.org<mailto:quic@ietf.org>" <quic@ietf.org<mailto:quic@ietf.org>>
Subject: QPACK - proposal to optimize compression

Please correct me if I’m wrong here.

It looks like the latest QPACK draft has been optimized to allow encode of both control and data streams in a single pass.
Because of this, we now must support that post-base encoding, and the multiple encoding formats that use Post-Base.
I.e. Literal Header Field with Name Reference and Literal Header Filter With Post-Base Name Reference, and similarly for Indexed Header Field.

To me, it seems like the simpler, more efficient option of removing post-base entirely could be better.   At the cost of requiring 2-pass encode, we can save a few bits for most headers since we would never need the post base header prefix.

So it’s a question of looping thru the headers twice, vs. being less efficient on the wire.

I haven’t profiled it, but I’d guess that double pass thru the headers isn’t going to be that much slower than single pass.   Many of the headers are going to be encoded in both streams.   So the more efficient wire format may be more optimal.

bs