Re: Broader discussion - limit dictionary encoding to one compression algorithm?
Jyrki Alakuijala <jyrki@google.com> Wed, 22 May 2024 10:42 UTC
Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=ietf.org@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 26DA6C1516E2 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 22 May 2024 03:42:13 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.749
X-Spam-Level:
X-Spam-Status: No, score=-7.749 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_ADSP_CUSTOM_MED=0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_EF=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, HTML_MESSAGE=0.001, MAILING_LIST_MULTI=-1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=w3.org header.b="FgHE/cGP"; dkim=fail (2048-bit key) reason="fail (body has been altered)" header.d=w3.org header.b="VXIPtvmu"; dkim=fail (2048-bit key) reason="fail (body has been altered)" header.d=google.com header.b="inqaOUeI"
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id JPY5U9_Sd8FH for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 22 May 2024 03:42:10 -0700 (PDT)
Received: from mab.w3.org (mab.w3.org [IPv6:2600:1f18:7d7a:2700:d091:4b25:8566:8113]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 55493C1516E1 for <httpbisa-archive-bis2Juki@ietf.org>; Wed, 22 May 2024 03:42:10 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=w3.org; s=s1; h=Subject:Content-Type:Cc:To:Message-ID:Date:From:In-Reply-To: References:MIME-Version:Reply-To; bh=Y8sbPZyBROkBmWDGPAV5s1HxyCXimcydDYBs81wP7Lo=; b=FgHE/cGPpDCuvU/63BD5iAK72Z guMSrhIIqXXAREa3w3qwCEhX9ho8s+nMyBkHrfsJioE6TLqo1F80cxrAg66oAIyEPnRuXS+OQFMCG DSeuuWdlNpEV/G0crEAVXzVf8JJ6Y+0kjpTwT3a3fNf7aWeGRt4mD0XeORGMbHJVhszMRCIAkUfrq FGq7S3MhvJbDKqjsCslVUmuzcv5ybWmjo45Awc4FDWWzsHSsUgUkWZMvOFYhjZIVLv3P1BvYiEk2W gh1BjLurckuGYk2ctGSpTwtealYP3mnWIbBJXvuxU1rqz9PpeO8uNxQDjgtATELYrvdbVs1Z47fee zAlr3TUQ==;
Received: from lists by mab.w3.org with local (Exim 4.96) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1s9jPE-0002nH-0t for ietf-http-wg-dist@listhub.w3.org; Wed, 22 May 2024 10:41:12 +0000
Resent-Date: Wed, 22 May 2024 10:41:12 +0000
Resent-Message-Id: <E1s9jPE-0002nH-0t@mab.w3.org>
Received: from www-data by mab.w3.org with local (Exim 4.96) (envelope-from <jyrki@google.com>) id 1s9jPC-0002mU-1V for ietf-http-wg@listhub.w3.internal; Wed, 22 May 2024 10:41:10 +0000
Received: from ip-10-0-0-224.ec2.internal ([10.0.0.224] helo=puck.w3.org) by mab.w3.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from <jyrki@google.com>) id 1s9Sr8-00FLd1-0d for ietf-http-wg@listhub.w3.internal; Tue, 21 May 2024 17:00:54 +0000
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=w3.org; s=s1; h=Content-Type:Cc:To:Subject:Message-ID:Date:From:In-Reply-To: References:MIME-Version:Reply-To; bh=wAoS14hDNDOA2lS6NMq8nJFAeQS0sqQLfmWeRpi+wiI=; t=1716310854; x=1717174854; b=VXIPtvmuuwUIc5YR1weChNb3ViWq03+tmDKU4J9C35uAvyXpic29FUBbZRgF6jPUoayhzL6NANa +RGra+BRwjc2W41PrSMx9hdgmBdMlpd4vJABlxWE8wo4YY2Z+KzXKuE0liGpfDpbigHmosXZj+nGr 1MukE+QUdd9CZTzFt7eL4+Q0QWz5QvFaIqlz2hTcbXsezbvTQiThS+xuT+2z4eLBuAKyIKEFjVPKF azDv/umOnOHoq6eBOH8LeTPAOWVT1q0kbUsTyfivuE7aHRyKrHyJJEkt34zOZHp/bsxnKMx+BS2gQ lNsqNsmbXJCaj4ALP4B0FyC9henkzoeADakw==;
Received-SPF: pass (puck.w3.org: domain of google.com designates 2a00:1450:4864:20::532 as permitted sender) client-ip=2a00:1450:4864:20::532; envelope-from=jyrki@google.com; helo=mail-ed1-x532.google.com;
Received: from mail-ed1-x532.google.com ([2a00:1450:4864:20::532]) by puck.w3.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from <jyrki@google.com>) id 1s9Sr7-00H6aX-21 for ietf-http-wg@w3.org; Tue, 21 May 2024 17:00:54 +0000
Received: by mail-ed1-x532.google.com with SMTP id 4fb4d7f45d1cf-572f6c56cdaso828a12.0 for <ietf-http-wg@w3.org>; Tue, 21 May 2024 10:00:53 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1716310849; x=1716915649; darn=w3.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=wAoS14hDNDOA2lS6NMq8nJFAeQS0sqQLfmWeRpi+wiI=; b=inqaOUeIpj2Q7wus5rNIKZKLM21iiudi83bKBASGADormLVks8cC8LLI2mYwVDk7eI /YL2hjEXcTZrxsw3N6/VlsViFgln/8RwLtruPETyVH5t5C7TqHncLYYxPlej144SxYbT BJOS+mInvujYArk240FYgvvviIMqNumxraigVN2EJoYkWdqQNeHDVbt8HegFqaCjgenH g4TsiLZea2WBQqnW3nBNuJbhauUOPpPe+JwbM6kKapeyUM/vS5PcTvY9rRFS4z9ZxcVJ GxBQ24cm+qtfWmeWFrxAcXKX2SfYXz9LnLDB4YxksqqWehBlnAv3pfxQAqTI0Bt9rF+K R6WA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716310849; x=1716915649; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=wAoS14hDNDOA2lS6NMq8nJFAeQS0sqQLfmWeRpi+wiI=; b=iDLfuEVMTVp4qZVAGeu8Egv/t8pWEsjwCHQIUhES1mZL8MZ5sy2uR8a4uY78Hnlvds ezTaRWfV1OQ4qJ+tkHvhBU/dTj6ERgMAtEt1m1SGG2KZ4XQx52PnuZI0WSDAR7nHW9Eq qmXMrZnSUcouAcHNmMlh9g9pavJipwhkAB5StGcg/gbnEHzVfyNijrN7biKXsAqwRVMf 2KMMqbylhfQOB+TPLHwOxVg15+tPvZQhoK7GsNXeA6443eqWy2vwFK6voNHHa4Zqrkyr 8DU8mJ20sCHd1VQpXq7xdVHG0QZUe8EJTsULAs19azMCeLum01j7L+YjwpJc4oAglp9+ w4JA==
X-Gm-Message-State: AOJu0YzQGb5rOch2HhdFU3s9mxPqO9uqeEAvsgnBtui6cpeOY58ZykmS h3PrUmNvNz+pzcOMrIPlSlOaS0mT9W+nPZN4tcQX1Z7PSuYxdihc0gFCHyitFz6X4IIvl6BuwXb kSi1+5XaLvtAgLiFuEUgXw0XNSszwR0ix9UTe4mR16eNcGIqaug==
X-Google-Smtp-Source: AGHT+IF56J0Ki8nmn7IZBcUgpMHExdn2xgKOk+XoX9cQdzrINkQOfcEgVZP2bJGGcidW4z5xkWkdFUS8p0gXe6ijYtY=
X-Received: by 2002:a05:6402:44b:b0:572:e6fb:ab07 with SMTP id 4fb4d7f45d1cf-5752c7d36e8mr570494a12.7.1716310849012; Tue, 21 May 2024 10:00:49 -0700 (PDT)
MIME-Version: 1.0
References: <CAJV+MGzjUnZZ=XFn5veOvuhVWyZNP2b9U0fxpS3UmrDC_bc_wQ@mail.gmail.com>
In-Reply-To: <CAJV+MGzjUnZZ=XFn5veOvuhVWyZNP2b9U0fxpS3UmrDC_bc_wQ@mail.gmail.com>
From: Jyrki Alakuijala <jyrki@google.com>
Date: Tue, 21 May 2024 19:00:06 +0200
Message-ID: <CAPapA7SXmcN6HYhyNumzjJngoS-wSQOorjPD0hcRFQJc8xVHOw@mail.gmail.com>
To: Patrick Meenan <patmeenan@gmail.com>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Content-Type: multipart/alternative; boundary="0000000000007bb1da0618f9c242"
X-W3C-Hub-DKIM-Status: validation passed: (address=jyrki@google.com domain=google.com), signature is good
X-W3C-Hub-Spam-Status: No, score=-20.6
X-W3C-Hub-Spam-Report: BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, DMARC_PASS=-0.001, ENV_AND_HDR_SPF_MATCH=-0.5, HTML_MESSAGE=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5, W3C_AA=-1, W3C_IRA=-1, W3C_WL=-1
X-W3C-Scan-Sig: puck.w3.org 1s9Sr7-00H6aX-21 4a278c88baabc31291e92f7e1f75afc6
X-caa-id: 2b81a5cdfd
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Broader discussion - limit dictionary encoding to one compression algorithm?
Archived-At: <https://www.w3.org/mid/CAPapA7SXmcN6HYhyNumzjJngoS-wSQOorjPD0hcRFQJc8xVHOw@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/51960
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <https://www.w3.org/email/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>
On Tue, May 21, 2024 at 5:05 PM Patrick Meenan <patmeenan@gmail.com> wrote: > - Brotli is limited to 50MB dictionaries, Zstandard can go up to 128MB. > These are artificial limitations. We can change brotli to gigabyte if we'd like. Allowing it to use larger dictionaries naturally increases memory use. > - Brotli uses 16MB of ram for the window while compressing/decompressing > independent of the dictionary size, Zstandard requires a window (RAM) as > large as the resource being compressed (for the delta case). > Here, the same. Brotli has a large-window-mode where there is no artificial 16 MB limitation. I only added that limitation originally because Chrome gave it as a launch criterion. > - Brotli at max compression is ~10-20% smaller than Zstandard at max > compression with dictionary (current implementations). > This is partially because of context modelling. Brotli's dictionary mechanism has a human-readable-dictionary mode that is not yet used in dictionary generation (outside of the internal dictionary). When we use that, it will increase dictionary efficiency for human readable languages (such as Armenian, Vietnamese, etc. not covered by the Brotli's static dictionary) about 25 % more than just using a usual dictionary. > - Zstandard benefits from dictionary use across all compression levels, > Brotli only benefits from dictionaries at level 5 and above (current > implementations). > This is encoding only decisions and can be changed. As things stand right now, if you have resources > 50MB and < 128MB you > can't use brotli to delta-encode them (even in the web case we have already > seen this with some large WASM apps). > We can easily change this. > If you have static resources < 50MB and can do the compression at build > time you would benefit from an additional 10-20% savings by using brotli > (current cli anyway). > Brotli has context modeling which helps quite a bit in compression. Also, brotli has a more complex 'entropy code dance' where it can very cheaply switch between entropy codes. If you are compressing dynamic responses and need to limit CPU, you may > benefit from using Zstandard at low compression levels (the amount of > brotli level-1 that is on the web may indicate this is a common constraint). > There are no technical stoppers that I know of that would not allow Brotli compression to be optimized similarly. The representations are very similar.. Luca Versari is working on Rust-based Brotli:5-encoder that supposedly is about 2x faster than the current C++ version. If you have existing infrastructure plumbed (security approved, etc) to > support one or the other, your preference might be to use the dictionary > version of the same algorithm rather than pull in a new library. > > Thanks, > > -Pat >
- Broader discussion - limit dictionary encoding to… Patrick Meenan
- Re: Broader discussion - limit dictionary encodin… Poul-Henning Kamp
- Re: Broader discussion - limit dictionary encodin… Patrick Meenan
- Re: Broader discussion - limit dictionary encodin… Glenn Strauss
- Re: Broader discussion - limit dictionary encodin… Roy T. Fielding
- Re: Broader discussion - limit dictionary encodin… Jyrki Alakuijala
- Re: Broader discussion - limit dictionary encodin… Patrick Meenan
- Re: Broader discussion - limit dictionary encodin… Jyrki Alakuijala