Re: Broader discussion - limit dictionary encoding to one compression algorithm?

Jyrki Alakuijala <jyrki@google.com> Wed, 22 May 2024 08:00 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=ietf.org@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 10212C151524 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 22 May 2024 01:00:53 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -15.351
X-Spam-Level:
X-Spam-Status: No, score=-15.351 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, HTML_MESSAGE=0.001, MAILING_LIST_MULTI=-1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_PASS=-0.001, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=w3.org header.b="JlWjC5jc"; dkim=pass (2048-bit key) header.d=w3.org header.b="PVuHjL/E"; dkim=pass (2048-bit key) header.d=google.com header.b="btMm2aVb"
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id wjw1-tO5EmRA for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 22 May 2024 01:00:47 -0700 (PDT)
Received: from mab.w3.org (mab.w3.org [IPv6:2600:1f18:7d7a:2700:d091:4b25:8566:8113]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 84A10C14F681 for <httpbisa-archive-bis2Juki@ietf.org>; Wed, 22 May 2024 01:00:47 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=w3.org; s=s1; h=Subject:Content-Type:Cc:To:Message-ID:Date:From:In-Reply-To: References:MIME-Version:Reply-To; bh=XH5F5mxgULI9aHAoCMNj5qDT/oDE3p19MPuNxp+lktQ=; b=JlWjC5jcrwCyGOWLLq6WX6tUZX vTNtO0ahTRAm8aQSeTalU4jLUrOlDs0pudIbdcESzXBhzwS7LR43JJgzMVc4kFn52AVLEpp7XxCtf 2fmeBKb57BUjxg9mUsgVVhbu4/glZfh08FFcQjlfUGSrK7mx8DFWYLxl73lMNKa4r183eSAtQMedh IAsEcvoILjlDjCchNToI7+/47wCEXrcJuP+1u3OHx9QljCSVCzLfZOMHY0irT1RLKzQk7s8MKDY4s WEeFv0NofA5LN1+2LFI3s9zSPZi4g2nXkmay1Oq7luiuIFNpUT/K/373SLxkgs9wwgUAngCM9d3OP 4nH123hQ==;
Received: from lists by mab.w3.org with local (Exim 4.96) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1s9gt3-00HCgn-03 for ietf-http-wg-dist@listhub.w3.org; Wed, 22 May 2024 07:59:49 +0000
Resent-Date: Wed, 22 May 2024 07:59:49 +0000
Resent-Message-Id: <E1s9gt3-00HCgn-03@mab.w3.org>
Received: from www-data by mab.w3.org with local (Exim 4.96) (envelope-from <jyrki@google.com>) id 1s9gt1-00HCfd-2H for ietf-http-wg@listhub.w3.internal; Wed, 22 May 2024 07:59:47 +0000
Received: from ip-10-0-0-144.ec2.internal ([10.0.0.144] helo=pan.w3.org) by mab.w3.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from <jyrki@google.com>) id 1s9gs2-00HCc7-0K for ietf-http-wg@listhub.w3.internal; Wed, 22 May 2024 07:58:46 +0000
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=w3.org; s=s1; h=Content-Type:Cc:To:Subject:Message-ID:Date:From:In-Reply-To: References:MIME-Version:Reply-To; bh=XH5F5mxgULI9aHAoCMNj5qDT/oDE3p19MPuNxp+lktQ=; t=1716364726; x=1717228726; b=PVuHjL/Es1AIsKXE/3+zkQN0iBIZNHZl6qmEtlnrEVZDUtzVoNAt34gvWANl3wXPvIDdvQBLwSx hQpUMNH25jENZ1TzFtDlKwrTVT+HJ9yvf02F+fa7nhk/qD55j2e4pLqKDWpxTHuc32cl/qnYzQpYi EipnPifFezffdNT1GdFXDmZKvjX9UpLAQa/FLb28fzMDzld40NzW9dba5OkIlG1unKVxVivhDZiIj Qj6s4+/X0lLkaRRJCqfRTNDQguxEKOslA1h6ny8MtnKXQhvWtWYhDe8yJv4N9SBywpOHLU5xQv54U 0H8FpCGgbNuryMPfzCllaGxR9R9EFym2djTg==;
Received-SPF: pass (pan.w3.org: domain of google.com designates 2a00:1450:4864:20::52f as permitted sender) client-ip=2a00:1450:4864:20::52f; envelope-from=jyrki@google.com; helo=mail-ed1-x52f.google.com;
Received: from mail-ed1-x52f.google.com ([2a00:1450:4864:20::52f]) by pan.w3.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from <jyrki@google.com>) id 1s9gs1-006dJ3-1o for ietf-http-wg@w3.org; Wed, 22 May 2024 07:58:46 +0000
Received: by mail-ed1-x52f.google.com with SMTP id 4fb4d7f45d1cf-572f6c56cdaso11237a12.0 for <ietf-http-wg@w3.org>; Wed, 22 May 2024 00:58:45 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1716364721; x=1716969521; darn=w3.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=XH5F5mxgULI9aHAoCMNj5qDT/oDE3p19MPuNxp+lktQ=; b=btMm2aVbW0vN7UFCLBTvCpbdlGs2t/tafcJt5oajV8pkCX+Oosz3uhB3JpOgVQmoYJ o1Qo4HXlAD1DzwfrIn4mIbV1bZkWVzTDx2QzpXyB25kPoWL/OQ76kpwM+o0gPTByxy9c 5VzR74lJD3m/fqvz5LZ4uYxQrrMfz87B5EYd4HqWSCfD0TqLLXEc55hRspDXu/VWy6Zq 31aQcsrH/sQMuMDti7nvnGOmd2xz4ARLXz6AOM1q/MfOEG86A7YpTRgQOxMFWj0zIxgM CAsmlYXHd5kQay3kmx6M7EW5ZMZuUs4zf9xKqN6M5+fMgrkZR5TpDnS5deAx0ZpWM9Rf 152g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716364721; x=1716969521; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=XH5F5mxgULI9aHAoCMNj5qDT/oDE3p19MPuNxp+lktQ=; b=ZjmXSzKlABYe8nrlyX3e1ZHytoHOvdhIRYyHXsJXpBFyTm/o6kV3ztpNSPATEax8+L isT7C9R3JOdrVMOT0dbVttmi6uP/83agnQJGNAAw4N3dIqmitrwjrkXGx3+dzOu3jxN2 LehRWsQ6+MHNAB6asB8Rrt8Spfp2Y6WHhTCJXEXyCuYTx7E20EKLyKDLPizR4ln8BoCv 9Z2bBGawfUQeDay/9AGhA+yjJwDQvu7bGaKDacQYSm/pg81aNInfaHkPrFVdhsUUYnXB B72OxocG/KFc101JUAiDBgpp1hoxli2m0TvSS9lC3W1aj1fy8xCuhXssedT7mIob/TtY boWw==
X-Gm-Message-State: AOJu0Yz/Avan5pW34YA/Up/Jd3yT+6tByITLmyGaFYvVHNT6VmeWh4PM qBlqW4lU1nHVKy0BFd+hGjhQSCo9Jh6OsWpQBT2SPsPa+verMRWLiNgQWeQ9biVP6qdQkxeCvvB nnpPvx4Y0xw6PyuVY5Jh5S7E/C/nG6byatAcRa/P81bFt9LAc6Q==
X-Google-Smtp-Source: AGHT+IFS8u8k5zx6Opyt1atjbySqQmaMgrJ5p09Uw/V4UiQkkM7gIDACJGffFg4vZI2p2HnNtUzokHdRNMusJM5LmIw=
X-Received: by 2002:a05:6402:50cf:b0:572:57d8:4516 with SMTP id 4fb4d7f45d1cf-5782f9f7e7bmr148216a12.2.1716364721232; Wed, 22 May 2024 00:58:41 -0700 (PDT)
MIME-Version: 1.0
References: <CAJV+MGzjUnZZ=XFn5veOvuhVWyZNP2b9U0fxpS3UmrDC_bc_wQ@mail.gmail.com>
In-Reply-To: <CAJV+MGzjUnZZ=XFn5veOvuhVWyZNP2b9U0fxpS3UmrDC_bc_wQ@mail.gmail.com>
From: Jyrki Alakuijala <jyrki@google.com>
Date: Wed, 22 May 2024 09:58:15 +0200
Message-ID: <CAPapA7TZ9QUqNxErR4b0v3tuwZ4+gA=FFP5BjFaO0R9UFRYw+w@mail.gmail.com>
To: Patrick Meenan <patmeenan@gmail.com>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Content-Type: multipart/alternative; boundary="0000000000008405a60619064dfd"
X-W3C-Hub-DKIM-Status: validation passed: (address=jyrki@google.com domain=google.com), signature is good
X-W3C-Hub-Spam-Status: No, score=-17.9
X-W3C-Hub-Spam-Report: BAYES_50=0.8, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, DMARC_PASS=-0.001, ENV_AND_HDR_SPF_MATCH=-0.5, HTML_MESSAGE=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5, W3C_AA=-1, W3C_IRA=-1, W3C_WL=-1
X-W3C-Scan-Sig: pan.w3.org 1s9gs1-006dJ3-1o cf796b78fee79407cd923eaaa261b459
X-caa-id: f85c9ef217
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Broader discussion - limit dictionary encoding to one compression algorithm?
Archived-At: <https://www.w3.org/mid/CAPapA7TZ9QUqNxErR4b0v3tuwZ4+gA=FFP5BjFaO0R9UFRYw+w@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/51959
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <https://www.w3.org/email/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

On Tue, May 21, 2024 at 5:05 PM Patrick Meenan <patmeenan@gmail.com> wrote:

> As things stand right now, if you have resources > 50MB and < 128MB you
> can't use brotli to delta-encode them (even in the web case we have already
> seen this with some large WASM apps).
>

One cool compression improvement would be to allow the range-request to
return a dictionary range -- and in decompression only the bytes within the
dictionary range are used for decompression. That way decompression memory
use could be more limited (for a 100 MB content from 200 MB to ~5 MB), and
for the wasm app patching case that would improve the compression density
slightly (~10 % or so) as the length of the codings to the dictionary have
less entropy, and they can be the most substantial entropy source in such
codings.