Re: New I-D: Security Considerations Regarding Compression Dictionaries

Watson Ladd <> Tue, 29 October 2019 23:56 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 25B07120098 for <>; Tue, 29 Oct 2019 16:56:46 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.751
X-Spam-Status: No, score=-2.751 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.25, MAILING_LIST_MULTI=-1, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (1024-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id Uk2eENto8gue for <>; Tue, 29 Oct 2019 16:56:44 -0700 (PDT)
Received: from ( [IPv6:2603:400a:ffff:804:801e:34:0:38]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 16D9D12006B for <>; Tue, 29 Oct 2019 16:56:44 -0700 (PDT)
Received: from lists by with local (Exim 4.89) (envelope-from <>) id 1iPbJj-00034g-Pw for; Tue, 29 Oct 2019 23:54:27 +0000
Resent-Date: Tue, 29 Oct 2019 23:54:27 +0000
Resent-Message-Id: <>
Received: from ([2603:400a:ffff:804:801e:34:0:4f]) by with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from <>) id 1iPbJi-00033p-1H for; Tue, 29 Oct 2019 23:54:26 +0000
Received: from ([2607:f8b0:4864:20::736]) by with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from <>) id 1iPbJg-0003l6-Ec for; Tue, 29 Oct 2019 23:54:25 +0000
Received: by with SMTP id q70so739480qke.12 for <>; Tue, 29 Oct 2019 16:54:23 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=ihWmnSPKzHCku6wy899UkyzeTkD8BqKjOt8vbU8rzXE=; b=XowKxGCBP9i+IKaqgYU13mlYrrAJLUf9laRSQHxQ5D6u7ji8ydGlEMHYoH7Uw0eac4 29wJ8o9AZ/oLIwlsgVgqc49DBBMh/oFIWPBO9eFo3mKM5OhkUsoVoZ3wVA/PVEJ+BF7m ngHChVs+GWETUg0C0Z6zJeNoh9v/aVwu2oZDQ=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ihWmnSPKzHCku6wy899UkyzeTkD8BqKjOt8vbU8rzXE=; b=bhpoCxwqeBq/rwr7kgZTWq/NVxWKauwt9Y3sTQkE6yS67UKKrMPOOKlkKKZHMXIDbi eA6E6HLnSYNFu7WfVQ1dNsBQHwGl+x2+5KCiyJvt4hI28zYciRwtgs0vKaDw6NJXlLrZ XUq14qM2b9eXzwY0yK6IL/cqJmHIndx82wB9ZWs2w8dOat/Q5wkU1kynQ1vfyMV/bKJN zRpgu1TnTheHaWQPivJx9nCmuIhW05EnQs96QBE55dw97XHf+6ksx8uDG6J/ewHjMAde hKJKeW39pBNoG4NCpNhL7rAfzXvrbNdxkctUKv6Bq0eIjCv4wwoBrSrIQHTPF4jUmrqX ofDA==
X-Gm-Message-State: APjAAAUQiXWjOnhGMOtJ9ivILUDA6O6p6T6KUmGp49S+dKns0ixNcjUr /8eBrDj9syZLQVF1owtLy4Q4ByJiM9sC4w7/1cYuq+BnhmA=
X-Google-Smtp-Source: APXvYqxh6FswZ6Y46m3f8N51csOMPRkT14at1/MtDb5EexImaUE3i9MEZF+4/sjTx6Xok2vAoi2/alJF1EXggtj6KOM=
X-Received: by 2002:a37:4c8b:: with SMTP id z133mr20824128qka.132.1572393263060; Tue, 29 Oct 2019 16:54:23 -0700 (PDT)
MIME-Version: 1.0
References: <>
In-Reply-To: <>
From: Watson Ladd <>
Date: Tue, 29 Oct 2019 16:54:12 -0700
Message-ID: <>
To: "W. Felix Handte" <>
Cc: HTTP Working Group <>
Content-Type: text/plain; charset="UTF-8"
Received-SPF: pass client-ip=2607:f8b0:4864:20::736;;
X-W3C-Hub-Spam-Status: No, score=-4.1
X-W3C-Hub-Spam-Report: BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, W3C_AA=-1, W3C_WL=-1
X-W3C-Scan-Sig: 1iPbJg-0003l6-Ec d91d18a2a6dcf7a7a57247ab70131826
Subject: Re: New I-D: Security Considerations Regarding Compression Dictionaries
Archived-At: <>
X-Mailing-List: <> archive/latest/37078
Precedence: list
List-Id: <>
List-Help: <>
List-Post: <>
List-Unsubscribe: <>

On Tue, Oct 29, 2019 at 4:23 PM W. Felix Handte <>; wrote:
> Hello all,
> At IETF 104, I presented a teaser of the exploratory work I've been
> doing into dictionary-based compression for HTTP [0]. At the time, I
> promised that I would follow up with an analysis of the security
> properties of dictionary-based compression.
> That time has come! I've just uploaded a draft [1] that attempts to
> address that need and provide a useful survey of the interactions
> between dictionaries, internet protocols, and security.
> I would eventually like for this document to find a home in the HTTP WG;
> your feedback and thoughts are greatly appreciated.

I'm not sure I appreciate the distinction of "dictionary-based"
compression vs. other compression algorithms you draw in the draft.
The BREACH attack didn't look at changes to the Huffman table, which
was dominated by good old ETOAIN SHRDLU. Instead it changed the length
of matches back into the datastream, and thus the length of the
observed output. There isn't a separate dictionary to match substrings
in in DEFLATE.

A perfect compression algorithm reveals the Kolmogorov complexity of
the input. This is enough (if you can compute Kolmogorov complexity)
to reveal the differences between "hunter2 h" and "hunter2 z", and
then "hunter2 hu" and "hunter2 ha", etc.

It's true that a static Huffman tree isn't vulnerable to this problem,
but that's because the Huffman tree compresses character by character
using source statistics that don't change as the message is processed.
A dynamic Huffman tree (or range encoder) with only symbols (not per
context) would also leak overall number of symbols, while one with
context dependent probabilities would leak quite a bit more. No
dictionary here!

> I look forward to seeing you all in Singapore!
> Thanks,
> Felix
> [0]
> [1]