[TLS] Re: Bytes server -> client

"D. J. Bernstein" <djb@cr.yp.to> Sat, 09 November 2024 15:05 UTC

Return-Path: <djb-dsn2-1406711340.7506@cr.yp.to>
X-Original-To: tls@ietfa.amsl.com
Delivered-To: tls@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9AB28C14F6FB for <tls@ietfa.amsl.com>; Sat, 9 Nov 2024 07:05:32 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.907
X-Spam-Level:
X-Spam-Status: No, score=-1.907 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_BLOCKED=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, UNPARSEABLE_RELAY=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2ipoRimF34ha for <tls@ietfa.amsl.com>; Sat, 9 Nov 2024 07:05:32 -0800 (PST)
Received: from salsa.cs.uic.edu (salsa.cs.uic.edu [131.193.32.108]) by ietfa.amsl.com (Postfix) with SMTP id EE958C14F6F7 for <tls@ietf.org>; Sat, 9 Nov 2024 07:05:31 -0800 (PST)
Received: (qmail 30665 invoked by uid 1010); 9 Nov 2024 15:05:30 -0000
Received: from unknown (unknown) by unknown with QMTP; 9 Nov 2024 15:05:30 -0000
Received: (qmail 177072 invoked by uid 1000); 9 Nov 2024 15:05:16 -0000
Date: Sat, 09 Nov 2024 15:05:16 -0000
Message-ID: <20241109150516.177070.qmail@cr.yp.to>
Mail-Followup-To: tls@ietf.org, pqc@ietf.org
From: "D. J. Bernstein" <djb@cr.yp.to>
To: tls@ietf.org, pqc@ietf.org
In-Reply-To: <CAMjbhoUdt53ypQMFgNDh6YM9kDpP8pEB7Ost1nBPFF=kwi-gsA@mail.gmail.com>
Message-ID-Hash: X4IZXZW5W4KRMZWZKC7LV5WFRGAPBJFG
X-Message-ID-Hash: X4IZXZW5W4KRMZWZKC7LV5WFRGAPBJFG
X-MailFrom: djb-dsn2-1406711340.7506@cr.yp.to
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-tls.ietf.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
X-Mailman-Version: 3.3.9rc6
Precedence: list
Subject: [TLS] Re: Bytes server -> client
List-Id: "This is the mailing list for the Transport Layer Security working group of the IETF." <tls.ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/tls/dLqNWZnnlqJq7pxxmGgJNoFxUD0>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tls>
List-Help: <mailto:tls-request@ietf.org?subject=help>
List-Owner: <mailto:tls-owner@ietf.org>
List-Post: <mailto:tls@ietf.org>
List-Subscribe: <mailto:tls-join@ietf.org>
List-Unsubscribe: <mailto:tls-leave@ietf.org>

> This vast difference between median and average indicates that a small
> fraction of data-heavy connections skew the average.

Hmmm. Why not describe this as "a large number of short sessions skew
the median, making the median fail to reflect total data usage"?

The total cost of all sessions is the _average_ cost per session times
the number of sessions. The total cryptographic cost of all sessions is
the average cryptographic cost per session times the number of sessions.

Example:

   * A news site sends you a 20MB video in 1 big session, plus 99 tiny
     sessions each with 0.01MB. The total data it's sending is 21MB.

   * If you add 0.01MB to each session for crypto, then you're adding
     1MB across the 100 sessions. That's not much compared to 21MB.

   * In terms of averages, the average data per session is 0.21MB, and
     you're adding 0.01MB on top of that. Same 1/21 ratio (although if
     you don't know the total number of sessions then you can't compare
     this to other expenditures).

   * The _median_ data per session is just 0.01MB. This is wildly
     misleading: it completely misses the big video, while incorrectly
     making the crypto sound as if it's doubling costs.

To be clear, I do recommend looking at more of the distribution than
just the average. Seeing variations opens up possibilities such as (1)
being able to convince people to use stronger crypto for the longer
sessions, and (2) batching the short sessions to reduce their overhead
(not just for crypto), in cases where aggregate overhead is an issue.

---D. J. Bernstein