Re: Packet number encryption

Mikkel Fahnøe Jørgensen <> Wed, 21 February 2018 13:53 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 10756127077 for <>; Wed, 21 Feb 2018 05:53:35 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.698
X-Spam-Status: No, score=-1.698 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, FREEMAIL_REPLY=1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001] autolearn=no autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id UNQbw_m1fCHD for <>; Wed, 21 Feb 2018 05:53:27 -0800 (PST)
Received: from ( [IPv6:2607:f8b0:4001:c06::22c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 85994126DEE for <>; Wed, 21 Feb 2018 05:53:27 -0800 (PST)
Received: by with SMTP id e4so2181195iob.8 for <>; Wed, 21 Feb 2018 05:53:27 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=from:in-reply-to:references:mime-version:date:message-id:subject:to :cc; bh=uQQ3SiYrDKYXQccw7LU0eksVIazcD5Lh4h6bbKQSPaA=; b=DOsyZINXTmUKOdaTY1noWOr6d+ayOxubBQX/InbXiux/m9zMILOZl5Ess1yK1ox2g4 3jRUGrud1VMoF0wrTqNotYlYUu9d4lG4fC/sedAXYW9F6/b2Zyj2/i5Vfa/3v5kGD5vC 5xh+eSIFY9ic+sEQWCn6ba1gYh1rw7BsfO+UDYdmy3f68hQvUxWYVZdsXSOZwFsgW5I5 SqFsV1dSvsQPZoVQNcZ2mR2klFKt2W+appmGqX+Qa6ydZmhxRVzQfLKfUNMBOkl/foxj AaKxjLgIpnt9J2ynLzJrfHRNAByluac6AfoGmhuJfidUn1YXLUpXfpMTJAEdUx+7869M rrvg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:from:in-reply-to:references:mime-version:date :message-id:subject:to:cc; bh=uQQ3SiYrDKYXQccw7LU0eksVIazcD5Lh4h6bbKQSPaA=; b=puegmt7+QmiY2ZooIMcUsAxfW6AQjBTmLyfCWGnj2iss4RkPu6QRx97SPshA+KXrxz EtJr0ZkVbSpLICRHKPFxyaKw0RPzJTQO63alp+aH04RaqTVhaIlgimwwPjOJpJkmS2tH jPHUMdhD/68TyKDDaKDq/iiu6Tq6lmuWvbeUWIps2kEcgyoFPGPicEcLrWzjwhO8rieo fikBot6XRGEQkkNRJrdUTtKeb1g1xRiXIBxBzyr2TfYLZxX9cP/lN+a9zo/8ZhTgoa1M At1buBo6mLgbFdrSF21hgnifV6L8q9MAqtohJNacmgY2VxSyuDRgU8k0l+ZRhOwvRIZt rolQ==
X-Gm-Message-State: APf1xPAN2yHp39yuWDaJsRBoV9a9vlOWKbkqJQfgSRyGM/BDxxYiw8Em xwEBp/6a6JX7Tlt/XfUKcEJkJjR9SiuAQYv74hQ=
X-Google-Smtp-Source: AG47ELuAtvXZp/ESCWyl7CNezzwavKv922a8U8PWFqzarm9jl4S6v/N2QKMI1WPUGkb9kPa1AaIvzYeTLsCjRadsQxA=
X-Received: by with SMTP id h69mr4010464ioh.209.1519221203825; Wed, 21 Feb 2018 05:53:23 -0800 (PST)
Received: from 1058052472880 named unknown by with HTTPREST; Wed, 21 Feb 2018 08:53:23 -0500
From: =?UTF-8?Q?Mikkel_Fahn=C3=B8e_J=C3=B8rgensen?= <>
In-Reply-To: <>
References: <> <> <> <> <> <> <> <> <> <> <> <> <> <> <> <> <> <> <> <> <> <> <DB6PR10MB176692436653B08C1CA949C2ACF10@DB6PR10MB1766.EURPRD10.PROD.OUTLOOK.COM> <> <> <> <> <> <>
X-Mailer: Airmail (420)
MIME-Version: 1.0
Date: Wed, 21 Feb 2018 08:53:22 -0500
Message-ID: <>
Subject: Re: Packet number encryption
To: Victor Vasiliev <>, Kazuho Oku <>
Cc: Praveen Balasubramanian <>, "" <>, Marten Seemann <>, huitema <>
Content-Type: multipart/alternative; boundary="001a1140f5e6307ff30565b940c3"
Archived-At: <>
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 21 Feb 2018 13:53:35 -0000

I suppose that CCM is not a major concern since the nonce would be computed
from the encrypted packet number and thus not reveal anything surprising.
It does, however, add a fair bit of overhead:

16 bytes of CCM header including nonce and length data. 0 padded
authenticated data so the packet header will consume 16 octets. This
includes the now redundant encrypted packet number. At the is a 0-15 octet
padding which on might be 7.5 octets on average, or 0 if the packet is
filled perfectly, and of course the 16 octet auth tag.

This means that CCM mode will use at least 3x16 octets and addition to
encrypted payload and UDP headers.

It would probably be easy to remove 32 octets of that overhead, but it
would require special adaptation of CCM for QUIC that might not be
worthwhile - as we wait for devices that include GCM hardware support.

Kind Regards,
Mikkel Fahnøe Jørgensen

On 21 February 2018 at 13.42.12, Mikkel Fahnøe Jørgensen (

Maybe I’m missing something (such as exactly how TLS/QUIC maps encryption
headers to packets:

but it appears that the packet number encryption / linkage discussion is
focused entirely on AES-GCM.

The CCM mode which is part of TLS 1.3 has a header of 16 octets that
includes a nonce. This nonce would appear to the packet number.

In summary: CCM has the unencrypted form H, LA, A, PA, LM, M, PM, T where H
is a 16 octets header including nonce, LA is a length or the authenticated
data A, LM is the length of the plain text data M, PA and PM is 0-padding
up to block length, and T is tag of up to 16 octets. Some fields optional
depending on first octet (flag) in H. T is computed as a chain of AES
encryptions. Encryption is applied to LM, M, PM, T in CTR mode. (I suppose
there is also padding after A).

CCM is relevant to devices that have AES acceleration but no CLMUL
instruction for GHASH needed by efficient AES-GCM implementations. E.g.
ESP32 microcontrollers:

Of course the CCM header could partly or fully be removed if the
authenticated data and encrypted data lengths can be implied otherwise, or
part of it could be encrypted.

On 12 February 2018 at 12.49.38, Mikkel Fahnøe Jørgensen (

I can add some more numbers here

The numbers deducted from Solarflare report is
interesting, however I am not sure if that reflects the ordinary use
of a protocol; my understanding is that HFT is exceptional.

HFT is the extreme case which is why the benchmark is interesting, not
necessarily the use case itself.

The same concern applies to how fast a distributed database can achieve
consensus, which is something I do care about.

On 12 February 2018 at 12.44.01, Kazuho Oku ( wrote:

Victor, Mikkel, thank you for the estimations.

To me it seems that the estimated overhead of 0.1% seems like a
natural number that we would see on production environment, where the
average packet size is not small.

I also agree that it would be worthwhile to look the case where small
packets are exchanged. The numbers deducted from Solarflare report is
interesting, however I am not sure if that reflects the ordinary use
of a protocol; my understanding is that HFT is exceptional.

One benchmark that might give us a more meaningful number is the DNS
benchmark. lists a benchmark of several
authoritative DNS servers. The TLD benchmark and the Hosting (10k)
benchmark gives us the rough estimation on how much PPS we can achieve
when exchanging small packets.

The numbers we see on the benchmark is roughly 2M RPS (4M pps) on a 16
core (8x2) Intel Xeon processer running with HP enabled, when the
fastest DNS server is being used.

Running `openssl speed -evp` on a similar CPU gives me the following result:

$ /usr/local/openssl-1.1.0/bin/openssl speed -evp aes128
Doing aes-128-cbc for 3s on 16 size blocks: 133387877 aes-128-cbc's in 3.00s

This shows that 40M AES block operations can be run on a single core,
per second, which in turn means that a server with 16 cores can
perform 640M AES block operations per second (or 1,280M if HT is
enabled and if AES is not the bottleneck).

To summarize, if DNS had packet encryption, the overhead would be
somewhere from 0.3% to 0.6%.

Considering that, I would anticipate that the overhead of packet
number encryption will be neglible even for short packet workloads.

2018-02-12 16:19 GMT+09:00 Victor Vasiliev <>om>:
> I have no idea how you get that number (by which I mean, I have a lot of
> guesses, but none of which are solid or useful here). I looked at the
> profiles of various workloads we're running in production, and I estimate
> that none of them would be impacted by PN encryption by worse than 0.1%.
> On Sat, Feb 10, 2018 at 12:12 PM, Praveen Balasubramanian
> <> wrote:
>> Makes sense. The 1%+ overhead I had quoted was in comparison to full
>> protocol processing all the way from app to NIC. This is with an
>> early QUIC implementation and not fully optimized UDP/IP stack (we have
>> focused much much more on optimizing TCP in the past). After software
>> optimizations, the crypto share of the cost will only keep going up for
>> which we have no technique other than to offload when such support is
>> available in hardware and offload is way more challenging when workload
>> hosted in VMs and containers.
>> From: Mikkel Fahnøe Jørgensen []
>> Sent: Saturday, February 10, 2018 3:36 AM
>> To: Victor Vasiliev <>
>> Cc: Praveen Balasubramanian <>om>;; Marten
>> Seemann <>om>; huitema <>
>> Subject: Re: Packet number encryption
>> To put numbers into perspective using Intel 2015 data
>> A 64 byte message in AES-GCM AEAD in HW would use 1.03 cycles per byte or
>> 66 cycles total, or 22ns on a 3GHz core.
>> For packet numbers we use the CBC encrypt numbers because here AES cannot
>> exploit block parallelism.
>> Here we see 4.44 cycles/byte in HW or 71 cycles per block. With a 3GHz
>> setup that would amount to about 24ns overhead for packet encryption.
>> Clearly it makes no sense that AES-GCM is faster than a single AES block
>> encryption, but these are only approximate numbers and CBC mode might
have a
>> little overhead, so we clamp packet numbers to 22ns.
>> Taking the 98ns overhead by the Solarflare report we get a total
>> (simplified) processing time is 98ns non-crypto, 22ns for packet number,
>> 22ns for AEAD totalling 142ns. So the packet number encryption overhead
>> would be 22/(98+22)*100% = 18%. The numbers ignore other QUIC
>> but that can be done in other cores or outside the latency critical
>> This does not take into account that AEAD operation may operate less than
>> optimal because the packet number must be extracted first. On the other
>> hand, it is also not a disastrous overhead if no good alternative can be
>> found.
>> Earlier AES-NI I’ve seen from Intel doc suggests around 100cycles in HW
>> for a single AES-128 block which would be 33ns per packet number in the
>> above example.
>> On 10 February 2018 at 06.18.25, Mikkel Fahnøe Jørgensen
>> ( wrote:
>> 98ns for 68 byte messages

Kazuho Oku