Re: Hardware acceleration and packet number encryption

Mikkel Fahnøe Jørgensen <mikkelfj@gmail.com> Sun, 25 March 2018 16:52 UTC

Return-Path: <mikkelfj@gmail.com>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7CD91126CBF for <quic@ietfa.amsl.com>; Sun, 25 Mar 2018 09:52:06 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.708
X-Spam-Level:
X-Spam-Status: No, score=-0.708 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, HTTPS_HTTP_MISMATCH=1.989, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id yWZyFKe-YA7Q for <quic@ietfa.amsl.com>; Sun, 25 Mar 2018 09:52:02 -0700 (PDT)
Received: from mail-io0-x232.google.com (mail-io0-x232.google.com [IPv6:2607:f8b0:4001:c06::232]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 754151200F1 for <quic@ietf.org>; Sun, 25 Mar 2018 09:52:02 -0700 (PDT)
Received: by mail-io0-x232.google.com with SMTP id e79so20454757ioi.7 for <quic@ietf.org>; Sun, 25 Mar 2018 09:52:02 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:in-reply-to:references:mime-version:date:message-id:subject:to :cc; bh=6Vg/88ZXX8AD/cDQhlTuKEItT5hFk0YazfFEvBx0MUc=; b=H3rcQKhfRv3P9av8JPkMw/RpXW/q7xOACkmar80JwcVUQl9G1sbTugdSH/l+cZjRDH xr72N+MPnATby+/l2AlXPWi3p/Kv6wbv/GrbXv/X2NUtBuaCc6hthPwVWAMEfYL6dU6v 07I0gMfsrfW8YLMz8myaOY8XyKkYUJXuxcFIlEKr6lQk71bC4q7nwO+A25srfbiK08k4 ohGIesvZhRtRFcRewDaeSbungs8t18gubW2OWFzO5FtPoFsOgR+KuzqPjCA4kamkO6GK h3D46uhPzRh0xgANVTEauVAMYKIIlLuO9hmBTtDRIe/tY3KS53mwsrXQfA7jvemOU5ZR pGWQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:in-reply-to:references:mime-version:date :message-id:subject:to:cc; bh=6Vg/88ZXX8AD/cDQhlTuKEItT5hFk0YazfFEvBx0MUc=; b=hKbNOmRN9ZauVn8v3UAvOVTT2wH+I1yKSo9madMG0+tDbfHMBjN1YfiPHMZ5iCzikm b6Xgwakkx3o9Kq2JI1EppI+b0ZWBpO3GCSW1Nvw3+i+NnFqevgeHPJG2mG7hzNxb4o3G DxRbctY2wZncfj1tjP/XUS8A1NEmD26w9VtwrvVJLBXfLcHcNd7nbhAXqb91fBzYXm5a A4LDl8U/qnXbLCx+OsfdqbskrT2hqKWJVvBRngPsyZSbVZVxwidYMQAuJjByq5GfLHhB KF8o0HD2jIGlVBRUUSH+OhLq3PNQKXcuiYYfSEHEgR+BX55hjtBj0utMxpuDep1O8yZw zmzw==
X-Gm-Message-State: AElRT7F3pRtO6SiooVvzQgRBcGuGMKhV2vBc5x5V5f/WCW4dTCZR1Pre oLUdzxHH760MwPZHeqGKMkc47Sw0t1zYniY6bv8=
X-Google-Smtp-Source: AG47ELuVrquoeQAdIq4OZtz8PZw5/1+1NX5qjwviZlitlRRr3FMzhTem4nmQYsQvesH4xyonmzGLiSy/wjp+wf7SOBA=
X-Received: by 10.107.212.7 with SMTP id l7mr36002396iog.70.1521996721785; Sun, 25 Mar 2018 09:52:01 -0700 (PDT)
Received: from 1058052472880 named unknown by gmailapi.google.com with HTTPREST; Sun, 25 Mar 2018 12:52:00 -0400
From: Mikkel Fahnøe Jørgensen <mikkelfj@gmail.com>
In-Reply-To: <AA352A70-FF13-4EEC-AC61-447EB57FB16C@huitema.net>
References: <7fd34142-2e14-e383-1f65-bc3ca657576c@huitema.net> <F9FCC213-62B9-437C-ADF9-1277E6090317@gmail.com> <CABcZeBM3PfPkqVxPMcWM-Noyk=M2eCFWZw2Eq-XytbHM=0T9Uw@mail.gmail.com> <CAN1APdfjuvd1eBWCYedsbpi1mx9_+Xa6VvZ3aq_Bhhc+HN67ug@mail.gmail.com> <CABcZeBMtQBwsAF85i=xHmWN3PuGRkJEci+_PjS3LDXi7NgHyYg@mail.gmail.com> <1F436ED13A22A246A59CA374CBC543998B5CCEFD@ORSMSX111.amr.corp.intel.com> <CABcZeBNfPsJtLErBn1=iGKuLjJMo=jEB5OLxDuU7FxjJv=+b=A@mail.gmail.com> <82369B21-CDED-4A6F-9B32-FF1D93816D80@fb.com> <CABcZeBNdxTuS-Nwi=KMofEezS0+BUgEoETh-+KM01XNKg4SzSQ@mail.gmail.com> <CAN1APdcKxbd-WVKc1ksLPNG+OOLhC1T2AqSTOAOoCCiG0D_-xA@mail.gmail.com> <AA352A70-FF13-4EEC-AC61-447EB57FB16C@huitema.net>
X-Mailer: Airmail (420)
MIME-Version: 1.0
Date: Sun, 25 Mar 2018 12:52:00 -0400
Message-ID: <CAN1APdcLhhR5Y0L28Q-DcO6X0Kpcoqd0H_NMzHopd+1k62b2Yg@mail.gmail.com>
Subject: Re: Hardware acceleration and packet number encryption
To: Christian Huitema <huitema@huitema.net>
Cc: Kazuho Oku <kazuhooku@gmail.com>, Eric Rescorla <ekr@rtfm.com>, IETF QUIC WG <quic@ietf.org>, Subodh Iyengar <subodh@fb.com>, "Deval, Manasi" <manasi.deval@intel.com>
Content-Type: multipart/alternative; boundary="f403043d0f88f38f5505683f79a6"
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/DONGaeW0BgaNoPvlLWVyit_Hnpc>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 25 Mar 2018 16:52:07 -0000

SPARX will keep it the buffer clean and it is also reasonably fast at about
500 cycles per byte, but it is going to add significantly more latency than
than a full AES-NI block encryption.

Consider the following PN encryption scheme which is not very different
from current proposal, but which avoids touching any AEAD data:

Derive a key K_pn, encrypt the the last 16 octets of AEAD tag X_pn =
E(K_pn, tail(tag, 16)).
The C_pn = (PX XOR X_pn) truncated to len(PN).
Store C_pn as the encrypted packet number.

I don’t think there is any repeat nonce / XOR weakness in this, but even if
there are, the protected secret is not that critical.

The encrypted packet number is now only consuming between 1 and 4 octets
and can be decrypted in the exact same manner.

This will not work out of the box because AEAD includes the packet number
so we need to modify the buffer. However, this isn’t required so we can
move the packet number out of the AEAD. But this is not so easy because the
rest of the header needs to be included and we don’t want the packet number
to be first in the packet. Placing it last could actually be an option
since it is useless until we see the AEAD tag anyway. In particular, you
would have all the required data in a single cache line which matters. (If
want to key duplicates you can you use the first encrypted bytes in the
packet rather than the packet number).

This approach still has the down side of consuming a full AES block
encryption that cannot be parallelised so we at about 24ns on current Intel
cores. However, any other encryption scheme that does not depend on
guessing that the packet number is in a given range, is likely to be slower.

I still think this is not worthwhile compared to segmented packet numbers,
but if has to be, it might be workable.


Kind Regards,
Mikkel Fahnøe Jørgensen


On 25 March 2018 at 18.10.04, Christian Huitema (huitema@huitema.net) wrote:

If we are exploring research ideas, one possibility would be to use 64 bit
sequence numbers, and encrypt them using a modern 64 bit cipher like SPARX (
https://www.cryptolux.org/index.php/SPARX). We can exclude the PN bits from
the authenticated data, since the actual sequence number is part of the
AEAD nonce.  With that, 64 bit encryption of the PN and AEAD encryption of
the payload can proceed in parallel. Decryption requires first decrypting
the PN to initialize the AEAD nonce, but that can be done without double
buffering.

Of course, the cost of that is header overhead, since the PN always
occupies 64 bits. So we are trading some overhead for hardware
acceleration. And we have to have some faith in the 64 bit encryption
algorithm. (SPARX was suggested to me by Jean-Philippe Aumusson, the author
of IPCrypt.)

-- Christian Huitema

On Mar 25, 2018, at 8:48 AM, Mikkel Fahnøe Jørgensen <mikkelfj@gmail.com>
wrote:

The tag used as IV in ECB mode for PN enceryption will use a full block
size which is 16 octets. The proposal was encrypt the tag and XOR the
result over that packet number and what follows.

If this is what is meant by “tag as IV” it is problematic for what I assume
is meant by double buffering, i.e. the need to modify the packet buffer
decryption. This is because the packet number and what follows must be
un-XORed before verification can take place.

You could keep the packet number out of AEAD, but you cannot afford to
waste the additional 16-4=12 octets or more that an AES block encryption
uses, so you a stuck with modifying the buffer post AEAD.

Finding alternative nonces won’t fix this problem.. If you encrypted the
header completely separately from the body, you could do something, but
then you waste space on extra header tags.


My suggestion with GF(2^n) will not work because: even if it works in
principle (finding an ideal in GF(2^32) and multiplying a seed with packet
number modulo ideal), it is easy to brute force 2^32. Alternatively you can
do chained hashing similar to how GCM’s GHASH works but then is not a
unique mapping, but that is not better the CTR mode encryption PRNG style,
and likely slower. Why would you do this at all, if it worked? Because at
allows you to stick to encrypting only the packet number that can stay
outside AEAD and thus avoid buffer modification. But I don’t see how it can
work.

Mikkel

On 25 March 2018 at 14.25.07, Eric Rescorla (ekr@rtfm.com) wrote:



On Sat, Mar 24, 2018 at 9:41 PM, Subodh Iyengar <subodh@fb.com> wrote:

> When we were first discussing pne, we proposed that the tag be used as the
> IV for the ctr operation. The pr samples encrypted data in the packet. Did
> we change that for a reason?
>

I believe that's my alternative #1 and PR#1079.


Would that help alleviate the buffering of the stream data? Because tag is
> always the last thing in the packet.
>

I will let Manasi answer this.


-Ekr


>
> Subodh
>
>
> On Mar 25, 2018, at 2:56 AM, Eric Rescorla <ekr@rtfm.com> wrote:
>
>
>
> On Sun, Mar 25, 2018 at 2:09 AM, Deval, Manasi <manasi.deval@intel.com>
> wrote:
>
>> From talking to several of the folks last week, I understand that
>> unlinkability is the goal of this protocol and there may be some
>> flexibility in how that can be achieved.
>>
>>
>>
>> Christian’s e-mail has a detailed list of options.  Here is the list of
>> favored options as I understand them.
>>
>>
>>
>> 1.      Packet number encrypted as current suggestion - The current
>> proposal for PR 1079, uses a two stage serialized approach such that the
>> stream header(s) and payload(s) need to be encrypted and the outcome of
>> encryption forms the nonce of the packet number encryption.
>>
>>
>>
>> 2.      Packet number encrypted alternative 1 - One of the ideas
>> suggested was to encrypt the stream header(s) and payload(s) with the
>> packet number as nonce, but have an additional nonce in the clear to
>> encrypt the packet number. A scheme like this can allow for these two
>> encryption operations to occur in parallel. This still has the issue of
>> serialization in decrypt.
>>
>>
>>
>> 3.      Packet number encrypted alternative 2 – Another option is to
>> generate 2 IVs – one for PN and the other for stream header(s) and
>> payload(s). The nonce can be a random value in the clear. This allows us to
>> encrypt and decrypt the two fields in parallel. The packet number is
>> encrypted so it also solves the ossification problem. Another variation of
>> this is to generate a single IV but use one part of it to encrypt the PN.
>>
> Neither of these alternatives seems ideal. Once you are carrying an
> explicit per-packet nonce, you might as well concatenate the payload and
> the PN and encrypt them together. The will require the least amount of
> nonce material.
>
> -Ekr
>
> 4.      PN in the clear – this is a complex scheme and in the discussion
>> with Ian, Jana and Praveen, they seemed to think this may be ok. If folks
>> think this is implementable, then we may need to find an alternate solution
>> for ossification.
>>
>>
>>
>> Thanks,
>>
>> Manasi
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *From:* Eric Rescorla [mailto:ekr@rtfm.com]
>> *Sent:* Saturday, March 24, 2018 3:18 PM
>> *To:* Mikkel Fahnøe Jørgensen <mikkelfj@gmail.com>
>> *Cc:* Kazuho Oku <kazuhooku@gmail.com>; Deval, Manasi <
>> manasi.deval@intel.com>; Christian Huitema <huitema@huitema..net
>> <huitema@huitema.net>>; IETF QUIC WG <quic@ietf.org>
>> *Subject:* Re: Hardware acceleration and packet number encryption
>>
>>
>>
>>
>>
>>
>>
>> On Sat, Mar 24, 2018 at 9:35 PM, Mikkel Fahnøe Jørgensen <
>> mikkelfj@gmail.com> wrote:
>>
>> AERO: I did not read all of it, but it does indeed sound esoteric.
>>
>> It can do two things of interest: reduce space used by packet numbers,
>> and presumably fix the encryption issue.
>>
>>
>>
>> However, it has a W parameter which is the limit of reordering which is
>> default 64 and recommended at most 255 for security reasons. This is way
>> way too low (I would assume) if packet clusters take multiple transatlantic
>> paths.
>>
>>
>>
>> That's just a function of how the packet numbers are encoded. It's not
>> difficult to come up with a design that tolerates more reordering.
>>
>>
>>
>> -Ekr
>>
>>
>>
>>
>>
>> If we accepted such a limit, I could very trivially come up with an
>> efficient solution to PN encryption. Since we cover at most 64 packets, we
>> only need a 5 bit packet number and reject false positives on AEAD tag. To
>> simplify, make it 8 bits. The algorithm is to AES encrypt a counter similar
>> to a typical AES based PRNG. Then, for each packet take one byte from the
>> stream and use it as packet number. The receiver creates the same stream
>> and maps the received byte to an index it has. It might occasionally have
>> to try multiple packet numbers since the mapping is not unique. Longer
>> packet numbers reduce this conflict ratio. To help with this detection some
>> short trial decryption might be included. The PN size can be extended as
>> needed.
>>
>>
>>
>> The cost of doing this is much lower than direct encryption for as
>> proposes in PR because 1) a single encryption covers multiple packets, 2)
>> the encryption can be parallelised resulting in a 4-5 fold performance
>> increase. Combined this results in sub-nanosecond overhead for AES-NI.
>>
>>
>>
>> However, you have to deal with uncertainties which is why this isn’t a
>> very good idea unless you have some very good knowledge of the traffic
>> pattern. It also complicates HW offloading, but I don’t see why it couldn’t
>> be done efficiently.
>>
>>
>>
>>
>>
>> Mikkel
>>
>>
>>
>> On 24 March 2018 at 17.26.47, Eric Rescorla (ekr@rtfm.com) wrote:
>>
>> 3. A more exotic solution like AERO (https://tools.ietf.org/html/d
>> raft-mcgrew-aero-00#ref-MF07
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__tools.ietf.org_html_draft-2Dmcgrew-2Daero-2D00-23ref-2DMF07&d=DwMFaQ&c=5VD0RTtNlTh3ycd41b3MUw&r=h3Ju9EBS7mHtwg-wAyN7fQ&m=Kqui4PrKKRuP58njW3vlK_ZPgcQX0TQ9iXVtGY1Kp30&s=GthDylmhvmHUnMvnjBT05qJT9VrOTknvVoMbdC7ObLo&e=>
>> )..
>>
>>
>>
>>
>>
>
>