Re: Hardware acceleration and packet number encryption

Christian Huitema <huitema@huitema.net> Sun, 25 March 2018 16:09 UTC

Return-Path: <huitema@huitema.net>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E2C3D126D0C for <quic@ietfa.amsl.com>; Sun, 25 Mar 2018 09:09:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.609
X-Spam-Level:
X-Spam-Status: No, score=-0.609 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, HTTPS_HTTP_MISMATCH=1.989, MIME_QP_LONG_LINE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SjLFJGJ8oa1P for <quic@ietfa.amsl.com>; Sun, 25 Mar 2018 09:09:47 -0700 (PDT)
Received: from mx43-out1.antispamcloud.com (mx43-out1.antispamcloud.com [138.201.61.189]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 635F71200F1 for <quic@ietf.org>; Sun, 25 Mar 2018 09:09:47 -0700 (PDT)
Received: from xsmtp24.mail2web.com ([168.144.250.190] helo=xsmtp04.mail2web.com) by mx61.antispamcloud.com with esmtps (TLSv1:AES256-SHA:256) (Exim 4.89) (envelope-from <huitema@huitema.net>) id 1f08DF-0004FP-SJ for quic@ietf.org; Sun, 25 Mar 2018 18:09:45 +0200
Received: from [10.5.2.13] (helo=xmail03.myhosting.com) by xsmtp04.mail2web.com with esmtps (TLS-1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.63) (envelope-from <huitema@huitema.net>) id 1f08DA-0002GU-0a for quic@ietf.org; Sun, 25 Mar 2018 12:09:38 -0400
Received: (qmail 22104 invoked from network); 25 Mar 2018 16:09:32 -0000
Received: from unknown (HELO [192.168.1.104]) (Authenticated-user:_huitema@huitema.net@[172.56.42.218]) (envelope-sender <huitema@huitema.net>) by xmail03.myhosting.com (qmail-ldap-1.03) with ESMTPA for <manasi.deval@intel.com>; 25 Mar 2018 16:09:31 -0000
Content-Type: multipart/alternative; boundary="Apple-Mail-C5BD65A2-FD1C-45B6-A604-3A69FF007300"
Mime-Version: 1.0 (1.0)
From: Christian Huitema <huitema@huitema.net>
X-Mailer: iPhone Mail (15D100)
In-Reply-To: <CAN1APdcKxbd-WVKc1ksLPNG+OOLhC1T2AqSTOAOoCCiG0D_-xA@mail.gmail.com>
Date: Sun, 25 Mar 2018 09:09:26 -0700
Cc: Eric Rescorla <ekr@rtfm.com>, Subodh Iyengar <subodh@fb.com>, Kazuho Oku <kazuhooku@gmail.com>, IETF QUIC WG <quic@ietf.org>, "Deval, Manasi" <manasi.deval@intel.com>
Content-Transfer-Encoding: 7bit
Message-Id: <AA352A70-FF13-4EEC-AC61-447EB57FB16C@huitema.net>
References: <7fd34142-2e14-e383-1f65-bc3ca657576c@huitema.net> <F9FCC213-62B9-437C-ADF9-1277E6090317@gmail.com> <CABcZeBM3PfPkqVxPMcWM-Noyk=M2eCFWZw2Eq-XytbHM=0T9Uw@mail.gmail.com> <CAN1APdfjuvd1eBWCYedsbpi1mx9_+Xa6VvZ3aq_Bhhc+HN67ug@mail.gmail.com> <CABcZeBMtQBwsAF85i=xHmWN3PuGRkJEci+_PjS3LDXi7NgHyYg@mail.gmail.com> <1F436ED13A22A246A59CA374CBC543998B5CCEFD@ORSMSX111.amr.corp.intel.com> <CABcZeBNfPsJtLErBn1=iGKuLjJMo=jEB5OLxDuU7FxjJv=+b=A@mail.gmail.com> <82369B21-CDED-4A6F-9B32-FF1D93816D80@fb.com> <CABcZeBNdxTuS-Nwi=KMofEezS0+BUgEoETh-+KM01XNKg4SzSQ@mail.gmail.com> <CAN1APdcKxbd-WVKc1ksLPNG+OOLhC1T2AqSTOAOoCCiG0D_-xA@mail.gmail.com>
To: Mikkel Fahnøe Jørgensen <mikkelfj@gmail.com>
Subject: Re: Hardware acceleration and packet number encryption
X-Originating-IP: 168.144.250.190
X-AntiSpamCloud-Domain: xsmtpout.mail2web.com
X-AntiSpamCloud-Username: 168.144.250.0/24
Authentication-Results: antispamcloud.com; auth=pass smtp.auth=168.144.250.0/24@xsmtpout.mail2web.com
X-AntiSpamCloud-Outgoing-Class: unsure
X-AntiSpamCloud-Outgoing-Evidence: Combined (0.15)
X-Recommended-Action: accept
X-Filter-ID: EX5BVjFpneJeBchSMxfU5hemwg6k+d/bW1DEnktkVSV602E9L7XzfQH6nu9C/Fh9KJzpNe6xgvOx q3u0UDjvO37pNwwF1lRXh5rzvPzo9Jts1ujulqUFmMITHM77eiViaO7ZL2bGSO/Tc4u8Sqx4qs7i TvJ2/ZGzVWB9scFAaCdIFaUvXN+CI+RGy3Me16pBo86SAdJ6bLtg5NStMc8F1x/TBCf6oYXAWGet lavcAjD9ytQxIHf9lN5jjLJaPK8lRJSPf/SXbEnDSsal/zZzc4n9VZdr7RAFD5mRwooUYhwMPaBP aKeQW+/QlaOdv8isl/qMm08Zpim2AHUKEWvQ6G/bWfgucjnNmABpGhD9TTttrFCuZ0NkwnSz2Luu o1u9uevuNfM1HjkNEFwape+IgNezYqxGMqsKjARq8PBC4qjSYb8Ll5Ew7esaVIVXxqL4mdySlZou 9qHIGOZDEEo7Oyc1nq0gsY582CWqKjiRB3ukywmZtiDkyd4mEBjJGGEJgawbllbHk+xyUKopM6rc KCaQX/lIXcRWtobViGg9fpUMghVcCxYGhncxAP/ZLBkEdQ9tvQ0BnO9EnHGZ0Ks3D53vpAT4eU1s usDZ05xjvOiU2yw6z85L3UyCdO+oQ3S7VPd90DA1c/ZLOZZo7XGPVfWv8HL1YL3Zn8TE/e4IMjT6 4dZYZAAUgQSn0n4YsmwR9pozWdbw56YFdP5Q2QyzPOggKW28pboyZCmKkHUYXamhdPAvlfREZ8sv 5hycebNx3NGYUEwLAAt4fz4APUPY23tB4/4Pbrz2QtFuyl+Sh6eSsUPc54JCDDPrhwG8vzFYfqb5 R4VemuUI6bcEARsm0KnLtMabt6P+t78klMitoc+NZy58uobCIkCdwVDO83SGTnM2K/9iKCD9v589 nVS3hWSdEOMftBjsWb6BDQzjSsHUIomTnJwT4ky6b7E7Hukt2Ge4B8NG0VKlrY+34Zmj+F/tjlrZ UvGhhjiSam0tWhQxL7hrJSk60SF3F6RYOYr2
X-Report-Abuse-To: spam@quarantine5.antispamcloud.com
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/tAinV5FF_8b8wUUDKwTJpPWMuds>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 25 Mar 2018 16:09:52 -0000

If we are exploring research ideas, one possibility would be to use 64 bit sequence numbers, and encrypt them using a modern 64 bit cipher like SPARX (https://www.cryptolux.org/index.php/SPARX). We can exclude the PN bits from the authenticated data, since the actual sequence number is part of the AEAD nonce.  With that, 64 bit encryption of the PN and AEAD encryption of the payload can proceed in parallel. Decryption requires first decrypting the PN to initialize the AEAD nonce, but that can be done without double buffering.

Of course, the cost of that is header overhead, since the PN always occupies 64 bits. So we are trading some overhead for hardware acceleration. And we have to have some faith in the 64 bit encryption algorithm. (SPARX was suggested to me by Jean-Philippe Aumusson, the author of IPCrypt.)

-- Christian Huitema 

> On Mar 25, 2018, at 8:48 AM, Mikkel Fahnøe Jørgensen <mikkelfj@gmail.com> wrote:
> 
> The tag used as IV in ECB mode for PN enceryption will use a full block size which is 16 octets. The proposal was encrypt the tag and XOR the result over that packet number and what follows.
> 
> If this is what is meant by “tag as IV” it is problematic for what I assume is meant by double buffering, i.e. the need to modify the packet buffer decryption. This is because the packet number and what follows must be un-XORed before verification can take place.
> 
> You could keep the packet number out of AEAD, but you cannot afford to waste the additional 16-4=12 octets or more that an AES block encryption uses, so you a stuck with modifying the buffer post AEAD.
> 
> Finding alternative nonces won’t fix this problem. If you encrypted the header completely separately from the body, you could do something, but then you waste space on extra header tags.
> 
> 
> My suggestion with GF(2^n) will not work because: even if it works in principle (finding an ideal in GF(2^32) and multiplying a seed with packet number modulo ideal), it is easy to brute force 2^32. Alternatively you can do chained hashing similar to how GCM’s GHASH works but then is not a unique mapping, but that is not better the CTR mode encryption PRNG style, and likely slower. Why would you do this at all, if it worked? Because at allows you to stick to encrypting only the packet number that can stay outside AEAD and thus avoid buffer modification. But I don’t see how it can work.
> 
> Mikkel
> 
>> On 25 March 2018 at 14.25.07, Eric Rescorla (ekr@rtfm.com) wrote:
>> 
>> 
>> 
>>> On Sat, Mar 24, 2018 at 9:41 PM, Subodh Iyengar <subodh@fb.com> wrote:
>>> When we were first discussing pne, we proposed that the tag be used as the IV for the ctr operation. The pr samples encrypted data in the packet. Did we change that for a reason?
>> 
>> I believe that's my alternative #1 and PR#1079.
>> 
>> 
>>> Would that help alleviate the buffering of the stream data? Because tag is always the last thing in the packet.
>> 
>> I will let Manasi answer this.
>> 
>> 
>> -Ekr
>>  
>>> 
>>> Subodh
>>> 
>>> 
>>> On Mar 25, 2018, at 2:56 AM, Eric Rescorla <ekr@rtfm.com> wrote:
>>> 
>>>> 
>>>> 
>>>>> On Sun, Mar 25, 2018 at 2:09 AM, Deval, Manasi <manasi.deval@intel.com> wrote:
>>>>> From talking to several of the folks last week, I understand that unlinkability is the goal of this protocol and there may be some flexibility in how that can be achieved.
>>>>> 
>>>>>  
>>>>> 
>>>>> Christian’s e-mail has a detailed list of options.  Here is the list of favored options as I understand them.
>>>>> 
>>>>>  
>>>>> 
>>>>> 1.      Packet number encrypted as current suggestion - The current proposal for PR 1079, uses a two stage serialized approach such that the stream header(s) and payload(s) need to be encrypted and the outcome of encryption forms the nonce of the packet number encryption.
>>>>> 
>>>>>  
>>>>> 
>>>>> 2.      Packet number encrypted alternative 1 - One of the ideas suggested was to encrypt the stream header(s) and payload(s) with the packet number as nonce, but have an additional nonce in the clear to encrypt the packet number. A scheme like this can allow for these two encryption operations to occur in parallel. This still has the issue of serialization in decrypt.
>>>>> 
>>>>>  
>>>>> 
>>>>> 3.      Packet number encrypted alternative 2 – Another option is to generate 2 IVs – one for PN and the other for stream header(s) and payload(s). The nonce can be a random value in the clear. This allows us to encrypt and decrypt the two fields in parallel. The packet number is encrypted so it also solves the ossification problem. Another variation of this is to generate a single IV but use one part of it to encrypt the PN.
>>>>> 
>>>> Neither of these alternatives seems ideal. Once you are carrying an explicit per-packet nonce, you might as well concatenate the payload and the PN and encrypt them together. The will require the least amount of nonce material.
>>>> 
>>>> -Ekr
>>>> 
>>>>> 4.      PN in the clear – this is a complex scheme and in the discussion with Ian, Jana and Praveen, they seemed to think this may be ok. If folks think this is implementable, then we may need to find an alternate solution for ossification.
>>>>> 
>>>>>  
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> Manasi
>>>>> 
>>>>>  
>>>>> 
>>>>>  
>>>>> 
>>>>>  
>>>>> 
>>>>>  
>>>>> 
>>>>> From: Eric Rescorla [mailto:ekr@rtfm.com]
>>>>> Sent: Saturday, March 24, 2018 3:18 PM
>>>>> To: Mikkel Fahnøe Jørgensen <mikkelfj@gmail.com>
>>>>> Cc: Kazuho Oku <kazuhooku@gmail.com>; Deval, Manasi <manasi.deval@intel.com>; Christian Huitema <huitema@huitema.net>; IETF QUIC WG <quic@ietf.org>
>>>>> Subject: Re: Hardware acceleration and packet number encryption
>>>>> 
>>>>>  
>>>>> 
>>>>>  
>>>>> 
>>>>>  
>>>>> 
>>>>> On Sat, Mar 24, 2018 at 9:35 PM, Mikkel Fahnøe Jørgensen <mikkelfj@gmail.com> wrote:
>>>>> 
>>>>> AERO: I did not read all of it, but it does indeed sound esoteric.
>>>>> 
>>>>> It can do two things of interest: reduce space used by packet numbers, and presumably fix the encryption issue.
>>>>> 
>>>>>  
>>>>> 
>>>>> However, it has a W parameter which is the limit of reordering which is default 64 and recommended at most 255 for security reasons. This is way way too low (I would assume) if packet clusters take multiple transatlantic paths.
>>>>> 
>>>>>  
>>>>> 
>>>>> That's just a function of how the packet numbers are encoded. It's not difficult to come up with a design that tolerates more reordering.
>>>>> 
>>>>>  
>>>>> 
>>>>> -Ekr
>>>>> 
>>>>>  
>>>>> 
>>>>>  
>>>>> 
>>>>> If we accepted such a limit, I could very trivially come up with an efficient solution to PN encryption. Since we cover at most 64 packets, we only need a 5 bit packet number and reject false positives on AEAD tag. To simplify, make it 8 bits. The algorithm is to AES encrypt a counter similar to a typical AES based PRNG. Then, for each packet take one byte from the stream and use it as packet number. The receiver creates the same stream and maps the received byte to an index it has. It might occasionally have to try multiple packet numbers since the mapping is not unique. Longer packet numbers reduce this conflict ratio. To help with this detection some short trial decryption might be included. The PN size can be extended as needed.
>>>>> 
>>>>>  
>>>>> 
>>>>> The cost of doing this is much lower than direct encryption for as proposes in PR because 1) a single encryption covers multiple packets, 2) the encryption can be parallelised resulting in a 4-5 fold performance increase. Combined this results in sub-nanosecond overhead for AES-NI.
>>>>> 
>>>>>  
>>>>> 
>>>>> However, you have to deal with uncertainties which is why this isn’t a very good idea unless you have some very good knowledge of the traffic pattern. It also complicates HW offloading, but I don’t see why it couldn’t be done efficiently.
>>>>> 
>>>>>  
>>>>> 
>>>>>  
>>>>> 
>>>>> Mikkel
>>>>> 
>>>>>  
>>>>> 
>>>>> On 24 March 2018 at 17.26.47, Eric Rescorla (ekr@rtfm.com) wrote:
>>>>> 
>>>>> 3. A more exotic solution like AERO (https://tools.ietf.org/html/draft-mcgrew-aero-00#ref-MF07)..
>>>>> 
>>>>>  
>>>>> 
>>>>>  
>>>>> 
>>>> 
>>