RE: Hardware acceleration and packet number encryption

Mikkel Fahnøe Jørgensen <mikkelfj@gmail.com> Sat, 31 March 2018 17:56 UTC

Return-Path: <mikkelfj@gmail.com>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id 9730012D945 for <quic@ietfa.amsl.com>; Sat, 31 Mar 2018 10:56:40 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ietf.org; s=ietf1; t=1522519000; bh=hLhla9n+IXPI4Di37uphUMox0iReZiv8qv0Kcv7V8XI=; h=From:In-Reply-To:References:Date:Subject:Cc:To:To:To; b=r3NBLOYT4KRZ8csT2JgznJv4EwEBc+wYl6GhSOaL+eQV3rLPLU8zFP3F3tH+BrRDk jM2tm/3nDW7aCZXFqWxqsXIyAtoia/MioH6NcgyJezW2avzRECTrwi/yXGIFfq69mE DZO6hqxaXvTx/59NmWqPYGjQRb1KiwswOY7ZjqM0=
X-Mailbox-Line: From mikkelfj@gmail.com Sat Mar 31 10:56:40 2018
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id 578F7127076; Sat, 31 Mar 2018 10:56:40 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ietf.org; s=ietf1; t=1522519000; bh=utRDw3/cVaIbBSH6aXNYK6+G2VmDNqrttdG4IrTD264=; h=From:In-Reply-To:References:Date:Subject:Cc:To:To:To; b=Mmu/oX7Xa37Loax5sSZNrYCTnorNbleTI/dL4AEFwSxS4RllD8tzLW1KZzQwINPaP dWiVr3scN15Ha6za4gmAO6cpvYXg4uEg9Uu1Pz37UyfDF3+boNc4X3qZ/QtgHNdToV FyQV+hZOeY028qztL+3aUjOj7SYbc5OZ2D2lM5k0=
X-Original-To: dmarc-reverse@ietfa.amsl.com
Delivered-To: dmarc-reverse@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 23FAD127077 for <dmarc-reverse@ietfa.amsl.com>; Sat, 31 Mar 2018 10:56:40 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.697
X-Spam-Level:
X-Spam-Status: No, score=-2.697 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 59qaP64bHsZS for <dmarc-reverse@ietfa.amsl.com>; Sat, 31 Mar 2018 10:56:36 -0700 (PDT)
Received: from mail-it0-x22e.google.com (mail-it0-x22e.google.com [IPv6:2607:f8b0:4001:c0b::22e]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id ADA24127076 for <pravb=40microsoft.com@dmarc.ietf.org>; Sat, 31 Mar 2018 10:56:36 -0700 (PDT)
Received: by mail-it0-x22e.google.com with SMTP id b5-v6so4033263itj.1 for <pravb=40microsoft.com@dmarc.ietf.org>; Sat, 31 Mar 2018 10:56:36 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:in-reply-to:references:mime-version:date:message-id:subject:to :cc; bh=yjACi+RyH26oPgpjF4mX9BfYh+evrxuMxCOd2PG7Nxc=; b=bKVXlRGli0wyUt1Nq99h4gem0knSkJmfNr5kLwEsDul4e5N1lvPUp+l9n3MZn4opVu RLXr4BD5VMdBZUYW9IhH4saaBKh/tYuSySNRZnldRUKFYGqOE7E0Yr3UwRZ4yQMen/zm h+oZQ5CL8BlQpDVeN6rBYWUWIEoPRRclZDp0eloA+gTYsDOeSyQF0NAsGurqF2LF2Hpo pMvSiUMTkuTmWbffhzWR09Fd+Yat29y9owADOweRJMwVz7E0EZkxiyFQeaLgBXL6hlvp sHiMu8EQhjClEeq0UemTmUs4Y+8xi7ozEcrwMWbCO6B+mph5byvyGw5gV3/TO17njth6 Wl+A==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:in-reply-to:references:mime-version:date :message-id:subject:to:cc; bh=yjACi+RyH26oPgpjF4mX9BfYh+evrxuMxCOd2PG7Nxc=; b=TDR0ubhmYKm4aP7EnhQTUq8kg6EcyQyGTAfB5R4Dfp4yKOSLkMJ2f6X3QXrqnxhXrn HWQY305iqPhRKhnJMXVUz3OvEkEvD78kPa5MwW/6YPVk3MnDSHQQXq998HYrvZ14GkZz DUzr5xr+0I7MXGuudUhR93uizCXtUf94n1gd2x904jkQsYDTjQfi8Qc0mJpMDHusrLN7 qSYOf2GCAtJgppxIt8LSbY+4C0gLTehC/H26HnMr4YfhK7nGTgkgAO83I3+cIPo6QZgm 82Tlq0bxfuqFVPzq33bggD+RgHy6IOCWWBgkZlm584BMN+nMF+W3KtI7YXFC2/ozSD49 JEyA==
X-Gm-Message-State: AElRT7GNcVlIMZ9hNh7wg8f4aPGpHuejBHPyHnOnLRifCme8Kw+VSv8l CpTCzsYdjIpU/wGIYWNvw1HSUeRDQJIXGonJNRhDCA==
X-Google-Smtp-Source: AIpwx4/+3e36zKRvpyiBN/tnDvJs20UVwhWR/OT+btfyujH6qpLTWB86lw5zDL7Jz12ogfyEZyqCpanEAMmB7GHLDtI=
X-Received: by 2002:a24:e085:: with SMTP id c127-v6mr7426855ith.25.1522518995505; Sat, 31 Mar 2018 10:56:35 -0700 (PDT)
Received: from 1058052472880 named unknown by gmailapi.google.com with HTTPREST; Sat, 31 Mar 2018 10:56:34 -0700
From: Mikkel Fahnøe Jørgensen <mikkelfj@gmail.com>
In-Reply-To: <CY4PR21MB0630E45900B1465B86314330B6A00@CY4PR21MB0630.namprd21.prod.outlook.com>
References: <7fd34142-2e14-e383-1f65-bc3ca657576c@huitema.net> <F9FCC213-62B9-437C-ADF9-1277E6090317@gmail.com> <CABcZeBM3PfPkqVxPMcWM-Noyk=M2eCFWZw2Eq-XytbHM=0T9Uw@mail.gmail.com> <CAN1APdfjuvd1eBWCYedsbpi1mx9_+Xa6VvZ3aq_Bhhc+HN67ug@mail.gmail.com> <CABcZeBMtQBwsAF85i=xHmWN3PuGRkJEci+_PjS3LDXi7NgHyYg@mail.gmail.com> <1F436ED13A22A246A59CA374CBC543998B5CCEFD@ORSMSX111.amr.corp.intel.com> <CABcZeBNfPsJtLErBn1=iGKuLjJMo=jEB5OLxDuU7FxjJv=+b=A@mail.gmail.com> <1F436ED13A22A246A59CA374CBC543998B5CDAD4@ORSMSX111.amr.corp.intel.com> <BBB8D1DE-25F8-4F3D-B274-C317848DE872@akamai.com> <CAN1APdd=47b2eXkvMg+Q_+P254xo4vo-Tu-YQu6XoUGMByO_eQ@mail.gmail.com> <CAKcm_gMpz4MpdmrHLtC8MvTf5uO9LjD915jM-i2LfpKY384O2w@mail.gmail.com> <HE1PR0702MB3611A67E764EE1C7D1644FAD84AD0@HE1PR0702MB3611.eurprd07.prod.outlook.com> <d8e35569-e939-4064-9ec4-2cccfba2f341@huitema.net> <CACpbDccqKoF-Y1poHMN2cLOK9GOuvtMTPsF-QEen3b30kUo9bg@mail.gmail.com> <CAKcm_gNffwpraF-H2LQBF33vUhYFx0bi_UXJ3N14k4Xj4NmWUw@mail.gmail.com> <CACsn0ckbthsn6V+0ccqZG=PF6BY74rAg-+Wwa7h=4tavOzCs+A@mail.gmail.com> <CY4PR21MB063062DBFA99CA14C6A995F6B6A20@CY4PR21MB0630.namprd21.prod.outlook.com> <CY4PR21MB0630E45900B1465B86314330B6A00@CY4PR21MB0630.namprd21.prod.outlook.com>
X-Mailer: Airmail (420)
MIME-Version: 1.0
Date: Sat, 31 Mar 2018 10:56:34 -0700
Message-ID: <CAN1APddpR_TmV=eiGkbmhmerKjo5KtPnbcqyVKoVNLs3kt1JNA@mail.gmail.com>
Subject: RE: Hardware acceleration and packet number encryption
Cc: Jana Iyengar <jri.ietf@gmail.com>, huitema <huitema@huitema.net>, IETF QUIC WG <quic@ietf.org>
Content-Type: multipart/alternative; boundary="000000000000e416020568b9131e"
To: Praveen Balasubramanian <pravb@microsoft.com>
To: Watson Ladd <watsonbladd@gmail.com>
To: Ian Swett <ianswett@google.com>
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/WuylXAQaACUrVYLL4TB6ZLH9Vpg>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 31 Mar 2018 17:56:41 -0000

I can live with packet number encryption in v1 if it opens up for
unencrypted multipath multi PN spaces in v2. This in turn, we solve the
issue with single pass hardware for v2 - since there is then no encryption.

However, I strongly suggest that it is done in a way that does not require
buffer modification for encryption. That means the the packet number cannot
be part of the AEAD data. It is also not difficult to avoid - see my
comments added to PR 1079.

As to 64-bit encryption - I seriously doubt this is worthwhile -
non-standard, takes up space, dubious security given enough power, and
again, it is not necessary because you can encrypt packet numbers down to a
single byte if you keep the PN out fo the AEAD data. Please provide a
counter argument if you think that this is not correct:
https://github.com/quicwg/base-drafts/pull/1079#issuecomment-376165638

The only issue I see is whether the PN should be placed first or last - and
if hardware acceleration is not the target for v1, it should be placed last..

- and again, I think it is futile to encrypt packet numbers for linkability
protection, but as I said, if it opens up for V2 without it, please by all
means, let's get this over with. Just don’t require in-place buffer updates
or buffer copies before verification and decryption. It is not necessary.

Kind Regards,
Mikkel Fahnøe Jørgensen


On 31 March 2018 at 18.21.48, Praveen Balasubramanian (
pravb=40microsoft.com@dmarc.ietf.org) wrote:

PNE is actually PNT (packet number transform) as I recall Jana had pointed
out in the early discussions. Encryption is just one possible transform to
apply to the actual PN before putting it in the clear portion of the
header. There are other possible transforms like increment, XOR, and no-op.
Preventing linkability seems to require encryption but I am not convinced
that it actually provides unlikability or is worth the extra computation
cost per packet. Can someone please clarify how it prevents a simple attack
like looking at timing of failovers to link IP addresses? Preventing
ossification requires a transform that does some form of greasing. If we go
down the path of negotiation for the transform, the client initial PN will
need to be random and treated specially just like the CID and there are
likely some other gotchas I haven’t thought of.

Thanks

-----Original Message-----
From: QUIC [mailto:quic-bounces@ietf.org] On Behalf Of Praveen
Balasubramanian
Sent: Wednesday, March 28, 2018 7:47 PM
To: Watson Ladd <watsonbladd@gmail.com>; Ian Swett <ianswett@google.com>
Cc: Jana Iyengar <jri.ietf@gmail.com>; IETF QUIC WG <quic@ietf.org>;
huitema <huitema@huitema.net>
Subject: RE: Hardware acceleration and packet number encryption

Sorry for late response just catching up with this thread.

Re. applicability of hardware offload
This is much broader than the datacenter scenario. We rely on TCP offloads
today for performance of all web services even on the front end. TCP
offloads are present pretty much on every server and VM and on by default.
We cannot afford to make the tradeoff of 2x or more increase in CPU cost
for driving the same workload over QUIC as compared to TCP. If QUIC is
being looked at as a general purpose transport and a replacement of TCP,
then hardware offload is absolutely an important requirement. We already
have scenarios in progress that are non-HTTP. Over time the TCP offloads
have also been supported on client systems primarily on Ethernet, but we
have seen recent adoption in Wifi NICs and mobile broadband as well. The
CPU savings are large even at smaller data rates. I'd be happy to publish
the numbers we have around offloads like LSO and LRO.

Re. multiple PN spaces
I am not understanding why this has high implementation cost so please
explain more. For one, this seems to be needed primarily for connection
migration which to me looks like an optional feature of QUIC v1 (not sure
if this is stated as such in the draft but it should be). And
implementations that choose to build connection migration support might as
well do the right design so they are ready for multi-path.

My preference order for different proposed solutions:
1. Multiple PN spaces without PNE.
2. Negotiate PNE and allow implementations to skip it if they do not
support connection migration or multi-path. All datacenter scenarios will
qualify, apps that do not want to use multiple paths will qualify and all
systems that do not have multiple NICs (or paths) will qualify. This will
allow us to make incremental progress and work on better hardware support
over time. Option 2 holds irrespective of single or multiple PN spaces.
3. Any alternative form of PNE that doesn’t cause issues for offloads.
Still not ideal CPU cost wise (for when we cannot offload) but seems to
have other benefits so we can live with it.

Now I understand that leaving PN in the clear may cause ossification issue
but that seems solvable by greasing both the starting PN and subsequently
not always incrementing by 1 in the clear.

Thanks

-----Original Message-----
From: QUIC [mailto:quic-bounces@ietf.org] On Behalf Of Watson Ladd
Sent: Wednesday, March 28, 2018 6:54 PM
To: Ian Swett <ianswett@google.com>
Cc: Jana Iyengar <jri.ietf@gmail.com>; IETF QUIC WG <quic@ietf.org>;
huitema <huitema@huitema.net>
Subject: Re: Hardware acceleration and packet number encryption

On Wed, Mar 28, 2018 at 5:39 PM, Ian Swett <ianswett=
40google.com@dmarc.ietf.org> wrote:
> Thanks for the nice summary Jana.
>
> As much as I'd love to have easier crypto HW acceleration, I've ended
> up arriving at the same conclusion. I don't want to bite off the work
> to do proper multipath in QUIC v1, which I think is the only other
> reasonable option of those Christian outlined.
>
> If someone comes up with a way to transform packet number to make it
> non-linkable, but doesn't have the downside of making hardware offload
> difficult, then I'm open to it. But we've been talking about this for
> 2 months without any notable improvements over Martin's PR.
>
> Given we never talk about any issue only once in QUIC, I'm sure this
> will come up again, but for the time being I think #1079 is the best
> option we have.

I am not so sure this is right. Some proposals I've seen upthread:
- Use a 64 bit blockcipher to encrypt the sequence number
- Various online modes that may or may not be a good idea

And another idea I just had:
-Put the encrypted packet number last in the buffer so it gets outputed at
the right time for transmitting hardware, and then have the receiving
hardware copy the bytes to the front before passing it through the
decryptor.

Admittedly I don't understand the constraints on hardware that might be a
problem for these approaches, but I don't think we are quite licked yet.

Sincerely,
Watson
>
>
>
> On Wed, Mar 28, 2018 at 8:03 PM Jana Iyengar <jri.ietf@gmail.com> wrote:
>>
>> A few quick thoughts as I catch up on this thread.
>>
>> I spent some time last week working through a design using multiple
>> PN spaces, and it is quite doable. I suspect we'll head towards
>> multiple PN spaces as we consider multipath in the future. That said,
>> there is complexity (as Christian notes). This complexity may be
>> warranted when doing multipath in v2 or later, but I'm not convinced
>> that this is necessary as a design primitive for QUICv1.
>>
>> We may want to creatively use the PN bits in v2, say to encode a path
>> ID and a PN, for multipath. We want to retain flexibility in these
>> bits going into v2. We've used encryption to ensure that we don't
>> lose flexibility elsewhere in the header, and it follows that we
>> should use PNE to retain flexibility in these bits as well.
>> (Simplicity of design is the other value in using PNE, since handling
>> migration linkability is non-trivial without
>> it.)
>>
>> This leaves the question of HW acceleration being at loggerheads with
>> the design in PR #1079. First, I expect that the primary benefit of
>> acceleration will be in DC environments. Yes, there are some gains to
>> be had in serving the public Internet as well, but I'm unconvinced
>> that this is the driving use case for hardware acceleration. I
>> understand that others may disagree with me here.
>>
>> AFAIK, QUIC has not been used in DC environments yet. I expect there
>> are other things in the protocol that we'd want to change as we gain
>> experience deploying QUIC in DCs. Spinning up a new version to try
>> QUIC within DCs is not only appropriate, I would recommend it. This
>> allows for rapid iterations internally, and the experience can drive
>> subsequent changes to QUIC. It's what *I* would do if I was to deploy
QUIC inside a DC.
>>
>> So, in short, I think we should go ahead with PR# 1079. This ensures
>> that future versions are guaranteed the flexibility to change the PN
>> bits for better support of HW acceleration or multipath or
what-have-you.
>>
>> - jana
>>
>> On Mar 26, 2018 9:41 AM, "Christian Huitema" <huitema@huitema.net>
wrote:
>>
>>
>> On 3/26/2018 8:20 AM, Swindells, Thomas (Nokia - GB/Cambridge) wrote:
>>
>> Looking at
>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.w
>> ikipedia.org%2Fwiki%2FAES_instruction_set%23Intel_and_AMD_x86_archite
>> cture&data=02%7C01%7Cpravb%40microsoft.com%7C8124554015264408874708d5
>> 95180268%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636578852732108
>> 907&sdata=4kf2m4KYai6Gd6j4Vc1nFOpddVaBKP%2FRXRDcBF57JRQ%3D&reserved=0
>> it seems to imply a large range of server, desktop and mobile chips
>> all have a CPU instruction set available to do AES acceleration and
>> other similar operations (other instruction sets are also available).
>>
>> If we are considering the AES instructions then it looks like it is
>> (or at least will be in the near future) a sizeable proportion of the
>> public internet have it to be used.
>>
>>
>> Certainly, but that's not the current debate. PR #1079 is fully
>> compatible with use of the AES instructions. The issue of the debate
>> is that the mechanism in PR #1079 required double buffering, first
>> encrypt the payload, then use the result of the encryption to encrypt
>> the PN. This is not an issue in a software implementation that can
>> readily access all bytes of the packet from memory, but it may be an
>> issue in some hardware implementations that are designed to do just one
pass over the data.
>>
>>
>> -- Christian Huitema
>>
>>
>>
>



-- 
"Man is born free, but everywhere he is in chains".
--Rousseau.