RE: Hardware acceleration and packet number encryption

Praveen Balasubramanian <pravb@microsoft.com> Thu, 29 March 2018 02:46 UTC

Return-Path: <pravb@microsoft.com>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 94F3C129C5D for <quic@ietfa.amsl.com>; Wed, 28 Mar 2018 19:46:42 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level:
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=microsoft.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cpjAcp6A6LNR for <quic@ietfa.amsl.com>; Wed, 28 Mar 2018 19:46:39 -0700 (PDT)
Received: from NAM02-SN1-obe.outbound.protection.outlook.com (mail-sn1nam02on0138.outbound.protection.outlook.com [104.47.36.138]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 37362120726 for <quic@ietf.org>; Wed, 28 Mar 2018 19:46:39 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=ENsRAyYCdkFTu/8sR51dpzoX+vmxnyy5X6I2c/aA9JU=; b=KQB5yHvsxIp6npTuYXeZuAFCRXmOMt63OGt5xIKLFDluUuyZQq5EQbJ3tXcnS7J4LOrUXalUUe2tVwWYz+qKNtFbfYILkbfecooo3YcQ/0u+AUUtiFg9G96FEDz5RY0cF06RlZpO6E6/26s7mUVlzSukpaydK/SXgtBEXaeU7IQ=
Received: from CY4PR21MB0630.namprd21.prod.outlook.com (10.175.115.20) by CY4PR21MB0133.namprd21.prod.outlook.com (10.173.189.15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.653.0; Thu, 29 Mar 2018 02:46:36 +0000
Received: from CY4PR21MB0630.namprd21.prod.outlook.com ([fe80::de:ba33:4748:51da]) by CY4PR21MB0630.namprd21.prod.outlook.com ([fe80::de:ba33:4748:51da%6]) with mapi id 15.20.0653.006; Thu, 29 Mar 2018 02:46:36 +0000
From: Praveen Balasubramanian <pravb@microsoft.com>
To: Watson Ladd <watsonbladd@gmail.com>, Ian Swett <ianswett@google.com>
CC: Jana Iyengar <jri.ietf@gmail.com>, IETF QUIC WG <quic@ietf.org>, huitema <huitema@huitema.net>
Subject: RE: Hardware acceleration and packet number encryption
Thread-Topic: Hardware acceleration and packet number encryption
Thread-Index: AQHTw2pWXYDTIV9KLUGmQTRvaS20baPfab+AgAAo/YCAAFZ6AIAADAQAgAAvsICAAAzVgIAB2cSAgABzBgCAAAxXAIAADMAAgAANjwCAABZegIADoGkAgAAJ9QCAABTugIAADYsQ
Date: Thu, 29 Mar 2018 02:46:36 +0000
Message-ID: <CY4PR21MB063062DBFA99CA14C6A995F6B6A20@CY4PR21MB0630.namprd21.prod.outlook.com>
References: <7fd34142-2e14-e383-1f65-bc3ca657576c@huitema.net> <F9FCC213-62B9-437C-ADF9-1277E6090317@gmail.com> <CABcZeBM3PfPkqVxPMcWM-Noyk=M2eCFWZw2Eq-XytbHM=0T9Uw@mail.gmail.com> <CAN1APdfjuvd1eBWCYedsbpi1mx9_+Xa6VvZ3aq_Bhhc+HN67ug@mail.gmail.com> <CABcZeBMtQBwsAF85i=xHmWN3PuGRkJEci+_PjS3LDXi7NgHyYg@mail.gmail.com> <1F436ED13A22A246A59CA374CBC543998B5CCEFD@ORSMSX111.amr.corp.intel.com> <CABcZeBNfPsJtLErBn1=iGKuLjJMo=jEB5OLxDuU7FxjJv=+b=A@mail.gmail.com> <1F436ED13A22A246A59CA374CBC543998B5CDAD4@ORSMSX111.amr.corp.intel.com> <BBB8D1DE-25F8-4F3D-B274-C317848DE872@akamai.com> <CAN1APdd=47b2eXkvMg+Q_+P254xo4vo-Tu-YQu6XoUGMByO_eQ@mail.gmail.com> <CAKcm_gMpz4MpdmrHLtC8MvTf5uO9LjD915jM-i2LfpKY384O2w@mail.gmail.com> <HE1PR0702MB3611A67E764EE1C7D1644FAD84AD0@HE1PR0702MB3611.eurprd07.prod.outlook.com> <d8e35569-e939-4064-9ec4-2cccfba2f341@huitema.net> <CACpbDccqKoF-Y1poHMN2cLOK9GOuvtMTPsF-QEen3b30kUo9bg@mail.gmail.com> <CAKcm_gNffwpraF-H2LQBF33vUhYFx0bi_UXJ3N14k4Xj4NmWUw@mail.gmail.com> <CACsn0ckbthsn6V+0ccqZG=PF6BY74rAg-+Wwa7h=4tavOzCs+A@mail.gmail.com>
In-Reply-To: <CACsn0ckbthsn6V+0ccqZG=PF6BY74rAg-+Wwa7h=4tavOzCs+A@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [2001:4898:80e8:4::712]
x-ms-publictraffictype: Email
x-microsoft-exchange-diagnostics: 1; CY4PR21MB0133; 7:YjbHcmb9rq4S6/8+m4VeempwMZ8MyuQoiekBTZXwvvV71pY5SxRWmSB6PUqnmO9AwuTgev6L0BU3Nsra4zowFciLOd7qYLWzbCXpxPl57ejCZW+31KyzzD4TGaPxgyj7T4nSOkgwp2cPk4MWFwaPA07l413kiQ8PpTAuALJQbtqUQ/Fe9OzVz/2sngyIMtxkOdmxOYUri1WeqX3HscRKZ0FUnsV/RDD5JIdIQtEwRIUqO/UEAwo5vzkGhLbtrLJ/; 20:1z8gNkxb9q8qOvDx5JJPq8D6h/36PsUvrbuHmJvSHp3bHBaeUCFuelCc1XGgi5QreoYDYXZpHHa498WUSAOfs9UluQXuJbwk85UcpcBKDo1ERUdSRFjtEFS+YrYgXJS+8ar2+YTbaGoq5Tq2bR59PriO6V9zkN4TK5wO1LmZEmM=
x-ms-exchange-antispam-srfa-diagnostics: SOS;
x-ms-office365-filtering-ht: Tenant
x-ms-office365-filtering-correlation-id: 4eeb9452-08cd-40db-7cac-08d5951f4a6a
x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(48565401081)(5600026)(4604075)(3008032)(4534165)(4627221)(201703031133081)(201702281549075)(2017052603328)(7193020); SRVR:CY4PR21MB0133;
x-ms-traffictypediagnostic: CY4PR21MB0133:
x-ld-processed: 72f988bf-86f1-41af-91ab-2d7cd011db47,ExtAddr
x-microsoft-antispam-prvs: <CY4PR21MB013360823FC6124D6577A031B6A20@CY4PR21MB0133.namprd21.prod.outlook.com>
x-exchange-antispam-report-test: UriScan:(158342451672863)(209352067349851)(189930954265078)(85827821059158)(211936372134217)(153496737603132)(219752817060721)(266576461109395)(17755550239193);
x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(8211001083)(61425038)(6040522)(2401047)(5005006)(8121501046)(10201501046)(93006095)(93001095)(3002001)(3231221)(944501327)(52105095)(6055026)(61426038)(61427038)(6041310)(20161123560045)(20161123564045)(20161123558120)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123562045)(6072148)(201708071742011); SRVR:CY4PR21MB0133; BCL:0; PCL:0; RULEID:; SRVR:CY4PR21MB0133;
x-forefront-prvs: 0626C21B10
x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(376002)(396003)(346002)(366004)(39380400002)(39860400002)(51914003)(13464003)(189003)(199004)(305945005)(86362001)(14454004)(10290500003)(476003)(39060400002)(22452003)(25786009)(7736002)(46003)(5660300001)(478600001)(11346002)(33656002)(54906003)(10090500001)(102836004)(6436002)(93886005)(446003)(966005)(316002)(74316002)(105586002)(486005)(97736004)(486005)(2900100001)(4326008)(186003)(99286004)(59450400001)(68736007)(8936002)(6506007)(86612001)(6116002)(5250100002)(9686003)(76176011)(229853002)(7696005)(8676002)(6246003)(6306002)(55016002)(6346003)(110136005)(53936002)(53546011)(8990500004)(81166006)(2906002)(81156014)(3280700002)(106356001)(3660700001); DIR:OUT; SFP:1102; SCL:1; SRVR:CY4PR21MB0133; H:CY4PR21MB0630.namprd21.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1;
received-spf: None (protection.outlook.com: microsoft.com does not designate permitted sender hosts)
authentication-results: spf=none (sender IP is ) smtp.mailfrom=pravb@microsoft.com;
x-microsoft-antispam-message-info: Wfb8pOO3hYA0GPA+ULxRcFOjpDp+iUjz4fA4uZH/S0TT2zHyp8fdBQxjwHt2YPgcJyRHuK4VVZVsDbvwbzLV13ZjOgLRzl1JCog2DJeqjO7ZGKmYDp9CSjgwoVbfSPephXOlUIXnABfaKPFD5xq+Fxl4AyRBuPmK5KdSs/NVe+aeFeOcXveYKzalgJVpI+JSOB7T7nm7kDhl/QTLBjSxNKsEC/MEQXukabrLO17KnlEw7OgVQtK1JMviPIddxgsnzvAHRlBth8Pk6OCkl4SjHbN4S8Pi5kVOcH3ouJXnm6FPscXdq1H7tWqqh15cz+2j95teO5VzhAUcR2YApa99/Q==
spamdiagnosticoutput: 1:99
spamdiagnosticmetadata: NSPM
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-OriginatorOrg: microsoft.com
X-MS-Exchange-CrossTenant-Network-Message-Id: 4eeb9452-08cd-40db-7cac-08d5951f4a6a
X-MS-Exchange-CrossTenant-originalarrivaltime: 29 Mar 2018 02:46:36.6970 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 72f988bf-86f1-41af-91ab-2d7cd011db47
X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY4PR21MB0133
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/vF8eehHvAer6bgt8hVQ6fDE240A>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 29 Mar 2018 02:46:42 -0000

Sorry for late response just catching up with this thread.

Re. applicability of hardware offload
This is much broader than the datacenter scenario. We rely on TCP offloads today for performance of all web services even on the front end. TCP offloads are present pretty much on every server and VM and on by default. We cannot afford to make the tradeoff of 2x or more increase in CPU cost for driving the same workload over QUIC as compared to TCP. If QUIC is being looked at as a general purpose transport and a replacement of TCP, then hardware offload is absolutely an important requirement. We already have scenarios in progress that are non-HTTP. Over time the TCP offloads have also been supported on client systems primarily on Ethernet, but we have seen recent adoption in Wifi NICs and mobile broadband as well. The CPU savings are large even at smaller data rates. I'd be happy to publish the numbers we have around offloads like LSO and LRO. 

Re. multiple PN spaces
I am not understanding why this has high implementation cost so please explain more. For one, this seems to be needed primarily for connection migration which to me looks like an optional feature of QUIC v1 (not sure if this is stated as such in the draft but it should be). And implementations that choose to build connection migration support might as well do the right design so they are ready for multi-path. 

My preference order for different proposed solutions:
1. Multiple PN spaces without PNE. 
2. Negotiate PNE and allow implementations to skip it if they do not support connection migration or multi-path. All datacenter scenarios will qualify, apps that do not want to use multiple paths will qualify and all systems that do not have multiple NICs (or paths) will qualify. This will allow us to make incremental progress and work on better hardware support over time. Option 2 holds irrespective of single or multiple PN spaces.
3. Any alternative form of PNE that doesn’t cause issues for offloads. Still not ideal CPU cost wise (for when we cannot offload) but seems to have other benefits so we can live with it.

Now I understand that leaving PN in the clear may cause ossification issue but that seems solvable by greasing both the starting PN and subsequently not always incrementing by 1 in the clear.

Thanks

-----Original Message-----
From: QUIC [mailto:quic-bounces@ietf.org] On Behalf Of Watson Ladd
Sent: Wednesday, March 28, 2018 6:54 PM
To: Ian Swett <ianswett@google.com>
Cc: Jana Iyengar <jri.ietf@gmail.com>; IETF QUIC WG <quic@ietf.org>; huitema <huitema@huitema.net>
Subject: Re: Hardware acceleration and packet number encryption

On Wed, Mar 28, 2018 at 5:39 PM, Ian Swett <ianswett=40google.com@dmarc.ietf.org> wrote:
> Thanks for the nice summary Jana.
>
> As much as I'd love to have easier crypto HW acceleration, I've ended 
> up arriving at the same conclusion.  I don't want to bite off the work 
> to do proper multipath in QUIC v1, which I think is the only other 
> reasonable option of those Christian outlined.
>
> If someone comes up with a way to transform packet number to make it 
> non-linkable, but doesn't have the downside of making hardware offload 
> difficult, then I'm open to it.  But we've been talking about this for 
> 2 months without any notable improvements over Martin's PR.
>
> Given we never talk about any issue only once in QUIC, I'm sure this 
> will come up again, but for the time being I think #1079 is the best 
> option we have.

I am not so sure this is right. Some proposals I've seen upthread:
- Use a 64 bit blockcipher to encrypt the sequence number
- Various online modes that may or may not be a good idea

And another idea I just had:
-Put the encrypted packet number last in the buffer so it gets outputed at the right time for transmitting hardware, and then have the receiving hardware copy the bytes to the front before passing it through the decryptor.

Admittedly I don't understand the constraints on hardware that might be a problem for these approaches, but I don't think we are quite licked yet.

Sincerely,
Watson
>
>
>
> On Wed, Mar 28, 2018 at 8:03 PM Jana Iyengar <jri.ietf@gmail.com> wrote:
>>
>> A few quick thoughts as I catch up on this thread.
>>
>> I spent some time last week working through a design using multiple 
>> PN spaces, and it is quite doable. I suspect we'll head towards 
>> multiple PN spaces as we consider multipath in the future. That said, 
>> there is complexity (as Christian notes). This complexity may be 
>> warranted when doing multipath in v2 or later, but I'm not convinced 
>> that this is necessary as a design primitive for QUICv1.
>>
>> We may want to creatively use the PN bits in v2, say to encode a path 
>> ID and a PN, for multipath. We want to retain flexibility in these 
>> bits going into v2. We've used encryption to ensure that we don't 
>> lose flexibility elsewhere in the header, and it follows that we 
>> should use PNE to retain flexibility in these bits as well. 
>> (Simplicity of design is the other value in using PNE, since handling 
>> migration linkability is non-trivial without
>> it.)
>>
>> This leaves the question of HW acceleration being at loggerheads with 
>> the design in PR #1079. First, I expect that the primary benefit of 
>> acceleration will be in DC environments. Yes, there are some gains to 
>> be had in serving the public Internet as well, but I'm unconvinced 
>> that this is the driving use case for hardware acceleration. I 
>> understand that others may disagree with me here.
>>
>> AFAIK, QUIC has not been used in DC environments yet. I expect there 
>> are other things in the protocol that we'd want to change as we gain 
>> experience deploying QUIC in DCs. Spinning up a new version to try 
>> QUIC within DCs is not only appropriate, I would recommend it. This 
>> allows for rapid iterations internally, and the experience can drive 
>> subsequent changes to QUIC. It's what *I* would do if I was to deploy QUIC inside a DC.
>>
>> So, in short, I think we should go ahead with PR# 1079. This ensures 
>> that future versions are guaranteed the flexibility to change the PN 
>> bits for better support of HW acceleration or multipath or what-have-you.
>>
>> - jana
>>
>> On Mar 26, 2018 9:41 AM, "Christian Huitema" <huitema@huitema.net> wrote:
>>
>>
>> On 3/26/2018 8:20 AM, Swindells, Thomas (Nokia - GB/Cambridge) wrote:
>>
>> Looking at
>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.w
>> ikipedia.org%2Fwiki%2FAES_instruction_set%23Intel_and_AMD_x86_archite
>> cture&data=02%7C01%7Cpravb%40microsoft.com%7C8124554015264408874708d5
>> 95180268%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636578852732108
>> 907&sdata=4kf2m4KYai6Gd6j4Vc1nFOpddVaBKP%2FRXRDcBF57JRQ%3D&reserved=0
>> it seems to imply a large range of server, desktop and mobile chips 
>> all have a CPU instruction set available to do AES acceleration and 
>> other similar operations (other instruction sets are also available).
>>
>> If we are considering the AES instructions then it looks like it is 
>> (or at least will be in the near future) a sizeable proportion of the 
>> public internet have it to be used.
>>
>>
>> Certainly, but that's not the current debate. PR #1079 is fully 
>> compatible with use of the AES instructions. The issue of the debate 
>> is that the mechanism in PR #1079 required double buffering, first 
>> encrypt the payload, then use the result of the encryption to encrypt 
>> the PN. This is not an issue in a software implementation that can 
>> readily access all bytes of the packet from memory, but it may be an 
>> issue in some hardware implementations that are designed to do just one pass over the data.
>>
>>
>> -- Christian Huitema
>>
>>
>>
>



--
"Man is born free, but everywhere he is in chains".
--Rousseau.