Re: Packet Number Encryption Performance
Ian Swett <ianswett@google.com> Fri, 22 June 2018 18:51 UTC
Return-Path: <ianswett@google.com>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9C30F130EF7 for <quic@ietfa.amsl.com>; Fri, 22 Jun 2018 11:51:57 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -17.51
X-Spam-Level:
X-Spam-Status: No, score=-17.51 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, T_DKIMWL_WL_MED=-0.01, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id D_hpNVbJNnsj for <quic@ietfa.amsl.com>; Fri, 22 Jun 2018 11:51:51 -0700 (PDT)
Received: from mail-yw0-x233.google.com (mail-yw0-x233.google.com [IPv6:2607:f8b0:4002:c05::233]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 91A27130EEC for <quic@ietf.org>; Fri, 22 Jun 2018 11:51:51 -0700 (PDT)
Received: by mail-yw0-x233.google.com with SMTP id b125-v6so2739977ywe.1 for <quic@ietf.org>; Fri, 22 Jun 2018 11:51:51 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=iky+FwCgmBCKJQKAq5RgMLd4Dk4APYhQ4LiLPNIPzYE=; b=HlGWxKrEK7w6M0EvDROk11HKKF04W+7clWLVrW5KTvh7edtULkxafISFSzzV80HPkC L+HKr4Fv3UNu7FW9R+/YRZDOrfr2YHOQ4VeG3FGONpQJQ1cT42LHGG4MikasxkjTdrl9 hVMtcc8DvLJJKM0KQcWfpoV24piErmlqtBGYkl7DR2Pp3Ys9LL5eJDJVEo6ZOLVT7sLz dn9e20fjIgcdl/UAJy5xes3nkI5D90t0OAmxzpOQLxz5RlRQG7j7VWH5Ve3uC0cPNXkk aYjxhzsi33UJiAdu2xEbeF1pDmeqrSA01fu9B4u87Y9s7bUKelWaVigtNZySPrS+OoCk EqmA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=iky+FwCgmBCKJQKAq5RgMLd4Dk4APYhQ4LiLPNIPzYE=; b=G1NCnaBpO8uMSnxzVBfpAXUMGHGjQxH/2oOZEVuUBtEgV9vFvHNgYUjrCVI0slJdu5 l90H+A/IG82SRd8HtTOcDLG3dmI9lgokcyZrkDbidxem24a8/S/U+YA64HJCQvNIq/v2 0iTq8EaWHsoQ/Xoa/1JMOgL7P/+na/hGFNrR41UAOFbj4yvTyUUBtKZ5aKay8iXQ93+p 1QxzNbtkAhL+e+jsJYQiP5p7ysEAMlG6cZxdFTSsm6wphmpMl+ooDe6zD9cygxi2MLgV o22NSBvPu1eU6T4wX+oxY1PDISQ5FZkJoJg2bZBZUYNCsOFJpuY5fD02bm507SqmhOvU JS3w==
X-Gm-Message-State: APt69E2In1ikE935fxK2GTW5CA+6nrRw0RQhvmkITmiorg4syoFvyA99 PyvLQdnQMtjiFTOoUURlBFn8W+6Sw8mjej59A4HxvQ==
X-Google-Smtp-Source: ADUXVKIhyX6CIlNU0PqQ+0GGUbBLDvoDrYop7hysUA3s7ymFn00OYWSztyLTvgNMXY4ebwRbH5LszdHTsNTrV8yrQ8w=
X-Received: by 2002:a81:3208:: with SMTP id y8-v6mr1369075ywy.361.1529693510432; Fri, 22 Jun 2018 11:51:50 -0700 (PDT)
MIME-Version: 1.0
References: <DM5PR2101MB0901FCB1094A124818A0B1FEB3760@DM5PR2101MB0901.namprd21.prod.outlook.com> <CANatvzxVBq1-UKiuixWGFfFyWMh8SYpp=y2LqYwiF=tHT6oOOQ@mail.gmail.com> <DM5PR2101MB0901C834F1FDFEC6D0D50781B3750@DM5PR2101MB0901.namprd21.prod.outlook.com> <CANatvzz0u=oy1j2_6=bn6bcuwzQv_6fVqe3WkBtjwaAZ8Bfh=w@mail.gmail.com> <CANatvzysRVQXsB0ZCReY3n_R_kZT-jhmYwR-7-2KYt5+GZCk0A@mail.gmail.com> <CAKcm_gPxYu9jNFmYR0_vQfawuC+T_E9UJbcDPOycrUAMuVJabg@mail.gmail.com> <CY4PR21MB06303A8C17796335F3A3FDE2B6750@CY4PR21MB0630.namprd21.prod.outlook.com> <DM5PR2101MB0901939C8975A87AA74219B9B3750@DM5PR2101MB0901.namprd21.prod.outlook.com>
In-Reply-To: <DM5PR2101MB0901939C8975A87AA74219B9B3750@DM5PR2101MB0901.namprd21.prod.outlook.com>
From: Ian Swett <ianswett@google.com>
Date: Fri, 22 Jun 2018 14:51:38 -0400
Message-ID: <CAKcm_gMc6y_2+KU3L+XpifNK4JESFA0V=OX4Nj51jTFfAm9M1A@mail.gmail.com>
Subject: Re: Packet Number Encryption Performance
To: nibanks=40microsoft.com@dmarc.ietf.org
Cc: Praveen Balasubramanian <pravb@microsoft.com>, Kazuho Oku <kazuhooku@gmail.com>, IETF QUIC WG <quic@ietf.org>
Content-Type: multipart/alternative; boundary="0000000000004eb6f6056f3f86d5"
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/X7NjpGrYH9WcdwoNGRVXzvydK6s>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.26
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 22 Jun 2018 18:52:06 -0000
I expect crypto to increase as a fraction of CPU, but I don't expect it to go much higher than 10%. But who knows, maybe 2 years from now everything else will be very optimized and crypto will be 15%? On Fri, Jun 22, 2018 at 12:34 PM Nick Banks <nibanks= 40microsoft.com@dmarc.ietf.org> wrote: > I just want to add, my implementation already uses ECB from bcrypt (and I > do the XOR) already. Bcrypt doesn’t expose CTR mode directly. > > > > Sent from Mail <https://go.microsoft.com/fwlink/?LinkId=550986> for > Windows 10 > > > ------------------------------ > *From:* Praveen Balasubramanian > *Sent:* Friday, June 22, 2018 9:26:44 AM > *To:* Ian Swett; Kazuho Oku > *Cc:* Nick Banks; IETF QUIC WG > *Subject:* RE: Packet Number Encryption Performance > > > Ian, do you expect that fraction of overall cost to hold once the UDP > stack is optimized? Is your measurement on top of the recent kernel > improvements? I expect crypto fraction of overall cost to keep increasing > over time as the network stack bottlenecks are eliminated. > > > > Kazuho, should the draft describe the optimizations you are making? Or are > these are too OpenSSL specific? > > > > *From:* QUIC [mailto:quic-bounces@ietf.org] *On Behalf Of *Ian Swett > *Sent:* Friday, June 22, 2018 4:45 AM > *To:* Kazuho Oku <kazuhooku@gmail.com> > *Cc:* Nick Banks <nibanks@microsoft.com>; IETF QUIC WG <quic@ietf.org> > *Subject:* Re: Packet Number Encryption Performance > > > > Thanks for digging into the details of this, Kazuho. <4% increase in > crypto cost is a bit more than I originally expected(~2%), but crypto is > less than 10% of my CPU usage, so it's still less than 0.5% total, which is > acceptable to me. > > > > On Fri, Jun 22, 2018 at 2:45 AM Kazuho Oku <kazuhooku@gmail.com> wrote: > > > > > > 2018-06-22 12:22 GMT+09:00 Kazuho Oku <kazuhooku@gmail.com>: > > > > > > 2018-06-22 11:54 GMT+09:00 Nick Banks <nibanks@microsoft.com>: > > Hi Kazuho, > > > > Thanks for sharing your numbers as well! I'm bit confused where you say > you can reduce the 10% overhead to 2% to 4%. How do you plan on doing that? > > > > As stated in my previous mail, the 10% of overhead consists of three > parts, each consuming comparable number of CPU cycles. The two among the > three is related to the abstraction layer and how CTR is implemented, while > the other one is the core AES-ECB operation cost. > > > > It should be able to remove the costly abstraction layer. > > > > It should also be possible to remove the overhead of CTR, since in PNE, we > need to XOR at most 4 octets (applying XOR is the only difference between > CTR and ECB). That cost should be something that should be possible to be > nullified. > > > > Considering these aspects, and by looking at the numbers on the OpenSSL > source code (as well as considering the overhead of GCM), my expectation > goes to 2% to 4%. > > > > Just did some experiments and it seems that the expectation was correct. > > > > The benchmarks tell me that the overhead goes down from 10.0% to 3.8%, by > doing the following: > > > > * remove the overhead of CTR abstraction (i.e. use the ECB backend and do > XOR by ourselves) > > * remove the overhead of the abstraction layer (i.e. call the method > returned by EVP_CIPHER_meth_get_do_cipher instead of calling > EVP_EncryptUpdate) > > > > Of course the changes are specific to OpenSSL, but I would expect that you > can expect similar numbers assuming that you have access to an optimized > AES implementation. > > > > > > > > Sent from my Windows 10 phone > > [HxS - 15254 - 16.0.10228.20075] > > > ------------------------------ > > *From:* Kazuho Oku <kazuhooku@gmail.com> > *Sent:* Thursday, June 21, 2018 7:21:17 PM > *To:* Nick Banks > *Cc:* quic@ietf.org > *Subject:* Re: Packet Number Encryption Performance > > > > Hi Nick, > > > > Thank you for bringing the numbers to the list. > > > > I have just run a small benchmark using Quicly, and I see comparable > numbers. > > > > To be precise, I see 10.0% increase of CPU cycles when encrypting a > Initial packet of 1,280 octets. I expect that we will see similar numbers > on other QUIC stacks that also use picotls (with OpenSSL as a backend). > Note that the number is only comparing the cost of encryption, the overhead > ratio will be much smaller if we look at the total number of CPU cycles > spent by a QUIC stack as a whole. > > > > Looking at the profile, the overhead consists of three operations that > each consumes comparable CPU cycles: core AES operation (using AES-NI), CTR > operation overhead, CTR initialization. Note that picotls at the moment > provides access to CTR crypto beneath the AEAD interface, which is to be > used by the QUIC stacks. > > > > I would assume that we can cut down the overhead to somewhere between 2% > to 4%, but it might be hard to go down to somewhere near 1%, because we > cannot parallelize the AES operation of PNE with that of AEAD (see > https://github.com/openssl/openssl/blob/OpenSSL_1_1_0h/crypto/aes/asm/aesni-x86_64.pl#L24-L39 > <https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fopenssl%2Fopenssl%2Fblob%2FOpenSSL_1_1_0h%2Fcrypto%2Faes%2Fasm%2Faesni-x86_64.pl%23L24-L39&data=02%7C01%7Cnibanks%40microsoft.com%7C11d55f17333e4a795d7008d5d7e6d93c%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636652308843994134&sdata=kqMz4SsN%2F2ErGK06Qz8Z0vUzpl4MiipnNE2wAMUb46c%3D&reserved=0> > about the impact of parallelization). > > > > I do not think that 2% to 4% of additional overhead to the crypto is an > issue for QUIC/HTTP, but current overhead of 10% is something that we might > want to decrease. I am glad to be able to learn that now. > > > > > > 2018-06-22 5:48 GMT+09:00 Nick Banks < > nibanks=40microsoft.com@dmarc.ietf.org>: > > Hello QUIC WG, > > > > I recently implemented PNE for WinQuic (using bcrypt APIs) and I decided > to get some performance numbers to see what the overhead of PNE was. I > figured the rest of the WG might be interested. > > > > My test just encrypts the same buffer (size dependent on the test case) > 10,000,000 times and measured the time it took. The test then did the same > thing, but also encrypted the packet number as well. I ran all that 10 > times in total. I then collected the best times for each category to > produce the following graphs and tables (full excel doc attached): > > > > [image: cid:image003.png@01D40966.7655B6B0] > > > > *Time (ms)* > > *Rate (Mbps)* > > *Bytes* > > *NO PNE* > > *PNE* > > *PNE Overhead* > > *No PNE* > > *PNE* > > *4* > > 2284.671 > > 3027.657 > > 33% > > 140.064 > > 105.692 > > *16* > > 2102.402 > > 2828.204 > > 35% > > 608.827 > > 452.584 > > *64* > > 2198.883 > > 2907.577 > > 32% > > 2328.45 > > 1760.92 > > *256* > > 2758.3 > > 3490.28 > > 27% > > 7424.86 > > 5867.72 > > *600* > > 4669.283 > > 5424.539 > > 16% > > 10280 > > 8848.68 > > *1000* > > 6130.139 > > 6907.805 > > 13% > > 13050.3 > > 11581.1 > > *1200* > > 6458.679 > > 7229.672 > > 12% > > 14863.7 > > 13278.6 > > *1450* > > 7876.312 > > 8670.16 > > 10% > > 14727.7 > > 13379.2 > > > > I used a server grade lab machine I had at my disposal, running the latest > Windows 10 Server DataCenter build. Again, these numbers are for crypto > only. No QUIC or UDP is included. > > > > Thanks, > > - Nick > > > > > > > > -- > > Kazuho Oku > > > > > > -- > > Kazuho Oku > > > > > > -- > > Kazuho Oku > >
- Re: Packet Number Encryption Performance Ian Swett
- Re: Packet Number Encryption Performance Kazuho Oku
- Re: Packet Number Encryption Performance Willy Tarreau
- Re: Packet Number Encryption Performance Mikkel Fahnøe Jørgensen
- Re: Packet Number Encryption Performance Kazuho Oku
- RE: Packet Number Encryption Performance Nick Banks
- Re: Packet Number Encryption Performance Kazuho Oku
- Re: Packet Number Encryption Performance Jana Iyengar
- RE: Packet Number Encryption Performance Nick Banks
- Re: Packet Number Encryption Performance Jana Iyengar
- RE: Packet Number Encryption Performance Deval, Manasi
- Packet Number Encryption Performance Nick Banks
- Re: Packet Number Encryption Performance Jana Iyengar
- Re: Packet Number Encryption Performance Rui Paulo
- RE: Packet Number Encryption Performance Nick Banks
- Re: Packet Number Encryption Performance Kazuho Oku
- RE: Packet Number Encryption Performance Nick Banks
- RE: Packet Number Encryption Performance Mikkel Fahnøe Jørgensen
- RE: Packet Number Encryption Performance Nick Banks
- Re: Packet Number Encryption Performance Kazuho Oku
- RE: Packet Number Encryption Performance Nick Banks
- Re: Packet Number Encryption Performance Ian Swett
- RE: Packet Number Encryption Performance Nick Banks
- RE: Packet Number Encryption Performance Praveen Balasubramanian
- Re: Packet Number Encryption Performance Kazuho Oku