Re: Proposal to replace ACK block count with ACK length
Eric Rescorla <ekr@rtfm.com> Fri, 22 June 2018 13:11 UTC
Return-Path: <ekr@rtfm.com>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DD5D5130E5A for <quic@ietfa.amsl.com>; Fri, 22 Jun 2018 06:11:52 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.909
X-Spam-Level:
X-Spam-Status: No, score=-1.909 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, T_DKIMWL_WL_MED=-0.01] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=rtfm-com.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id sDrmS-apHgMY for <quic@ietfa.amsl.com>; Fri, 22 Jun 2018 06:11:47 -0700 (PDT)
Received: from mail-yw0-x232.google.com (mail-yw0-x232.google.com [IPv6:2607:f8b0:4002:c05::232]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A3176130E59 for <quic@ietf.org>; Fri, 22 Jun 2018 06:11:47 -0700 (PDT)
Received: by mail-yw0-x232.google.com with SMTP id k18-v6so2360699ywm.11 for <quic@ietf.org>; Fri, 22 Jun 2018 06:11:47 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rtfm-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=30NyicC0wwKHU9sNYE1WVnUBHAiQ5BZi+Li3DEQNZ4U=; b=eenP6gb4u89tKRWpScnH3R0lSDftxVUA0dkPejl7D+ihQ4Fyoe7VvEgqNATNAifDnr qnRVf576jqeEiFVuaGhM2lJ5f0o6bCLnmwaoBJCnW2xhzCCmz45kr2H6dRqLycdYUncD AHYZnuO3cY8FxRfarQDY7jGxuQuIwLUToaZdB3BD8iIDoQGlcBzQ1FbxcYAkx+gHq7Wg NyX4kL9IrdWevu9of6AsKuL3ZMjGn2uy1Ko8u6GSl1gL5hy5KECuIrSRhZ/yHxyENMGK +xsp7SJy5sthsb+Ro1Q7/NzciZobGHVl7ZzFiH8f1emHBN3usNQ/WRC2y1cWJrR1AMj6 vniA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=30NyicC0wwKHU9sNYE1WVnUBHAiQ5BZi+Li3DEQNZ4U=; b=mugc9O444Hy3j74BOnpkXX+PTgIaE7dyjLG3kOPUbUEo4OEcrYN1+DuhsLDmrpgq8L nlKUO/gWApbk9f2JtuyI7Kidy76QeLeCgsuBa+PKoVh5DsVlFr59d8F20SHTvxpkjMzO I/Pb1T9/qHrNLV29u/BMN66QqmIno+nB85D70TXcf+L/r5Vv1qtt5ljU1+5ZDxlWaQoF F1dJXU3rb6aLlgncTdcW7U8T2Rj/oKJTmacFbrMkuUgx31c/sYJ1PPSPrK8E+zoz+kDV FasEWrpOKrxax2iiJy6RkCmb9OIPI3d13pZ9p8WWgpeYv13+HvOZsKM03RlFIiCwGpAX NeIQ==
X-Gm-Message-State: APt69E1uEUbFDYr7QSg0dxfbTsNcUMW/+zvxuZBqiy2jL9er6r88qTut CN7JxnQqDFXHKT1o54tDhnvs8Mor067uuJKCjvBgUQ==
X-Google-Smtp-Source: ADUXVKKU3aXJBL4w8r2o47yWX5lxOE6VB0GYFLdLuysEaKqGn4IVBmHRSC8+VLspk7MEOSAePWiMA2owVTqo9oOnCMU=
X-Received: by 2002:a0d:f286:: with SMTP id b128-v6mr699417ywf.489.1529673106681; Fri, 22 Jun 2018 06:11:46 -0700 (PDT)
MIME-Version: 1.0
Received: by 2002:a81:613:0:0:0:0:0 with HTTP; Fri, 22 Jun 2018 06:11:05 -0700 (PDT)
In-Reply-To: <CANatvzzEV=BGJXFuOnDfhXJQV78aWFf84joMknRExY48vu8OYw@mail.gmail.com>
References: <1F436ED13A22A246A59CA374CBC543998B832414@ORSMSX111.amr.corp.intel.com> <20180611154244.GA27622@ubuntu-dmitri> <CACpbDcdxzRxeiN93kKoj__vo2TERm4QZKqaesL=jr4wQUN1gXA@mail.gmail.com> <1F436ED13A22A246A59CA374CBC543998B833B91@ORSMSX111.amr.corp.intel.com> <CABcZeBOjjRrX+AsXdgcUKpL=ciL8U_U1+WVAhQv-ZjwGxkQxYw@mail.gmail.com> <MWHPR21MB0638068EFA850328793E55F6B67C0@MWHPR21MB0638.namprd21.prod.outlook.com> <CACpbDcdbTKKEh8dcshWM6-7vq2hBFJC1myL1+H6etpMMjth+wg@mail.gmail.com> <CABkgnnV_thWcAi=AdwV+Za5rXywiUvtOYpsNNp1y7=RvL2MvWA@mail.gmail.com> <CAOYVs2qE=Tw_7eax9HwaESaQPMh7k3BSVV112d+pPeSfZ09EjQ@mail.gmail.com> <CABcZeBOCRHAuh44CrMH02UZ3Ar_2sa5M1c3LG_A-RPzXX+H+Yw@mail.gmail.com> <CAKcm_gOeZHR-BGJiqK=zQKqbgq=briQuH+fzHrkUYbhQx3B_sw@mail.gmail.com> <CANatvzyKv8EGVR-Z5WMDKbeuKHP791OynsTqX=+HriKBxFnafA@mail.gmail.com> <CAOYVs2oE6yawW04MVH1ApewSJ+0g9g2oMxCj+CU+butfiAe8kA@mail.gmail.com> <CANatvzxniU0AUEi5tuKzmX45uTUV6-y0JbqcdKTpu1J4WQR7JA@mail.gmail.com> <CAOYVs2p9vJrCVuXqGsR29rOGj=CNt1m7TcavGV9Kwk-9hA4sPQ@mail.gmail.com> <1F436ED13A22A246A59CA374CBC543998B83AB21@ORSMSX111.amr.corp.intel.com> <1F436ED13A22A246A59CA374CBC543998B83EC27@ORSMSX111.amr.corp.intel.com> <CAKcm_gMV4vXXW5jKwAR-cOT6OYpi6FL-mO9K=0GWL6WULjWNKA@mail.gmail.com> <1F436ED13A22A246A59CA374CBC543998B83EF15@ORSMSX111.amr.corp.intel.com> <CAOYVs2oynZuE43q1MVO3bBKTPCFg_T3pykS4e5p7DpSaSvmgtQ@mail.gmail.com> <1F436ED13A22A246A59CA374CBC543998B843873@ORSMSX111.amr.corp.intel.com> <CANatvzzEV=BGJXFuOnDfhXJQV78aWFf84joMknRExY48vu8OYw@mail.gmail.com>
From: Eric Rescorla <ekr@rtfm.com>
Date: Fri, 22 Jun 2018 06:11:05 -0700
Message-ID: <CABcZeBPyMBKvY_K6nQSNbxvGXhF2o3hMKeTFgvmbWPgkEyFKaw@mail.gmail.com>
Subject: Re: Proposal to replace ACK block count with ACK length
To: Kazuho Oku <kazuhooku@gmail.com>
Cc: "Deval, Manasi" <manasi.deval@intel.com>, Marten Seemann <martenseemann@gmail.com>, Ian Swett <ianswett@google.com>, Jana Iyengar <jri.ietf@gmail.com>, Praveen Balasubramanian <pravb@microsoft.com>, IETF QUIC WG <quic@ietf.org>, Martin Thomson <martin.thomson@gmail.com>
Content-Type: multipart/alternative; boundary="0000000000002587e7056f3ac668"
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/xU8k4xd-qeiFCdu_uCZx6jq_F6o>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.26
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 22 Jun 2018 13:11:53 -0000
It seems like there are two questions at hand here: 1. Would it be architecturally better to have frames have a consistent self-contained representation? 2. Is it enough better that we should do so now. I agree with Kazuho that (a) we don't have that representation now and (b) it would be a better design to do so. I'm perhaps somewhat more positive on (2) than he is. I don't think it's critically important that we make the change, but if we were to hold a consensus call, I think I would be in favor. I'd certainly be interested in looking at a proposal if someone else were to make one. -Ekr On Thu, Jun 21, 2018 at 8:45 PM, Kazuho Oku <kazuhooku@gmail.com> wrote: > 2018-06-22 8:08 GMT+09:00 Deval, Manasi <manasi.deval@intel.com>: > > I feel that the requirement to have every value valid is somewhat > academic. > > I think that Marten is correct in pointing out that making the ACK > frame self-containing (by having a field that represents the number of > octets being consumed by the frame) would be an exception from the > design pattern we have. > > Look at STREAM frame. The field is not self-contained. Instead, it has > a Length field for the Stream Data, which is a leaf. The same goes for > NEW_CONNECTION_ID frame (that has the length field for Connection ID > field (which is also a leaf)), CONNECTION_CLOSE (length field for > Reason Phase). > > I agree with Marten, Mikkel (and possibly others as well) that having > consistency is important. > > Therefore, I agree with Mikkel that we should consider making every > frame self-contained or keeping every frame as-is (i.e. not > self-contained). > > FWIW, as described in the latter half of > https://www.ietf.org/mail-archive/web/quic/current/msg04287.html, it > is possible to make every frame self-contained *and* also make ACK > frames smaller than the current draft. Making every frame > self-contained gives us the possibility to send new extension frames > without negotiation. > > I do not think that I would push for making every frame self-contained > by myself (because I do not think it meets the high bar to have a > change at such a late moment of standardization), but as stated, my > preference goes to seeing every frame made self-contained or none of > them made as such. > > > The length value provides much more value than the block count and the > fact > > that certain values can never be achieved is an inherent property of the > > length. > > > > > > > > One interesting observation is that this property is not limited to > length. > > One can even make a similar argument about ACK block count. The maximum > > number of ACK blocks that can be defined will not always have a > meaningful > > value. In out examples, 0,1,2,3 are all valid. If I set the value of ACK > > block count to have lower two bits to be 11, the maximum value is of ACK > > blocks is – 4611686018427387903. This is the same value of largest > > acknowledged so if the ACK block count was set to this value, it would > still > > be meaningless. > > > > > > > > Thanks, > > > > Manasi > > > > > > > > > > > > > > > > From: Marten Seemann [mailto:martenseemann@gmail.com] > > Sent: Tuesday, June 19, 2018 6:44 PM > > To: Deval, Manasi <manasi.deval@intel.com> > > Cc: Ian Swett <ianswett@google.com>; Kazuho Oku <kazuhooku@gmail.com>; > Eric > > Rescorla <ekr@rtfm.com>; Jana Iyengar <jri.ietf@gmail.com>; Praveen > > Balasubramanian <pravb@microsoft.com>; IETF QUIC WG <quic@ietf.org>; > Martin > > Thomson <martin.thomson@gmail.com> > > > > > > Subject: Re: Proposal to replace ACK block count with ACK length > > > > > > > > Hi Manasi, > > > > > > > >> The risk of disagreement between ack blocks and ack block count is same > as > >> the risk of disagreement between ack blocks and ack length. Either way > this > >> needs to be counted up while creating the ACK and counted down while > parsing > >> it. The possibility of error is the same. Getting the ack block count > wrong > >> is as problematic as getting the ack length wrong. Do you agree? > > > > > > > > I disagree. Let's take an example of an ACK frame with one ACK range, > that > > needs a 2 byte varint to represent the First ACK Block and another 2 byte > > varint to represent the Gap. > > > > With your proposal: > > > > The values 0 and 1 are invalid, since the length fields itself is > included > > in the length. > > The values 2, 3, ..., (2 + len(LargestAcknowledged) + len(AckDelay)) - 1 > are > > invalid, since the length needs to include the Largest Acknowledged and > the > > Ack Delay. > > The value 2 + len(LargestAcknowledged) + len(AckDelay) would be the first > > valid value, and correspond to an ACK frame with no blocks. > > The value 2 + len(LargestAcknowledged) + len(AckDelay) + 1 is invalid, > since > > it would cut the varint for the First ACK Block > > The value 2 + len(LargestAcknowledged) + len(AckDelay) + 2 is invalid, > since > > it would cut the frame after the First ACK Block (but every block must be > > followed by a gap length) > > The value 2 + len(LargestAcknowledged) + len(AckDelay) + 3 is invalid, > since > > it would cut the varint for the Gap > > Finally, the value 2 + len(LargestAcknowledged) + len(AckDelay) + 4 is > valid > > > > There are *a lot* of invalid values that you can encode into the ACK > length > > field. More importantly, *none* of these error cases exists with the > current > > frame format. > > > > The *only* error case that can occur with our current format is that the > > packet is too short for the number of ACK blocks that are supposed to > > contained in the frame. This can occur with your proposal as well (in > > addition to the error cases listed above). > > > > > > > > My concern is not that it's impossible or even particularly hard to catch > > these errors, but I dislike the property that some (in fact, most) > encodable > > values are invalid. > > > > > > > > Best, > > > > Marten > > > > > > > > > > > > On Wed, Jun 20, 2018 at 3:21 AM Deval, Manasi <manasi.deval@intel.com> > > wrote: > > > > Hi Ian, > > > > > > > > Here is another attempt to solve the objections you raised: > > > > > > > >>I'm not a fan of this proposal, because I think it is impractical to drop > >> the number of ack blocks, because with the ECN proposal it becomes > >> impractically complex to parse. > > > > Is there a reason the proposal from Christian does not solve this > problem? > > > > > > > >>If we don't remove the number of ack blocks, then the ack frame is > larger, > >> but I don't think the extra size field is useful for most > implementations. > >> Also, it means the length can disagree with the actual length, which add > >> complexity and the possibility of writing error-prone code. The idea of > >> someone offloading ack processing and then proceeding to trust the > length > >> seems like someone could get wrong and cause some concerning issues. > > > > The risk of disagreement between ack blocks and ack block count is same > as > > the risk of disagreement between ack blocks and ack length. Either way > this > > needs to be counted up while creating the ACK and counted down while > parsing > > it. The possibility of error is the same. Getting the ack block count > wrong > > is as problematic as getting the ack length wrong. Do you agree? > > > > > > > >>My experience is multithreaded packet processing is more cost and work > than > >> it's worth. Sure you can't fill a 100G NIC with one connection, but > that > >> seems like an academic problem, not one for workloads I've seen. > Typically > >> the extra cost of multithreading outweighs its value. > > > > The value is two fold – pre-processing and multi-threading. If we > > pre-process the received packets such that ACKs and streams can be > > coalesced, the receive side can indicate a large chunk of information > though > > the kernel, reducing the cost of system call and protocol overhead. This > is > > the same concept as UDP segmentation taken a step further on receive > side. > > After this chunk is indicated into the QUIC protocol, the protocol may > > process stream and ACK in parallel. While folks may or may not utilize > this, > > there is an advantage here. > > > > > > > > Thanks, > > > > Manasi > > > > > > > > > > > > From: Ian Swett [mailto:ianswett@google.com] > > Sent: Tuesday, June 19, 2018 11:26 AM > > To: Deval, Manasi <manasi.deval@intel.com> > > Cc: Marten Seemann <martenseemann@gmail.com>; Kazuho Oku > > <kazuhooku@gmail.com>; Eric Rescorla <ekr@rtfm.com>; Jana Iyengar > > <jri.ietf@gmail.com>; Praveen Balasubramanian <pravb@microsoft.com>; > IETF > > QUIC WG <quic@ietf.org>; Martin Thomson <martin.thomson@gmail.com> > > > > > > Subject: Re: Proposal to replace ACK block count with ACK length > > > > > > > > I'm still not interested in this change, for the reasons I stated above. > > > > > > > > On Tue, Jun 19, 2018 at 2:21 PM Deval, Manasi <manasi.deval@intel.com> > > wrote: > > > > Hi All, > > > > > > > > Do we have agreement here to create a new PR? > > > > > > > > Thanks, > > > > Manasi > > > > > > > > From: Deval, Manasi > > Sent: Sunday, June 17, 2018 2:25 PM > > To: Marten Seemann <martenseemann@gmail.com>; Kazuho Oku > > <kazuhooku@gmail.com> > > Cc: Ian Swett <ianswett=40google.com@dmarc.ietf.org>; Eric Rescorla > > <ekr@rtfm.com>; Jana Iyengar <jri.ietf@gmail.com>; Praveen > Balasubramanian > > <pravb@microsoft.com>; IETF QUIC WG <quic@ietf.org>; Martin Thomson > > <martin.thomson@gmail.com> > > Subject: RE: Proposal to replace ACK block count with ACK length > > > > > > > > Hi All, > > > > > > > > I have made a list of objections to the proposal and the solutions to > those > > objections discussed on this thread. > > > > > > > > a. Co-existence of length field with ECN field and ACK blocks. > > > > > > > > Christian suggested to move the ECN fields to precede the ACK blocks. > This > > is an elegant solution. Parsing entire list of ACK blocks to review ECN > bits > > would have been annoying, even though it can work. > > > > > > > > b. There are two cases to be parsed – entire ACK and parse ACK to > > identify length. There are some reservations when ACK parsing gets harder > > for the case where the entire header needs to be parsed. > > > > > > > > Agreement from several folks here. In the original ACK defined in draft > 12 > > of the slide, one would count down number of ACK blocks to get to the > end of > > the packet. In the proposal I made, one would count down the length to > > identify the end of the packet. The logic is very similar in cycle count > and > > complexity. Several folks also commented to this effect. > > > > > > > > c. Multi-threaded packet processing > > > > > > > > I would expect that there are 10s of 1000s of connections in use at any > time > > for a server with a high speed link. Multi-threading to handle each of > these > > flows / connections in parallel is necessity to be able to support large > > number of connections on a high speed link. Tx segmentation, Rx > coalescing > > are well known strategies to reduce the processing cost. In initial > stages, > > code is often written as a single-threaded and then re-factored to > > parallelize cycle intensive operations. In order to allow this protocol > to > > scale in future, I would suggest we do not preclude this case. > > > > > > > > d. Increase in ACK size by 1 byte. > > > > > > > > I do not see this as a serious issue but if folks but we can consider > making > > this a varint, if others have strong feelings about it. It’s a trade-off > : 2 > > reads to save 1 byte. > > > > > > > > e. Every encodable value should be valid > > > > Not every length will be valid. This is inherent to lengths. This same > issue > > ails the ‘payload length’ in QUIC header. Not only does the issue exist > for > > small values, it also applies to large values since data stream will be > sent > > after crypto negotiation. E.g. - how does one craft a payload with 62 > bit > > payload length in a large header? > > > > > > > > > > > > Thanks, > > > > Manasi > > > > > > > > > > > > From: Marten Seemann [mailto:martenseemann@gmail.com] > > Sent: Sunday, June 17, 2018 6:59 AM > > To: Kazuho Oku <kazuhooku@gmail.com> > > Cc: Ian Swett <ianswett=40google.com@dmarc.ietf.org>; Eric Rescorla > > <ekr@rtfm.com>; Jana Iyengar <jri.ietf@gmail.com>; Praveen > Balasubramanian > > <pravb@microsoft.com>; IETF QUIC WG <quic@ietf.org>; Martin Thomson > > <martin.thomson@gmail.com>; Deval, Manasi <manasi.deval@intel.com> > > Subject: Re: Proposal to replace ACK block count with ACK length > > > > > > > > Maybe it's specific to Go, but I'm using a single io.Reader for the whole > > packet, so as long as the packet payload is long enough, the varint > parsing > > will not fail. > > > > I don't think that specifics of programming languages matter here though, > > and I'm sure both frame formats can be reasonably implemented in C as > well > > as in Go. The reasons I'm opposed to Manasi's proposal are that it moves > us > > away from the principle that only reasonable values should be encodable, > and > > that it increases the size of the ACK frame, for the questionable > benefit of > > being able to parallelise the frame parser. > > > > > > > > On Sun, Jun 17, 2018 at 8:48 PM Kazuho Oku <kazuhooku@gmail.com> wrote: > > > > 2018-06-17 22:36 GMT+09:00 Marten Seemann <martenseemann@gmail.com>: > >> At least for my implementation, parsing doesn't become easier, it > becomes > >> more complex with this proposal. My varint-parser always consumes as > many > >> bytes as the varint requires, so after parsing a varint, I'd have to > >> introduce an additional check that this didn't overflow the ACK length > >> (e.g. > >> consider that I parsed the ACK frame so far that only 2 bytes are > >> remaining > >> according to ACK length field, but the next varint is 4 bytes long). > > > > Isn't your varint parser checking that it has not (or will not) run > > across the end of the packet payload for every ACK block it parses? > > I'd assume that you would be doing that, because I think that is > > necessary to avoid buffer overrun. > > > > What I am saying that that check could be converted to a overrun check > > against the end of the "frame payload", and that checking the > > remaining block count becomes unnecessary, in case we replace ACK > > Block Count with ACK Frame Length. > > > >> > >> In general, we've been moving the wire image towards making every > >> encodable > >> value valid. This proposal moves us away from that principle: > >> * some small values are always invalid (the length can never be between > 0 > >> and 3) > >> * a lot of intermediate values are invalid (if the boundary falls > inside a > >> varint, as described above) > >> Both these cases can't occur with the current ACK frame format. > >> > >> On Sun, Jun 17, 2018 at 7:54 PM Kazuho Oku <kazuhooku@gmail.com> wrote: > >>> > >>> 2018-06-17 8:34 GMT+09:00 Ian Swett > >>> <ianswett=40google.com@dmarc.ietf.org>: > >>> > I'm not a fan of this proposal, because I think it is impractical to > >>> > drop > >>> > the number of ack blocks, because with the ECN proposal it becomes > >>> > impractically complex to parse. > >>> > >>> For the ECN proposal, as Christian has suggested, we can move the ECN > >>> counters before the ACK blocks. Then, it would not be complex to > >>> parse. > >>> > >>> And my view is that parsing becomes easier if we replace ACK Block > >>> Count with ACK Frame Length. > >>> > >>> Now, with ACK Block Count, we need to check the remaining number of > >>> blocks and the remaining space in the packet payload for every block > >>> that we parse. Failing to check either leads to a bug or a security > >>> issue. > >>> > >>> If we switch to ACK Frame Length, we need to only check the remaining > >>> space in the frame. > >>> > >>> I think that this is the biggest benefit of replacing ACK Block Count > >>> with ACK Frame Length. OTOH the downside is that you need extra one to > >>> two bits (one if the size of block / gap is expected to be below 65, > >>> two if they are expected to be above that) for encoding ACK Frame > >>> Length compared to ACK Block Count. > >>> > >>> > >>> > >>> Having said that, I honestly wonder if all the frames could have it's > >>> length being encoded (either explicitly or either as a signal that > >>> says "to the end of the packet"). Consider something like below: > >>> > >>> |0| frame-type (7) | frame-payload-length (i) | frame-payload (*) | > >>> or > >>> |1| frame-type (7) | frame-payload (*) | > >>> > >>> When MSB of the first octet set to zero, the length of the frame > >>> payload is designated by the varint that immediately follows the frame > >>> type. > >>> When MSB of the first octet set to one, the length of the frame > >>> payload spans to the end of the packet. > >>> > >>> In this encoding, we can always omit the Length field of a STREAM > >>> frame. So the overhead for carrying stream data will be indifferent in > >>> practice. > >>> > >>> For the ACK frame, we can omit the ACK Block Count field. And the > >>> overhead will be one to two bits if the ACK frame is sent in the > >>> middle of the packet (thereby using the encoding with explicit frame > >>> payload length), or one octet or more shorter if ACK is the last frame > >>> of the packet. > >>> > >>> We are likely to see increase of overhead for most of the other types > >>> of frames, but I do not think that would be an issue considering that > >>> they will be far seldom seen compared to STREAMs and ACKs. > >>> > >>> To summarize, my anticipation is that we can make all the frames > >>> self-contained (i.e. the length can be determined without the > >>> knowledge of how each frame is encoded) without any overhead, if we > >>> agree on making the frame type space 1 bit smaller. > >>> > >>> Finally, the biggest benefit of using a self-contained encoding of > >>> frames is that we would have the ability to introduce new optional > >>> frames without negotiation. By making the frames self-contained, QUIC > >>> endpoints will have the freedom of ignoring the frames that they do > >>> not understand. > >>> > >>> Being able to send QUIC frames defined in extensions without > >>> negotiating using Transport Parameters will be a win in both terms of > >>> security (because clients' TP is sent in clear) and flexibility > >>> (because we will be possible to send the extensions before we figure > >>> out whether the peer supports that extension). > >>> > >>> > If we don't remove the number of ack blocks, then the ack frame is > >>> > larger, > >>> > but I don't think the extra size field is useful for most > >>> > implementations. > >>> > Also, it means the length can disagree with the actual length, which > >>> > add > >>> > complexity and the possibility of writing error-prone code. The idea > >>> > of > >>> > someone offloading ack processing and then proceeding to trust the > >>> > length > >>> > seems like someone could get wrong and cause some concerning issues. > >>> > > >>> > My experience is multithreaded packet processing is more cost and > work > >>> > than > >>> > it's worth. Sure you can't fill a 100G NIC with one connection, but > >>> > that > >>> > seems like an academic problem, not one for workloads I've seen. > >>> > Typically > >>> > the extra cost of multithreading outweighs its value. > >>> > > >>> > To be clear, I don't think this is an awful idea, but I also don't > see > >>> > the > >>> > value and it adds complexity. I read Manasi's email, but I don't > think > >>> > I > >>> > understand why any of those matter in practice. > >>> > > >>> > On Sat, Jun 16, 2018 at 4:13 PM Eric Rescorla <ekr@rtfm.com> wrote: > >>> >> > >>> >> On Fri, Jun 15, 2018 at 6:46 PM, Marten Seemann > >>> >> <martenseemann@gmail.com> > >>> >> wrote: > >>> >>> > >>> >>> This proposal increases the size of the ACK frame by 1 byte in the > >>> >>> common > >>> >>> case (less than 63 ACK ranges), since the ACK length field here > >>> >>> always > >>> >>> consumes 2 bytes, whereas the ACK Block Count is a variable-length > >>> >>> integer. > >>> >>> Considering how much work we put into minimising the size of the > >>> >>> frames, > >>> >>> this feels like a step in the wrong direction.. > >>> >>> > >>> >>> Regarding the processing cost, I agree with Dmitri. Handling an ACK > >>> >>> frame > >>> >>> requires looping over and making changes to a data structure that > >>> >>> keeps > >>> >>> track of sent packets. This is much more expensive than simply > >>> >>> parsing > >>> >>> a > >>> >>> bunch of varints in the ACK frame. It seems unlikely that a > >>> >>> multi-threaded > >>> >>> packet parser would offer any real-world performance benefits. > >>> >> > >>> >> > >>> >> I don't want to overstate the benefit here, but my point isn't that > >>> >> parsing is expensive but that if you want to have a multithreaded > >>> >> packet > >>> >> processing system, then it's nice to have a simpler data structure > >>> >> (the > >>> >> unparsed ACK block) to hand to the ACK processing thread. > >>> >> > >>> >> -Ekr > >>> >> > >>> >> > >>> >>> > >>> >>> On Sat, Jun 16, 2018 at 6:19 AM Martin Thomson > >>> >>> <martin.thomson@gmail.com> > >>> >>> wrote: > >>> >>>> > >>> >>>> When we discussed this before, some people observed that this > >>> >>>> creates > >>> >>>> a need to encode in two passes. That's the trade-off here. (Not > >>> >>>> expressing an opinion.) > >>> >>>> On Fri, Jun 15, 2018 at 3:51 PM Jana Iyengar <jri.ietf@gmail.com> > >>> >>>> wrote: > >>> >>>> > > >>> >>>> > I don't have a strong opinion on this. I'm certainly not opposed > >>> >>>> > to > >>> >>>> > it. > >>> >>>> > Does anyone have a strong opposition? > >>> >>>> > > >>> >>>> > On Fri, Jun 15, 2018 at 3:10 PM Praveen Balasubramanian > >>> >>>> > <pravb@microsoft.com> wrote: > >>> >>>> >> > >>> >>>> >> I agree as well since this can help reduce per packet > processing > >>> >>>> >> overhead. ACKs are going to be the second most common frame > type > >>> >>>> >> so no > >>> >>>> >> objections to special casing. > >>> >>>> >> > >>> >>>> >> > >>> >>>> >> > >>> >>>> >> From: QUIC [mailto:quic-bounces@ietf.org] On Behalf Of Eric > >>> >>>> >> Rescorla > >>> >>>> >> Sent: Friday, June 15, 2018 9:11 AM > >>> >>>> >> To: Deval, Manasi <manasi.deval@intel.com> > >>> >>>> >> Cc: Jana Iyengar <jri.ietf@gmail.com>; QUIC WG <quic@ietf.org> > >>> >>>> >> Subject: Re: Proposal to replace ACK block count with ACK > length > >>> >>>> >> > >>> >>>> >> > >>> >>>> >> > >>> >>>> >> I agree with Manasi here. This change would allow ack frame > >>> >>>> >> parsing > >>> >>>> >> to be more self-contained, which is an advantage for the parser > >>> >>>> >> and also > >>> >>>> >> potentially for parallelism (because you can quickly find the > >>> >>>> >> frame and then > >>> >>>> >> process it in parallel). > >>> >>>> >> > >>> >>>> >> > >>> >>>> >> > >>> >>>> >> -Ekr > >>> >>>> >> > >>> >>>> >> > >>> >>>> >> > >>> >>>> >> > >>> >>>> >> > >>> >>>> >> On Mon, Jun 11, 2018 at 5:22 PM, Deval, Manasi > >>> >>>> >> <manasi.deval@intel.com> wrote: > >>> >>>> >> > >>> >>>> >> In general, varints require some specific logic for parsing. To > >>> >>>> >> skip > >>> >>>> >> over any header, I have to read every single varint. As the > code > >>> >>>> >> sees Stream > >>> >>>> >> and ACK headers most frequently, that is my focus. The Stream > >>> >>>> >> frame has a > >>> >>>> >> length in its third field. > >>> >>>> >> > >>> >>>> >> > >>> >>>> >> > >>> >>>> >> ACK parsing, however, needs 6 + 2*num_blocks reads to identify > >>> >>>> >> length. There are two reads each for ‘largest acknowledged’, > ‘ACK > >>> >>>> >> delay’ and > >>> >>>> >> ‘ACK block count’. The pain point is the total number of cycles > >>> >>>> >> parse an > >>> >>>> >> ACK. If I am processing 10M pps, where 10% - 30% of the packets > >>> >>>> >> have a > >>> >>>> >> piggybacked ACK, these cycles becomes a significant bottleneck. > >>> >>>> >> > >>> >>>> >> > >>> >>>> >> > >>> >>>> >> Thanks, > >>> >>>> >> > >>> >>>> >> Manasi > >>> >>>> >> > >>> >>>> >> > >>> >>>> >> > >>> >>>> >> > >>> >>>> >> > >>> >>>> >> > >>> >>>> >> > >>> >>>> >> From: QUIC [mailto:quic-bounces@ietf.org] On Behalf Of Jana > >>> >>>> >> Iyengar > >>> >>>> >> Sent: Monday, June 11, 2018 3:11 PM > >>> >>>> >> To: Deval, Manasi <manasi.deval@intel.com>; QUIC WG > >>> >>>> >> <quic@ietf.org> > >>> >>>> >> Subject: Re: Proposal to replace ACK block count with ACK > length > >>> >>>> >> > >>> >>>> >> > >>> >>>> >> > >>> >>>> >> You're right that we no longer have the ability to skip an ACK > >>> >>>> >> frame, > >>> >>>> >> and this crept in when we moved to varints. > >>> >>>> >> > >>> >>>> >> I believe your problem though is generally true of most frames > >>> >>>> >> not > >>> >>>> >> just ACKs, since ids, packet numbers, and numbers in all frames > >>> >>>> >> are now all > >>> >>>> >> varints. To skip any frame, you'll need to parse the varint > >>> >>>> >> fields > >>> >>>> >> in those > >>> >>>> >> frames. If you have logic to process and skip varints, then > >>> >>>> >> skipping the ack > >>> >>>> >> block section is merely repeating this operation > (2*num_block+1) > >>> >>>> >> times. Do > >>> >>>> >> you see specific value in skipping ACK frames over the other > >>> >>>> >> control frames? > >>> >>>> >> > >>> >>>> >> > >>> >>>> >> > >>> >>>> >> > >>> >>>> >> > >>> >>>> >> On Mon, Jun 11, 2018 at 8:43 AM Dmitri Tikhonov > >>> >>>> >> <dtikhonov@litespeedtech..com> wrote: > >>> >>>> >> > >>> >>>> >> On Mon, Jun 11, 2018 at 03:33:35PM +0000, Deval, Manasi wrote: > >>> >>>> >> > - Moving the ACK length to the front of the ACK allows > >>> >>>> >> > the > >>> >>>> >> > flexibility of either reading the entire ACK or > >>> >>>> >> > reading > >>> >>>> >> > the > >>> >>>> >> > first 16 bits and skipping over the length. This is > a > >>> >>>> >> > useful > >>> >>>> >> > feature for the case where ACK processing is split > >>> >>>> >> > into > >>> >>>> >> > multiple layers. Depending on the processor this is > >>> >>>> >> > run > >>> >>>> >> > on, > >>> >>>> >> > there are different advantages - > >>> >>>> >> > >>> >>>> >> Just a note. In my experience, the cost of parsing an ACK > frame > >>> >>>> >> is > >>> >>>> >> negligible compared to the cost of processing an ACK frame: > that > >>> >>>> >> is, > >>> >>>> >> poking at various memory locations to discard newly ACKed > >>> >>>> >> packets. > >>> >>>> >> > >>> >>>> >> - Dmitri. > >>> >>>> >> > >>> >>>> >> > >>> >>>> > >>> >> > >>> > > >>> > >>> > >>> > >>> -- > >>> Kazuho Oku > > > > > > > > -- > > Kazuho Oku > > > > -- > Kazuho Oku >
- Re: Proposal to replace ACK block count with ACK … Mikkel Fahnøe Jørgensen
- Re: Proposal to replace ACK block count with ACK … Eric Rescorla
- Re: Proposal to replace ACK block count with ACK … Kazuho Oku
- RE: Proposal to replace ACK block count with ACK … Nick Banks
- Re: Proposal to replace ACK block count with ACK … Ian Swett
- RE: Proposal to replace ACK block count with ACK … Deval, Manasi
- Re: Proposal to replace ACK block count with ACK … Mikkel Fahnøe Jørgensen
- Re: Proposal to replace ACK block count with ACK … Ian Swett
- Re: Proposal to replace ACK block count with ACK … Mirja Kühlewind
- Re: Proposal to replace ACK block count with ACK … Marten Seemann
- RE: Proposal to replace ACK block count with ACK … Deval, Manasi
- RE: Proposal to replace ACK block count with ACK … Deval, Manasi
- Re: Proposal to replace ACK block count with ACK … Eric Rescorla
- Re: Proposal to replace ACK block count with ACK … Ian Swett
- RE: Proposal to replace ACK block count with ACK … Mikkel Fahnøe Jørgensen
- RE: Proposal to replace ACK block count with ACK … Deval, Manasi
- RE: Proposal to replace ACK block count with ACK … Deval, Manasi
- Re: Proposal to replace ACK block count with ACK … Marten Seemann
- Re: Proposal to replace ACK block count with ACK … Kazuho Oku
- Re: Proposal to replace ACK block count with ACK … Marten Seemann
- Re: Proposal to replace ACK block count with ACK … Kazuho Oku
- Re: Proposal to replace ACK block count with ACK … Ian Swett
- Re: Proposal to replace ACK block count with ACK … Eric Rescorla
- Re: Proposal to replace ACK block count with ACK … Kazuho Oku
- Re: Proposal to replace ACK block count with ACK … Eggert, Lars
- Re: Proposal to replace ACK block count with ACK … Christian Huitema
- Re: Proposal to replace ACK block count with ACK … Kazuho Oku
- Re: Proposal to replace ACK block count with ACK … Marten Seemann
- Re: Proposal to replace ACK block count with ACK … Martin Thomson
- Re: Proposal to replace ACK block count with ACK … Jana Iyengar
- RE: Proposal to replace ACK block count with ACK … Praveen Balasubramanian
- Re: Proposal to replace ACK block count with ACK … Eric Rescorla
- Re: Proposal to replace ACK block count with ACK … Mikkel Fahnøe Jørgensen
- RE: Proposal to replace ACK block count with ACK … Deval, Manasi
- Re: Proposal to replace ACK block count with ACK … Dmitri Tikhonov
- Proposal to replace ACK block count with ACK leng… Deval, Manasi
- Re: Proposal to replace ACK block count with ACK … Jana Iyengar
- RE: Proposal to replace ACK block count with ACK … Deval, Manasi
- Re: Proposal to replace ACK block count with ACK … Subodh Iyengar
- Re: Proposal to replace ACK block count with ACK … Deval, Manasi
- RE: Proposal to replace ACK block count with ACK … Deval, Manasi
- Re: Proposal to replace ACK block count with ACK … Mirja Kühlewind