Re: Proposal to replace ACK block count with ACK length

Eric Rescorla <ekr@rtfm.com> Fri, 22 June 2018 13:11 UTC

Return-Path: <ekr@rtfm.com>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DD5D5130E5A for <quic@ietfa.amsl.com>; Fri, 22 Jun 2018 06:11:52 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.909
X-Spam-Level:
X-Spam-Status: No, score=-1.909 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, T_DKIMWL_WL_MED=-0.01] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=rtfm-com.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id sDrmS-apHgMY for <quic@ietfa.amsl.com>; Fri, 22 Jun 2018 06:11:47 -0700 (PDT)
Received: from mail-yw0-x232.google.com (mail-yw0-x232.google.com [IPv6:2607:f8b0:4002:c05::232]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A3176130E59 for <quic@ietf.org>; Fri, 22 Jun 2018 06:11:47 -0700 (PDT)
Received: by mail-yw0-x232.google.com with SMTP id k18-v6so2360699ywm.11 for <quic@ietf.org>; Fri, 22 Jun 2018 06:11:47 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rtfm-com.20150623.gappssmtp.com; s=20150623; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=30NyicC0wwKHU9sNYE1WVnUBHAiQ5BZi+Li3DEQNZ4U=; b=eenP6gb4u89tKRWpScnH3R0lSDftxVUA0dkPejl7D+ihQ4Fyoe7VvEgqNATNAifDnr qnRVf576jqeEiFVuaGhM2lJ5f0o6bCLnmwaoBJCnW2xhzCCmz45kr2H6dRqLycdYUncD AHYZnuO3cY8FxRfarQDY7jGxuQuIwLUToaZdB3BD8iIDoQGlcBzQ1FbxcYAkx+gHq7Wg NyX4kL9IrdWevu9of6AsKuL3ZMjGn2uy1Ko8u6GSl1gL5hy5KECuIrSRhZ/yHxyENMGK +xsp7SJy5sthsb+Ro1Q7/NzciZobGHVl7ZzFiH8f1emHBN3usNQ/WRC2y1cWJrR1AMj6 vniA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=30NyicC0wwKHU9sNYE1WVnUBHAiQ5BZi+Li3DEQNZ4U=; b=mugc9O444Hy3j74BOnpkXX+PTgIaE7dyjLG3kOPUbUEo4OEcrYN1+DuhsLDmrpgq8L nlKUO/gWApbk9f2JtuyI7Kidy76QeLeCgsuBa+PKoVh5DsVlFr59d8F20SHTvxpkjMzO I/Pb1T9/qHrNLV29u/BMN66QqmIno+nB85D70TXcf+L/r5Vv1qtt5ljU1+5ZDxlWaQoF F1dJXU3rb6aLlgncTdcW7U8T2Rj/oKJTmacFbrMkuUgx31c/sYJ1PPSPrK8E+zoz+kDV FasEWrpOKrxax2iiJy6RkCmb9OIPI3d13pZ9p8WWgpeYv13+HvOZsKM03RlFIiCwGpAX NeIQ==
X-Gm-Message-State: APt69E1uEUbFDYr7QSg0dxfbTsNcUMW/+zvxuZBqiy2jL9er6r88qTut CN7JxnQqDFXHKT1o54tDhnvs8Mor067uuJKCjvBgUQ==
X-Google-Smtp-Source: ADUXVKKU3aXJBL4w8r2o47yWX5lxOE6VB0GYFLdLuysEaKqGn4IVBmHRSC8+VLspk7MEOSAePWiMA2owVTqo9oOnCMU=
X-Received: by 2002:a0d:f286:: with SMTP id b128-v6mr699417ywf.489.1529673106681; Fri, 22 Jun 2018 06:11:46 -0700 (PDT)
MIME-Version: 1.0
Received: by 2002:a81:613:0:0:0:0:0 with HTTP; Fri, 22 Jun 2018 06:11:05 -0700 (PDT)
In-Reply-To: <CANatvzzEV=BGJXFuOnDfhXJQV78aWFf84joMknRExY48vu8OYw@mail.gmail.com>
References: <1F436ED13A22A246A59CA374CBC543998B832414@ORSMSX111.amr.corp.intel.com> <20180611154244.GA27622@ubuntu-dmitri> <CACpbDcdxzRxeiN93kKoj__vo2TERm4QZKqaesL=jr4wQUN1gXA@mail.gmail.com> <1F436ED13A22A246A59CA374CBC543998B833B91@ORSMSX111.amr.corp.intel.com> <CABcZeBOjjRrX+AsXdgcUKpL=ciL8U_U1+WVAhQv-ZjwGxkQxYw@mail.gmail.com> <MWHPR21MB0638068EFA850328793E55F6B67C0@MWHPR21MB0638.namprd21.prod.outlook.com> <CACpbDcdbTKKEh8dcshWM6-7vq2hBFJC1myL1+H6etpMMjth+wg@mail.gmail.com> <CABkgnnV_thWcAi=AdwV+Za5rXywiUvtOYpsNNp1y7=RvL2MvWA@mail.gmail.com> <CAOYVs2qE=Tw_7eax9HwaESaQPMh7k3BSVV112d+pPeSfZ09EjQ@mail.gmail.com> <CABcZeBOCRHAuh44CrMH02UZ3Ar_2sa5M1c3LG_A-RPzXX+H+Yw@mail.gmail.com> <CAKcm_gOeZHR-BGJiqK=zQKqbgq=briQuH+fzHrkUYbhQx3B_sw@mail.gmail.com> <CANatvzyKv8EGVR-Z5WMDKbeuKHP791OynsTqX=+HriKBxFnafA@mail.gmail.com> <CAOYVs2oE6yawW04MVH1ApewSJ+0g9g2oMxCj+CU+butfiAe8kA@mail.gmail.com> <CANatvzxniU0AUEi5tuKzmX45uTUV6-y0JbqcdKTpu1J4WQR7JA@mail.gmail.com> <CAOYVs2p9vJrCVuXqGsR29rOGj=CNt1m7TcavGV9Kwk-9hA4sPQ@mail.gmail.com> <1F436ED13A22A246A59CA374CBC543998B83AB21@ORSMSX111.amr.corp.intel.com> <1F436ED13A22A246A59CA374CBC543998B83EC27@ORSMSX111.amr.corp.intel.com> <CAKcm_gMV4vXXW5jKwAR-cOT6OYpi6FL-mO9K=0GWL6WULjWNKA@mail.gmail.com> <1F436ED13A22A246A59CA374CBC543998B83EF15@ORSMSX111.amr.corp.intel.com> <CAOYVs2oynZuE43q1MVO3bBKTPCFg_T3pykS4e5p7DpSaSvmgtQ@mail.gmail.com> <1F436ED13A22A246A59CA374CBC543998B843873@ORSMSX111.amr.corp.intel.com> <CANatvzzEV=BGJXFuOnDfhXJQV78aWFf84joMknRExY48vu8OYw@mail.gmail.com>
From: Eric Rescorla <ekr@rtfm.com>
Date: Fri, 22 Jun 2018 06:11:05 -0700
Message-ID: <CABcZeBPyMBKvY_K6nQSNbxvGXhF2o3hMKeTFgvmbWPgkEyFKaw@mail.gmail.com>
Subject: Re: Proposal to replace ACK block count with ACK length
To: Kazuho Oku <kazuhooku@gmail.com>
Cc: "Deval, Manasi" <manasi.deval@intel.com>, Marten Seemann <martenseemann@gmail.com>, Ian Swett <ianswett@google.com>, Jana Iyengar <jri.ietf@gmail.com>, Praveen Balasubramanian <pravb@microsoft.com>, IETF QUIC WG <quic@ietf.org>, Martin Thomson <martin.thomson@gmail.com>
Content-Type: multipart/alternative; boundary="0000000000002587e7056f3ac668"
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/xU8k4xd-qeiFCdu_uCZx6jq_F6o>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.26
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 22 Jun 2018 13:11:53 -0000

It seems like there are two questions at hand here:

1. Would it be architecturally better to have frames have a consistent
self-contained
representation?
2. Is it enough better that we should do so now.

I agree with Kazuho that (a) we don't have that representation now and (b)
it would
be a better design to do so. I'm perhaps somewhat more positive on (2) than
he
is. I don't think it's critically important that we make the change, but if
we were
to hold a consensus call, I think I would be in favor. I'd certainly be
interested
in looking at a proposal if someone else were to make one.

-Ekr



On Thu, Jun 21, 2018 at 8:45 PM, Kazuho Oku <kazuhooku@gmail.com> wrote:

> 2018-06-22 8:08 GMT+09:00 Deval, Manasi <manasi.deval@intel.com>:
> > I feel that the requirement to have every value valid is somewhat
> academic.
>
> I think that Marten is correct in pointing out that making the ACK
> frame self-containing (by having a field that represents the number of
> octets being consumed by the frame) would be an exception from the
> design pattern we have.
>
> Look at STREAM frame. The field is not self-contained. Instead, it has
> a Length field for the Stream Data, which is a leaf. The same goes for
> NEW_CONNECTION_ID frame (that has the length field for Connection ID
> field (which is also a leaf)), CONNECTION_CLOSE (length field for
> Reason Phase).
>
> I agree with Marten, Mikkel (and possibly others as well) that having
> consistency is important.
>
> Therefore, I agree with Mikkel that we should consider making every
> frame self-contained or keeping every frame as-is (i.e. not
> self-contained).
>
> FWIW, as described in the latter half of
> https://www.ietf.org/mail-archive/web/quic/current/msg04287.html, it
> is possible to make every frame self-contained *and* also make ACK
> frames smaller than the current draft. Making every frame
> self-contained gives us the possibility to send new extension frames
> without negotiation.
>
> I do not think that I would push for making every frame self-contained
> by myself (because I do not think it meets the high bar to have a
> change at such a late moment of standardization), but as stated, my
> preference goes to seeing every frame made self-contained or none of
> them made as such.
>
> > The length value provides much more value than the block count and the
> fact
> > that certain values can never be achieved is an inherent property of the
> > length.
> >
> >
> >
> > One interesting observation is that this property is not limited to
> length.
> > One can even make a similar argument about ACK block count. The maximum
> > number of ACK blocks that can be defined will not always have a
> meaningful
> > value. In out examples, 0,1,2,3 are all valid. If I set the value of ACK
> > block count to have lower two bits to be 11, the maximum value is of ACK
> > blocks is – 4611686018427387903. This is the same value of largest
> > acknowledged so if the ACK block count was set to this value, it would
> still
> > be meaningless.
> >
> >
> >
> > Thanks,
> >
> > Manasi
> >
> >
> >
> >
> >
> >
> >
> > From: Marten Seemann [mailto:martenseemann@gmail.com]
> > Sent: Tuesday, June 19, 2018 6:44 PM
> > To: Deval, Manasi <manasi.deval@intel.com>
> > Cc: Ian Swett <ianswett@google.com>; Kazuho Oku <kazuhooku@gmail.com>;
> Eric
> > Rescorla <ekr@rtfm.com>; Jana Iyengar <jri.ietf@gmail.com>; Praveen
> > Balasubramanian <pravb@microsoft.com>; IETF QUIC WG <quic@ietf.org>;
> Martin
> > Thomson <martin.thomson@gmail.com>
> >
> >
> > Subject: Re: Proposal to replace ACK block count with ACK length
> >
> >
> >
> > Hi Manasi,
> >
> >
> >
> >> The risk of disagreement between ack blocks and ack block count is same
> as
> >> the risk of disagreement between ack blocks and ack length. Either way
> this
> >> needs to be counted up while creating the ACK and counted down while
> parsing
> >> it. The possibility of error is the same. Getting the ack block count
> wrong
> >> is as problematic as getting the ack length wrong. Do you agree?
> >
> >
> >
> > I disagree. Let's take an example of an ACK frame with one ACK range,
> that
> > needs a 2 byte varint to represent the First ACK Block and another 2 byte
> > varint to represent the Gap.
> >
> > With your proposal:
> >
> > The values 0 and 1 are invalid, since the length fields itself is
> included
> > in the length.
> > The values 2, 3, ..., (2 + len(LargestAcknowledged) + len(AckDelay)) - 1
> are
> > invalid, since the length needs to include the Largest Acknowledged and
> the
> > Ack Delay.
> > The value 2 + len(LargestAcknowledged) + len(AckDelay) would be the first
> > valid value, and correspond to an ACK frame with no blocks.
> > The value 2 + len(LargestAcknowledged) + len(AckDelay) + 1 is invalid,
> since
> > it would cut the varint for the First ACK Block
> > The value 2 + len(LargestAcknowledged) + len(AckDelay) + 2 is invalid,
> since
> > it would cut the frame after the First ACK Block (but every block must be
> > followed by a gap length)
> > The value 2 + len(LargestAcknowledged) + len(AckDelay) + 3 is invalid,
> since
> > it would cut the varint for the Gap
> > Finally, the value 2 + len(LargestAcknowledged) + len(AckDelay) + 4 is
> valid
> >
> > There are *a lot* of invalid values that you can encode into the ACK
> length
> > field. More importantly, *none* of these error cases exists with the
> current
> > frame format.
> >
> > The *only* error case that can occur with our current format is that the
> > packet is too short for the number of ACK blocks that are supposed to
> > contained in the frame. This can occur with your proposal as well (in
> > addition to the error cases listed above).
> >
> >
> >
> > My concern is not that it's impossible or even particularly hard to catch
> > these errors, but I dislike the property that some (in fact, most)
> encodable
> > values are invalid.
> >
> >
> >
> > Best,
> >
> > Marten
> >
> >
> >
> >
> >
> > On Wed, Jun 20, 2018 at 3:21 AM Deval, Manasi <manasi.deval@intel.com>
> > wrote:
> >
> > Hi Ian,
> >
> >
> >
> > Here is another attempt to solve the objections you raised:
> >
> >
> >
> >>I'm not a fan of this proposal, because I think it is impractical to drop
> >> the number of ack blocks, because with the ECN proposal it becomes
> >> impractically complex to parse.
> >
> > Is there a reason the proposal from Christian does not solve this
> problem?
> >
> >
> >
> >>If we don't remove the number of ack blocks, then the ack frame is
> larger,
> >> but I don't think the extra size field is useful for most
> implementations.
> >> Also, it means the length can disagree with the actual length, which add
> >> complexity and the possibility of writing error-prone code.  The idea of
> >> someone offloading ack processing and then proceeding to trust the
> length
> >> seems like someone could get wrong and cause some concerning issues.
> >
> > The risk of disagreement between ack blocks and ack block count is same
> as
> > the risk of disagreement between ack blocks and ack length. Either way
> this
> > needs to be counted up while creating the ACK and counted down while
> parsing
> > it. The possibility of error is the same. Getting the ack block count
> wrong
> > is as problematic as getting the ack length wrong. Do you agree?
> >
> >
> >
> >>My experience is multithreaded packet processing is more cost and work
> than
> >> it's worth.  Sure you can't fill a 100G NIC with one connection, but
> that
> >> seems like an academic problem, not one for workloads I've seen.
> Typically
> >> the extra cost of multithreading outweighs its value.
> >
> > The value is two fold – pre-processing and multi-threading. If we
> > pre-process the received packets such that ACKs and streams can be
> > coalesced, the receive side can indicate a large chunk of information
> though
> > the kernel, reducing the cost of system call and protocol overhead. This
> is
> > the same concept as UDP segmentation taken a step further on receive
> side.
> > After this chunk is indicated into the QUIC protocol, the protocol may
> > process stream and ACK in parallel. While folks may or may not utilize
> this,
> > there is an advantage here.
> >
> >
> >
> > Thanks,
> >
> > Manasi
> >
> >
> >
> >
> >
> > From: Ian Swett [mailto:ianswett@google.com]
> > Sent: Tuesday, June 19, 2018 11:26 AM
> > To: Deval, Manasi <manasi.deval@intel.com>
> > Cc: Marten Seemann <martenseemann@gmail.com>; Kazuho Oku
> > <kazuhooku@gmail.com>; Eric Rescorla <ekr@rtfm.com>; Jana Iyengar
> > <jri.ietf@gmail.com>; Praveen Balasubramanian <pravb@microsoft.com>;
> IETF
> > QUIC WG <quic@ietf.org>; Martin Thomson <martin.thomson@gmail.com>
> >
> >
> > Subject: Re: Proposal to replace ACK block count with ACK length
> >
> >
> >
> > I'm still not interested in this change, for the reasons I stated above.
> >
> >
> >
> > On Tue, Jun 19, 2018 at 2:21 PM Deval, Manasi <manasi.deval@intel.com>
> > wrote:
> >
> > Hi All,
> >
> >
> >
> > Do we have agreement here to create a new PR?
> >
> >
> >
> > Thanks,
> >
> > Manasi
> >
> >
> >
> > From: Deval, Manasi
> > Sent: Sunday, June 17, 2018 2:25 PM
> > To: Marten Seemann <martenseemann@gmail.com>; Kazuho Oku
> > <kazuhooku@gmail.com>
> > Cc: Ian Swett <ianswett=40google.com@dmarc.ietf.org>; Eric Rescorla
> > <ekr@rtfm.com>; Jana Iyengar <jri.ietf@gmail.com>; Praveen
> Balasubramanian
> > <pravb@microsoft.com>; IETF QUIC WG <quic@ietf.org>; Martin Thomson
> > <martin.thomson@gmail.com>
> > Subject: RE: Proposal to replace ACK block count with ACK length
> >
> >
> >
> > Hi All,
> >
> >
> >
> > I have made a list of objections to the proposal and the solutions to
> those
> > objections discussed on this thread.
> >
> >
> >
> > a.      Co-existence of length field with ECN field and ACK blocks.
> >
> >
> >
> > Christian suggested to move the ECN fields to precede the ACK blocks.
> This
> > is an elegant solution. Parsing entire list of ACK blocks to review ECN
> bits
> > would have been annoying, even though it can work.
> >
> >
> >
> > b.      There are two cases to be parsed – entire ACK and parse ACK to
> > identify length. There are some reservations when ACK parsing gets harder
> > for the case where the entire header needs to be parsed.
> >
> >
> >
> > Agreement from several folks here. In the original ACK defined in draft
> 12
> > of the slide, one would count down number of ACK blocks to get to the
> end of
> > the packet. In the proposal I made, one would count down the length to
> > identify the end of the packet. The logic is very similar in cycle count
> and
> > complexity. Several folks also commented to this effect.
> >
> >
> >
> > c.      Multi-threaded packet processing
> >
> >
> >
> > I would expect that there are 10s of 1000s of connections in use at any
> time
> > for a server with a high speed link. Multi-threading to handle each of
> these
> > flows / connections in parallel is necessity to be able to support large
> > number of connections on a high speed link. Tx segmentation, Rx
> coalescing
> > are well known strategies to reduce the processing cost. In initial
> stages,
> > code is often written as a single-threaded and then re-factored to
> > parallelize cycle intensive operations. In order to allow this protocol
> to
> > scale in future, I would suggest we do not preclude this case.
> >
> >
> >
> > d.      Increase in ACK size by 1 byte.
> >
> >
> >
> > I do not see this as a serious issue but if folks but we can consider
> making
> > this a varint, if others have strong feelings about it. It’s a trade-off
> : 2
> > reads to save 1 byte.
> >
> >
> >
> > e.      Every encodable value should be valid
> >
> > Not every length will be valid. This is inherent to lengths. This same
> issue
> > ails the ‘payload length’ in QUIC header. Not only does the issue exist
> for
> > small values, it also applies to large values since data stream will be
> sent
> > after crypto negotiation.  E.g.  - how does one craft a payload with 62
> bit
> > payload length in a large header?
> >
> >
> >
> >
> >
> > Thanks,
> >
> > Manasi
> >
> >
> >
> >
> >
> > From: Marten Seemann [mailto:martenseemann@gmail.com]
> > Sent: Sunday, June 17, 2018 6:59 AM
> > To: Kazuho Oku <kazuhooku@gmail.com>
> > Cc: Ian Swett <ianswett=40google.com@dmarc.ietf.org>; Eric Rescorla
> > <ekr@rtfm.com>; Jana Iyengar <jri.ietf@gmail.com>; Praveen
> Balasubramanian
> > <pravb@microsoft.com>; IETF QUIC WG <quic@ietf.org>; Martin Thomson
> > <martin.thomson@gmail.com>; Deval, Manasi <manasi.deval@intel.com>
> > Subject: Re: Proposal to replace ACK block count with ACK length
> >
> >
> >
> > Maybe it's specific to Go, but I'm using a single io.Reader for the whole
> > packet, so as long as the packet payload is long enough, the varint
> parsing
> > will not fail.
> >
> > I don't think that specifics of programming languages matter here though,
> > and I'm sure both frame formats can be reasonably implemented in C as
> well
> > as in Go. The reasons I'm opposed to Manasi's proposal are that it moves
> us
> > away from the principle that only reasonable values should be encodable,
> and
> > that it increases the size of the ACK frame, for the questionable
> benefit of
> > being able to parallelise the frame parser.
> >
> >
> >
> > On Sun, Jun 17, 2018 at 8:48 PM Kazuho Oku <kazuhooku@gmail.com> wrote:
> >
> > 2018-06-17 22:36 GMT+09:00 Marten Seemann <martenseemann@gmail.com>:
> >> At least for my implementation, parsing doesn't become easier, it
> becomes
> >> more complex with this proposal. My varint-parser always consumes as
> many
> >> bytes as the varint requires, so after parsing a varint, I'd have to
> >> introduce an additional check that this didn't overflow the ACK length
> >> (e.g.
> >> consider that I parsed the ACK frame so far that only 2 bytes are
> >> remaining
> >> according to ACK length field, but the next varint is 4 bytes long).
> >
> > Isn't your varint parser checking that it has not (or will not) run
> > across the end of the packet payload for every ACK block it parses?
> > I'd assume that you would be doing that, because I think that is
> > necessary to avoid buffer overrun.
> >
> > What I am saying that that check could be converted to a overrun check
> > against the end of the "frame payload", and that checking the
> > remaining block count becomes unnecessary, in case we replace ACK
> > Block Count with ACK Frame Length.
> >
> >>
> >> In general, we've been moving the wire image towards making every
> >> encodable
> >> value valid. This proposal moves us away from that principle:
> >> * some small values are always invalid (the length can never be between
> 0
> >> and 3)
> >> * a lot of intermediate values are invalid (if the boundary falls
> inside a
> >> varint, as described above)
> >> Both these cases can't occur with the current ACK frame format.
> >>
> >> On Sun, Jun 17, 2018 at 7:54 PM Kazuho Oku <kazuhooku@gmail.com> wrote:
> >>>
> >>> 2018-06-17 8:34 GMT+09:00 Ian Swett
> >>> <ianswett=40google.com@dmarc.ietf.org>:
> >>> > I'm not a fan of this proposal, because I think it is impractical to
> >>> > drop
> >>> > the number of ack blocks, because with the ECN proposal it becomes
> >>> > impractically complex to parse.
> >>>
> >>> For the ECN proposal, as Christian has suggested, we can move the ECN
> >>> counters before the ACK blocks. Then, it would not be complex to
> >>> parse.
> >>>
> >>> And my view is that parsing becomes easier if we replace ACK Block
> >>> Count with ACK Frame Length.
> >>>
> >>> Now, with ACK Block Count, we need to check the remaining number of
> >>> blocks and the remaining space in the packet payload for every block
> >>> that we parse. Failing to check either leads to a bug or a security
> >>> issue.
> >>>
> >>> If we switch to ACK Frame Length, we need to only check the remaining
> >>> space in the frame.
> >>>
> >>> I think that this is the biggest benefit of replacing ACK Block Count
> >>> with ACK Frame Length. OTOH the downside is that you need extra one to
> >>> two bits (one if the size of block / gap is expected to be below 65,
> >>> two if they are expected to be above that) for encoding ACK Frame
> >>> Length compared to ACK Block Count.
> >>>
> >>>
> >>>
> >>> Having said that, I honestly wonder if all the frames could have it's
> >>> length being encoded (either explicitly or either as a signal that
> >>> says "to the end of the packet"). Consider something like below:
> >>>
> >>> |0| frame-type (7) | frame-payload-length (i) | frame-payload (*) |
> >>>  or
> >>> |1| frame-type (7) | frame-payload (*) |
> >>>
> >>> When MSB of the first octet set to zero, the length of the frame
> >>> payload is designated by the varint that immediately follows the frame
> >>> type.
> >>> When MSB of the first octet set to one, the length of the frame
> >>> payload spans to the end of the packet.
> >>>
> >>> In this encoding, we can always omit the Length field of a STREAM
> >>> frame. So the overhead for carrying stream data will be indifferent in
> >>> practice.
> >>>
> >>> For the ACK frame, we can omit the ACK Block Count field. And the
> >>> overhead will be one to two bits if the ACK frame is sent in the
> >>> middle of the packet (thereby using the encoding with explicit frame
> >>> payload length), or one octet or more shorter if ACK is the last frame
> >>> of the packet.
> >>>
> >>> We are likely to see increase of overhead for most of the other types
> >>> of frames, but I do not think that would be an issue considering that
> >>> they will be far seldom seen compared to STREAMs and ACKs.
> >>>
> >>> To summarize, my anticipation is that we can make all the frames
> >>> self-contained (i.e. the length can be determined without the
> >>> knowledge of how each frame is encoded) without any overhead, if we
> >>> agree on making the frame type space 1 bit smaller.
> >>>
> >>> Finally, the biggest benefit of using a self-contained encoding of
> >>> frames is that we would have the ability to introduce new optional
> >>> frames without negotiation. By making the frames self-contained, QUIC
> >>> endpoints will have the freedom of ignoring the frames that they do
> >>> not understand.
> >>>
> >>> Being able to send QUIC frames defined in extensions without
> >>> negotiating using Transport Parameters will be a win in both terms of
> >>> security (because clients' TP is sent in clear) and flexibility
> >>> (because we will be possible to send the extensions before we figure
> >>> out whether the peer supports that extension).
> >>>
> >>> > If we don't remove the number of ack blocks, then the ack frame is
> >>> > larger,
> >>> > but I don't think the extra size field is useful for most
> >>> > implementations.
> >>> > Also, it means the length can disagree with the actual length, which
> >>> > add
> >>> > complexity and the possibility of writing error-prone code.  The idea
> >>> > of
> >>> > someone offloading ack processing and then proceeding to trust the
> >>> > length
> >>> > seems like someone could get wrong and cause some concerning issues.
> >>> >
> >>> > My experience is multithreaded packet processing is more cost and
> work
> >>> > than
> >>> > it's worth.  Sure you can't fill a 100G NIC with one connection, but
> >>> > that
> >>> > seems like an academic problem, not one for workloads I've seen.
> >>> > Typically
> >>> > the extra cost of multithreading outweighs its value.
> >>> >
> >>> > To be clear, I don't think this is an awful idea, but I also don't
> see
> >>> > the
> >>> > value and it adds complexity.  I read Manasi's email, but I don't
> think
> >>> > I
> >>> > understand why any of those matter in practice.
> >>> >
> >>> > On Sat, Jun 16, 2018 at 4:13 PM Eric Rescorla <ekr@rtfm.com> wrote:
> >>> >>
> >>> >> On Fri, Jun 15, 2018 at 6:46 PM, Marten Seemann
> >>> >> <martenseemann@gmail.com>
> >>> >> wrote:
> >>> >>>
> >>> >>> This proposal increases the size of the ACK frame by 1 byte in the
> >>> >>> common
> >>> >>> case (less than 63 ACK ranges), since the ACK length field here
> >>> >>> always
> >>> >>> consumes 2 bytes, whereas the ACK Block Count is a variable-length
> >>> >>> integer.
> >>> >>> Considering how much work we put into minimising the size of the
> >>> >>> frames,
> >>> >>> this feels like a step in the wrong direction..
> >>> >>>
> >>> >>> Regarding the processing cost, I agree with Dmitri. Handling an ACK
> >>> >>> frame
> >>> >>> requires looping over and making changes to a data structure that
> >>> >>> keeps
> >>> >>> track of sent packets. This is much more expensive than simply
> >>> >>> parsing
> >>> >>> a
> >>> >>> bunch of varints in the ACK frame. It seems unlikely that a
> >>> >>> multi-threaded
> >>> >>> packet parser would offer any real-world performance benefits.
> >>> >>
> >>> >>
> >>> >> I don't want to overstate the benefit here, but my point isn't that
> >>> >> parsing is expensive but that if you want to have a multithreaded
> >>> >> packet
> >>> >> processing system, then it's nice to have a simpler data structure
> >>> >> (the
> >>> >> unparsed ACK block) to hand to the ACK processing thread.
> >>> >>
> >>> >> -Ekr
> >>> >>
> >>> >>
> >>> >>>
> >>> >>> On Sat, Jun 16, 2018 at 6:19 AM Martin Thomson
> >>> >>> <martin.thomson@gmail.com>
> >>> >>> wrote:
> >>> >>>>
> >>> >>>> When we discussed this before, some people observed that this
> >>> >>>> creates
> >>> >>>> a need to encode in two passes.  That's the trade-off here.  (Not
> >>> >>>> expressing an opinion.)
> >>> >>>> On Fri, Jun 15, 2018 at 3:51 PM Jana Iyengar <jri.ietf@gmail.com>
> >>> >>>> wrote:
> >>> >>>> >
> >>> >>>> > I don't have a strong opinion on this. I'm certainly not opposed
> >>> >>>> > to
> >>> >>>> > it.
> >>> >>>> > Does anyone have a strong opposition?
> >>> >>>> >
> >>> >>>> > On Fri, Jun 15, 2018 at 3:10 PM Praveen Balasubramanian
> >>> >>>> > <pravb@microsoft.com> wrote:
> >>> >>>> >>
> >>> >>>> >> I agree as well since this can help reduce per packet
> processing
> >>> >>>> >> overhead. ACKs are going to be the second most common frame
> type
> >>> >>>> >> so no
> >>> >>>> >> objections to special casing.
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >> From: QUIC [mailto:quic-bounces@ietf.org] On Behalf Of Eric
> >>> >>>> >> Rescorla
> >>> >>>> >> Sent: Friday, June 15, 2018 9:11 AM
> >>> >>>> >> To: Deval, Manasi <manasi.deval@intel.com>
> >>> >>>> >> Cc: Jana Iyengar <jri.ietf@gmail.com>; QUIC WG <quic@ietf.org>
> >>> >>>> >> Subject: Re: Proposal to replace ACK block count with ACK
> length
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >> I agree with Manasi here. This change would allow ack frame
> >>> >>>> >> parsing
> >>> >>>> >> to be more self-contained, which is an advantage for the parser
> >>> >>>> >> and also
> >>> >>>> >> potentially for parallelism (because you can quickly find the
> >>> >>>> >> frame and then
> >>> >>>> >> process it in parallel).
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >> -Ekr
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >> On Mon, Jun 11, 2018 at 5:22 PM, Deval, Manasi
> >>> >>>> >> <manasi.deval@intel.com> wrote:
> >>> >>>> >>
> >>> >>>> >> In general, varints require some specific logic for parsing. To
> >>> >>>> >> skip
> >>> >>>> >> over any header, I have to read every single varint. As the
> code
> >>> >>>> >> sees Stream
> >>> >>>> >> and ACK headers most frequently, that is my focus.  The Stream
> >>> >>>> >> frame has a
> >>> >>>> >> length in its third field.
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >> ACK parsing, however, needs 6 + 2*num_blocks reads to identify
> >>> >>>> >> length. There are two reads each for ‘largest acknowledged’,
> ‘ACK
> >>> >>>> >> delay’ and
> >>> >>>> >> ‘ACK block count’. The pain point is the total number of cycles
> >>> >>>> >> parse an
> >>> >>>> >> ACK. If I am processing 10M pps, where 10% - 30% of the packets
> >>> >>>> >> have a
> >>> >>>> >> piggybacked ACK, these cycles becomes a significant bottleneck.
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >> Thanks,
> >>> >>>> >>
> >>> >>>> >> Manasi
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >> From: QUIC [mailto:quic-bounces@ietf.org] On Behalf Of Jana
> >>> >>>> >> Iyengar
> >>> >>>> >> Sent: Monday, June 11, 2018 3:11 PM
> >>> >>>> >> To: Deval, Manasi <manasi.deval@intel.com>; QUIC WG
> >>> >>>> >> <quic@ietf.org>
> >>> >>>> >> Subject: Re: Proposal to replace ACK block count with ACK
> length
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >> You're right that we no longer have the ability to skip an ACK
> >>> >>>> >> frame,
> >>> >>>> >> and this crept in when we moved to varints.
> >>> >>>> >>
> >>> >>>> >> I believe your problem though is generally true of most frames
> >>> >>>> >> not
> >>> >>>> >> just ACKs, since ids, packet numbers, and numbers in all frames
> >>> >>>> >> are now all
> >>> >>>> >> varints. To skip any frame, you'll need to parse the varint
> >>> >>>> >> fields
> >>> >>>> >> in those
> >>> >>>> >> frames. If you have logic to process and skip varints, then
> >>> >>>> >> skipping the ack
> >>> >>>> >> block section is merely repeating this operation
> (2*num_block+1)
> >>> >>>> >> times. Do
> >>> >>>> >> you see specific value in skipping ACK frames over the other
> >>> >>>> >> control frames?
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >> On Mon, Jun 11, 2018 at 8:43 AM Dmitri Tikhonov
> >>> >>>> >> <dtikhonov@litespeedtech..com> wrote:
> >>> >>>> >>
> >>> >>>> >> On Mon, Jun 11, 2018 at 03:33:35PM +0000, Deval, Manasi wrote:
> >>> >>>> >> > -        Moving the ACK length to the front of the ACK allows
> >>> >>>> >> > the
> >>> >>>> >> >          flexibility of either reading the entire ACK or
> >>> >>>> >> > reading
> >>> >>>> >> > the
> >>> >>>> >> >          first 16 bits and skipping over the length. This is
> a
> >>> >>>> >> > useful
> >>> >>>> >> >          feature for the case where ACK processing is split
> >>> >>>> >> > into
> >>> >>>> >> >          multiple layers. Depending on the processor this is
> >>> >>>> >> > run
> >>> >>>> >> > on,
> >>> >>>> >> >          there are different advantages -
> >>> >>>> >>
> >>> >>>> >> Just a note.  In my experience, the cost of parsing an ACK
> frame
> >>> >>>> >> is
> >>> >>>> >> negligible compared to the cost of processing an ACK frame:
> that
> >>> >>>> >> is,
> >>> >>>> >> poking at various memory locations to discard newly ACKed
> >>> >>>> >> packets.
> >>> >>>> >>
> >>> >>>> >>   - Dmitri.
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>>
> >>> >>
> >>> >
> >>>
> >>>
> >>>
> >>> --
> >>> Kazuho Oku
> >
> >
> >
> > --
> > Kazuho Oku
>
>
>
> --
> Kazuho Oku
>