Re: Proposal to replace ACK block count with ACK length

Mikkel Fahnøe Jørgensen <mikkelfj@gmail.com> Fri, 22 June 2018 15:01 UTC

Return-Path: <mikkelfj@gmail.com>
X-Original-To: quic@ietfa.amsl.com
Delivered-To: quic@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 79118130E91 for <quic@ietfa.amsl.com>; Fri, 22 Jun 2018 08:01:05 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.998
X-Spam-Level:
X-Spam-Status: No, score=-1.998 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id UaoBqChkiYS2 for <quic@ietfa.amsl.com>; Fri, 22 Jun 2018 08:00:56 -0700 (PDT)
Received: from mail-io0-x22a.google.com (mail-io0-x22a.google.com [IPv6:2607:f8b0:4001:c06::22a]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id C759B130E8E for <quic@ietf.org>; Fri, 22 Jun 2018 08:00:55 -0700 (PDT)
Received: by mail-io0-x22a.google.com with SMTP id q4-v6so6390889iob.2 for <quic@ietf.org>; Fri, 22 Jun 2018 08:00:55 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:in-reply-to:references:mime-version:date:message-id:subject:to :cc; bh=6zoCPnJsbftl6As3ks0tzgX4UjVBNh//HomexXxe3EA=; b=jrb2N68z77qw3R/pXOFCasD+sNwI94TAnEpyhj9gvCpwpPsXzRLF+QWtA2RSPBRjkX NLp+XYT9T+Q1hcS4tB60o75h/aEqsp1iixcfsSzgMeHHl24LYPVVY2S0lcdVfeMReozP eOUGh0ZqynH8Jm04zISi1TkVonCwWbmbdXLdzjHW5Qlec4V03rxFAXJJZVPwpgD068hR aLbDWl/fXl8CZqaF5zT4uqpOBeAliyXTDifkw6A10PeFyX9W45iRehFFWEnOCGUMMgfL 9sJzK0T3po5nPUXJKKdOFbBnA8hKbmngG1STTTEjgDPyAMWtl9HXbVjH70QxdS9E4EmT wf4Q==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:in-reply-to:references:mime-version:date :message-id:subject:to:cc; bh=6zoCPnJsbftl6As3ks0tzgX4UjVBNh//HomexXxe3EA=; b=oelibE0EIemM9BpwJdhPszqRPgxKqKpgvkRg2EOtUP2xYB9GT2nB6VPw2jowRieJRd IHQH0oxax4ZJ0Ualcrbcs9+VulYXHWQ09IfN0KB1myPktr7ovb0LCl2vS4J/P3FXWwiw 2b7+yNovoOmhRCf39x4fH5rgPZ7WSQRFdhvFZuxEppSRPKk498EiJzJm1thM6pYLHGg9 YkhkmksdzdmiCrs+gqbIAnN7jOKQJaOUH7ow+mkpJYfPyqL3fjiJ3XrtRz+uir57dA7a GVl4FcYtWZP53p68+/Fx5wxXSrzdEWYBLjNRKzcPB9ElZdqXeZL01/KZV1nTJiq7MUcP ds1Q==
X-Gm-Message-State: APt69E1NqMd8VKra42U/Bpc9dyxTQ1IoofIAGX6P7xhn7PkcNM5tL2VE BAe09LjDNbfBVHlLyXdjiUo/YG0aUM4Hl8+gXBQ=
X-Google-Smtp-Source: AAOMgpfn3skES2ftFyxs8NYr8U0l0wEYDmqB5appYINCx/iyqPe0IHnKG5M/lJAZ/V095w0GMrszScmLmQb+k+6+/1M=
X-Received: by 2002:a6b:93c6:: with SMTP id v189-v6mr1585458iod.274.1529679653372; Fri, 22 Jun 2018 08:00:53 -0700 (PDT)
Received: from 1058052472880 named unknown by gmailapi.google.com with HTTPREST; Fri, 22 Jun 2018 11:00:52 -0400
From: Mikkel Fahnøe Jørgensen <mikkelfj@gmail.com>
In-Reply-To: <CABcZeBPyMBKvY_K6nQSNbxvGXhF2o3hMKeTFgvmbWPgkEyFKaw@mail.gmail.com>
References: <1F436ED13A22A246A59CA374CBC543998B832414@ORSMSX111.amr.corp.intel.com> <20180611154244.GA27622@ubuntu-dmitri> <CACpbDcdxzRxeiN93kKoj__vo2TERm4QZKqaesL=jr4wQUN1gXA@mail.gmail.com> <1F436ED13A22A246A59CA374CBC543998B833B91@ORSMSX111.amr.corp.intel.com> <CABcZeBOjjRrX+AsXdgcUKpL=ciL8U_U1+WVAhQv-ZjwGxkQxYw@mail.gmail.com> <MWHPR21MB0638068EFA850328793E55F6B67C0@MWHPR21MB0638.namprd21.prod.outlook.com> <CACpbDcdbTKKEh8dcshWM6-7vq2hBFJC1myL1+H6etpMMjth+wg@mail.gmail.com> <CABkgnnV_thWcAi=AdwV+Za5rXywiUvtOYpsNNp1y7=RvL2MvWA@mail.gmail.com> <CAOYVs2qE=Tw_7eax9HwaESaQPMh7k3BSVV112d+pPeSfZ09EjQ@mail.gmail.com> <CABcZeBOCRHAuh44CrMH02UZ3Ar_2sa5M1c3LG_A-RPzXX+H+Yw@mail.gmail.com> <CAKcm_gOeZHR-BGJiqK=zQKqbgq=briQuH+fzHrkUYbhQx3B_sw@mail.gmail.com> <CANatvzyKv8EGVR-Z5WMDKbeuKHP791OynsTqX=+HriKBxFnafA@mail.gmail.com> <CAOYVs2oE6yawW04MVH1ApewSJ+0g9g2oMxCj+CU+butfiAe8kA@mail.gmail.com> <CANatvzxniU0AUEi5tuKzmX45uTUV6-y0JbqcdKTpu1J4WQR7JA@mail.gmail.com> <CAOYVs2p9vJrCVuXqGsR29rOGj=CNt1m7TcavGV9Kwk-9hA4sPQ@mail.gmail.com> <1F436ED13A22A246A59CA374CBC543998B83AB21@ORSMSX111.amr.corp.intel.com> <1F436ED13A22A246A59CA374CBC543998B83EC27@ORSMSX111.amr.corp.intel.com> <CAKcm_gMV4vXXW5jKwAR-cOT6OYpi6FL-mO9K=0GWL6WULjWNKA@mail.gmail.com> <1F436ED13A22A246A59CA374CBC543998B83EF15@ORSMSX111.amr.corp.intel.com> <CAOYVs2oynZuE43q1MVO3bBKTPCFg_T3pykS4e5p7DpSaSvmgtQ@mail.gmail.com> <1F436ED13A22A246A59CA374CBC543998B843873@ORSMSX111.amr.corp.intel.com> <CANatvzzEV=BGJXFuOnDfhXJQV78aWFf84joMknRExY48vu8OYw@mail.gmail.com> <CABcZeBPyMBKvY_K6nQSNbxvGXhF2o3hMKeTFgvmbWPgkEyFKaw@mail.gmail.com>
X-Mailer: Airmail (420)
MIME-Version: 1.0
Date: Fri, 22 Jun 2018 11:00:52 -0400
Message-ID: <CAN1APddCQ_H18QT+12zytagkBe5VKFUZN31wkMxOgQmHB2Xqug@mail.gmail.com>
Subject: Re: Proposal to replace ACK block count with ACK length
To: Eric Rescorla <ekr@rtfm.com>, Kazuho Oku <kazuhooku@gmail.com>
Cc: "Deval, Manasi" <manasi.deval@intel.com>, Jana Iyengar <jri.ietf@gmail.com>, IETF QUIC WG <quic@ietf.org>, Marten Seemann <martenseemann@gmail.com>, Martin Thomson <martin.thomson@gmail.com>, Ian Swett <ianswett@google.com>, Praveen Balasubramanian <pravb@microsoft.com>
Content-Type: multipart/alternative; boundary="0000000000005c0ce3056f3c4cd9"
Archived-At: <https://mailarchive.ietf.org/arch/msg/quic/WkAc81uTeZF4oRJ6TiJXEaGqojk>
X-BeenThere: quic@ietf.org
X-Mailman-Version: 2.1.26
Precedence: list
List-Id: Main mailing list of the IETF QUIC working group <quic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/quic>, <mailto:quic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/quic/>
List-Post: <mailto:quic@ietf.org>
List-Help: <mailto:quic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/quic>, <mailto:quic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 22 Jun 2018 15:01:06 -0000

As I said earlier, I also favour a length prefix. But, Ian does have a
point:

Writing data is generally more expensive than reading data. Especially if
you have to traverse long data structures to find the length before you can
start writing and/or you may have to conservatively reserve extra space for
a length field.

So before deciding one way or the other, the cost of writing needs to be
well understood, also in scenarios with large MTU’s.

In the flatbuffers space that I’m involved with, streaming has turned to be
an issue because data cannot be transmitted in parts without index data on
a separate channel to recombined fragments. A simple, but non-standard,
change in the format would fix this. JSON and CoAP allows streaming, but
many other formats do not, including later versions of protocol buffers, as
I understand.

An odd consequence of making both read and write efficient is that length
better work better if stored at the end and packets read backwards. This
requires a single total length prefix but this is stored in the datagram
header. This is probably too odd-ball, but still a consideration worth
noting.

The second best alternative may therefore be to allow writes to be fast and
reads to be decent.

Yet, I still like length prefixes if they could be made write efficient
because high performance low-latency processes care zero about ACK and can
consume a packet directly while a background process handles all the
latency insensitive ACK and retransmission logic.

The question is also, where is the pressure: an IoT aggregator might be
massively read intensive while a web cache would be very write intensive.


Kind Regards,
Mikkel Fahnøe Jørgensen


On 22 June 2018 at 15.12.09, Eric Rescorla (ekr@rtfm.com) wrote:

It seems like there are two questions at hand here:

1. Would it be architecturally better to have frames have a consistent
self-contained
representation?
2. Is it enough better that we should do so now.

I agree with Kazuho that (a) we don't have that representation now and (b)
it would
be a better design to do so. I'm perhaps somewhat more positive on (2) than
he
is. I don't think it's critically important that we make the change, but if
we were
to hold a consensus call, I think I would be in favor. I'd certainly be
interested
in looking at a proposal if someone else were to make one.

-Ekr



On Thu, Jun 21, 2018 at 8:45 PM, Kazuho Oku <kazuhooku@gmail.com> wrote:

> 2018-06-22 8:08 GMT+09:00 Deval, Manasi <manasi.deval@intel.com>:
> > I feel that the requirement to have every value valid is somewhat
> academic.
>
> I think that Marten is correct in pointing out that making the ACK
> frame self-containing (by having a field that represents the number of
> octets being consumed by the frame) would be an exception from the
> design pattern we have.
>
> Look at STREAM frame. The field is not self-contained. Instead, it has
> a Length field for the Stream Data, which is a leaf. The same goes for
> NEW_CONNECTION_ID frame (that has the length field for Connection ID
> field (which is also a leaf)), CONNECTION_CLOSE (length field for
> Reason Phase).
>
> I agree with Marten, Mikkel (and possibly others as well) that having
> consistency is important.
>
> Therefore, I agree with Mikkel that we should consider making every
> frame self-contained or keeping every frame as-is (i.e. not
> self-contained).
>
> FWIW, as described in the latter half of
> https://www.ietf.org/mail-archive/web/quic/current/msg04287.html, it
> is possible to make every frame self-contained *and* also make ACK
> frames smaller than the current draft. Making every frame
> self-contained gives us the possibility to send new extension frames
> without negotiation.
>
> I do not think that I would push for making every frame self-contained
> by myself (because I do not think it meets the high bar to have a
> change at such a late moment of standardization), but as stated, my
> preference goes to seeing every frame made self-contained or none of
> them made as such.
>
> > The length value provides much more value than the block count and the
> fact
> > that certain values can never be achieved is an inherent property of the
> > length.
> >
> >
> >
> > One interesting observation is that this property is not limited to
> length.
> > One can even make a similar argument about ACK block count. The maximum
> > number of ACK blocks that can be defined will not always have a
> meaningful
> > value. In out examples, 0,1,2,3 are all valid. If I set the value of ACK
> > block count to have lower two bits to be 11, the maximum value is of ACK
> > blocks is – 4611686018427387903. This is the same value of largest
> > acknowledged so if the ACK block count was set to this value, it would
> still
> > be meaningless.
> >
> >
> >
> > Thanks,
> >
> > Manasi
> >
> >
> >
> >
> >
> >
> >
> > From: Marten Seemann [mailto:martenseemann@gmail.com]
> > Sent: Tuesday, June 19, 2018 6:44 PM
> > To: Deval, Manasi <manasi..deval@intel.com <manasi.deval@intel.com>>
> > Cc: Ian Swett <ianswett@google.com>; Kazuho Oku <kazuhooku@gmail.com>;
> Eric
> > Rescorla <ekr@rtfm.com>; Jana Iyengar <jri.ietf@gmail.com>; Praveen
> > Balasubramanian <pravb@microsoft.com>; IETF QUIC WG <quic@ietf.org>;
> Martin
> > Thomson <martin.thomson@gmail.com>
> >
> >
> > Subject: Re: Proposal to replace ACK block count with ACK length
> >
> >
> >
> > Hi Manasi,
> >
> >
> >
> >> The risk of disagreement between ack blocks and ack block count is same
> as
> >> the risk of disagreement between ack blocks and ack length. Either way
> this
> >> needs to be counted up while creating the ACK and counted down while
> parsing
> >> it. The possibility of error is the same. Getting the ack block count
> wrong
> >> is as problematic as getting the ack length wrong. Do you agree?
> >
> >
> >
> > I disagree. Let's take an example of an ACK frame with one ACK range,
> that
> > needs a 2 byte varint to represent the First ACK Block and another 2 byte
> > varint to represent the Gap.
> >
> > With your proposal:
> >
> > The values 0 and 1 are invalid, since the length fields itself is
> included
> > in the length.
> > The values 2, 3, ..., (2 + len(LargestAcknowledged) + len(AckDelay)) - 1
> are
> > invalid, since the length needs to include the Largest Acknowledged and
> the
> > Ack Delay.
> > The value 2 + len(LargestAcknowledged) + len(AckDelay) would be the first
> > valid value, and correspond to an ACK frame with no blocks.
> > The value 2 + len(LargestAcknowledged) + len(AckDelay) + 1 is invalid,
> since
> > it would cut the varint for the First ACK Block
> > The value 2 + len(LargestAcknowledged) + len(AckDelay) + 2 is invalid,
> since
> > it would cut the frame after the First ACK Block (but every block must be
> > followed by a gap length)
> > The value 2 + len(LargestAcknowledged) + len(AckDelay) + 3 is invalid,
> since
> > it would cut the varint for the Gap
> > Finally, the value 2 + len(LargestAcknowledged) + len(AckDelay) + 4 is
> valid
> >
> > There are *a lot* of invalid values that you can encode into the ACK
> length
> > field. More importantly, *none* of these error cases exists with the
> current
> > frame format.
> >
> > The *only* error case that can occur with our current format is that the
> > packet is too short for the number of ACK blocks that are supposed to
> > contained in the frame. This can occur with your proposal as well (in
> > addition to the error cases listed above).
> >
> >
> >
> > My concern is not that it's impossible or even particularly hard to catch
> > these errors, but I dislike the property that some (in fact, most)
> encodable
> > values are invalid.
> >
> >
> >
> > Best,
> >
> > Marten
> >
> >
> >
> >
> >
> > On Wed, Jun 20, 2018 at 3:21 AM Deval, Manasi <manasi.deval@intel.com>
> > wrote:
> >
> > Hi Ian,
> >
> >
> >
> > Here is another attempt to solve the objections you raised:
> >
> >
> >
> >>I'm not a fan of this proposal, because I think it is impractical to drop
> >> the number of ack blocks, because with the ECN proposal it becomes
> >> impractically complex to parse.
> >
> > Is there a reason the proposal from Christian does not solve this
> problem?
> >
> >
> >
> >>If we don't remove the number of ack blocks, then the ack frame is
> larger,
> >> but I don't think the extra size field is useful for most
> implementations.
> >> Also, it means the length can disagree with the actual length, which add
> >> complexity and the possibility of writing error-prone code.  The idea of
> >> someone offloading ack processing and then proceeding to trust the
> length
> >> seems like someone could get wrong and cause some concerning issues.
> >
> > The risk of disagreement between ack blocks and ack block count is same
> as
> > the risk of disagreement between ack blocks and ack length. Either way
> this
> > needs to be counted up while creating the ACK and counted down while
> parsing
> > it. The possibility of error is the same. Getting the ack block count
> wrong
> > is as problematic as getting the ack length wrong. Do you agree?
> >
> >
> >
> >>My experience is multithreaded packet processing is more cost and work
> than
> >> it's worth.  Sure you can't fill a 100G NIC with one connection, but
> that
> >> seems like an academic problem, not one for workloads I've seen.
> Typically
> >> the extra cost of multithreading outweighs its value.
> >
> > The value is two fold – pre-processing and multi-threading. If we
> > pre-process the received packets such that ACKs and streams can be
> > coalesced, the receive side can indicate a large chunk of information
> though
> > the kernel, reducing the cost of system call and protocol overhead. This
> is
> > the same concept as UDP segmentation taken a step further on receive
> side.
> > After this chunk is indicated into the QUIC protocol, the protocol may
> > process stream and ACK in parallel. While folks may or may not utilize
> this,
> > there is an advantage here.
> >
> >
> >
> > Thanks,
> >
> > Manasi
> >
> >
> >
> >
> >
> > From: Ian Swett [mailto:ianswett@google.com]
> > Sent: Tuesday, June 19, 2018 11:26 AM
> > To: Deval, Manasi <manasi..deval@intel.com <manasi.deval@intel.com>>
> > Cc: Marten Seemann <martenseemann@gmail.com>; Kazuho Oku
> > <kazuhooku@gmail.com>; Eric Rescorla <ekr@rtfm.com>; Jana Iyengar
> > <jri.ietf@gmail.com>; Praveen Balasubramanian <pravb@microsoft.com>;
> IETF
> > QUIC WG <quic@ietf.org>; Martin Thomson <martin.thomson@gmail.com>
> >
> >
> > Subject: Re: Proposal to replace ACK block count with ACK length
> >
> >
> >
> > I'm still not interested in this change, for the reasons I stated above.
> >
> >
> >
> > On Tue, Jun 19, 2018 at 2:21 PM Deval, Manasi <manasi.deval@intel.com>
> > wrote:
> >
> > Hi All,
> >
> >
> >
> > Do we have agreement here to create a new PR?
> >
> >
> >
> > Thanks,
> >
> > Manasi
> >
> >
> >
> > From: Deval, Manasi
> > Sent: Sunday, June 17, 2018 2:25 PM
> > To: Marten Seemann <martenseemann@gmail.com>; Kazuho Oku
> > <kazuhooku@gmail.com>
> > Cc: Ian Swett <ianswett=40google.com@dmarc.ietf.org
> <40google.com@dmarc.ietf..org>>; Eric Rescorla
> > <ekr@rtfm.com>; Jana Iyengar <jri.ietf@gmail.com>; Praveen
> Balasubramanian
> > <pravb@microsoft.com>; IETF QUIC WG <quic@ietf.org>; Martin Thomson
> > <martin.thomson@gmail.com>
> > Subject: RE: Proposal to replace ACK block count with ACK length
> >
> >
> >
> > Hi All,
> >
> >
> >
> > I have made a list of objections to the proposal and the solutions to
> those
> > objections discussed on this thread.
> >
> >
> >
> > a.      Co-existence of length field with ECN field and ACK blocks.
> >
> >
> >
> > Christian suggested to move the ECN fields to precede the ACK blocks.
> This
> > is an elegant solution. Parsing entire list of ACK blocks to review ECN
> bits
> > would have been annoying, even though it can work.
> >
> >
> >
> > b.      There are two cases to be parsed – entire ACK and parse ACK to
> > identify length. There are some reservations when ACK parsing gets harder
> > for the case where the entire header needs to be parsed.
> >
> >
> >
> > Agreement from several folks here. In the original ACK defined in draft
> 12
> > of the slide, one would count down number of ACK blocks to get to the
> end of
> > the packet. In the proposal I made, one would count down the length to
> > identify the end of the packet. The logic is very similar in cycle count
> and
> > complexity. Several folks also commented to this effect.
> >
> >
> >
> > c.      Multi-threaded packet processing
> >
> >
> >
> > I would expect that there are 10s of 1000s of connections in use at any
> time
> > for a server with a high speed link. Multi-threading to handle each of
> these
> > flows / connections in parallel is necessity to be able to support large
> > number of connections on a high speed link. Tx segmentation, Rx
> coalescing
> > are well known strategies to reduce the processing cost. In initial
> stages,
> > code is often written as a single-threaded and then re-factored to
> > parallelize cycle intensive operations. In order to allow this protocol
> to
> > scale in future, I would suggest we do not preclude this case.
> >
> >
> >
> > d.      Increase in ACK size by 1 byte.
> >
> >
> >
> > I do not see this as a serious issue but if folks but we can consider
> making
> > this a varint, if others have strong feelings about it. It’s a trade-off
> : 2
> > reads to save 1 byte.
> >
> >
> >
> > e.      Every encodable value should be valid
> >
> > Not every length will be valid. This is inherent to lengths. This same
> issue
> > ails the ‘payload length’ in QUIC header. Not only does the issue exist
> for
> > small values, it also applies to large values since data stream will be
> sent
> > after crypto negotiation.  E.g.  - how does one craft a payload with 62
> bit
> > payload length in a large header?
> >
> >
> >
> >
> >
> > Thanks,
> >
> > Manasi
> >
> >
> >
> >
> >
> > From: Marten Seemann [mailto:martenseemann@gmail.com]
> > Sent: Sunday, June 17, 2018 6:59 AM
> > To: Kazuho Oku <kazuhooku@gmail.com>
> > Cc: Ian Swett <ianswett=40google.com@dmarc.ietf.org
> <40google.com@dmarc.ietf..org>>; Eric Rescorla
> > <ekr@rtfm.com>; Jana Iyengar <jri.ietf@gmail.com>; Praveen
> Balasubramanian
> > <pravb@microsoft.com>; IETF QUIC WG <quic@ietf.org>; Martin Thomson
> > <martin.thomson@gmail.com>; Deval, Manasi <manasi.deval@intel.com>
> > Subject: Re: Proposal to replace ACK block count with ACK length
> >
> >
> >
> > Maybe it's specific to Go, but I'm using a single io.Reader for the whole
> > packet, so as long as the packet payload is long enough, the varint
> parsing
> > will not fail.
> >
> > I don't think that specifics of programming languages matter here though,
> > and I'm sure both frame formats can be reasonably implemented in C as
> well
> > as in Go. The reasons I'm opposed to Manasi's proposal are that it moves
> us
> > away from the principle that only reasonable values should be encodable,
> and
> > that it increases the size of the ACK frame, for the questionable
> benefit of
> > being able to parallelise the frame parser.
> >
> >
> >
> > On Sun, Jun 17, 2018 at 8:48 PM Kazuho Oku <kazuhooku@gmail.com> wrote:
> >
> > 2018-06-17 22:36 GMT+09:00 Marten Seemann <martenseemann@gmail.com>:
> >> At least for my implementation, parsing doesn't become easier, it
> becomes
> >> more complex with this proposal. My varint-parser always consumes as
> many
> >> bytes as the varint requires, so after parsing a varint, I'd have to
> >> introduce an additional check that this didn't overflow the ACK length
> >> (e.g.
> >> consider that I parsed the ACK frame so far that only 2 bytes are
> >> remaining
> >> according to ACK length field, but the next varint is 4 bytes long).
> >
> > Isn't your varint parser checking that it has not (or will not) run
> > across the end of the packet payload for every ACK block it parses?
> > I'd assume that you would be doing that, because I think that is
> > necessary to avoid buffer overrun.
> >
> > What I am saying that that check could be converted to a overrun check
> > against the end of the "frame payload", and that checking the
> > remaining block count becomes unnecessary, in case we replace ACK
> > Block Count with ACK Frame Length.
> >
> >>
> >> In general, we've been moving the wire image towards making every
> >> encodable
> >> value valid. This proposal moves us away from that principle:
> >> * some small values are always invalid (the length can never be between
> 0
> >> and 3)
> >> * a lot of intermediate values are invalid (if the boundary falls
> inside a
> >> varint, as described above)
> >> Both these cases can't occur with the current ACK frame format..
> >>
> >> On Sun, Jun 17, 2018 at 7:54 PM Kazuho Oku <kazuhooku@gmail.com> wrote:
> >>>
> >>> 2018-06-17 8:34 GMT+09:00 Ian Swett
> >>> <ianswett=40google.com@dmarc.ietf.org>:
> >>> > I'm not a fan of this proposal, because I think it is impractical to
> >>> > drop
> >>> > the number of ack blocks, because with the ECN proposal it becomes
> >>> > impractically complex to parse.
> >>>
> >>> For the ECN proposal, as Christian has suggested, we can move the ECN
> >>> counters before the ACK blocks. Then, it would not be complex to
> >>> parse.
> >>>
> >>> And my view is that parsing becomes easier if we replace ACK Block
> >>> Count with ACK Frame Length.
> >>>
> >>> Now, with ACK Block Count, we need to check the remaining number of
> >>> blocks and the remaining space in the packet payload for every block
> >>> that we parse. Failing to check either leads to a bug or a security
> >>> issue.
> >>>
> >>> If we switch to ACK Frame Length, we need to only check the remaining
> >>> space in the frame.
> >>>
> >>> I think that this is the biggest benefit of replacing ACK Block Count
> >>> with ACK Frame Length. OTOH the downside is that you need extra one to
> >>> two bits (one if the size of block / gap is expected to be below 65,
> >>> two if they are expected to be above that) for encoding ACK Frame
> >>> Length compared to ACK Block Count.
> >>>
> >>>
> >>>
> >>> Having said that, I honestly wonder if all the frames could have it's
> >>> length being encoded (either explicitly or either as a signal that
> >>> says "to the end of the packet"). Consider something like below:
> >>>
> >>> |0| frame-type (7) | frame-payload-length (i) | frame-payload (*) |
> >>>  or
> >>> |1| frame-type (7) | frame-payload (*) |
> >>>
> >>> When MSB of the first octet set to zero, the length of the frame
> >>> payload is designated by the varint that immediately follows the frame
> >>> type.
> >>> When MSB of the first octet set to one, the length of the frame
> >>> payload spans to the end of the packet.
> >>>
> >>> In this encoding, we can always omit the Length field of a STREAM
> >>> frame. So the overhead for carrying stream data will be indifferent in
> >>> practice.
> >>>
> >>> For the ACK frame, we can omit the ACK Block Count field. And the
> >>> overhead will be one to two bits if the ACK frame is sent in the
> >>> middle of the packet (thereby using the encoding with explicit frame
> >>> payload length), or one octet or more shorter if ACK is the last frame
> >>> of the packet.
> >>>
> >>> We are likely to see increase of overhead for most of the other types
> >>> of frames, but I do not think that would be an issue considering that
> >>> they will be far seldom seen compared to STREAMs and ACKs.
> >>>
> >>> To summarize, my anticipation is that we can make all the frames
> >>> self-contained (i.e. the length can be determined without the
> >>> knowledge of how each frame is encoded) without any overhead, if we
> >>> agree on making the frame type space 1 bit smaller.
> >>>
> >>> Finally, the biggest benefit of using a self-contained encoding of
> >>> frames is that we would have the ability to introduce new optional
> >>> frames without negotiation. By making the frames self-contained, QUIC
> >>> endpoints will have the freedom of ignoring the frames that they do
> >>> not understand.
> >>>
> >>> Being able to send QUIC frames defined in extensions without
> >>> negotiating using Transport Parameters will be a win in both terms of
> >>> security (because clients' TP is sent in clear) and flexibility
> >>> (because we will be possible to send the extensions before we figure
> >>> out whether the peer supports that extension).
> >>>
> >>> > If we don't remove the number of ack blocks, then the ack frame is
> >>> > larger,
> >>> > but I don't think the extra size field is useful for most
> >>> > implementations.
> >>> > Also, it means the length can disagree with the actual length, which
> >>> > add
> >>> > complexity and the possibility of writing error-prone code.  The idea
> >>> > of
> >>> > someone offloading ack processing and then proceeding to trust the
> >>> > length
> >>> > seems like someone could get wrong and cause some concerning issues.
> >>> >
> >>> > My experience is multithreaded packet processing is more cost and
> work
> >>> > than
> >>> > it's worth.  Sure you can't fill a 100G NIC with one connection, but
> >>> > that
> >>> > seems like an academic problem, not one for workloads I've seen.
> >>> > Typically
> >>> > the extra cost of multithreading outweighs its value.
> >>> >
> >>> > To be clear, I don't think this is an awful idea, but I also don't
> see
> >>> > the
> >>> > value and it adds complexity.  I read Manasi's email, but I don't
> think
> >>> > I
> >>> > understand why any of those matter in practice.
> >>> >
> >>> > On Sat, Jun 16, 2018 at 4:13 PM Eric Rescorla <ekr@rtfm.com> wrote:
> >>> >>
> >>> >> On Fri, Jun 15, 2018 at 6:46 PM, Marten Seemann
> >>> >> <martenseemann@gmail.com>
> >>> >> wrote:
> >>> >>>
> >>> >>> This proposal increases the size of the ACK frame by 1 byte in the
> >>> >>> common
> >>> >>> case (less than 63 ACK ranges), since the ACK length field here
> >>> >>> always
> >>> >>> consumes 2 bytes, whereas the ACK Block Count is a variable-length
> >>> >>> integer.
> >>> >>> Considering how much work we put into minimising the size of the
> >>> >>> frames,
> >>> >>> this feels like a step in the wrong direction..
> >>> >>>
> >>> >>> Regarding the processing cost, I agree with Dmitri. Handling an ACK
> >>> >>> frame
> >>> >>> requires looping over and making changes to a data structure that
> >>> >>> keeps
> >>> >>> track of sent packets. This is much more expensive than simply
> >>> >>> parsing
> >>> >>> a
> >>> >>> bunch of varints in the ACK frame. It seems unlikely that a
> >>> >>> multi-threaded
> >>> >>> packet parser would offer any real-world performance benefits.
> >>> >>
> >>> >>
> >>> >> I don't want to overstate the benefit here, but my point isn't that
> >>> >> parsing is expensive but that if you want to have a multithreaded
> >>> >> packet
> >>> >> processing system, then it's nice to have a simpler data structure
> >>> >> (the
> >>> >> unparsed ACK block) to hand to the ACK processing thread.
> >>> >>
> >>> >> -Ekr
> >>> >>
> >>> >>
> >>> >>>
> >>> >>> On Sat, Jun 16, 2018 at 6:19 AM Martin Thomson
> >>> >>> <martin.thomson@gmail.com>
> >>> >>> wrote:
> >>> >>>>
> >>> >>>> When we discussed this before, some people observed that this
> >>> >>>> creates
> >>> >>>> a need to encode in two passes.  That's the trade-off here.  (Not
> >>> >>>> expressing an opinion.)
> >>> >>>> On Fri, Jun 15, 2018 at 3:51 PM Jana Iyengar <jri.ietf@gmail.com>
> >>> >>>> wrote:
> >>> >>>> >
> >>> >>>> > I don't have a strong opinion on this. I'm certainly not opposed
> >>> >>>> > to
> >>> >>>> > it.
> >>> >>>> > Does anyone have a strong opposition?
> >>> >>>> >
> >>> >>>> > On Fri, Jun 15, 2018 at 3:10 PM Praveen Balasubramanian
> >>> >>>> > <pravb@microsoft.com> wrote:
> >>> >>>> >>
> >>> >>>> >> I agree as well since this can help reduce per packet
> processing
> >>> >>>> >> overhead. ACKs are going to be the second most common frame
> type
> >>> >>>> >> so no
> >>> >>>> >> objections to special casing.
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >> From: QUIC [mailto:quic-bounces@ietf.org] On Behalf Of Eric
> >>> >>>> >> Rescorla
> >>> >>>> >> Sent: Friday, June 15, 2018 9:11 AM
> >>> >>>> >> To: Deval, Manasi <manasi.deval@intel.com>
> >>> >>>> >> Cc: Jana Iyengar <jri.ietf@gmail.com>; QUIC WG <quic@ietf.org>
> >>> >>>> >> Subject: Re: Proposal to replace ACK block count with ACK
> length
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >> I agree with Manasi here. This change would allow ack frame
> >>> >>>> >> parsing
> >>> >>>> >> to be more self-contained, which is an advantage for the parser
> >>> >>>> >> and also
> >>> >>>> >> potentially for parallelism (because you can quickly find the
> >>> >>>> >> frame and then
> >>> >>>> >> process it in parallel).
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >> -Ekr
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >> On Mon, Jun 11, 2018 at 5:22 PM, Deval, Manasi
> >>> >>>> >> <manasi.deval@intel.com> wrote:
> >>> >>>> >>
> >>> >>>> >> In general, varints require some specific logic for parsing. To
> >>> >>>> >> skip
> >>> >>>> >> over any header, I have to read every single varint. As the
> code
> >>> >>>> >> sees Stream
> >>> >>>> >> and ACK headers most frequently, that is my focus.  The Stream
> >>> >>>> >> frame has a
> >>> >>>> >> length in its third field.
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >> ACK parsing, however, needs 6 + 2*num_blocks reads to identify
> >>> >>>> >> length. There are two reads each for ‘largest acknowledged’,
> ‘ACK
> >>> >>>> >> delay’ and
> >>> >>>> >> ‘ACK block count’. The pain point is the total number of cycles
> >>> >>>> >> parse an
> >>> >>>> >> ACK. If I am processing 10M pps, where 10% - 30% of the packets
> >>> >>>> >> have a
> >>> >>>> >> piggybacked ACK, these cycles becomes a significant bottleneck.
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >> Thanks,
> >>> >>>> >>
> >>> >>>> >> Manasi
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >> From: QUIC [mailto:quic-bounces@ietf.org] On Behalf Of Jana
> >>> >>>> >> Iyengar
> >>> >>>> >> Sent: Monday, June 11, 2018 3:11 PM
> >>> >>>> >> To: Deval, Manasi <manasi.deval@intel.com>; QUIC WG
> >>> >>>> >> <quic@ietf.org>
> >>> >>>> >> Subject: Re: Proposal to replace ACK block count with ACK
> length
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >> You're right that we no longer have the ability to skip an ACK
> >>> >>>> >> frame,
> >>> >>>> >> and this crept in when we moved to varints.
> >>> >>>> >>
> >>> >>>> >> I believe your problem though is generally true of most frames
> >>> >>>> >> not
> >>> >>>> >> just ACKs, since ids, packet numbers, and numbers in all frames
> >>> >>>> >> are now all
> >>> >>>> >> varints. To skip any frame, you'll need to parse the varint
> >>> >>>> >> fields
> >>> >>>> >> in those
> >>> >>>> >> frames. If you have logic to process and skip varints, then
> >>> >>>> >> skipping the ack
> >>> >>>> >> block section is merely repeating this operation
> (2*num_block+1)
> >>> >>>> >> times. Do
> >>> >>>> >> you see specific value in skipping ACK frames over the other
> >>> >>>> >> control frames?
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>> >> On Mon, Jun 11, 2018 at 8:43 AM Dmitri Tikhonov
> >>> >>>> >> <dtikhonov@litespeedtech..com> wrote:
> >>> >>>> >>
> >>> >>>> >> On Mon, Jun 11, 2018 at 03:33:35PM +0000, Deval, Manasi wrote:
> >>> >>>> >> > -        Moving the ACK length to the front of the ACK allows
> >>> >>>> >> > the
> >>> >>>> >> >          flexibility of either reading the entire ACK or
> >>> >>>> >> > reading
> >>> >>>> >> > the
> >>> >>>> >> >          first 16 bits and skipping over the length. This is
> a
> >>> >>>> >> > useful
> >>> >>>> >> >          feature for the case where ACK processing is split
> >>> >>>> >> > into
> >>> >>>> >> >          multiple layers. Depending on the processor this is
> >>> >>>> >> > run
> >>> >>>> >> > on,
> >>> >>>> >> >          there are different advantages -
> >>> >>>> >>
> >>> >>>> >> Just a note.  In my experience, the cost of parsing an ACK
> frame
> >>> >>>> >> is
> >>> >>>> >> negligible compared to the cost of processing an ACK frame:
> that
> >>> >>>> >> is,
> >>> >>>> >> poking at various memory locations to discard newly ACKed
> >>> >>>> >> packets.
> >>> >>>> >>
> >>> >>>> >>   - Dmitri.
> >>> >>>> >>
> >>> >>>> >>
> >>> >>>>
> >>> >>
> >>> >
> >>>
> >>>
> >>>
> >>> --
> >>> Kazuho Oku
> >
> >
> >
> > --
> > Kazuho Oku
>
>
>
> --
> Kazuho Oku
>