Re: [aqm] TCP ACK Suppression

David Lang <david@lang.hm> Fri, 09 October 2015 00:05 UTC

Return-Path: <david@lang.hm>
X-Original-To: aqm@ietfa.amsl.com
Delivered-To: aqm@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id F413A1B29FE for <aqm@ietfa.amsl.com>; Thu, 8 Oct 2015 17:05:01 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.91
X-Spam-Level:
X-Spam-Status: No, score=-1.91 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, T_RP_MATCHES_RCVD=-0.01] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PHwQh3665Z83 for <aqm@ietfa.amsl.com>; Thu, 8 Oct 2015 17:04:59 -0700 (PDT)
Received: from bifrost.lang.hm (mail.lang.hm [64.81.33.126]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 0EF981B29FC for <aqm@ietf.org>; Thu, 8 Oct 2015 17:04:58 -0700 (PDT)
Received: from asgard.lang.hm (asgard.lang.hm [10.0.0.100]) by bifrost.lang.hm (8.13.4/8.13.4/Debian-3) with ESMTP id t9904m9v016502; Thu, 8 Oct 2015 17:04:48 -0700
Date: Thu, 08 Oct 2015 17:04:48 -0700
From: David Lang <david@lang.hm>
X-X-Sender: dlang@asgard.lang.hm
To: Joe Touch <touch@isi.edu>
In-Reply-To: <5616FAFA.5020707@isi.edu>
Message-ID: <alpine.DEB.2.02.1510081647590.3852@nftneq.ynat.uz>
References: <alpine.DEB.2.02.1510060748480.8750@uplift.swm.pp.se> <D2394BB6.548C5%g.white@cablelabs.com> <0A452E1DADEF254C9A7AC1969B8781284A7D9B66@FR712WXCHMBA13.zeu.alcatel-lucent.com> <5616DCD9.8@isi.edu> <alpine.DEB.2.02.1510081428470.3852@nftneq.ynat.uz> <5616E42D.5090402@isi.edu> <alpine.DEB.2.02.1510081517470.3852@nftneq.ynat.uz> <5616FAFA.5020707@isi.edu>
User-Agent: Alpine 2.02 (DEB 1266 2009-07-14)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset="US-ASCII"; format="flowed"
Archived-At: <http://mailarchive.ietf.org/arch/msg/aqm/PdXreEAwhpkPCj-u9jYkLh51KgU>
Cc: "LAUTENSCHLAEGER, Wolfram (Wolfram)" <wolfram.lautenschlaeger@alcatel-lucent.com>, Greg White <g.white@CableLabs.com>, "aqm@ietf.org" <aqm@ietf.org>
Subject: Re: [aqm] TCP ACK Suppression
X-BeenThere: aqm@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "Discussion list for active queue management and flow isolation." <aqm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/aqm>, <mailto:aqm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/aqm/>
List-Post: <mailto:aqm@ietf.org>
List-Help: <mailto:aqm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/aqm>, <mailto:aqm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 09 Oct 2015 00:05:02 -0000

On Thu, 8 Oct 2015, Joe Touch wrote:

> On 10/8/2015 3:29 PM, David Lang wrote:
>> On Thu, 8 Oct 2015, Joe Touch wrote:
>>
>>> On 10/8/2015 2:31 PM, David Lang wrote:
>>>> On Thu, 8 Oct 2015, Joe Touch wrote:
>>>>
>>>>> On 10/7/2015 12:42 AM, LAUTENSCHLAEGER, Wolfram (Wolfram) wrote:
>>>>> ...
>>>>>> Is this topic addressed in some RFC already?
>>>>>
>>>>> It's a direct violation of RFC793, which expects one ACK for every two
>>>>> segments:
>>>>>
>>>>> 4.2 Generating Acknowledgments
>>>>>
>>>>>   The delayed ACK algorithm specified in [Bra89] SHOULD be used by a
>>>>>   TCP receiver.  When used, a TCP receiver MUST NOT excessively delay
>>>>>   acknowledgments.  Specifically, an ACK SHOULD be generated for at
>>>>>   least every second full-sized segment, and MUST be generated within
>>>>>   500 ms of the arrival of the first unacknowledged packet.
>>>>
>>>> actually, this is only a violation of the SHOULD section, not the MUST
>>>> section.
>>>
>>> When you violate a SHOULD, you need to have a good reason that applies
>>> in a limited subset of cases.
>>>
>>> "it benefits me" isn't one of them, otherwise the SHOULD would *always*
>>> apply.
>>>
>>>> And if the Ack packets are going to arrive at wire-speed anyway (due to
>>>> other causes), is there really an advantage to having 32 ack packets
>>>> arriving one after the other instead of making it so that the first ack
>>>> packet (which arrives at the same time) can ack everything?
>>>
>>> If the first ACK confirms everything, you're giving the endpoint a false
>>> sense of how fast the data was received. This is valid only if the
>>> *last* ACK is the only one you retain, but then you'll increase delay.
>>
>> why does it give the server a false sense of how fast the data was
>> received? the packets don't have timestamps that the server can trust,
>> they are just packets arriving.
>
> Well, the only reason we can no longer trust them is that an
> intermediate device has tampered with them.

no, you could not trust any timestamps in the packets even if nothing changes 
the packets between endpoints.

> See, this is the problem - the DOCSIS modem wants to do what *it* wants,
> assuming everyone else plays by the rules, but it doesn't care whether
> it violates the assumptions other parties are making.
>
> That's an example of "tragedy of the commons".
>
>> And if the server concludes something
>> different from 32 packets arriving, each acking 2 packet, but all
>> arriving one after the other at it's wire speed (let's say it's a slow
>> network, only Gig-E) compared to a single packet arriving that acks 64
>> packets of data at once, it's doing something very strange and making
>> assumptions about how the network works that are invalid.
>
> Says who? The RFCs say that this assumption SHOULD be reasonable.
>
>>> Unless you know that the endpoint supports ABC and pacing, yes, there's
>>> a very distinct advantage to getting 32 ACKs rather than 1. It also
>>> helps with better accuracy on the RTT calculation, which is based on
>>> sampling (and you've killed 97% of the samples).
>>
>> the 97% of the samples that I've killed would be producing invalid data
>> for your calculation because they were delayed in returning.
>
> Why do you think that is invalid data? That's an accurate measure of the
> return path of the ACK stream.

so how do you sanely conclude anything from 32 ack packets arriving at wire 
speed back-to-back?

> ...
>>>> And if there is such an advantage, does it outweight the disadvantages
>>>> that the extra ack packets cause by causing highly asymmetric links to
>>>> be overloaded and drop packets?
>>>
>>> Why is it so bad to drop packets?
>>
>> because forcing packets for other services to be dropped to make room
>> for acks degrades those other services.
>
> Sure, but remember that we're not here to support the cable company's
> business model. They deployed networks that had severely
> underprovisioned backchannels so they could use shared channels rather
> than routers one step lower in the hierarchy. Now they pull this stunt
> so they can fix what's broken with their provisioning model.
>
> The trouble is that it has effects for others in the network, not just
> the cable company.

It's not just cable companies. the same sort of thing will happen with 
half-duplex wifi links where acks will accumulate while data packets are flowing 
in the other direction.

stop trying to say that this is the fault of one subset of industry and 
recognize that there are lots of legitimate reasons for this.

Highly asymmetric links are not just 'cable companies underprovisioning their 
networks'. DSL lines are highly asymmetric due to the difference in the cost of 
the transmitters on each end of the link. As are Satellite IP systems, etc.

>>> TCP isn't supposed to be the most efficient in EVERY corner case. It's
>>> supposed to *always work* in EVERY corner case.
>>
>> I don't see how it fails to work in this case. As people have pointed
>> out, some cable routers have been doing this for 15 years and the
>> Internet has not imploded from it yet, so the drawbacks of dropping
>> these already-delayed and redundant ack packets cannot be the
>> end-of-the-internet that you are painting it to be
>
> Oh, right. That argument. We haven't seen it break anything, so it
> *must* be safe.
>
> What would you see if it were broken? Maybe hosts that burst into the
> net and caused router buffers to overload? Hmmm.

that happens without this, so you can't blame it on the missing acks.

>> We are talking about only doing this in one specific case, the case
>> where other things have already caused some of the acks to be delayed to
>> the point where later acks have 'caught up' with them on the network and
>> both early and late acks are sitting in the same queue on the same
>> device waiting to be sent at the same time.
>
> They're in a queue. That means the early ones go out before the late
> ones. You have two choices if you coalesce their information:
>
> a) delete the early ACKs
>
> 	Oh, but you wouldn't do *that* because it would hit *your*
> 	customers with a higher delay.

but by not having to transmit the early acks, the later ack goes out faster, so 
the customers get less of a delay in getting the data they have received acked.

If the only thing in the queue is acks, then the last ack in the queue goes out 
as fast as the first ack would. By doing this you transmit less, which can speed 
up the network overall as the next station can transmit it's data faster 
(thinking of wifi as an example)

If there are other packets in the queue, you still can delete all the acks 
except the last one that will fit into the burst with no degredation in how fast 
data is acknoleged (and you increase the amount of usable data that is sent in 
that timeslot instead of wasting it on redundant ack packets)

> b) delete the late ACKs and alter the early ones
>
> 	Giving your customers a false sense of how fast their
> 	data was getting there. Roadrunner pulled stunts like this
> 	in the early 90's too. It's not exactly news.

unless there are other packets in the flow that the ack is jumping, I see no 
problem with this. You aren't sending out an ack before the data is arrived, you 
just aren't delaying the last ack unneccessarily.

>> At this point there are three possiblilities
>>
>> 1. all the acks get sent back-to-back, wasting bandwith with their
>> redundancy
>
> That's not a waste; that's information.

very low value information at best.

>> 2. send only the newest ack, trashing all the ones that would be redundant
>
> If you wait to send it last, maybe... but then you're still encouraging
> the receiver to burst its next transmissions. We already know that sort
> of bursting causes problems (even if *we* don't see them, someone does).

how would the burst be any different if the server gets 32 acks back to back vs 
1 ack that covers everything. The additional acks aren't going to get there any 
faster than the single one would.

>> 3. the total of the acks that are queued exceeds the next transmit
>> window, so only some of the acks get sent, the newest one doesn't and
>> gets delayed further.
>>
>>
>> we know that #2 doesn't break the Internet,
>
> No, you really don't. What you know is that #2 is cheap and benefits
> you. Everyone continually doing that *will* break the Internet.

you keep stating that, but you are short on details about why a string of acks 
at wire speed is better than a single ack covering the same data 'because I say 
so' doesn't cut it.

>> it's within the range of
>> responses permitted by the RFC SHOULD.
>
> SHOULD means that breaking it needs to be done for a reason. I've long
> argued that the SHOULD should never be there in the first place without
> explaining why it isn't a MUST or a MAY and the conditions under which
> it might be appropriate to violate it. RFC793 doesn't have that context,
> unfortunately, but it doesn't mean that any - and every - SHOULD is
> intended to be willfully ignored at all times.
>
>> It decreases load on congested links.
>                       ^^^^^^^^^
> severely underprovisioned

no, merely congested for some reason. Any shared media will have the same 
situation, when another station is transmitting, a queue builds.

>> But you keep insisting that it's a horrible thing to consider doing.
>
> The tragedy of the commons is a horrible thing. Just because something
> doesn't hurt you or you can't see how it hurts others doesn't mean there
> isn't a problem.
>
> I've outlined the reasons why this is bad - basically it works only
> under the assumption that DOCSIS modems get to play by their own rules
> and every one else plays fair. If that's not a bad idea, I don't know
> what is.

you say that it breaks timing assumptions and calculations, but then don't 
explain how the train of acks arriving at wire speed would let your calculations 
be any more accurate.

David Lang