Re: [aqm] TCP ACK Suppression

Joe Touch <touch@isi.edu> Thu, 08 October 2015 23:24 UTC

Return-Path: <touch@isi.edu>
X-Original-To: aqm@ietfa.amsl.com
Delivered-To: aqm@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B0FC41AD374 for <aqm@ietfa.amsl.com>; Thu, 8 Oct 2015 16:24:12 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.91
X-Spam-Level:
X-Spam-Status: No, score=-6.91 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_HI=-5, T_RP_MATCHES_RCVD=-0.01] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PiMjszXg8FhC for <aqm@ietfa.amsl.com>; Thu, 8 Oct 2015 16:24:10 -0700 (PDT)
Received: from boreas.isi.edu (boreas.isi.edu [128.9.160.161]) (using TLSv1 with cipher ECDHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 88E081AD1F5 for <aqm@ietf.org>; Thu, 8 Oct 2015 16:24:10 -0700 (PDT)
Received: from [128.9.160.211] (mul.isi.edu [128.9.160.211]) (authenticated bits=0) by boreas.isi.edu (8.13.8/8.13.8) with ESMTP id t98NNcED018589 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT); Thu, 8 Oct 2015 16:23:39 -0700 (PDT)
To: David Lang <david@lang.hm>
References: <alpine.DEB.2.02.1510060748480.8750@uplift.swm.pp.se> <D2394BB6.548C5%g.white@cablelabs.com> <0A452E1DADEF254C9A7AC1969B8781284A7D9B66@FR712WXCHMBA13.zeu.alcatel-lucent.com> <5616DCD9.8@isi.edu> <alpine.DEB.2.02.1510081428470.3852@nftneq.ynat.uz> <5616E42D.5090402@isi.edu> <alpine.DEB.2.02.1510081517470.3852@nftneq.ynat.uz>
From: Joe Touch <touch@isi.edu>
X-Enigmail-Draft-Status: N1110
Message-ID: <5616FAFA.5020707@isi.edu>
Date: Thu, 08 Oct 2015 16:23:38 -0700
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0
MIME-Version: 1.0
In-Reply-To: <alpine.DEB.2.02.1510081517470.3852@nftneq.ynat.uz>
Content-Type: text/plain; charset="windows-1252"
Content-Transfer-Encoding: 7bit
X-ISI-4-43-8-MailScanner: Found to be clean
X-MailScanner-From: touch@isi.edu
Archived-At: <http://mailarchive.ietf.org/arch/msg/aqm/PHmZ8sB6pGNKkXy3jdswGYEsSDA>
Cc: "LAUTENSCHLAEGER, Wolfram (Wolfram)" <wolfram.lautenschlaeger@alcatel-lucent.com>, Greg White <g.white@CableLabs.com>, "aqm@ietf.org" <aqm@ietf.org>, touch@isi.edu
Subject: Re: [aqm] TCP ACK Suppression
X-BeenThere: aqm@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "Discussion list for active queue management and flow isolation." <aqm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/aqm>, <mailto:aqm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/aqm/>
List-Post: <mailto:aqm@ietf.org>
List-Help: <mailto:aqm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/aqm>, <mailto:aqm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 08 Oct 2015 23:24:12 -0000


On 10/8/2015 3:29 PM, David Lang wrote:
> On Thu, 8 Oct 2015, Joe Touch wrote:
> 
>> On 10/8/2015 2:31 PM, David Lang wrote:
>>> On Thu, 8 Oct 2015, Joe Touch wrote:
>>>
>>>> On 10/7/2015 12:42 AM, LAUTENSCHLAEGER, Wolfram (Wolfram) wrote:
>>>> ...
>>>>> Is this topic addressed in some RFC already?
>>>>
>>>> It's a direct violation of RFC793, which expects one ACK for every two
>>>> segments:
>>>>
>>>> 4.2 Generating Acknowledgments
>>>>
>>>>   The delayed ACK algorithm specified in [Bra89] SHOULD be used by a
>>>>   TCP receiver.  When used, a TCP receiver MUST NOT excessively delay
>>>>   acknowledgments.  Specifically, an ACK SHOULD be generated for at
>>>>   least every second full-sized segment, and MUST be generated within
>>>>   500 ms of the arrival of the first unacknowledged packet.
>>>
>>> actually, this is only a violation of the SHOULD section, not the MUST
>>> section.
>>
>> When you violate a SHOULD, you need to have a good reason that applies
>> in a limited subset of cases.
>>
>> "it benefits me" isn't one of them, otherwise the SHOULD would *always*
>> apply.
>>
>>> And if the Ack packets are going to arrive at wire-speed anyway (due to
>>> other causes), is there really an advantage to having 32 ack packets
>>> arriving one after the other instead of making it so that the first ack
>>> packet (which arrives at the same time) can ack everything?
>>
>> If the first ACK confirms everything, you're giving the endpoint a false
>> sense of how fast the data was received. This is valid only if the
>> *last* ACK is the only one you retain, but then you'll increase delay.
> 
> why does it give the server a false sense of how fast the data was
> received? the packets don't have timestamps that the server can trust,
> they are just packets arriving.

Well, the only reason we can no longer trust them is that an
intermediate device has tampered with them.

See, this is the problem - the DOCSIS modem wants to do what *it* wants,
assuming everyone else plays by the rules, but it doesn't care whether
it violates the assumptions other parties are making.

That's an example of "tragedy of the commons".

> And if the server concludes something
> different from 32 packets arriving, each acking 2 packet, but all
> arriving one after the other at it's wire speed (let's say it's a slow
> network, only Gig-E) compared to a single packet arriving that acks 64
> packets of data at once, it's doing something very strange and making
> assumptions about how the network works that are invalid.

Says who? The RFCs say that this assumption SHOULD be reasonable.

>> Unless you know that the endpoint supports ABC and pacing, yes, there's
>> a very distinct advantage to getting 32 ACKs rather than 1. It also
>> helps with better accuracy on the RTT calculation, which is based on
>> sampling (and you've killed 97% of the samples).
> 
> the 97% of the samples that I've killed would be producing invalid data
> for your calculation because they were delayed in returning.

Why do you think that is invalid data? That's an accurate measure of the
return path of the ACK stream.

...
>>> And if there is such an advantage, does it outweight the disadvantages
>>> that the extra ack packets cause by causing highly asymmetric links to
>>> be overloaded and drop packets?
>>
>> Why is it so bad to drop packets?
> 
> because forcing packets for other services to be dropped to make room
> for acks degrades those other services.

Sure, but remember that we're not here to support the cable company's
business model. They deployed networks that had severely
underprovisioned backchannels so they could use shared channels rather
than routers one step lower in the hierarchy. Now they pull this stunt
so they can fix what's broken with their provisioning model.

The trouble is that it has effects for others in the network, not just
the cable company.

...
>> TCP isn't supposed to be the most efficient in EVERY corner case. It's
>> supposed to *always work* in EVERY corner case.
> 
> I don't see how it fails to work in this case. As people have pointed
> out, some cable routers have been doing this for 15 years and the
> Internet has not imploded from it yet, so the drawbacks of dropping
> these already-delayed and redundant ack packets cannot be the
> end-of-the-internet that you are painting it to be

Oh, right. That argument. We haven't seen it break anything, so it
*must* be safe.

What would you see if it were broken? Maybe hosts that burst into the
net and caused router buffers to overload? Hmmm.

> We are talking about only doing this in one specific case, the case
> where other things have already caused some of the acks to be delayed to
> the point where later acks have 'caught up' with them on the network and
> both early and late acks are sitting in the same queue on the same
> device waiting to be sent at the same time.

They're in a queue. That means the early ones go out before the late
ones. You have two choices if you coalesce their information:

a) delete the early ACKs

	Oh, but you wouldn't do *that* because it would hit *your*
	customers with a higher delay.

b) delete the late ACKs and alter the early ones

	Giving your customers a false sense of how fast their
	data was getting there. Roadrunner pulled stunts like this
	in the early 90's too. It's not exactly news.

> At this point there are three possiblilities
> 
> 1. all the acks get sent back-to-back, wasting bandwith with their
> redundancy

That's not a waste; that's information.

> 2. send only the newest ack, trashing all the ones that would be redundant

If you wait to send it last, maybe... but then you're still encouraging
the receiver to burst its next transmissions. We already know that sort
of bursting causes problems (even if *we* don't see them, someone does).

> 3. the total of the acks that are queued exceeds the next transmit
> window, so only some of the acks get sent, the newest one doesn't and
> gets delayed further.
> 
> 
> we know that #2 doesn't break the Internet, 

No, you really don't. What you know is that #2 is cheap and benefits
you. Everyone continually doing that *will* break the Internet.

> it's within the range of
> responses permitted by the RFC SHOULD.

SHOULD means that breaking it needs to be done for a reason. I've long
argued that the SHOULD should never be there in the first place without
explaining why it isn't a MUST or a MAY and the conditions under which
it might be appropriate to violate it. RFC793 doesn't have that context,
unfortunately, but it doesn't mean that any - and every - SHOULD is
intended to be willfully ignored at all times.

> It decreases load on congested links.
                       ^^^^^^^^^
severely underprovisioned

> But you keep insisting that it's a horrible thing to consider doing.

The tragedy of the commons is a horrible thing. Just because something
doesn't hurt you or you can't see how it hurts others doesn't mean there
isn't a problem.

I've outlined the reasons why this is bad - basically it works only
under the assumption that DOCSIS modems get to play by their own rules
and every one else plays fair. If that's not a bad idea, I don't know
what is.

Joe