Re: [aqm] Gaming ECN (again) (was: think once to mark, think twice to drop: draft-ietf-aqm-ecn-benefits-02)

David Lang <david@lang.hm> Wed, 15 April 2015 20:16 UTC

Return-Path: <david@lang.hm>
X-Original-To: aqm@ietfa.amsl.com
Delivered-To: aqm@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 291211A1EFD for <aqm@ietfa.amsl.com>; Wed, 15 Apr 2015 13:16:32 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.31
X-Spam-Level:
X-Spam-Status: No, score=-1.31 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, J_CHICKENPOX_44=0.6, T_RP_MATCHES_RCVD=-0.01] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id eNIyexIXc2Zb for <aqm@ietfa.amsl.com>; Wed, 15 Apr 2015 13:16:30 -0700 (PDT)
Received: from bifrost.lang.hm (mail.lang.hm [64.81.33.126]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2B0AB1A2130 for <aqm@ietf.org>; Wed, 15 Apr 2015 13:16:26 -0700 (PDT)
Received: from asgard.lang.hm (asgard.lang.hm [10.0.0.100]) by bifrost.lang.hm (8.13.4/8.13.4/Debian-3) with ESMTP id t3FKFLci020556; Wed, 15 Apr 2015 13:15:21 -0700
Date: Wed, 15 Apr 2015 13:15:21 -0700 (PDT)
From: David Lang <david@lang.hm>
X-X-Sender: dlang@asgard.lang.hm
To: Bob Briscoe <bob.briscoe@bt.com>
In-Reply-To: <201504150923.t3F9N4Ns019432@bagheera.jungle.bt.co.uk>
Message-ID: <alpine.DEB.2.02.1504151259500.26320@nftneq.ynat.uz>
References: <23AFEFE3-4D93-4DD9-A22B-952C63DB9FE3@cisco.com> <BF6B00CC65FD2D45A326E74492B2C19FB75BAA82@FR711WXCHMBA05.zeu.alcatel-lucent.com> <72EE366B-05E6-454C-9E53-5054E6F9E3E3@ifi.uio.no> <55146DB9.7050501@rogers.com> <08C34E4A-DFB7-4816-92AE-2ED161799488@ifi.uio.no> <BF6B00CC65FD2D45A326E74492B2C19FB75BAFA0@FR711WXCHMBA05.zeu.alcatel-lucent.com> <alpine.DEB.2.02.1503271024550.2416@nftneq.ynat.uz> <5d58d2e21400449280173aa63069bf7a@hioexcmbx05-prd.hq.netapp.com> <20150327183659.GI39886@verdi> <72C12F6B-9DDE-4483-81F2-2D9A0F2D3A48@cs.columbia.edu> <alpine.DEB.2.02.1503271211200.19390@nftneq.ynat.uz> <D13AFCE7.46BC%kk@cs.ucr.edu> <alpine.DEB.2.02.1503271257230.19390@nftneq.ynat.uz> <AE342093-DE05-4D93-96DA-EB07E221F1D9@netapp.com> <alpine.DEB.2.02.1503291923300.26044@nftneq.ynat.uz> <201504131511.t3DFBG3R002270@bagheera.jungle.bt.co.uk> <alpine.DEB.2.02.1504131441190.11469@nftneq.ynat.uz> <201504150923.t3F9N4Ns019432@bagheera.jungle.bt.co.uk>
User-Agent: Alpine 2.02 (DEB 1266 2009-07-14)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
Archived-At: <http://mailarchive.ietf.org/arch/msg/aqm/GDFYUMXBGJZuJbZVNqQzccnSmYw>
Cc: "Scheffenegger, Richard" <rs@netapp.com>, Vishal Misra <misra@cs.columbia.edu>, KK <kk@cs.ucr.edu>, John Leslie <john@jlc.net>, "aqm@ietf.org" <aqm@ietf.org>
Subject: Re: [aqm] Gaming ECN (again) (was: think once to mark, think twice to drop: draft-ietf-aqm-ecn-benefits-02)
X-BeenThere: aqm@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "Discussion list for active queue management and flow isolation." <aqm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/aqm>, <mailto:aqm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/aqm/>
List-Post: <mailto:aqm@ietf.org>
List-Help: <mailto:aqm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/aqm>, <mailto:aqm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 15 Apr 2015 20:16:32 -0000

On Wed, 15 Apr 2015, Bob Briscoe wrote:

> 3) I will prove that it is as easy to game loss as it is to game ECN, first 
> considering sender cheating, then receiver cheating:

I think there is a key difference

cheating non-ECN requires changes to the TCP stack code

cheating ECN requires an iptables rule to zero out the ECN bit on received 
packets before they hit the TCP stack.

I've seen enough people doing cargo-cult network configs to 'improve 
gaming' (MTU 576 as one example) that I expect that if there is any noticable 
advantage for doing so, there will be a significant population doing so.

> 3a) Sender Cheating
> From the sender's point of view, the only difference between a loss and an 
> ECN mark is that it has to retransmit a loss. But that has nothing to do with 
> the rate it can go at. If it has been programmed to ignore congestion 
> feedback (and instead to go at a constant unresponsive rate{Note 2}), it is 
> as easy for it to ignore loss feedback as ECN feedback. See {Note 3} for an 
> example.
>
> 3b) Receiver Cheating
> * An ECN receiver can best fool an ECN-capable TCP sender into going faster 
> by only feeding back a small fraction of ECN marks.{Note 4}
> * A non-ECN receiver could fool a non-ECN TCP sender into going faster by 
> only revealing a small fraction of the losses. However, it would have to ACK 
> undelivered bytes, and most TCP-based apps won't work unless all bytes are 
> delivered.{Note 5}
>
> So it seems that it's easier for a receiver to game ECN than loss. However:
> * returning to the ECN case, the sender can validate the receiver by randomly 
> setting an ECN mark itself on a very small proportion of packets (probably 
> only on unusually high rate connections). Then if it doesn't see ECN feedback 
> on the ACK of any one of its self-inserted marks, it can close the 
> connection.

do any TCP stacks do this?

won't this cause the datarate to slow down, something that will seem 'bad' for 
the people who would need to code this (it will show up on benchmarks as random 
slowdowns, right?)

> In summary,
> * a sender can't game ECN any more easily than it can game loss.

I think it's more "there is a way to detect gaming ECN"

> * a receiver can only game ECN if the sender doesn't take measures to prevent 
> it.{Note 6}
>
>> If the packets are just marked, but not dropped, then the ECN-capable flows 
>> will occupy a disproportinate share of the available buffer space, since 
>> they just get marked instead of dropped.
>
> Nope.
>
> The arrival rates will be the same, whether or not ECN is used (see earlier). 
> And recall  that TCP drives the marking or loss probability at very small 
> fractions in all normal conditions.
>
> Example: if there are 10 flows in a 100Mb/s link, 5 ECN and 5 non-ECN, they 
> will all arrive at the buffer at 10Mb/s (all other factors being equal). 
> Then, if the loss or marking probability is 0.5%, the AQM will be marking but 
> not dropping 1 in 200 packets in the ECN flows whereas it would drop 1 in 200 
> from the non-ECN flows.
>
> So, assuming tail drop, if there were 399 packets in this buffer, on average 
> 200 would be ECN-capable (20 in each flow) with one marked; and 199 would be 
> non-ECN-capable (20 in each flow except one with 19). And one of those 199 
> would be a retransimssion from an earlier loss.
>
> [Of course, we would hope that there would be 4 packets in the buffer, not 
> 400. The proportions would still be the same on average. I merely used 399 to 
> avoid fractions of packets for the averages.]
>
>
> ===Footnotes===
>
> {Note 3} Examples to show source cheating is as easy with loss as ECN:
> * An ECN source sends at a constant unresponsive 90Mb/s through a 100Mb/s 
> bottleneck. In parallel some other responsive flows (say 10 non-ECN TCP 
> flows) squeeze themselves into the remaining 10Mb/s. They will cause 
> themselves (say) 0.5% loss probability, while the unresponsive flow will 
> experience 0.5% marking and zero loss.
> * A non-ECN source can just as easily send unresponsively at 90.5Mb/s as 
> 90Mb/s. The other flows will still drive loss to about 0.5%, which the 
> unresponsive flow will now experience as well. Nonetheless, after it 
> retransmits the 0.5% loss it still achieves goodput of about 90Mb/s.

well, sort of.

if the retransmissions are due to dropped packets, it may require retransmitting 
some packets that actually got through (depending on the link latency and rate 
of transmission, those added packets may not be acked before they also get 
retransmitted), this will waste some bandwidth

if the sender slows down, it may drop significantly below the 90Mb/s rate and 
have to ramp up to that rate again.

> {Note 4} Again, feeding back no marks at all would be naive, because it would 
> drive the bottleneck into overload, causing it to turn off ECN (and driving 
> the loss-rate over a cliff). A better strategy is to feedback only a small 
> proportion. Because TCP's rate depends on the square root of the congestion 
> probability, to download N times faster, the receiver should feed back only 
> about 1 in N^2 of the marks or losses. E.g. to go 90 times faster, feed back 
> 1 in 8100 marks (or losses).

that assumes that the raw sending rate is enough higher to drive the link into 
overload itself rather than just being enough higher to demand an unfair share 
of the link

you use the example of a 100Mb link with one app getting 90Mb and trying to get 
more.

Instead think of that same 100Mb link, but with 100 1Mb flows going over it. 
Would ignoring ECN feedback allow one app to get a 50Mb flow, squeezing the 
other apps to .5Mb each?

> {Note 5} There are two classes of apps that use TCP but can get away without 
> reliable delivery:
> i) Some streaming media apps are designed with a loss-tolerant encoding, so 
> they can use TCP but play out the media even if some retransmissions haven't 
> arrived yet (e.g. using a raw socket at the receiver).
> ii) In the specific case of HTTP, a hacked receiver can open another 
> connection to the same server and download the byte-ranges it needs to repair 
> the holes in the other connection.

the TCP stack will demand reliable delivery, even if the app doesn't. There 
isn't any provision for an app to use TCP but allow 'holes' in the data.

David Lang