Re: [re-ECN] FW: ConEx BoF announcement text
" Ilpo Järvinen " <ilpo.jarvinen@helsinki.fi> Mon, 26 October 2009 09:58 UTC
Return-Path: <ilpo.jarvinen@helsinki.fi>
X-Original-To: re-ecn@core3.amsl.com
Delivered-To: re-ecn@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix)
with ESMTP id 07E8B28C1A5 for <re-ecn@core3.amsl.com>;
Mon, 26 Oct 2009 02:58:45 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -5.897
X-Spam-Level:
X-Spam-Status: No, score=-5.897 tagged_above=-999 required=5 tests=[AWL=0.402,
BAYES_00=-2.599, MIME_8BIT_HEADER=0.3, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com
[127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ikeniyQw4wjW for
<re-ecn@core3.amsl.com>; Mon, 26 Oct 2009 02:58:42 -0700 (PDT)
Received: from mail.cs.helsinki.fi (courier.cs.helsinki.fi [128.214.9.1]) by
core3.amsl.com (Postfix) with ESMTP id B1EDE28C118 for <re-ecn@ietf.org>;
Mon, 26 Oct 2009 02:58:41 -0700 (PDT)
Received: from wel-95.cs.helsinki.fi (wel-95.cs.helsinki.fi [128.214.10.211])
(TLS: TLSv1/SSLv3,256bits,AES256-SHA) by mail.cs.helsinki.fi with esmtp;
Mon, 26 Oct 2009 11:58:52 +0200 id 0006CAF2.4AE572DC.00007184
Date: Mon, 26 Oct 2009 11:58:52 +0200 (EET)
From: "=?ISO-8859-15?Q?Ilpo_J=E4rvinen?=" <ilpo.jarvinen@helsinki.fi>
X-X-Sender: ijjarvin@wel-95.cs.helsinki.fi
To: Fred Baker <fred@cisco.com>
In-Reply-To: <9423E875-E829-4F1A-ADBD-EAF7089FD615@cisco.com>
Message-ID: <Pine.LNX.4.64.0910230832430.23090@melkinkari.cs.Helsinki.FI>
References: <4A916DBC72536E419A0BD955EDECEDEC0636399B@E03MVB1-UKBR.domain1.systemhost.net>
<4ADD187E.6000200@thinkingcat.com>
<200910221807.n9MI7P2a002071@bagheera.jungle.bt.co.uk>
<F437BD07-581B-4542-ABDB-ABABEDC3B8DD@cisco.com>
<Pine.LNX.4.64.0910222140130.20686@melkinkari.cs.Helsinki.FI>
<BA5FCA1F-0F30-42D2-8C3A-006B28B7D0E1@cisco.com>
<Pine.LNX.4.64.0910222340490.20686@melkinkari.cs.Helsinki.FI>
<9423E875-E829-4F1A-ADBD-EAF7089FD615@cisco.com>
User-Agent: Alpine 2.00 (DEB 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: MULTIPART/MIXED;
boundary="-696243445-1602740578-1256276444=:23090"
Content-ID: <alpine.DEB.2.00.0910261109580.19761@wel-95.cs.helsinki.fi>
Cc: re-ECN unIETF list <re-ecn@ietf.org>
Subject: Re: [re-ECN] FW: ConEx BoF announcement text
X-BeenThere: re-ecn@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: re-inserted explicit congestion notification <re-ecn.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/re-ecn>,
<mailto:re-ecn-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/re-ecn>
List-Post: <mailto:re-ecn@ietf.org>
List-Help: <mailto:re-ecn-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/re-ecn>,
<mailto:re-ecn-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 26 Oct 2009 09:58:45 -0000
On Thu, 22 Oct 2009, Fred Baker wrote: > :-) Me too want to put :-). > You started out by telling me that arbitrarily increasing the window doesn't > hurt anything No, I didn't say that. And after rereading what I said, I'm rather sure that you just wanted to see it that way, I said: "As long as no unnecessary work is not done at any bottleneck (and we don't have a shared resource such as a wireless channel), thoughput does _not_ degrade (at all!), even though many keep asserting that without much of a thought nowadays." ...And before that I explicitly said that I agree with you that delay increases. We were talking about _throughput_ reduction after "cliff", which was first claimed by you and I responded to that :-). ...And then you followed saying that Reno/Cubic aims to that cliff. While you didn't directly claim that Reno/Cubic behavior then implies throughput reduction, I don't see why you even brought the througput up in the first place if you didn't want to imply that these two would be related. If it was just about delay increase with Reno/Cubic I agree with you. > - you quoted Bob saying that congestion isn't a problem. I demonstrated > that congestion out of control is a problem. ...Right, but not for the Cubic/Reno which you in the first mail put among the bad ones. Cubic/Reno are not something I'd call to be "out of control". E.g., if you would replace those two flows of your example with them, no problem would occur even with the fact that they "tune to the cliff" (and even you seemed to agree with this with as you mentioned ACK clockedness which is in effect both with Cubic and Reno). Besides, it's the loss based feedback which forces Cubic/Reno to operate near the cliff, not something that is built into Cubic/Reno behavior itself. So with ECN your original statement about them tuning to the cliff wouldn't be true with either but I suppose might have meant that in your later paragraph about ECN (in fact, I think it's totally independent from Cubic/Reno behavior whether the tuning happens to "the cliff" or not, just buffer sizes and when congestion feedback is given matters). > Then you complained that I didn't demonstrate that Cubic/Reno is a > problem. OK, let's talk about Cubic/Reno, and why Bittorrent has a > working group in the IETF (ledbat) about how to reduce the impact on > ISPs and their customers when they run seven Cubic/Reno TCPs in > parallel. Here I agree with you, it is out of control how many paraller Cubic/Renos one opens. But I don't think this has much to do with "tuning to the cliff". One can take an extreme example: it is possible to open so many of them that all will just continue because of occassional RTOs (or for a moment with window of two when lucky enough). ...But again that is just stupid behavior and it then even becomes completely irrelevant whether you had FAST, Cubic or Reno there in the first place. So assuming windows are in such region that RTOs due to small window are not the problem, I don't see why Cubics/Renos wouldn't reduce their rate when experiencing losses and recover relatively fast without introducing _unnecessary_ retransmissions, preventing entry to such state where unnecessary work is done as multiple bottlenecks for a single flow can then only happen during a short transient. ...So I see how this would prove your original point (copied from Jain) about throughput decrease at the cliff (for the Reno/Cubic). If not pushed to such extreme scenario using unresponsive traffic like you did in you the example you provided first, the issues will mainly be in _delay_ (or in response time which is essentially the same as delay) because Cubic/Reno are, after all, responsive. Throughput is only affected when a flow is about to complete (and requires retransmissions due to losses it suffered from) but you never seemed to even slightly hint towards this (this is of course relatively significant case in the real world). > Actually, could I get you to read the ledbat charter? > http://www.ietf.org/dyn/wg/charter/ledbat-charter.html You have to be reading between the lines, as it doesn't say anything about troughtput reduction? It talks only about delay and that it is maximized by the std TCP behavior. Contrary to your claims, it even admits that such maximizing "works well" for some types of transfers, so where you see this throughput reduction at the cliff?!? Only paragraph that even remotely is discussing about throughput related issues is this: "When an application uses several non-rate-limited transport connections to transfer through a bottleneck, it may obtain a larger fraction of the bottleneck than if it had used fewer connections" (note that that non-rate-limited here does not imply fully unresponsive, afaict if reading the whole paragraph, just "n times more aggressive" than std TCP). To me that certainly doesn't say that throughput reduces but that it is shared "unfairly" (if you let me use that not well defined expression) which I certainly agree. To conclude, there is no equivalence between increase in queuing delay and decrease in throughput in the real world, while it certainly can happen like you put in one of the later mails: "potentially". Of course some keep trying to make them fully equivalent (trying to make it less and less relevant to find out the actual causes of problems, just blame "the big bad congestion collapse" instead :-)): http://www.postel.org/pipermail/end2end-interest/2009-September/007769.html -- i. > On Oct 22, 2009, at 3:30 PM, Ilpo Järvinen wrote: > > >On Thu, 22 Oct 2009, Fred Baker wrote: > > > > >well, we disagree on some points. > > > >I guess we agree quite much yes. However, what I'm pretty much still > >lacking the connection you made between Cubic/Reno and throughput > >reduction because of "the cliff". I oppose such an universal statement you > >made earlier about the cliff being there if we keep pushing the network. > >E.g., you put earlier Cubic among the "bad ones" who push the network, but > >now your scenario below doesn't have anything that even remotely resembles > >Cubic/Reno but something totally unresponsive (I'm assuming SACK based > >recovery is there with this defination of Reno, and that router buffers > >are large enough to accommodate MD-madness "recovery"). ...That's the > >connection I want to fully tear down. Don't first put Cubic or Reno into > >bads than show an example with something completely different, it isn't > >very fair to blame something from others faults like that and justify > >their "badness" that way. This is very important because this > >mis-connection is there in many minds that when enough congestion, > >throughput automatically dramatically drops (like in the Jain's graph). As > >a result, we hear these "congestion is the problem", and "more aggressive > >means more congestion" type of an attitude. Just to make sure, I didn't > >mean to say here that I'd think that Cubic/Reno is something perfect > >either. > > > >...The rest is just some comments to your rather intimidated scenario :-). > > > > >I don't know that perfect agreement is necessary, and certainly not on > > >this thread. That said... > > > > > >Regarding behavior beyond the cliff, it might be instructive to take the > > >most > > >extreme possible case - a dumbbell network and a really obnoxious file > > >transfer algorithm. > > > > > > +-+ 10*N +-+ 10*N +-+ N +-+ 10*N +-+ > > > |A|--------|B|-------|C|------|D|-------|E| > > > +-+ +-+ +-+ +-+ +-+ > > > 10*N | |10*N > > > +--+ +--+ > > > | F| | G| > > > +--+ +--+ > > > > > >Presume that the capacity from C to D is N, and all other capacities are > > >10*N. > > >Presume that some application at A wants to send something really big to a > > >peer at E, and some other application at F wants to send something else to > > >a > > >peer at G. > > > > > >Presume also that the way file transfers work is that the sender puts > > >appropriate headers on all of the indicated packets, queues them up to its > > >link layer chip, and spews them as fast as it can. If it's a 100 gigabyte > > >file, it queues up 100 gigabytes and watches it heat the "wires". The peer > > >receives what it receives and tells the sender what it didn't. The sender > > >now > > >repeats the exercise with anything his peer didn't receive. > > > > > >BTW, that is in fact how a file transfer done in a certain research > > >environment on very noisy lines is proposed to work. I didn't just dream > > >that > > >up. > > > > > >Obviously, links A-B, B-C, F-B, and C-D will be fully utilized. Since B-C > > >has > > >less (half) capacity than A-B + F-B, each of those flows will drop about > > >half > > >of its traffic. The transfer rate F-G should be around 5*N. C-D will also > > >drop > > >80% of the remaining traffic, resulting in an arrival rate at E of N. A > > >will > > >go through at least ten iterations of its transfer, and F at least two. > > > > > >The first down side is that while A-B and F-B are fully utilized, most > > >of that capacity is wasted. > > > >Yes, that's the nature of unresponsiveness you added by yourself, and now > >you say that it is a down side?!? ...Nobody (including the network) can > >prevent an application from doing stupid things like wasting the resources > >of the first link :-). I understand though that you could construct a more > >complex topology where it wouldn't be the first link. > > > >Instead of tackling that possibility, I take another extreme example to > >stress the point of mine: An application could cause the very same > >symptoms as were with the classical congestion collapse (as per Nagle) > >just by keeping resending just the very same segment even though it made > >through (and ACK was received), only multiple copies are then in the > >network and the throughput of that particular flow will be dramatically > >affected. But it's just plain stupid behavior which no network can prevent > >you from doing. Of course nobody else would have to be part of that > >experiment and they can operate with good throughputs. In a thought > >experiment we can put a TCP implementation with SACK and lots of receiver > >window into the age of Nagle and prove that it keeps operating with very > >good throughput regardless of what the stupid senders do as long as the > >transfer isn't unlucky to run of data to send while recovering. > > > >...Secondly, I don't find this non-optimal use of excess capacity to be > >the reason why one would need/want Re-ECN. > > > >Another stupidity of the application/protocol in this example is the way > >of feedback which exaggerates anomalities nearby the end of a transfer. > > > > >This impacts F's ability to deliver traffic to G; it > > >should have achieved 90% throughput rate (the rate C-G less the rate C-D), > > >and > > >in fact achieved 50% or less. With the right mechanism (ack clocking comes > > >to > > >mind, and there are other approaches), the capacity used by the file > > >transfer > > >on A-B is only 10% used and there is at most a 10% impact on the file > > >transfer > > >F-G. > > > >...Right, would there be responsiveness, all changes. And I guess we then > >pretty much agree. > > > > >The second down-side is that the traffic arriving at G and E will be > > >scattered all through the respective files, meaning that they have to > > >store the bits and pieces and do a fair bit of searching and sorting to > > >determine what they have not yet received and to reconstitute the file. > > >This is a waste both of memory and CPU resources. > > > >This has nothing to do with the throughput reduction anymore, don't you > >think? Also, to me these seem "extras" again are because of stupid > >behavior introduced by you rather than anything very close to what I was > >arguing against. > > > >But in general I agree with you that loss recovery even without any > >throughput loss introduces these space constraints (though in a lot less > >severe way than in your extreme example) and one would want to avoid that > >if at all possible. But IMHO that again has rather little to do with "the > >cliff" and the throughput reduction (that supposedly happens because of > >it). > > > > >Voila, we come back pretty quickly to wanting the transmission rate of A to > > >approximate the reception rate at E and the transmission rate at F to > > >approximate the reception rate at G, and wanting the window to not be > > >excessively large. An excessively large window wastes bandwidth before the > > >bottleneck and potentially creates unnecessary bottlenecks, impacts > > >other file transfers, imposes a memory/algorithm burden on the receiver, > > >and for all of those costs accomplishes nothing that a smaller window > > >would not have accomplished better. > > > >Agreed. ...Noted though that you have for some reason cleverly milded > >the "reduces throughput" into "_potentially_ creates unnecessary > >bottlenecks" (emphasis mine) here which I can certainly agree much > >more ;-). > > > >...In fact, I'd even not choke because of your example here, if given in > >less extreme form, as we have CBR type of non-responsiveness present too > >for real, and thus dead packets wasting capacity (without ECN) and so on. > > > > >A window that is too large is a bad thing, and the congestion it causes is > > >a > > >bad thing. Not that the presence of congestion is bad - it is, as Bob says, > > >a > > >natural side effect of what happens. But congestion that is poorly managed > > >by > > >the endpoint and/or the network is a very bad thing. re-ecn does in fact > > >try > > >to give the sender an incentive to approximate the knee (the smallest > > >window > > >that achieves the potential throughput rate), and increasing it to or > > >beyond > > >the cliff by definition doesn't increase it's rate materially beyond that > > >point. > > > >Agreed. Like you pointed out earlier, there's no increase in throughput > >beyond that point not matter what trick one tries to play and I agree. > >My point was just to say that with responsive transfers there isn't such > >magic cliff (always), but the throughput would just keep that constant > >line even if drops start to occur. And especially mentioning Cubic/Reno, > >which are responsive, in this context of "cliff" seemed very misleading. > > > > > >-- > >i. > > > > > > >On Oct 22, 2009, at 12:36 PM, Ilpo Järvinen wrote: > > > > > > >On Thu, 22 Oct 2009, Fred Baker wrote: > > > > > > > > >I'll argue something slightly different. It comes to the same endpoint, > > > > >but > > > > >through a different observation. Some of us, you Bob especially, are > > > > >very > > > > >familiar with this; others are perhaps less so. Bear with me. > > > > > > > > > >Jain's 1994 patent on congestion control defines two terms: the "cliff" > > > > >and > > > > >the "knee" of a throughput curve. > > > > > > > > > > | > > > > > T | availabile capacity > > > > > h | --------------------------------------- > > > > > r | +---+ > > > > > u | knee/ \cliff > > > > > p | / \ > > > > > u | / \ > > > > > t | / \ > > > > > |/ \ > > > > > |/ \ > > > > > |/ ---------------- > > > > > ----------------------------------------- > > > > > Window ---> > > > > > > > > > >As the TCP/SCTP window increases from zero to some value, throughput > > > > >also > > > > >increases. That stops when the available capacity has been consumed; at > > > > >that > > > > >point, even if the window grows, throughput does not. What increases > > > > >instead > > > > >is RTT, because a queue grows. > > > > > > > >Up to this point I agree with this paragraph... > > > > > > > > >If window continues to increase, at some point the queue starts > > > > >dropping > > > > >traffic, and throughput degrades. > > > > > > > >...However, this is not a true claim. As long as no unnecessary work is > > > >not done at any bottleneck (and we don't have a shared resource such as > > > >a wireless channel), thoughput does _not_ degrade (at all!), even though > > > >many keep asserting that without much of a thought nowadays. Of course > > > >non-infinitely long flows would introduce some anomalities to > > > >calculations > > > >through which one could use to show that this claim is "true" but I think > > > >that's hardly what people usually mean when they make that claim that > > > >dropping implies throughput degradation. > > > > > > > >Somewhat related, for some reason there's a very common misconcept that > > > >drops imply poor performance. That is often stated as a fact without > > > >looking deeper into what really was the cause for sub-optimal > > > >performance, often that turns out to be something else when looked > > > >closely enough. I'd say that very often losses were, in fact, innocent. > > > >Quite often the blame would fall on constant factor MD (if one is honest > > > >enough to admit). > > > > > > > >Btw, your graph is not accurate (wrt. Jain), he's drawing a much sharper > > > >drop than you do at his "cliff" (so with ascii you'd have to use | at > > > >first to match him). I've some doubts about validity of that sharp > > > >angle too, so I find your graph is much more sensible (that is, as long > > > >as > > > >the resource is shared). > > > > > > > > >By definition, the window at the "knee" is smaller than the window at > > > > >the "cliff", but throughput is the same. > > > > > > > > > >Common TCP congestion control algorithms such as Reno and Cubic tune to > > > > >the > > > > >cliff. That has the upside of maximizing throughput; it has the > > > > >downside > > > > >that > > > > >it is abusive to other applications such as voice/video (increased and > > > > >variable RTT, and induces loss), and to other TCP sessions. #include > > > > ><discussion of bittorent and why other users are negatively impacted by > > > > >it> > > > > > > > >I think Bob doesn't agree with you here, he isn't saying that congestion > > > >is something bad but something nearly opposite. Bob's own words: > > > > > > > >"It would contradict this to say congestion is a problem - it's not - > > > >it's > > > >healthy and natural in a data network." > > > > > > > >...It's just that if you keep doing that too much (beyond the congestion > > > >volume allocated for you), you would "suffer (if I've understood him > > > >right). ...So it has very little to do with operating in the "cliff" or > > > >"knee" in short timescales (unless of course you've already ran out of > > > >your quota). > > > > > > > > >Other TCP congestion control algorithms based on ECN, CalTech FAST, > > > > >etc, > > > > >try > > > > >to tune to the knee. This gets them exactly the same throughput > > > > >including > > > > >support of very high rate applications but with a smaller window value > > > > >and > > > > >therefore lower queue depth and lower probability of loss - both for > > > > >themselves and their competitors. It does so without negatively, or at > > > > >least > > > > >AS negatively, impacting applications they compete with beyond seeking > > > > >to > > > > >share the available capacity with them. > > > > > > > >I certainly agree with you here that low queuing delay is a desirable > > > >property too. ...However, I don't see how e.g., ECN alone could achieve > > > >low queue delay and high throughput at the same time, you'd need > > > >something > > > >more than that (mainly to remove the 0.5 factored MD). And after removing > > > >that, we'd get something close to what Matt Mathis is suggesting and that > > > >then does not anymore depend on ECN to work (though I think one could > > > >certainly avoid lots of loss recovering by signalling in early with > > > >ECN). > > > > > > > >-- > > > >i. > > > > > > > > > >-- > >i. >
- [re-ECN] ConEx BoF announcement text philip.eardley
- [re-ECN] FW: ConEx BoF announcement text toby.moncaster
- Re: [re-ECN] FW: ConEx BoF announcement text Leslie Daigle
- Re: [re-ECN] FW: ConEx BoF announcement text Bob Briscoe
- Re: [re-ECN] FW: ConEx BoF announcement text Leslie Daigle
- Re: [re-ECN] FW: ConEx BoF announcement text Bob Briscoe
- Re: [re-ECN] FW: ConEx BoF announcement text Bob Briscoe
- Re: [re-ECN] FW: ConEx BoF announcement text philip.eardley
- Re: [re-ECN] FW: ConEx BoF announcement text toby.moncaster
- Re: [re-ECN] FW: ConEx BoF announcement text Leslie Daigle
- Re: [re-ECN] FW: ConEx BoF announcement text toby.moncaster
- Re: [re-ECN] FW: ConEx BoF announcement text bmanning
- Re: [re-ECN] FW: ConEx BoF announcement text Leslie Daigle
- Re: [re-ECN] FW: ConEx BoF announcement text Richard Bennett
- Re: [re-ECN] FW: ConEx BoF announcement text Richard Bennett
- Re: [re-ECN] FW: ConEx BoF announcement text toby.moncaster
- Re: [re-ECN] FW: ConEx BoF announcement text philip.eardley
- Re: [re-ECN] FW: ConEx BoF announcement text toby.moncaster
- Re: [re-ECN] FW: ConEx BoF announcement text Kwok Ho Chan
- Re: [re-ECN] FW: ConEx BoF announcement text João Taveira Araújo
- Re: [re-ECN] FW: ConEx BoF announcement text John Leslie
- Re: [re-ECN] FW: ConEx BoF announcement text Leslie Daigle
- Re: [re-ECN] FW: ConEx BoF announcement text Bob Briscoe
- Re: [re-ECN] FW: ConEx BoF announcement text Fred Baker
- Re: [re-ECN] FW: ConEx BoF announcement text Ilpo Järvinen
- Re: [re-ECN] FW: ConEx BoF announcement text Fred Baker
- Re: [re-ECN] FW: ConEx BoF announcement text Ilpo Järvinen
- Re: [re-ECN] FW: ConEx BoF announcement text Fred Baker
- Re: [re-ECN] FW: ConEx BoF announcement text Leslie Daigle
- Re: [re-ECN] FW: ConEx BoF announcement text Bob Briscoe
- Re: [re-ECN] FW: ConEx BoF announcement text Bob Briscoe
- Re: [re-ECN] FW: ConEx BoF announcement text Leslie Daigle
- Re: [re-ECN] FW: ConEx BoF announcement text João Taveira Araújo
- Re: [re-ECN] FW: ConEx BoF announcement text Ilpo Järvinen
- Re: [re-ECN] FW: ConEx BoF announcement text toby.moncaster
- Re: [re-ECN] FW: ConEx BoF announcement text Bob Briscoe
- Re: [re-ECN] FW: ConEx BoF announcement text John Leslie
- Re: [re-ECN] FW: ConEx BoF announcement text philip.eardley
- Re: [re-ECN] FW: ConEx BoF announcement text toby.moncaster
- Re: [re-ECN] FW: ConEx BoF announcement text philip.eardley