Re: [re-ECN] FW: ConEx BoF announcement text
Fred Baker <fred@cisco.com> Thu, 22 October 2009 22:51 UTC
Return-Path: <fred@cisco.com>
X-Original-To: re-ecn@core3.amsl.com
Delivered-To: re-ecn@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix)
with ESMTP id 2F5973A69EB for <re-ecn@core3.amsl.com>;
Thu, 22 Oct 2009 15:51:24 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -106.329
X-Spam-Level:
X-Spam-Status: No,
score=-106.329 tagged_above=-999 required=5 tests=[AWL=-0.030, BAYES_00=-2.599,
MIME_8BIT_HEADER=0.3, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com
[127.0.0.1]) (amavisd-new, port 10024) with ESMTP id buXW+fyP4vtM for
<re-ecn@core3.amsl.com>; Thu, 22 Oct 2009 15:51:22 -0700 (PDT)
Received: from sj-iport-3.cisco.com (sj-iport-3.cisco.com [171.71.176.72]) by
core3.amsl.com (Postfix) with ESMTP id 6881F3A67C0 for <re-ecn@ietf.org>;
Thu, 22 Oct 2009 15:51:22 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com;
i=fred@cisco.com; l=15575; q=dns/txt; s=sjiport03001; t=1256251892;
x=1257461492;
h=from:sender:reply-to:subject:date:message-id:to:cc: mime-version:content-transfer-encoding:content-id:
content-description:resent-date:resent-from:resent-sender:
resent-to:resent-cc:resent-message-id:in-reply-to:
references:list-id:list-help:list-unsubscribe:
list-subscribe:list-post:list-owner:list-archive;
z=From:=20Fred=20Baker=20<fred@cisco.com>|Subject:=20Re: =20[re-ECN]=20FW:=20=20ConEx=20BoF=20announcement=20text
|Date:=20Thu,=2022=20Oct=202009=2015:51:31=20-0700
|Message-Id:=20<9423E875-E829-4F1A-ADBD-EAF7089FD615@cisc
o.com>|To:=20=3D?ISO-8859-1?Q?=3D22Ilpo_J=3DE4rvinen=3D22
?=3D=20<ilpo.jarvinen@helsinki.fi>|Cc:=20Bob=20Briscoe=20
<rbriscoe@jungle.bt.co.uk>,=0D=0A=20=20=20=20=20=20=20=20
Leslie=20Daigle=20<leslie@thinkingcat.com>,=0D=0A=20=20
=20=20=20=20=20=20re-ECN=20unIETF=20list=20<re-ecn@ietf.o
rg>|Mime-Version:=201.0=20(Apple=20Message=20framework=20
v936)|Content-Transfer-Encoding:=20quoted-printable
|In-Reply-To:=20<Pine.LNX.4.64.0910222340490.20686@melkin
kari.cs.Helsinki.FI>|References:=20<4A916DBC72536E419A0BD
955EDECEDEC0636399B@E03MVB1-UKBR.domain1.systemhost.net>
=20<4ADD187E.6000200@thinkingcat.com>=20<200910221807.n9M
I7P2a002071@bagheera.jungle.bt.co.uk>=20<F437BD07-581B-45
42-ABDB-ABABEDC3B8DD@cisco.com>=20<Pine.LNX.4.64.09102221
40130.20686@melkinkari.cs.Helsinki.FI>=20<BA5FCA1F-0F30-4
2D2-8C3A-006B28B7D0E1@cisco.com>=20<Pine.LNX.4.64.0910222
340490.20686@melkinkari.cs.Helsinki.FI>;
bh=mmMCW5DpJqclhqA2BrKAgbtYEVxnlgr4Xcy7YR0Zq+A=;
b=Mb8+wUCMxnamo65UqPR9ijpKXZ3a6wphrwRbZTe4YBxLGcAdyRqK14GH
GAxSSXuKda9aE8M7FY5PqldbisJ96BDkKH4OzvxjIEaGkR62iKZnW7TMp
X8RaZOTHkTiR25em/4UA04e9yNeSKKRI2K6JiqzsmHb/35dL4fWQYXJK6 g=;
Authentication-Results: sj-iport-3.cisco.com;
dkim=neutral (message not signed) header.i=none
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: ApoEAHZ+4EqrR7Ht/2dsb2JhbADEEphRgj2CAgQ
X-IronPort-AV: E=Sophos;i="4.44,608,1249257600"; d="scan'208";a="198393845"
Received: from sj-core-1.cisco.com ([171.71.177.237]) by sj-iport-3.cisco.com
with ESMTP; 22 Oct 2009 22:51:32 +0000
Received: from stealth-10-32-244-219.cisco.com
(stealth-10-32-244-219.cisco.com [10.32.244.219]) by sj-core-1.cisco.com
(8.13.8/8.14.3) with ESMTP id n9MMpVvi024975; Thu, 22 Oct 2009 22:51:31 GMT
Message-Id: <9423E875-E829-4F1A-ADBD-EAF7089FD615@cisco.com>
From: Fred Baker <fred@cisco.com>
To: =?ISO-8859-1?Q?=22Ilpo_J=E4rvinen=22?= <ilpo.jarvinen@helsinki.fi>
In-Reply-To: <Pine.LNX.4.64.0910222340490.20686@melkinkari.cs.Helsinki.FI>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed; delsp=yes
Content-Transfer-Encoding: quoted-printable
Mime-Version: 1.0 (Apple Message framework v936)
Date: Thu, 22 Oct 2009 15:51:31 -0700
References: <4A916DBC72536E419A0BD955EDECEDEC0636399B@E03MVB1-UKBR.domain1.systemhost.net>
<4ADD187E.6000200@thinkingcat.com>
<200910221807.n9MI7P2a002071@bagheera.jungle.bt.co.uk>
<F437BD07-581B-4542-ABDB-ABABEDC3B8DD@cisco.com>
<Pine.LNX.4.64.0910222140130.20686@melkinkari.cs.Helsinki.FI>
<BA5FCA1F-0F30-42D2-8C3A-006B28B7D0E1@cisco.com>
<Pine.LNX.4.64.0910222340490.20686@melkinkari.cs.Helsinki.FI>
X-Mailer: Apple Mail (2.936)
Cc: re-ECN unIETF list <re-ecn@ietf.org>
Subject: Re: [re-ECN] FW: ConEx BoF announcement text
X-BeenThere: re-ecn@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: re-inserted explicit congestion notification <re-ecn.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/re-ecn>,
<mailto:re-ecn-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/re-ecn>
List-Post: <mailto:re-ecn@ietf.org>
List-Help: <mailto:re-ecn-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/re-ecn>,
<mailto:re-ecn-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 22 Oct 2009 22:51:24 -0000
:-) You started out by telling me that arbitrarily increasing the window doesn't hurt anything - you quoted Bob saying that congestion isn't a problem. I demonstrated that congestion out of control is a problem. Then you complained that I didn't demonstrate that Cubic/Reno is a problem. OK, let's talk about Cubic/Reno, and why Bittorrent has a working group in the IETF (ledbat) about how to reduce the impact on ISPs and their customers when they run seven Cubic/Reno TCPs in parallel. Actually, could I get you to read the ledbat charter? http://www.ietf.org/dyn/wg/charter/ledbat-charter.html On Oct 22, 2009, at 3:30 PM, Ilpo Järvinen wrote: > On Thu, 22 Oct 2009, Fred Baker wrote: > >> well, we disagree on some points. > > I guess we agree quite much yes. However, what I'm pretty much still > lacking the connection you made between Cubic/Reno and throughput > reduction because of "the cliff". I oppose such an universal > statement you > made earlier about the cliff being there if we keep pushing the > network. > E.g., you put earlier Cubic among the "bad ones" who push the > network, but > now your scenario below doesn't have anything that even remotely > resembles > Cubic/Reno but something totally unresponsive (I'm assuming SACK based > recovery is there with this defination of Reno, and that router > buffers > are large enough to accommodate MD-madness "recovery"). ...That's the > connection I want to fully tear down. Don't first put Cubic or Reno > into > bads than show an example with something completely different, it > isn't > very fair to blame something from others faults like that and justify > their "badness" that way. This is very important because this > mis-connection is there in many minds that when enough congestion, > throughput automatically dramatically drops (like in the Jain's > graph). As > a result, we hear these "congestion is the problem", and "more > aggressive > means more congestion" type of an attitude. Just to make sure, I > didn't > mean to say here that I'd think that Cubic/Reno is something perfect > either. > > ...The rest is just some comments to your rather intimidated > scenario :-). > >> I don't know that perfect agreement is necessary, and certainly not >> on >> this thread. That said... >> >> Regarding behavior beyond the cliff, it might be instructive to >> take the most >> extreme possible case - a dumbbell network and a really obnoxious >> file >> transfer algorithm. >> >> +-+ 10*N +-+ 10*N +-+ N +-+ 10*N +-+ >> |A|--------|B|-------|C|------|D|-------|E| >> +-+ +-+ +-+ +-+ +-+ >> 10*N | |10*N >> +--+ +--+ >> | F| | G| >> +--+ +--+ >> >> Presume that the capacity from C to D is N, and all other >> capacities are 10*N. >> Presume that some application at A wants to send something really >> big to a >> peer at E, and some other application at F wants to send something >> else to a >> peer at G. >> >> Presume also that the way file transfers work is that the sender puts >> appropriate headers on all of the indicated packets, queues them up >> to its >> link layer chip, and spews them as fast as it can. If it's a 100 >> gigabyte >> file, it queues up 100 gigabytes and watches it heat the "wires". >> The peer >> receives what it receives and tells the sender what it didn't. The >> sender now >> repeats the exercise with anything his peer didn't receive. >> >> BTW, that is in fact how a file transfer done in a certain research >> environment on very noisy lines is proposed to work. I didn't just >> dream that >> up. >> >> Obviously, links A-B, B-C, F-B, and C-D will be fully utilized. >> Since B-C has >> less (half) capacity than A-B + F-B, each of those flows will drop >> about half >> of its traffic. The transfer rate F-G should be around 5*N. C-D >> will also drop >> 80% of the remaining traffic, resulting in an arrival rate at E of >> N. A will >> go through at least ten iterations of its transfer, and F at least >> two. >> >> The first down side is that while A-B and F-B are fully utilized, >> most >> of that capacity is wasted. > > Yes, that's the nature of unresponsiveness you added by yourself, > and now > you say that it is a down side?!? ...Nobody (including the network) > can > prevent an application from doing stupid things like wasting the > resources > of the first link :-). I understand though that you could construct > a more > complex topology where it wouldn't be the first link. > > Instead of tackling that possibility, I take another extreme example > to > stress the point of mine: An application could cause the very same > symptoms as were with the classical congestion collapse (as per Nagle) > just by keeping resending just the very same segment even though it > made > through (and ACK was received), only multiple copies are then in the > network and the throughput of that particular flow will be > dramatically > affected. But it's just plain stupid behavior which no network can > prevent > you from doing. Of course nobody else would have to be part of that > experiment and they can operate with good throughputs. In a thought > experiment we can put a TCP implementation with SACK and lots of > receiver > window into the age of Nagle and prove that it keeps operating with > very > good throughput regardless of what the stupid senders do as long as > the > transfer isn't unlucky to run of data to send while recovering. > > ...Secondly, I don't find this non-optimal use of excess capacity to > be > the reason why one would need/want Re-ECN. > > Another stupidity of the application/protocol in this example is the > way > of feedback which exaggerates anomalities nearby the end of a > transfer. > >> This impacts F's ability to deliver traffic to G; it >> should have achieved 90% throughput rate (the rate C-G less the >> rate C-D), and >> in fact achieved 50% or less. With the right mechanism (ack >> clocking comes to >> mind, and there are other approaches), the capacity used by the >> file transfer >> on A-B is only 10% used and there is at most a 10% impact on the >> file transfer >> F-G. > > ...Right, would there be responsiveness, all changes. And I guess we > then > pretty much agree. > >> The second down-side is that the traffic arriving at G and E will be >> scattered all through the respective files, meaning that they have to >> store the bits and pieces and do a fair bit of searching and >> sorting to >> determine what they have not yet received and to reconstitute the >> file. >> This is a waste both of memory and CPU resources. > > This has nothing to do with the throughput reduction anymore, don't > you > think? Also, to me these seem "extras" again are because of stupid > behavior introduced by you rather than anything very close to what I > was > arguing against. > > But in general I agree with you that loss recovery even without any > throughput loss introduces these space constraints (though in a lot > less > severe way than in your extreme example) and one would want to avoid > that > if at all possible. But IMHO that again has rather little to do with > "the > cliff" and the throughput reduction (that supposedly happens because > of > it). > >> Voila, we come back pretty quickly to wanting the transmission rate >> of A to >> approximate the reception rate at E and the transmission rate at F to >> approximate the reception rate at G, and wanting the window to not be >> excessively large. An excessively large window wastes bandwidth >> before the >> bottleneck and potentially creates unnecessary bottlenecks, impacts >> other file transfers, imposes a memory/algorithm burden on the >> receiver, >> and for all of those costs accomplishes nothing that a smaller window >> would not have accomplished better. > > Agreed. ...Noted though that you have for some reason cleverly milded > the "reduces throughput" into "_potentially_ creates unnecessary > bottlenecks" (emphasis mine) here which I can certainly agree much > more ;-). > > ...In fact, I'd even not choke because of your example here, if > given in > less extreme form, as we have CBR type of non-responsiveness present > too > for real, and thus dead packets wasting capacity (without ECN) and > so on. > >> A window that is too large is a bad thing, and the congestion it >> causes is a >> bad thing. Not that the presence of congestion is bad - it is, as >> Bob says, a >> natural side effect of what happens. But congestion that is poorly >> managed by >> the endpoint and/or the network is a very bad thing. re-ecn does in >> fact try >> to give the sender an incentive to approximate the knee (the >> smallest window >> that achieves the potential throughput rate), and increasing it to >> or beyond >> the cliff by definition doesn't increase it's rate materially >> beyond that >> point. > > Agreed. Like you pointed out earlier, there's no increase in > throughput > beyond that point not matter what trick one tries to play and I agree. > My point was just to say that with responsive transfers there isn't > such > magic cliff (always), but the throughput would just keep that constant > line even if drops start to occur. And especially mentioning Cubic/ > Reno, > which are responsive, in this context of "cliff" seemed very > misleading. > > > -- > i. > > >> On Oct 22, 2009, at 12:36 PM, Ilpo Järvinen wrote: >> >>> On Thu, 22 Oct 2009, Fred Baker wrote: >>> >>>> I'll argue something slightly different. It comes to the same >>>> endpoint, but >>>> through a different observation. Some of us, you Bob especially, >>>> are very >>>> familiar with this; others are perhaps less so. Bear with me. >>>> >>>> Jain's 1994 patent on congestion control defines two terms: the >>>> "cliff" and >>>> the "knee" of a throughput curve. >>>> >>>> | >>>> T | availabile capacity >>>> h | --------------------------------------- >>>> r | +---+ >>>> u | knee/ \cliff >>>> p | / \ >>>> u | / \ >>>> t | / \ >>>> |/ \ >>>> |/ \ >>>> |/ ---------------- >>>> ----------------------------------------- >>>> Window ---> >>>> >>>> As the TCP/SCTP window increases from zero to some value, >>>> throughput also >>>> increases. That stops when the available capacity has been >>>> consumed; at >>>> that >>>> point, even if the window grows, throughput does not. What >>>> increases >>>> instead >>>> is RTT, because a queue grows. >>> >>> Up to this point I agree with this paragraph... >>> >>>> If window continues to increase, at some point the queue starts >>>> dropping >>>> traffic, and throughput degrades. >>> >>> ...However, this is not a true claim. As long as no unnecessary >>> work is >>> not done at any bottleneck (and we don't have a shared resource >>> such as >>> a wireless channel), thoughput does _not_ degrade (at all!), even >>> though >>> many keep asserting that without much of a thought nowadays. Of >>> course >>> non-infinitely long flows would introduce some anomalities to >>> calculations >>> through which one could use to show that this claim is "true" but >>> I think >>> that's hardly what people usually mean when they make that claim >>> that >>> dropping implies throughput degradation. >>> >>> Somewhat related, for some reason there's a very common misconcept >>> that >>> drops imply poor performance. That is often stated as a fact without >>> looking deeper into what really was the cause for sub-optimal >>> performance, often that turns out to be something else when looked >>> closely enough. I'd say that very often losses were, in fact, >>> innocent. >>> Quite often the blame would fall on constant factor MD (if one is >>> honest >>> enough to admit). >>> >>> Btw, your graph is not accurate (wrt. Jain), he's drawing a much >>> sharper >>> drop than you do at his "cliff" (so with ascii you'd have to use | >>> at >>> first to match him). I've some doubts about validity of that sharp >>> angle too, so I find your graph is much more sensible (that is, as >>> long as >>> the resource is shared). >>> >>>> By definition, the window at the "knee" is smaller than the >>>> window at >>>> the "cliff", but throughput is the same. >>>> >>>> Common TCP congestion control algorithms such as Reno and Cubic >>>> tune to the >>>> cliff. That has the upside of maximizing throughput; it has the >>>> downside >>>> that >>>> it is abusive to other applications such as voice/video >>>> (increased and >>>> variable RTT, and induces loss), and to other TCP sessions. >>>> #include >>>> <discussion of bittorent and why other users are negatively >>>> impacted by it> >>> >>> I think Bob doesn't agree with you here, he isn't saying that >>> congestion >>> is something bad but something nearly opposite. Bob's own words: >>> >>> "It would contradict this to say congestion is a problem - it's >>> not - it's >>> healthy and natural in a data network." >>> >>> ...It's just that if you keep doing that too much (beyond the >>> congestion >>> volume allocated for you), you would "suffer (if I've understood him >>> right). ...So it has very little to do with operating in the >>> "cliff" or >>> "knee" in short timescales (unless of course you've already ran >>> out of >>> your quota). >>> >>>> Other TCP congestion control algorithms based on ECN, CalTech >>>> FAST, etc, >>>> try >>>> to tune to the knee. This gets them exactly the same throughput >>>> including >>>> support of very high rate applications but with a smaller window >>>> value and >>>> therefore lower queue depth and lower probability of loss - both >>>> for >>>> themselves and their competitors. It does so without negatively, >>>> or at >>>> least >>>> AS negatively, impacting applications they compete with beyond >>>> seeking to >>>> share the available capacity with them. >>> >>> I certainly agree with you here that low queuing delay is a >>> desirable >>> property too. ...However, I don't see how e.g., ECN alone could >>> achieve >>> low queue delay and high throughput at the same time, you'd need >>> something >>> more than that (mainly to remove the 0.5 factored MD). And after >>> removing >>> that, we'd get something close to what Matt Mathis is suggesting >>> and that >>> then does not anymore depend on ECN to work (though I think one >>> could >>> certainly avoid lots of loss recovering by signalling in early with >>> ECN). >>> >>> -- >>> i. >> >> > > -- > i.
- [re-ECN] ConEx BoF announcement text philip.eardley
- [re-ECN] FW: ConEx BoF announcement text toby.moncaster
- Re: [re-ECN] FW: ConEx BoF announcement text Leslie Daigle
- Re: [re-ECN] FW: ConEx BoF announcement text Bob Briscoe
- Re: [re-ECN] FW: ConEx BoF announcement text Leslie Daigle
- Re: [re-ECN] FW: ConEx BoF announcement text Bob Briscoe
- Re: [re-ECN] FW: ConEx BoF announcement text Bob Briscoe
- Re: [re-ECN] FW: ConEx BoF announcement text philip.eardley
- Re: [re-ECN] FW: ConEx BoF announcement text toby.moncaster
- Re: [re-ECN] FW: ConEx BoF announcement text Leslie Daigle
- Re: [re-ECN] FW: ConEx BoF announcement text toby.moncaster
- Re: [re-ECN] FW: ConEx BoF announcement text bmanning
- Re: [re-ECN] FW: ConEx BoF announcement text Leslie Daigle
- Re: [re-ECN] FW: ConEx BoF announcement text Richard Bennett
- Re: [re-ECN] FW: ConEx BoF announcement text Richard Bennett
- Re: [re-ECN] FW: ConEx BoF announcement text toby.moncaster
- Re: [re-ECN] FW: ConEx BoF announcement text philip.eardley
- Re: [re-ECN] FW: ConEx BoF announcement text toby.moncaster
- Re: [re-ECN] FW: ConEx BoF announcement text Kwok Ho Chan
- Re: [re-ECN] FW: ConEx BoF announcement text João Taveira Araújo
- Re: [re-ECN] FW: ConEx BoF announcement text John Leslie
- Re: [re-ECN] FW: ConEx BoF announcement text Leslie Daigle
- Re: [re-ECN] FW: ConEx BoF announcement text Bob Briscoe
- Re: [re-ECN] FW: ConEx BoF announcement text Fred Baker
- Re: [re-ECN] FW: ConEx BoF announcement text Ilpo Järvinen
- Re: [re-ECN] FW: ConEx BoF announcement text Fred Baker
- Re: [re-ECN] FW: ConEx BoF announcement text Ilpo Järvinen
- Re: [re-ECN] FW: ConEx BoF announcement text Fred Baker
- Re: [re-ECN] FW: ConEx BoF announcement text Leslie Daigle
- Re: [re-ECN] FW: ConEx BoF announcement text Bob Briscoe
- Re: [re-ECN] FW: ConEx BoF announcement text Bob Briscoe
- Re: [re-ECN] FW: ConEx BoF announcement text Leslie Daigle
- Re: [re-ECN] FW: ConEx BoF announcement text João Taveira Araújo
- Re: [re-ECN] FW: ConEx BoF announcement text Ilpo Järvinen
- Re: [re-ECN] FW: ConEx BoF announcement text toby.moncaster
- Re: [re-ECN] FW: ConEx BoF announcement text Bob Briscoe
- Re: [re-ECN] FW: ConEx BoF announcement text John Leslie
- Re: [re-ECN] FW: ConEx BoF announcement text philip.eardley
- Re: [re-ECN] FW: ConEx BoF announcement text toby.moncaster
- Re: [re-ECN] FW: ConEx BoF announcement text philip.eardley