Re: [re-ECN] implementations
Bob Briscoe <rbriscoe@jungle.bt.co.uk> Mon, 26 October 2009 14:36 UTC
Return-Path: <rbriscoe@jungle.bt.co.uk>
X-Original-To: re-ecn@core3.amsl.com
Delivered-To: re-ecn@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix)
with ESMTP id E98AC3A6A73 for <re-ecn@core3.amsl.com>;
Mon, 26 Oct 2009 07:36:47 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.489
X-Spam-Level:
X-Spam-Status: No, score=-1.489 tagged_above=-999 required=5 tests=[AWL=0.028,
BAYES_00=-2.599, DNS_FROM_RFC_BOGUSMX=1.482, J_CHICKENPOX_74=0.6,
RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com
[127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Ll2ma2aCAYp9 for
<re-ecn@core3.amsl.com>; Mon, 26 Oct 2009 07:36:40 -0700 (PDT)
Received: from smtp4.smtp.bt.com (smtp4.smtp.bt.com [217.32.164.151]) by
core3.amsl.com (Postfix) with ESMTP id E6D9D3A6A6F for <re-ecn@ietf.org>;
Mon, 26 Oct 2009 07:36:39 -0700 (PDT)
Received: from i2kc08-ukbr.domain1.systemhost.net ([193.113.197.71]) by
smtp4.smtp.bt.com with Microsoft SMTPSVC(6.0.3790.3959);
Mon, 26 Oct 2009 14:36:52 +0000
Received: from cbibipnt08.iuser.iroot.adidom.com ([147.149.100.81]) by
i2kc08-ukbr.domain1.systemhost.net with Microsoft SMTPSVC(6.0.3790.3959);
Mon, 26 Oct 2009 14:36:52 +0000
Received: From bagheera.jungle.bt.co.uk ([132.146.168.158]) by
cbibipnt08.iuser.iroot.adidom.com (WebShield SMTP v4.5 MR1a P0803.399);
id 1256567811509; Mon, 26 Oct 2009 14:36:51 +0000
Received: from MUT.jungle.bt.co.uk ([10.73.146.30]) by
bagheera.jungle.bt.co.uk (8.13.5/8.12.8) with ESMTP id n9QEalKk001268;
Mon, 26 Oct 2009 14:36:47 GMT
Message-Id: <200910261436.n9QEalKk001268@bagheera.jungle.bt.co.uk>
X-Mailer: QUALCOMM Windows Eudora Version 7.1.0.9
Date: Mon, 26 Oct 2009 14:36:47 +0000
To: "Don Bowman" <don@sandvine.com>
From: Bob Briscoe <rbriscoe@jungle.bt.co.uk>
In-Reply-To: <EB618291F3454E4DA10D152B9045C0170215EBA0@exchange-2.sandvi
ne.com>
References: <4AD7A078.8000100@thinkingcat.com>
<EB618291F3454E4DA10D152B9045C0170215E753@exchange-2.sandvine.com>
<fc0ff13d0910231201kb611d4es2059713e3a5ebe3@mail.gmail.com>
<EB618291F3454E4DA10D152B9045C0170215EB31@exchange-2.sandvine.com>
<200910260916.n9Q9G6Et026065@bagheera.jungle.bt.co.uk>
<EB618291F3454E4DA10D152B9045C0170215EBA0@exchange-2.sandvine.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed
X-Scanned-By: MIMEDefang 2.56 on 132.146.168.158
X-OriginalArrivalTime: 26 Oct 2009 14:36:52.0506 (UTC)
FILETIME=[C2400FA0:01CA5649]
Cc: re-ecn@ietf.org
Subject: Re: [re-ECN] implementations
X-BeenThere: re-ecn@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: re-inserted explicit congestion notification <re-ecn.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/re-ecn>,
<mailto:re-ecn-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/re-ecn>
List-Post: <mailto:re-ecn@ietf.org>
List-Help: <mailto:re-ecn-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/re-ecn>,
<mailto:re-ecn-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 26 Oct 2009 14:36:48 -0000
Don, At 13:21 26/10/2009, Don Bowman wrote: >From: Bob Briscoe [mailto:rbriscoe@jungle.bt.co.uk] > > At 19:41 23/10/2009, Don Bowman wrote: > > >From: Matt Mathis [mailto:matt.mathis@gmail.com] > > ... snip ... > > > > > > >Sandvine would therefore define congestion on a per-application > > >class basis as the variability in delay or packet loss beyond what > > >the application can withstand without the user noticing. The > > >Wikipedia article @ http://en.wikipedia.org/wiki/Congestion_collapse > > >provides a good description based on the assumption that all > > >applications perceive congestion the same. > > > > [BB] We need to unpick two separate aspects: an app both i) suffers > > from the effects of congestion and ii) contributes to that congestion: > > > > == i) Sensitivity to congestion == > > The different sensitivities of different apps to a common congestion > > metric are a property of the app, not the network - they are not > > reasons to re-define congestion itself. Once you have an objective > > congestion measure, the response to it can be down to each app (see > > below for the special case of an app-specific middlebox). > > > > == ii) Contribution to congestion == > > Different apps cause different amounts of congestion per data sent, > > by responding more sensitively (LEDBAT) or less (unresponsive UDP), > > or somewhere in the middle (TCP). And apps on shorter RTTs cause less > > congestion for the same data, because they can get out the way more > > quickly. > > All these differences are choices of the app, not anyone else. So > > it's right to define an objective measure of congestion independent > > of these things. If an app chooses to use a long RTT path thereby > > causing more congestion, making it accountable for its contribution > > to congestion is still the right thing to do. > >I'm not sure i would define UDP as an application. An application >is e.g. video streaming. Some applications have a natural pacing >mechanism (e.g. video, voip, IM, email), others do not (e.g. file >transfer). > >Applications have a user-expectation of them, and an ability to deal >with congestion. Some (e.g. the paced ones) deal with the jitter >aspect of congestion by increasing their pre-buffer. The affect >on the user experience (longer channel change time) is better than >the alternative (glitches or freezes). So although congestion >is application-independent, the affect of it is not. [BB] I deliberately said 'unresponsive UDP' cos I wouldn't define UDP as an application either. I was just giving an example at the far end of the spectrum of responsiveness (actually I've seen apps that are inversely responsive - the more loss the more redundancy is sent). > > > > > > >Now, for use of TCP packet loss. > > >Since there is more than one TCP stack implementation in use, and > > >they vary in aggressiveness of ramp up/down, and since the network > > >is shared by so many sessions, and since congestion comes at many > > >hops along the way but TCP packet loss @ one location doesn't show > > >where it occurred, you have a blended mix. The loss may be in the > > >wifi in the home, in the home gateway, in the dsl modem, in the > > >dslam, in the bras, atm aggregation, aggregation routers, core > > >routers, peering, transit, etc. It may be due to the user's own > > >traffic, due to unreliable RF signals, due to other users traffic, > > >due to network switchover, etc. > > > > [BB] Yup. > > > > >So when we modelled TCP packet loss and correlated it against > > >congestion measured as % of link utilisation @ each choke point, and > > >eyeballed the quality of some key applications, we found the > > >correlation to be not that great. > > > > [BB] I'm not exactly sure what the second correlation you mention is > > between. I'm assuming quality:loss is somehow compared to > > quality:utilisation. This might means something if ':' means > > division, but I'm not sure what it means if ':' is the function 'to > > eyeball'. > >well, eyeball may have been a somewhat imprecise term here :) > >Our equipment is able to measure VoIP QoE (MOS). We do this by locking >a local clock to the RTP stream clock, and also looking @ the RTP >sequence counter for detecting loss. In this fashion we can >see latency/loss/jitter (latency comes from RTCP-XR from the endpoints). >As a correlation, we looked @ the correlation between TCP packet loss >and MOS for VoIP. The correlation was low (~0.1). This is in a heavily >mixed network w/ hundreds of thousands of consumer connections. >We inferred there was congestion @ some times because the MOS varied >as a function of the hour. We assumed the congestion was in the network >we were in because some of the VoIP providers were directly peered, >and we assumed they were uncongested (no proof of the assumption). > >In a similar fashion, we correlated access round-trip-time by >looking @ the delta between SYN-ACK towards the subscriber, and ACK >returning. We used this time as a proxy for congestion since it >correlated >better (~0.7). [BB] Understood. But the point I was making is that we're not looking for some metric that correlates with each application's QoS. We're looking for a metric that represents potential harm by each user U on 'the most sensitive but unknown' application X that might be sharing with U. If different apps are better at working round that harm, all power to them. Contribution to congestion (volume of congested bytes) might look like a somewhat arbitrary choice, but it's very robust because it's proportionate to the two factors involved, taken over a brief period of time: - Your volume: if you're sending twice as much as someone else, you get twice as much pressure to get out; - Everyone else's volume: if the queue is being pushed twice as hard, it also doubles the pressure for you to get out of the way; > > > > > > >A related question relates to the use of policing. If i put a 10Mbps > > >policer on a network that is capable of 100Mbps, I do not see 90Mbps > > >of 'drops' from the policer, i typically see 0. Therefore i cannot > > >infer what the bandwidth need would be if i remove the policer (it > > >might still be 10, it might go to 100). Thus the packet loss cannot > > >be used to infer the bandwidth desired. > > > > [BB] There's two separate issues I have with this: > > > > 1/ Whether loss is due to policy or to congestion, bandwidth desired > > isn't relevant to how much harm one user contributes to others. > > That's the bandwidth they actually make do with, weighted by the > > congestion when they use it (ie their contribution to congestion). > >But it does relate to today's problem of 'how much capacity is truly >desired'. And it remains one of the common myths in the minds of >our customers, and of the end users of networks. It's the Heisenberg >uncertainty principle applied to internet bandwidth :) There are some >marked non-linearities. Anecdotal evidence that i have... networks >that are too congested to support streaming have no streaming. When you >start to fix that congestion, the bandwidth demanded actually goes >up sharply as people start to be able to use apps they didn't before. >The most interesting cases are small island nations (malta, Caribbean, >...): >no local content, no peering, disproportionate cost of transit to >access network, ... great case studies, often fun to visit too :) [BB] The proposed mechanism allows each user to push for the bandwidth they want, if they are prepared to push others out of the way, drawing down their congestion allowance in the process. That can't signal demand for a new app that requires more bandwidth than physically exists. But in more typical incremental cases it signals to the operator that there is real demand (ie backed by cash). I'm not claiming it's a panacea tho - it's not something that would substitute for sending out marketing people to find out if there's some cool new app that people would use and be willing to pay for 3x as much bandwidth for. > > > > 2/ If a policer is basing its decisions on contribution to > > congestion, it should have two separate internal processes: > > - measuring the congestion a user's traffic causes > > - dropping traffic when policy is exceeded > >i agree. [BB] Cheers Bob ________________________________________________________________ Bob Briscoe, BT Innovate & Design
- [re-ECN] implementations Leslie Daigle
- Re: [re-ECN] implementations toby.moncaster
- Re: [re-ECN] implementations alan.p.smith
- Re: [re-ECN] implementations Mirja Kühlewind
- Re: [re-ECN] implementations Don Bowman
- Re: [re-ECN] implementations Matt Mathis
- Re: [re-ECN] implementations Don Bowman
- Re: [re-ECN] implementations Bob Briscoe
- Re: [re-ECN] implementations Don Bowman
- Re: [re-ECN] implementations Bob Briscoe
- [re-ECN] What do we mean by "Congestion" John Leslie
- Re: [re-ECN] What do we mean by "Congestion" Don Bowman
- Re: [re-ECN] implementations Don Bowman
- Re: [re-ECN] What do we mean by "Congestion" Bob Briscoe