Re: [re-ECN] implementations
"Don Bowman" <don@sandvine.com> Mon, 26 October 2009 13:21 UTC
Return-Path: <don@sandvine.com>
X-Original-To: re-ecn@core3.amsl.com
Delivered-To: re-ecn@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix)
with ESMTP id 911E63A68FB for <re-ecn@core3.amsl.com>;
Mon, 26 Oct 2009 06:21:32 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.445
X-Spam-Level:
X-Spam-Status: No, score=-1.445 tagged_above=-999 required=5 tests=[AWL=0.554,
BAYES_00=-2.599, J_CHICKENPOX_74=0.6]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com
[127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tAE1t8pAzx+L for
<re-ecn@core3.amsl.com>; Mon, 26 Oct 2009 06:21:31 -0700 (PDT)
Received: from mail2.sandvine.com (Mail1.sandvine.com [64.7.137.134]) by
core3.amsl.com (Postfix) with ESMTP id 3CF9C3A684A for <re-ecn@ietf.org>;
Mon, 26 Oct 2009 06:21:30 -0700 (PDT)
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Date: Mon, 26 Oct 2009 09:21:43 -0400
Message-ID: <EB618291F3454E4DA10D152B9045C0170215EBA0@exchange-2.sandvine.com>
In-Reply-To: <200910260916.n9Q9G6Et026065@bagheera.jungle.bt.co.uk>
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Thread-Topic: [re-ECN] implementations
Thread-Index: AcpWHPiVIfphj7uxQiKIea7P8RDRKgAHu5oA
References: <4AD7A078.8000100@thinkingcat.com>
<EB618291F3454E4DA10D152B9045C0170215E753@exchange-2.sandvine.com>
<fc0ff13d0910231201kb611d4es2059713e3a5ebe3@mail.gmail.com>
<EB618291F3454E4DA10D152B9045C0170215EB31@exchange-2.sandvine.com>
<200910260916.n9Q9G6Et026065@bagheera.jungle.bt.co.uk>
From: "Don Bowman" <don@sandvine.com>
To: "Bob Briscoe" <rbriscoe@jungle.bt.co.uk>
Cc: Matt Mathis <matt.mathis@gmail.com>, re-ecn@ietf.org
Subject: Re: [re-ECN] implementations
X-BeenThere: re-ecn@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: re-inserted explicit congestion notification <re-ecn.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/re-ecn>,
<mailto:re-ecn-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/re-ecn>
List-Post: <mailto:re-ecn@ietf.org>
List-Help: <mailto:re-ecn-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/re-ecn>,
<mailto:re-ecn-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 26 Oct 2009 13:21:32 -0000
From: Bob Briscoe [mailto:rbriscoe@jungle.bt.co.uk] > At 19:41 23/10/2009, Don Bowman wrote: > >From: Matt Mathis [mailto:matt.mathis@gmail.com] ... snip ... > > > >Sandvine would therefore define congestion on a per-application > >class basis as the variability in delay or packet loss beyond what > >the application can withstand without the user noticing. The > >Wikipedia article @ http://en.wikipedia.org/wiki/Congestion_collapse > >provides a good description based on the assumption that all > >applications perceive congestion the same. > > [BB] We need to unpick two separate aspects: an app both i) suffers > from the effects of congestion and ii) contributes to that congestion: > > == i) Sensitivity to congestion == > The different sensitivities of different apps to a common congestion > metric are a property of the app, not the network - they are not > reasons to re-define congestion itself. Once you have an objective > congestion measure, the response to it can be down to each app (see > below for the special case of an app-specific middlebox). > > == ii) Contribution to congestion == > Different apps cause different amounts of congestion per data sent, > by responding more sensitively (LEDBAT) or less (unresponsive UDP), > or somewhere in the middle (TCP). And apps on shorter RTTs cause less > congestion for the same data, because they can get out the way more > quickly. > All these differences are choices of the app, not anyone else. So > it's right to define an objective measure of congestion independent > of these things. If an app chooses to use a long RTT path thereby > causing more congestion, making it accountable for its contribution > to congestion is still the right thing to do. I'm not sure i would define UDP as an application. An application is e.g. video streaming. Some applications have a natural pacing mechanism (e.g. video, voip, IM, email), others do not (e.g. file transfer). Applications have a user-expectation of them, and an ability to deal with congestion. Some (e.g. the paced ones) deal with the jitter aspect of congestion by increasing their pre-buffer. The affect on the user experience (longer channel change time) is better than the alternative (glitches or freezes). So although congestion is application-independent, the affect of it is not. > > > >Now, for use of TCP packet loss. > >Since there is more than one TCP stack implementation in use, and > >they vary in aggressiveness of ramp up/down, and since the network > >is shared by so many sessions, and since congestion comes at many > >hops along the way but TCP packet loss @ one location doesn't show > >where it occurred, you have a blended mix. The loss may be in the > >wifi in the home, in the home gateway, in the dsl modem, in the > >dslam, in the bras, atm aggregation, aggregation routers, core > >routers, peering, transit, etc. It may be due to the user's own > >traffic, due to unreliable RF signals, due to other users traffic, > >due to network switchover, etc. > > [BB] Yup. > > >So when we modelled TCP packet loss and correlated it against > >congestion measured as % of link utilisation @ each choke point, and > >eyeballed the quality of some key applications, we found the > >correlation to be not that great. > > [BB] I'm not exactly sure what the second correlation you mention is > between. I'm assuming quality:loss is somehow compared to > quality:utilisation. This might means something if ':' means > division, but I'm not sure what it means if ':' is the function 'to > eyeball'. well, eyeball may have been a somewhat imprecise term here :) Our equipment is able to measure VoIP QoE (MOS). We do this by locking a local clock to the RTP stream clock, and also looking @ the RTP sequence counter for detecting loss. In this fashion we can see latency/loss/jitter (latency comes from RTCP-XR from the endpoints). As a correlation, we looked @ the correlation between TCP packet loss and MOS for VoIP. The correlation was low (~0.1). This is in a heavily mixed network w/ hundreds of thousands of consumer connections. We inferred there was congestion @ some times because the MOS varied as a function of the hour. We assumed the congestion was in the network we were in because some of the VoIP providers were directly peered, and we assumed they were uncongested (no proof of the assumption). In a similar fashion, we correlated access round-trip-time by looking @ the delta between SYN-ACK towards the subscriber, and ACK returning. We used this time as a proxy for congestion since it correlated better (~0.7). > > > >A related question relates to the use of policing. If i put a 10Mbps > >policer on a network that is capable of 100Mbps, I do not see 90Mbps > >of 'drops' from the policer, i typically see 0. Therefore i cannot > >infer what the bandwidth need would be if i remove the policer (it > >might still be 10, it might go to 100). Thus the packet loss cannot > >be used to infer the bandwidth desired. > > [BB] There's two separate issues I have with this: > > 1/ Whether loss is due to policy or to congestion, bandwidth desired > isn't relevant to how much harm one user contributes to others. > That's the bandwidth they actually make do with, weighted by the > congestion when they use it (ie their contribution to congestion). But it does relate to today's problem of 'how much capacity is truly desired'. And it remains one of the common myths in the minds of our customers, and of the end users of networks. It's the Heisenberg uncertainty principle applied to internet bandwidth :) There are some marked non-linearities. Anecdotal evidence that i have... networks that are too congested to support streaming have no streaming. When you start to fix that congestion, the bandwidth demanded actually goes up sharply as people start to be able to use apps they didn't before. The most interesting cases are small island nations (malta, Caribbean, ...): no local content, no peering, disproportionate cost of transit to access network, ... great case studies, often fun to visit too :) > > 2/ If a policer is basing its decisions on contribution to > congestion, it should have two separate internal processes: > - measuring the congestion a user's traffic causes > - dropping traffic when policy is exceeded i agree.
- [re-ECN] implementations Leslie Daigle
- Re: [re-ECN] implementations toby.moncaster
- Re: [re-ECN] implementations alan.p.smith
- Re: [re-ECN] implementations Mirja Kühlewind
- Re: [re-ECN] implementations Don Bowman
- Re: [re-ECN] implementations Matt Mathis
- Re: [re-ECN] implementations Don Bowman
- Re: [re-ECN] implementations Bob Briscoe
- Re: [re-ECN] implementations Don Bowman
- Re: [re-ECN] implementations Bob Briscoe
- [re-ECN] What do we mean by "Congestion" John Leslie
- Re: [re-ECN] What do we mean by "Congestion" Don Bowman
- Re: [re-ECN] implementations Don Bowman
- Re: [re-ECN] What do we mean by "Congestion" Bob Briscoe