Re: [re-ECN] What do we mean by "Congestion"

"Don Bowman" <don@sandvine.com> Mon, 26 October 2009 17:45 UTC

Return-Path: <don@sandvine.com>
X-Original-To: re-ecn@core3.amsl.com
Delivered-To: re-ecn@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 8134C28C222 for <re-ecn@core3.amsl.com>; Mon, 26 Oct 2009 10:45:57 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.929
X-Spam-Level:
X-Spam-Status: No, score=-1.929 tagged_above=-999 required=5 tests=[AWL=0.670, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tJBcKIaI9Auz for <re-ecn@core3.amsl.com>; Mon, 26 Oct 2009 10:45:56 -0700 (PDT)
Received: from mail2.sandvine.com (Mail1.sandvine.com [64.7.137.134]) by core3.amsl.com (Postfix) with ESMTP id 2DC1728C115 for <re-ecn@ietf.org>; Mon, 26 Oct 2009 10:45:55 -0700 (PDT)
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Date: Mon, 26 Oct 2009 13:46:08 -0400
Message-ID: <EB618291F3454E4DA10D152B9045C0170215EBDB@exchange-2.sandvine.com>
In-Reply-To: <20091026145436.GB62345@verdi>
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Thread-Topic: What do we mean by "Congestion"
Thread-Index: AcpWTEoqavXDMUdCT7GgDGwn4vkL/wAFrkkg
References: <4AD7A078.8000100@thinkingcat.com> <EB618291F3454E4DA10D152B9045C0170215E753@exchange-2.sandvine.com> <fc0ff13d0910231201kb611d4es2059713e3a5ebe3@mail.gmail.com> <EB618291F3454E4DA10D152B9045C0170215EB31@exchange-2.sandvine.com> <200910260916.n9Q9G6Et026065@bagheera.jungle.bt.co.uk> <EB618291F3454E4DA10D152B9045C0170215EBA0@exchange-2.sandvine.com> <20091026145436.GB62345@verdi>
From: "Don Bowman" <don@sandvine.com>
To: "John Leslie" <john@jlc.net>
Cc: re-ecn@ietf.org
Subject: Re: [re-ECN] What do we mean by "Congestion"
X-BeenThere: re-ecn@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: re-inserted explicit congestion notification <re-ecn.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/re-ecn>, <mailto:re-ecn-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/re-ecn>
List-Post: <mailto:re-ecn@ietf.org>
List-Help: <mailto:re-ecn-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/re-ecn>, <mailto:re-ecn-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 26 Oct 2009 17:45:57 -0000

From: John Leslie [mailto:john@jlc.net]
> Don Bowman <don@sandvine.com> wrote:
> > From: Bob Briscoe [mailto:rbriscoe@jungle.bt.co.uk]
> >
> >> [BB] We need to unpick two separate aspects: an app both i) suffers
> >> from the effects of congestion and ii) contributes to that
> congestion:
> >>...
> >> All these differences are choices of the app, not anyone else. So
> >> it's right to define an objective measure of congestion independent
> >> of these things. If an app chooses to use a long RTT path thereby
> >> causing more congestion, making it accountable for its contribution
> >> to congestion is still the right thing to do.
> 
>    I'm afraid we do not all mean the same thing by "congestion". :^(
> 
>    While I've long since abandoned hope of getting everyone on the
same
> page, I believe it would help if we could keep it clear what
individual
> posters mean when they use that term.
> 
>    "Congestion" to historic TCP is packets lost, period.
> 
>    "Congestion" to file-transfer is anything that slows the transfer.
> 
>    "Congestion" to video-streaming is how much delay you need to
> compensate for jitter and retransmissions.
> 
>    "Congestion" to VoIP is how much degradation the listener gets.
> 
>    "Congestion" to Bob is -- well, I don't know exactly: perhaps Bob
> could enlighten us...

I would define congestion as the point @ which trying harder achieves no
more,
diminishing returns.
I would define the then affect of that in turns of user expectation of
experience.


 ...

> 
> > Our equipment is able to measure VoIP QoE (MOS).
> 
>    Acronym-Alert!
VoIP == voice over IP
QoE == quality of experience
MOS == mean opinion score, an ITU standard (ITU == international telecom
union)

> 
> > We do this by locking a local clock to the RTP stream clock, and
> > also looking @ the RTP sequence counter for detecting loss. In
> > this fashion we can see latency/loss/jitter (latency comes from
> > RTCP-XR from the endpoints).
> 
>    I'll guess Don is talking about video streaming (which IMHO is
> insensitive to latency below several seconds). Thus increases in
> latency cover jitter nicely.

I was actually talking about voice. Video streaming today is largely
moved to RTMP transport, and does not use RTP.

Latency to video streaming affects channel change time. Several seconds
is not great to add on to the video decode buffers (VBV). If the latency
is at all variable, some decoders are not good @ initial latency
being different than long term latency.

> 
> > As a correlation, we looked @ the correlation between TCP packet
> > loss and MOS for VoIP.
> 
>    VoIP, of course, _is_ sensitive to latency. In my experience,
> latency over half a second borders on unacceptable.

The 3GPP standard for voice is 600ms of end to end delay as an absolute
maximum (cell to cell call). In practise, 300ms is a significant
irritant
to the average person since they tend to step on each other when
talking.

> 
> > The correlation was low (~0.1).
> 
>    Alas, I don't understand how Don measures "MOS". In VoIP, packet
> loss is perceived as noise-like, and much of VoIP is pretty noise
> tolerant.
> 
> > This is in a heavily mixed network w/ hundreds of thousands of
> > consumer connections.
> 
>    I'm left to guess what Don refers to here...

MOS is a standard (mean opinion score) used by the voice industry. There
are methods that model the affect of loss/latency/jitter/codec fidelity
on the end user. This is crunched into a number from 1 to 5. Some
operators express a SLA (service level agreement) in terms of MOS
(e.g. 95% of calls will have a MOS of 4.0 or better).

> 
> > We inferred there was congestion @ some times because the MOS
> > varied as a function of the hour.
> 
>    To be expected...
> 
> > We assumed the congestion was in the network we were in because
> > some of the VoIP providers were directly peered, and we assumed
> > they were uncongested (no proof of the assumption).
> 
>    Alas, such assurances, unless tested, may not be dependable...
> (And I'm not the least bit sure what Don means by "congestion".)
> 
> > In a similar fashion, we correlated access round-trip-time by
> > looking @ the delta between SYN-ACK towards the subscriber, and
> > ACK returning. We used this time as a proxy for congestion since
> > it correlated better (~0.7).
> 
>    A correlation of 0.7 may be the best we can hope for, but it's
> really not good enough.

I dunno, i thought it was sufficiently close to 1.0 to say 'mission
accomplished' :)

> 
>    Round-trip-time variations are probably correlated with some
> definitions of "congestion", and absolute RTTs are probably
> correlated with a number of things which can degrade the experience.
> Equating RTT with congestion sounds really funny, though...

We correlated jitter in RTT, not absolute RTT.

--don