Re: [tsvwg] Review of draft-carlberg-tsvwg-ecn-reactions-03

Piers O'Hanlon <p.ohanlon@gmail.com> Tue, 23 October 2012 15:19 UTC

Return-Path: <p.ohanlon@gmail.com>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 193A011E80CC for <tsvwg@ietfa.amsl.com>; Tue, 23 Oct 2012 08:19:52 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.599
X-Spam-Level:
X-Spam-Status: No, score=-3.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Z5efYqtPfjCi for <tsvwg@ietfa.amsl.com>; Tue, 23 Oct 2012 08:19:51 -0700 (PDT)
Received: from mail-we0-f172.google.com (mail-we0-f172.google.com [74.125.82.172]) by ietfa.amsl.com (Postfix) with ESMTP id D496A11E80A6 for <tsvwg@ietf.org>; Tue, 23 Oct 2012 08:19:50 -0700 (PDT)
Received: by mail-we0-f172.google.com with SMTP id u46so2411146wey.31 for <tsvwg@ietf.org>; Tue, 23 Oct 2012 08:19:45 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:mime-version:content-type:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to:x-mailer; bh=Tyd9CULrf4eFwCKeV6cGoEnqheWVzw6PsFg5Gmqnqto=; b=NN43QiRA1Acp6ybQIGZZGzjTd72NHQQwVaoYsKkdAt4F5ru/4dvVfUUXv2w9J1xJeq TkzosZu7dpp38Dr7WtdZvAHScoBgkEi0U/hodmbpVKiVEZNBeUNqrfstaePct6wlPBcw fMfaia7bv06X8r2nyD3xTSGOh/AGPrx28CnG8ERI2fGpV/p2mGyhnoE3Irw/xyOaR5ep iBo1wR02rlgEUHnxBWS6BqC2HT5ClAyJgSJf/mjS4lvuw7IKMeEiWv9Rrbxlqko8N59V OSkKIWAupL7zD7zZhfuxahevzdt9snPi3z7TXg5Ge6M/EmnkLHrYCZTKzwsSroULfmE+ pWqQ==
Received: by 10.216.217.38 with SMTP id h38mr6857089wep.82.1351005585492; Tue, 23 Oct 2012 08:19:45 -0700 (PDT)
Received: from ?IPv6:2001:470:1f09:d24:28a9:537e:d9ee:89a9? ([2001:470:1f09:d24:28a9:537e:d9ee:89a9]) by mx.google.com with ESMTPS id cn6sm28444061wib.9.2012.10.23.08.19.26 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 23 Oct 2012 08:19:35 -0700 (PDT)
Mime-Version: 1.0 (Apple Message framework v1283)
Content-Type: text/plain; charset="us-ascii"
From: Piers O'Hanlon <p.ohanlon@gmail.com>
In-Reply-To: <201210191729.q9JHTarm031903@bagheera.jungle.bt.co.uk>
Date: Tue, 23 Oct 2012 16:19:24 +0100
Content-Transfer-Encoding: quoted-printable
Message-Id: <89AE31B0-17D3-4526-BDF9-A4CE9578B5F1@gmail.com>
References: <201210191729.q9JHTarm031903@bagheera.jungle.bt.co.uk>
To: Bob Briscoe <bob.briscoe@bt.com>
X-Mailer: Apple Mail (2.1283)
Cc: Ken Carlberg <ken.carlberg@gmail.com>, tsvwg IETF list <tsvwg@ietf.org>
Subject: Re: [tsvwg] Review of draft-carlberg-tsvwg-ecn-reactions-03
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tsvwg>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 23 Oct 2012 15:19:52 -0000

Hi Bob,

Thanks for the review. Comments inline.

On 19 Oct 2012, at 18:29, Bob Briscoe wrote:

> Ken & Piers,
> 
> I finally got round to reviewing this draft.
> 
> 1/ Where you refer to RFC6679 as more fine-grained information than TCP, you might want to refer to this one too, which the Vancouver TCPM meeting agreed to include in the TCPM charter:
> <http://tools.ietf.org/html/draft-kuehlewind-tcpm-accecn-reqs-00.txt>
> 
> There are currently two possible proposals to satisfy these requirements:
> <http://tools.ietf.org/html/draft-kuehlewind-tcpm-accurate-ecn-01.txt>
> <http://tools.ietf.org/html/draft-kuehlewind-tcpm-accurate-ecn-option-01.txt>
> 
Sure.

> S.3.1
> "  TFRC ... is seeing ... deployment in .... Empathy/Farsight, and GoogleTalk [googl].
> 
>   However it should be noted that TFRC is only recommended for real-time
>   media use with ECN response. TFRC is not recommended for non-ECN paths
>   due to its loss based operation which leads to full queues with
>   maximised latencies.
> "
> 
> Are you saying Empathy/Farsight and GoogleTalk only recommend TFRC if with ECN, or is this document doing the recommending?
> 
We're just saying that TFRC has seen some [partial] deployment in Empathy/Farsight and GoogleTalk. We suggested in the draft that if you're going use TFRC then one should use TFRC with ECN.


> "It is assumed that ECN markings will usually occur
>   with lower queue occupancy and thus lower latency. "
> 
> That can't be right. RFC3168 insists that an AQM signals ECN where it would otherwise have signalled a drop for a non-ECN packet. There is only one queue for them both, so the occupancy has to be the same for both to satisfy RFC3168.
> 
> The whole point of the RFC3168 requirement is to ensure that non-ECN cannot starve ECN traffic. If an AQM were to mark ECN traffic at a shorter queue length, then non-ECN traffic would drive up the (same) queue length to the larger non-ECN operating point, and the algo would be marking ECN traffic with a lot higher probability than non-ECN. ECN would then starve itself given RFC3168 insists an ECN transport is meant to react to a mark as if it were a drop.
> 
> This assumption seems to be the justification for TFRC being appropriate for ECN, where it might not be for non-ECN. If the assumption isn't correct, where does that leave the main message of the document?
> 
Sure - when comparing an ECN enabled RFC3168 compliant queue with and without ECN. However it depends whether the queue complies with RFC3168 - there are a number of alternative approaches using ECN but not compliant to RFC3168 - such as DCTCP and others. Also there is quite a bit of leeway in the definition of the AQM behaviour in RFC3168. But if one is comparing an RFC3168 queue using ECN against against a drop-tail queue then the ECN/RFC3168 queue is generally more favourable to real-time flows due to the probabilistic nature of the marking/dropping.

But to be fair you have a point in the case of a direct comparison of RFC3168 ECN or drop.  I think that the more important aspect of using ECN is that it provide for back-off before loss occurs - allowing for flows to adapt earlier and minimise loss.

> S.5.2 Fault Tolerance
> I don't see why either redundant (duplicate) transmission or FEC are relevant to a document about using ECN, as opposed to loss, for congestion signalling.
> 
Well one could take a variety of actions on the reception of ECN (or loss) - the 3GPP specs suggest some interesting things... We've mentioned them as they're one of many possibilities.

> S.5.3
> I wholly agree that flows that consider themselves important could use a substantially less aggressive back-off. But I complete exemption is of great concern to me. Therein lies a congestion collapse. There is no need for complete exemption, if a flow can respond very little to congestion. At least then if congestion gets really bad, it will still respond a lot. But if congestion is not so bad, it will seem pretty much unresponsive.
> 
True complete exemption probably should not be generally used.

> I know you report experiments where ECN exemption has limited effect on normal flows, but the experiments assume low numbers of exempt flows relative to non-exempt. Each individual flow cannot know whether that assumption holds, so making this assumption is a recipe for boiling a network during an emergency - just when we need it most.
> 
The simulations are of flows that have finite rate limits - so the exempt can take all the bandwidth they need, leaving the rest to be shared amongst the non-exempt flows.

> S.5.3
> "  This applies to all
>   flows that utilize some form of the rate control that is inversely
>   proportional to the loss rate, which includes TCP-like algorithms or
>   equation-based approaches.
> "
> s/inversely proportional to/inverse in some degree to/
> 
Sure - we should add some 'approximate' qualification to that statement.

> "Inversely proportional to" means rate = k/p,
> where p is loss probability and k is a constant
> TCP New Reno    rate = k/p^0.5
> TFRC            rate = k/p^0.5
> TCP Cubic       rate = k/p^0.75
> Compound TCP    rate = k/p^0.8
> None have       rate = k/p
> 
> S.5.3. penultimate para:
> I don't see how the normal flows can have a different latency from the exempt flows. If they all share the same queue, they all see the same latency at any one time. Certainly the average latency may be different for long-running flows that are present during times when shorter flows are there and when they are not. But the _instantaneous_ latency will be the same for any flows sharing a single queue.
> 
Yes the delay incurred on the shared bottleneck queue is the same for all flows, however relatively 'higher delay' seen by the exempt flows but it is actually mostly in the sender node transmit queue (which is the normal situation in this general scenario). That is where queuing should be occurring most of the time as the access link (1Mb/s) is lower than the backhaul link (100Mb/s) - it is in effect the bottleneck queue. So the situation of higher delays is actually the norm and only when ECN marking kicks in on the backhaul link does it actually lead to reduced delays for those flows responding to the marking - reducing their rate - and thus their delay. But the exempt flows that don't react to the marking continue with their normal higher delay. The queue in the backhaul link does build once the combined rates of the node exceeds the backhaul capacity but as the other nodes are ECN enabled they reduce their rate so that queue doesn't build beyond the [RED] queue marking thresholds.


> 
> Nits:
> S.3. s/to quickly/too quickly/
> 
Thanks,

Piers.

> 
> Bob
> 
> 
> ________________________________________________________________
> Bob Briscoe,                                BT Innovate & Design