Re: [re-ECN] Two questions about CONEX

Bob Briscoe <> Tue, 18 May 2010 09:25 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 1976F3A6CD1 for <>; Tue, 18 May 2010 02:25:43 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: 0.17
X-Spam-Status: No, score=0.17 tagged_above=-999 required=5 tests=[AWL=-0.313, BAYES_50=0.001, DNS_FROM_RFC_BOGUSMX=1.482, RCVD_IN_DNSWL_LOW=-1]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id Z8NQ5ZkXgVWf for <>; Tue, 18 May 2010 02:25:35 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id 601CC3A6C4C for <>; Tue, 18 May 2010 02:21:58 -0700 (PDT)
Received: from ([]) by with Microsoft SMTPSVC(6.0.3790.3959); Tue, 18 May 2010 10:21:50 +0100
Received: from ([]) by with Microsoft SMTPSVC(6.0.3790.4675); Tue, 18 May 2010 10:21:50 +0100
Received: From ([]) by (WebShield SMTP v4.5 MR1a P0803.399); id 1274174509274; Tue, 18 May 2010 10:21:49 +0100
Received: from ([]) by (8.13.5/8.12.8) with ESMTP id o4I9Lkq0010351; Tue, 18 May 2010 10:21:46 +0100
Message-Id: <>
X-Mailer: QUALCOMM Windows Eudora Version
Date: Tue, 18 May 2010 10:21:38 +0100
To: Fred Baker <>
From: Bob Briscoe <>
In-Reply-To: <>
References: <> <>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format="flowed"
X-Scanned-By: MIMEDefang 2.56 on
X-OriginalArrivalTime: 18 May 2010 09:21:50.0320 (UTC) FILETIME=[8BF05F00:01CAF66B]
Cc: "" <>
Subject: Re: [re-ECN] Two questions about CONEX
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: re-inserted explicit congestion notification <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Tue, 18 May 2010 09:25:43 -0000


At 06:59 18/05/2010, Fred Baker wrote:

>On May 7, 2010, at 11:24 PM, Stewart Bryant wrote:
> > That causes me to wonder where CONEX fits in relative to IPFIX 
> which is a mechanism that is designed to monitor the flows in a 
> network and report this information to the network operator.
>That is a fairly complicated question.
>First, IPFIX is not at all lightweight. At the high end, to counter 
>that, it is often statistical - collecting data on all flows is 
>considered a lot of data and, if a reliable transport is used to 
>deliver it, potentially a large amount of memory dedicated to the 
>function. Setting or not setting a two bit field in the IP header is 
>something that any device that can decrement a TTL or hop limit can 
>easily do with effectively no additional overhead. If you're trying 
>to consistently manage high end users, which is what conex is 
>ultimately about, which can be implemented anywhere and gives you 
>full information?
>Second, IPFIX doesn't really measure the same thing as ECN does. 
>What ECN does is note that at the point in time that a datagram is 
>traversing a node, the node is or is not experiencing "congestion" 
>as defined locally. The definition for that class of traffic at that 
>point could be "integrated throughput rate exceeds <threshold>", 
>"instantaneous queue depth exceeds <threshold>", or "mean queue 
>depth exceeds <threshold>", and it might mean "marked with some 
>probability" or "marked if the case is true". Whatever its 
>definition, you can determine that the node in question saw 
>something it called "congestion". IPFIX tells you what went through 
>a system - between <time1> and <time2>, data flow <selector> passed 
>some number of bytes and packets. You can now feed that into a 
>simulator of some type, but at best it will give you an 
>approximation of the behavior of the link conditioned by your 
>assumptions, not "the behavior of the link".


As John Leslie said in his talk to the transport area in LA, network 
layer people use the word congestion as synonymous with utilisation. 
Transport layer people use congestion to mean queue growth.

Average utilisation isn't sufficient to be able to determine what the 
different contributions to congestion (queue growth) are of different 
users, if some are running unresponsive transports, and others are 
running highly yielding transports (e.g. LEDBAT).

* In ConEx, we're trying to find the product of bit-rate and queue 
congestion over time.
* IPFIX finds bit-rate over time.

>Where I scratch my head with re-ecn is the impact of the 
>re-introduction of the flag stream. It looks to me like a control 
>system, which is presumed to be somewhere at a network edge, is 
>supposed to be scoreboarding marks against IP addresses or IP 
>prefixes, which sounds to me like the kind of overheads one sees in 
>XCP, RCP, or the like, but with storage and discrimination in some 
>form of IP address database. That worries me from a scaling perspective.

It's not intended to be like XCP/RCP. The fine control is intended to 
still be at the end-system not in the network equipment. The 
re-ECN/ConEx info merely allows the network to encourage all the 
end-system controls to balance their aggressiveness, to achieve a 
'fair' amount of harm to others over time.

The minimum a network needs to store is one number per directly 
connected customer (whether end-customer or neighbouring network). 
And there's no need for each router to do this. One traffic control 
device per connected 'customer' needs this info, and the packets 
bring this info to it.

That's surely the opposite of a scaling problem.

This 6pp paper is the best we could do to succinctly describe how the 
most minimalist control system could work:
   Policing Freedom to Use the Internet Resource Pool
Or I can provide much more detail on request.

>I'm in favor of having ISPs be able to limit congestion in a 
>reasonable way, but I'm interested in approaches that scale. I'm 
>also concerned about what looks to me like a persistent under-run. 
>If Alice sends Bob a stream of traffic, Bob replies noting her mark 
>count, and she re-replies re-inserting that mark count, there is 
>always some probability that her re-inserted reply is itself marked 
>and Bob no longer cares. If the network is drawing broad conclusions 
>from aggregated marks, it seems it would need to account for the under
>  -run.

Ah, this is probably a misconception because we often describe re-ECN 
as simply reinsertion of black packets in response to feedback 
notifying red packets. Then, yes, black would lag red.

But one level of detail further down in the re-ecn-tcp I-D, the 
source has to keep the balance positive by sending sufficient green 
packets from the start. Green and black count the same in the 
network, but they can be distinguished if you need to separate actual 
congestion from just being cautious. This is a subtle, but crucial 
part of the protocol (IMHO), which allows the anti-cheating 
mechanisms to be much simpler.

For TCP, we've calculated the SYN, and the first & third data packets 
should be green, altho Mirja has pointed out an error in our maths on 
this, so we have to correct that.

>Another way in which I scratch my head is what one actually does. If 
>we're OK with maintaining per-address or per-prefix state of some 
>sort, I would find myself wondering about a variation on fair 
>queuing that uses the later of "a time or sequence number dictated 
>by the source address" and "a time or sequence number dictated by 
>the destination address", and only keeps track of addresses/prefixes 
>that have been recently used.
>That allows the ISP to be in control of its own congestion without 
>the compliance of neighboring networks or the hosts in them. But 
>that's just me.

Don't understand. Do you want to say more?

BTW, we designed re-ECN so that you don't have to trust that the 
source address has any meaning - it can just be used as an opaque 
label, not a locator. So a receiving network doesn't have to care 
about source addresses in upstream networks - it's independent of 
whether mechanisms like src addr validation (SAVI) are widely 
deployed or not. And it's OK with NATs & tunnels.


>re-ECN mailing list

Bob Briscoe,                                BT Innovate & Design