Re: [re-ECN] Two questions about CONEX

Fred Baker <fred@cisco.com> Tue, 18 May 2010 06:00 UTC

Return-Path: <fred@cisco.com>
X-Original-To: re-ecn@core3.amsl.com
Delivered-To: re-ecn@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 6D1613A6959 for <re-ecn@core3.amsl.com>; Mon, 17 May 2010 23:00:05 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -108.652
X-Spam-Level:
X-Spam-Status: No, score=-108.652 tagged_above=-999 required=5 tests=[AWL=-0.653, BAYES_50=0.001, RCVD_IN_DNSWL_HI=-8, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id AsEadyrtzefA for <re-ecn@core3.amsl.com>; Mon, 17 May 2010 23:00:04 -0700 (PDT)
Received: from sj-iport-5.cisco.com (sj-iport-5.cisco.com [171.68.10.87]) by core3.amsl.com (Postfix) with ESMTP id 2C1243A6B66 for <re-ecn@ietf.org>; Mon, 17 May 2010 23:00:04 -0700 (PDT)
Authentication-Results: sj-iport-5.cisco.com; dkim=neutral (message not signed) header.i=none
X-IronPort-AV: E=Sophos;i="4.53,253,1272844800"; d="scan'208";a="198800783"
Received: from sj-core-3.cisco.com ([171.68.223.137]) by sj-iport-5.cisco.com with ESMTP; 18 May 2010 05:59:56 +0000
Received: from stealth-10-32-244-218.cisco.com (stealth-10-32-244-218.cisco.com [10.32.244.218]) by sj-core-3.cisco.com (8.13.8/8.14.3) with ESMTP id o4I5xn0W011836; Tue, 18 May 2010 05:59:50 GMT
Received: from [127.0.0.1] by stealth-10-32-244-218.cisco.com (PGP Universal service); Mon, 17 May 2010 22:59:56 -0700
X-PGP-Universal: processed; by stealth-10-32-244-218.cisco.com on Mon, 17 May 2010 22:59:56 -0700
Mime-Version: 1.0 (Apple Message framework v1078)
From: Fred Baker <fred@cisco.com>
In-Reply-To: <4BE5039A.5040003@cisco.com>
Date: Mon, 17 May 2010 22:59:43 -0700
Message-Id: <CD605E30-0F84-4A2D-ABBC-F3C73742E837@cisco.com>
References: <4BE5039A.5040003@cisco.com>
To: stbryant@cisco.com
X-Mailer: Apple Mail (2.1078)
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Cc: "re-ecn@ietf.org" <re-ecn@ietf.org>
Subject: Re: [re-ECN] Two questions about CONEX
X-BeenThere: re-ecn@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: re-inserted explicit congestion notification <re-ecn.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/re-ecn>, <mailto:re-ecn-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/re-ecn>
List-Post: <mailto:re-ecn@ietf.org>
List-Help: <mailto:re-ecn-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/re-ecn>, <mailto:re-ecn-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 18 May 2010 06:00:05 -0000

On May 7, 2010, at 11:24 PM, Stewart Bryant wrote:

> That causes me to wonder where CONEX fits in relative to IPFIX which is a mechanism that is designed to monitor the flows in a network and report this information to the network operator.

That is a fairly complicated question.

First, IPFIX is not at all lightweight. At the high end, to counter that, it is often statistical - collecting data on all flows is considered a lot of data and, if a reliable transport is used to deliver it, potentially a large amount of memory dedicated to the function. Setting or not setting a two bit field in the IP header is something that any device that can decrement a TTL or hop limit can easily do with effectively no additional overhead. If you're trying to consistently manage high end users, which is what conex is ultimately about, which can be implemented anywhere and gives you full information?

Second, IPFIX doesn't really measure the same thing as ECN does. What ECN does is note that at the point in time that a datagram is traversing a node, the node is or is not experiencing "congestion" as defined locally. The definition for that class of traffic at that point could be "integrated throughput rate exceeds <threshold>", "instantaneous queue depth exceeds <threshold>", or "mean queue depth exceeds <threshold>", and it might mean "marked with some probability" or "marked if the case is true". Whatever its definition, you can determine that the node in question saw something it called "congestion". IPFIX tells you what went through a system - between <time1> and <time2>, data flow <selector> passed some number of bytes and packets. You can now feed that into a simulator of some type, but at best it will give you an approximation of the behavior of the link conditioned by your assumptions, not "the behavior of the link".

Where I scratch my head with re-ecn is the impact of the re-introduction of the flag stream. It looks to me like a control system, which is presumed to be somewhere at a network edge, is supposed to be scoreboarding marks against IP addresses or IP prefixes, which sounds to me like the kind of overheads one sees in XCP, RCP, or the like, but with storage and discrimination in some form of IP address database. That worries me from a scaling perspective. I'm in favor of having ISPs be able to limit congestion in a reasonable way, but I'm interested in approaches that scale. I'm also concerned about what looks to me like a persistent under-run. If Alice sends Bob a stream of traffic, Bob replies noting her mark count, and she re-replies re-inserting that mark count, there is always some probability that her re-inserted reply is itself marked and Bob no longer cares. If the network is drawing broad conclusions from aggregated marks, it seems it would need to account for the under-run.

Another way in which I scratch my head is what one actually does. If we're OK with maintaining per-address or per-prefix state of some sort, I would find myself wondering about a variation on fair queuing that uses the later of "a time or sequence number dictated by the source address" and "a time or sequence number dictated by the destination address", and only keeps track of addresses/prefixes that have been recently used. That allows the ISP to be in control of its own congestion without the compliance of neighboring networks or the hosts in them. But that's just me.