Re: [PCN] traffic matrix scenario

Lars Eggert <lars.eggert@nokia.com> Thu, 01 November 2007 11:15 UTC

Return-path: <pcn-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1InY16-0003vv-BU; Thu, 01 Nov 2007 07:15:16 -0400
Received: from pcn by megatron.ietf.org with local (Exim 4.43) id 1InY15-0003vQ-3B for pcn-confirm+ok@megatron.ietf.org; Thu, 01 Nov 2007 07:15:15 -0400
Received: from [10.90.34.44] (helo=chiedprmail1.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1InY14-0003vI-LR for pcn@ietf.org; Thu, 01 Nov 2007 07:15:14 -0400
Received: from smtp.nokia.com ([131.228.20.173] helo=mgw-ext14.nokia.com) by chiedprmail1.ietf.org with esmtp (Exim 4.43) id 1InY13-0004VH-OU for pcn@ietf.org; Thu, 01 Nov 2007 07:15:14 -0400
Received: from esebh108.NOE.Nokia.com (esebh108.ntc.nokia.com [172.21.143.145]) by mgw-ext14.nokia.com (Switch-3.2.5/Switch-3.2.5) with ESMTP id lA1BF63b029054; Thu, 1 Nov 2007 13:15:07 +0200
Received: from esebh103.NOE.Nokia.com ([172.21.143.33]) by esebh108.NOE.Nokia.com with Microsoft SMTPSVC(6.0.3790.1830); Thu, 1 Nov 2007 13:14:38 +0200
Received: from mgw-int01.ntc.nokia.com ([172.21.143.96]) by esebh103.NOE.Nokia.com over TLS secured channel with Microsoft SMTPSVC(6.0.3790.1830); Thu, 1 Nov 2007 13:14:38 +0200
Received: from [172.21.34.231] (esdhcp034231.research.nokia.com [172.21.34.231]) by mgw-int01.ntc.nokia.com (Switch-3.2.5/Switch-3.2.5) with ESMTP id lA1BEWdq022668; Thu, 1 Nov 2007 13:14:32 +0200
In-Reply-To: <66C55C26FA491C42A9C9BB62A376DAFF01631876@E03MVB1-UKBR.domain1.systemhost.net>
References: <66C55C26FA491C42A9C9BB62A376DAFF01631876@E03MVB1-UKBR.domain1.systemhost.net>
Mime-Version: 1.0 (Apple Message framework v752.3)
Message-Id: <0BC591DC-4D67-49CD-B70C-321511F59913@nokia.com>
From: Lars Eggert <lars.eggert@nokia.com>
Subject: Re: [PCN] traffic matrix scenario
Date: Thu, 01 Nov 2007 13:14:31 +0200
To: "ext ben.strulo@bt.com" <ben.strulo@bt.com>
X-Mailer: Apple Mail (2.752.3)
X-OriginalArrivalTime: 01 Nov 2007 11:14:38.0278 (UTC) FILETIME=[64330E60:01C81C78]
X-Nokia-AV: Clean
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 8f374d0786b25a451ef87d82c076f593
Cc: pcn@ietf.org
X-BeenThere: pcn@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: PCN WG list <pcn.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/pcn>, <mailto:pcn-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/pcn>
List-Post: <mailto:pcn@ietf.org>
List-Help: <mailto:pcn-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/pcn>, <mailto:pcn-request@ietf.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============2088462447=="
Errors-To: pcn-bounces@ietf.org

Hi,

On 2007-10-31, at 15:51, ext ben.strulo@bt.com wrote:
> Broadly speaking I think our objective is that flow termination should
> only be necessary in the case of an unexpected decrease in core
> capacity.  We generally do not expect an Admission Control system to
> admit calls and then later terminate them in the absence of internal
> network problems such as link failure.

I think this (how "OK" flow termination is) is the root cause of our  
argument about PCN.

I see PCN as a "best-effort" sort of QoS. A PCN has a load range in  
which it operates well. You increase load by admitting flows, you get  
signals about the load from the markings, and then you react to it.  
If you over-admitted a bit, you stop admission until load abates. If  
you over-admitted a lot, you need to terminate flows to get the  
domain back into a stable state. Flow admission is how you ramp up  
the load, stopping admission is the "light" mechanism to reduce load  
over time, flow termination is the "heavy" mechanism to reduce load  
quickly. You can be optimistic about admitting flows, because you can  
always recover from wrong admission decisions.

If flow termination is to be avoided at all costs, the properties of  
the architecture fundamentally change. All of a sudden, you need to  
be a *lot* more careful about when and what you admit, because all  
you have left to react to overload is the "light" stop-admission  
mechanism. You must not over-admit, so other mechanisms are needed to  
make sure you don't over-admit, i.e., probing.

> My scenario mentioned the failure of an exchange really only as an
> example of the sort of extreme scenarios we consider.  The real  
> examples
> I had in mind were simply flash crowds: widespread and unexpected
> increases in call request rates with an unusual (e.g. regionally
> focussed) traffic matrix.
>
> The sort of scenario that might be particularly testing for this
> particular probing issue, would be an initial anomalous traffic  
> pattern
> consisting of very heavy traffic on a few aggregates causing
> pre-congestion on just a few links, followed by a subsequent  
> widespread
> increase in demand which also focuses on those links.  It is easy to
> construct reasonable (though not necessarily probable) sequences of
> external events that could cause this sort of traffic pattern: for
> example, news reporting initially being local and progressing to
> national coverage.

I believe that it will be very difficult to design a simple  
architecture that can handle such cases with just stop-admission as a  
permitted reaction mechanism.

How likely are these scenarios?

> We would expect an Admission Control system to do a good job of
> rejecting requests in this scenario.  Though this is not completely  
> cut
> and dried: it's possible a very small amount of flow termination might
> be acceptable.
>
>> If we aren't in agreement, then I wonder under what
>> circumstances you'd consider flow termination to be appropriate?
>
> As I say, only really when there is a sudden and significant  
> decrease in
> core capacity.  Even then, we would normally expect to be  
> provisioned to
> deal with all but the most serious failures.

After reading through your scenario, I believe that PCN (as I  
understand it) is the wrong tool for the job. You really want firm  
guarantees that the domain can sustain a flow before admitting it,  
and you never want to terminate flows unless a catastrophic event  
occurs. In other words, you want path-signaled QoS reservations,  
maybe using RSVP or NSIS. We have protocols and an architecture for  
path-coupled QoS, and I see no need and no benefit in PCN attempting  
to solve the same problem in a different way.

(And if there is no other problem for PCN to solve, maybe we're done...)

Lars
_______________________________________________
PCN mailing list
PCN@ietf.org
https://www1.ietf.org/mailman/listinfo/pcn