RE: [PCN] Architecture draft - probing section & general updates.

"Anna Charny (acharny)" <acharny@cisco.com> Tue, 16 October 2007 18:55 UTC

Return-path: <pcn-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1Ihra3-0005kg-ME; Tue, 16 Oct 2007 14:55:51 -0400
Received: from pcn by megatron.ietf.org with local (Exim 4.43) id 1Ihra1-0005jS-Cv for pcn-confirm+ok@megatron.ietf.org; Tue, 16 Oct 2007 14:55:49 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1Ihra0-0005jA-JJ for pcn@ietf.org; Tue, 16 Oct 2007 14:55:48 -0400
Received: from rtp-iport-1.cisco.com ([64.102.122.148]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1IhrZu-0000MZ-NZ for pcn@ietf.org; Tue, 16 Oct 2007 14:55:48 -0400
X-IronPort-AV: E=Sophos; i="4.21,284,1188792000"; d="scan'208,217"; a="74153569"
Received: from rtp-dkim-1.cisco.com ([64.102.121.158]) by rtp-iport-1.cisco.com with ESMTP; 16 Oct 2007 14:55:37 -0400
Received: from rtp-core-2.cisco.com (rtp-core-2.cisco.com [64.102.124.13]) by rtp-dkim-1.cisco.com (8.12.11/8.12.11) with ESMTP id l9GItaJ7009649; Tue, 16 Oct 2007 14:55:36 -0400
Received: from xbh-rtp-211.amer.cisco.com (xbh-rtp-211.cisco.com [64.102.31.102]) by rtp-core-2.cisco.com (8.12.10/8.12.6) with ESMTP id l9GItJkZ022392; Tue, 16 Oct 2007 18:55:32 GMT
Received: from xmb-rtp-203.amer.cisco.com ([64.102.31.20]) by xbh-rtp-211.amer.cisco.com with Microsoft SMTPSVC(6.0.3790.1830); Tue, 16 Oct 2007 14:55:29 -0400
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Subject: RE: [PCN] Architecture draft - probing section & general updates.
Date: Tue, 16 Oct 2007 14:55:27 -0400
Message-ID: <BABC859E6D0B9A4D8448CC7F41CD2B07054B382F@xmb-rtp-203.amer.cisco.com>
In-Reply-To: <A632AD91CF90F24A87C42F6B96ADE5C502973652@rsys005a.comm.ad.roke.co.uk>
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Thread-Topic: [PCN] Architecture draft - probing section & general updates.
Thread-Index: AcgQFh3bpdXKC6/iTqGKgTMqojS39gABeF7AAAGUNqA=
From: "Anna Charny (acharny)" <acharny@cisco.com>
To: "Hancock, Robert" <robert.hancock@roke.co.uk>, philip.eardley@bt.com, pcn@ietf.org
X-OriginalArrivalTime: 16 Oct 2007 18:55:29.0395 (UTC) FILETIME=[1EF0AC30:01C81026]
X-TM-AS-Product-Ver: SMEX-8.0.0.1181-5.000.1023-15486.002
X-TM-AS-Result: No--20.423900-8.000000-31
X-TM-AS-User-Approved-Sender: No
X-TM-AS-User-Blocked-Sender: No
DKIM-Signature: v=0.5; a=rsa-sha256; q=dns/txt; l=34729; t=1192560937; x=1193424937; c=relaxed/simple; s=rtpdkim1001; h=Content-Type:From:Subject:Content-Transfer-Encoding:MIME-Version; d=cisco.com; i=acharny@cisco.com; z=From:=20=22Anna=20Charny=20(acharny)=22=20<acharny@cisco.com> |Subject:=20RE=3A=20[PCN]=20Architecture=20draft=20-=20probing=20section= 20&=20general=20updates. |Sender:=20 |To:=20=22Hancock, =20Robert=22=20<robert.hancock@roke.co.uk>, =20<philip.e ardley@bt.com>,=0A=20=20=20=20=20=20=20=20<pcn@ietf.org>; bh=GKzL9d3pbLQal7yfKAE920AtYOZXKGUXwQM0P7MuPao=; b=kwxA3JZNjBynYsQxtCRYQnVd1SQgZssVIfWrppl6adKv2QYmCx2RVDv0vf81zXlqb0I/CpAy EwgxRfYNsLMZFf2JB6YuJGCQqSU0vjjcrpwIZxN6yQCDMt8MEVzxk2kq;
Authentication-Results: rtp-dkim-1; header.From=acharny@cisco.com; dkim=pass ( sig from cisco.com/rtpdkim1001 verified; );
X-Spam-Score: -4.0 (----)
X-Scan-Signature: cb256aa41b5300a7da304d7294799ef5
Cc:
X-BeenThere: pcn@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: PCN WG list <pcn.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/pcn>, <mailto:pcn-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/pcn>
List-Post: <mailto:pcn@ietf.org>
List-Help: <mailto:pcn-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/pcn>, <mailto:pcn-request@ietf.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============1348938991=="
Errors-To: pcn-bounces@ietf.org

Hi Robert, Phil,
 
Yes, Robert's is a fair concern to which no obvious solution is in
sight. Different equipment might use different algorithms and might use
different fields for ECMP load-balancing under different circumstances.
IMHO this is a killer argument of why the use of probing for discovering
the state of ECMP paths should not be considered within the scope of PCN
WG.  
 
There remains a question of whether probing can/should be considerd to
probe the path regardless of the ECMP issue.  I see most of its value in
flash crowd situations in combination with low ingress-egress
agregation.  
 
Anna 
________________________________

From: Hancock, Robert [mailto:robert.hancock@roke.co.uk] 
Sent: Tuesday, October 16, 2007 1:49 PM
To: philip.eardley@bt.com; pcn@ietf.org
Subject: RE: [PCN] Architecture draft - probing section & general
updates.



	phil,
	 
	your text
	"Hence somehow the PCN-egress-node has to be able to
disambiguate a probe packet from a data packet, via the characteristic
setting of particular bit(s) in the packet's header or body - but these
bit(s) mustn't be used by the PCN-node's ECMP algorithm."
	is nicely written (especially the "somehow"). 
	 
	Presumably the last phrase should read "but these bit(s) mustn't
be used by any interior PCN-node's ECMP algorithm." And there lies the
rub: I can't see any robust technique that the ingress can use to decide
what bits to set and how to set them, without the specification making
explicit assumptions that go beyond what the ECMP "specifications" say,
and which may not be compatible with filtering capabilities on the
interior and egress nodes. 
	 
	robert h.


________________________________

		From: philip.eardley@bt.com
[mailto:philip.eardley@bt.com] 
		Sent: 16 October 2007 18:01
		To: pcn@ietf.org
		Subject: [PCN] Architecture draft - probing section &
general updates.
		
		

		Hi,

		 

		There was quite a discussion about the probing section
of the Architecture draft recently, draft-ietf-pcn-architecture-00. This
showed to me that it needed a re-write, which I've attempted. 

		 

		One thing that struck me reading the discussion was that
there really seem to be 2 uses of probing. The current draft tries to
talk about them as though they're two aspects of the same thing; in the
text below I've tried instead to talk about them separately - because I
think they're really quite different.

		 

		The first uses probing to test the
ingress-egress-aggregate, the second uses probing is to test a
particular ECMP path.

	


		The first uses probe pkts addressed to the
PCN-egress-node, the second uses probe pkts addressed to the destination
end node [so they follow the same ECMp path as the data pkts would do]

		 

		The first has no problem stopping probe pkts leaking out
of the PCN-domain [because they're addressed to the PCN-egress-node]
whilst the second has to think carefully about this [how the
PCN-egress-node can easily identify pkts, but the ECMP algos are
unaffected by the "flag = probe pkt"]

		 

		The first would probably be used by people who envisage
probing rarely being used (most ingress-egress-aggregates always have
traffic), the second is probably favoured by people who would actually
probe on every call admission attempt [esp people who have it in mind to
have PCN operating in parts of network with lower capacity links ie the
multiple flows (aggregation) assumption doesn't hold]. (There are plenty
of other scenarios which I'm not commenting on!)

		 

		The first has a less clear benefit to me (it doesn't
seem that much better than the easy alternative - just admit the 'first'
flow into an 'empty' ingress-egress-aggregate, and take the chance it
causes flows to be terminated or pkts to be dropped). The second has a
clearer benefit to me, in some scenarios (you test the actual ECMP path
the flow would take.) 

	


		The first probing approach (ie tests the
ingress-egress-aggregate) seems to me quite simple to define, but the
second much harder [need to work out how to stop probes leaking out of
PCN-domain and how to flag a probe to minimise interactions with ECMP]. 

		 

		Personally I think that in the short term (say until
Christmas) our objective is to start converging on the router's
PCN-marking behaviour (algorithm & encoding) - so it isn't necessary to
talk about the details of probing at the moment. However, as part of the
algorithm discussion it is relevant to note (as Michael has already
pointed out) that excess-rate-marking algorithm is less suitable than a
threshold-marking algo from a probing perspective (at least you have to
send a lot more probe pkts to get an accurate picture of the
pre-congestion level). (of course it's only one factor when we choose
the algo.)

		 

		Anyway, here's the proposed revised sub-section 5.5
about probing - please comment /shout if you don't like it:-

		 

		****

		Probing functions are optional, and can be used for
admission control.  

		 

		PCN's admission control, as described so far, is
essentially a reactive mechanism where the PCN-egress-node monitors the
pre-congestion level for traffic from each PCN-ingress-node; if the
level rises then it blocks new flows on that ingress-egress-aggregate.
However, it's possible that an ingress-egress-aggregate carries no
traffic, and so the PCN-egress-node can't make an admission decision
using the usual method described earlier. 

		 

		One approach is to be "optimistic" and simply admit the
new flow. However it's possible to envisage a scenario where the traffic
levels on other ingress-egress-aggregates are already so high that
they're blocking new PCN-flows and admitting a new flow onto this
'empty' ingress-egress-aggregate would add extra traffic onto the link
that's already pre-congested - which may 'tip the balance' so that PCN's
flow termination mechanism is activated or some packets are dropped.
This risk could be lessened by configuring on each link sufficient
'safety margin' above the PCN-lower-rate.   

		 

		An alternative approach is to make PCN a more proactive
mechanism. The PCN-ingress-node explicitly determines, before admitting
the prospective new flow, whether the ingress-egress-aggregate can
support it. This can be seen as a "pessimistic" approach, in contrast to
the "optimism" of the approach above. It involves probing: a
PCN-ingress-node generates and sends probe packets in order to test the
pre-congestion level that the flow would experience. A probe packet is
just a dummy data packet, generated by the PCN-ingress-node and
addressed to the PCN-egress-node. A downside of probing is that it adds
delay to the admission control process. Also note that in the scenario
described in the previous paragraph (where traffic levels on other
ingress-egress-aggregates is already very high), the probe packets may
also 'tip the balance'. However, the risk should be reduced because it
should be possible to send probe packets for a shorter time and at a
lower rate than a typical data flow. 

		 

		The situation is more complicated if there is multipath
routing (ECMP) in the PCN-domain. It is then possible for some paths to
be pre-congested whilst other paths within the same
ingress-egress-aggregate aren't pre-congested.

		 

		One approach essentially ignores ECMP: as usual, admit
or block a new flow depending on the "measurements of PCN-traffic" on
the ingress-egress-aggregate. This is rather similar to the "optimistic"
approach above. 

		 

		An alternative ("pessimistic" or "proactive") approach
is to probe the ECMP path. The PCN-ingress-node generates and sends
probe packets (dummy data) that follow the specific ECMP path that the
new flow would do, in order to test the pre-congestion level along it.
An ECMP algorithm typically examines: the source and destination IP
addresses and port numbers, the protocol ID and the DSCP. Hence these
fields must have the same values in the probe packets as the future data
packets would have. On the other hand, the PCN-egress-node needs to
consume the probe packets to ensure that they don't travel beyond the
PCN-domain (eg they might confuse the destination end node). Hence
somehow the PCN-egress-node has to be able to disambiguate a probe
packet from a data packet, via the characteristic setting of particular
bit(s) in the packet's header or body - but these bit(s) mustn't be used
by the PCN-node's ECMP algorithm. 

		 

		The probing functions are:

		 

		   o  Make decision that probing is needed. As described
above, this is when the ingress-egress-aggregate or the ECMP path
carries no PCN-traffic. An alternative is always to probe, ie probe
before admitting every PCN-flow.

		 

		   o  (if required) Communicate the request that probing
is needed - the PCN-egress-node signals to the PCN-ingress-node that
probing is needed

		 

		   o  Generate probe traffic - the PCN-ingress-node
generates the probe traffic.  The appropriate number (or rate) of probe
packets will depend on the PCN-marking algorithm; for example an
excess-rate-marking algorithm generates fewer PCN-marks than a
threshold-marking algorithm, and so will need more probe packets.

		 

		   o  Forward probe packets - as far as
PCN-interior-nodes are concerned, probe packets must be handled the same
as (ordinary data) PCN-packets, in terms of routing, scheduling and
PCN-marking.

		 

		   o  Consume probe packets - the PCN-egress-node
consumes probe packets to ensure that they don't travel beyond the
PCN-domain.

		 

		****

		 

		Incidentally, I have edited the draft to include all the
comments /discussion that there's been on the list since Vancouver. I'll
add the above probing section [unless there are cries of unhappiness]. I
also have some comments on paper from bob [will summarise to list where
> typos etc]. So will send revised version out next week or end of this
week.

		 

		Best wishes

		phil

_______________________________________________
PCN mailing list
PCN@ietf.org
https://www1.ietf.org/mailman/listinfo/pcn