[rsvp-dir] Redundant aggregate reservations: draft-ietf-tsvwg-rsvp-pcn-03

Bob Briscoe <bob.briscoe@bt.com> Thu, 15 November 2012 13:07 UTC

Return-Path: <bob.briscoe@bt.com>
X-Original-To: rsvp-dir@ietfa.amsl.com
Delivered-To: rsvp-dir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2A34321F88FF for <rsvp-dir@ietfa.amsl.com>; Thu, 15 Nov 2012 05:07:45 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.995
X-Spam-Level:
X-Spam-Status: No, score=-2.995 tagged_above=-999 required=5 tests=[AWL=0.003, BAYES_00=-2.599, HTML_MESSAGE=0.001, J_CHICKENPOX_44=0.6, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 04EtqKTIDizA for <rsvp-dir@ietfa.amsl.com>; Thu, 15 Nov 2012 05:07:45 -0800 (PST)
Received: from hubrelay-rd.bt.com (hubrelay-rd.bt.com [62.239.224.99]) by ietfa.amsl.com (Postfix) with ESMTP id 1D06B21F878C for <rsvp-dir@ietfa.amsl.com>; Thu, 15 Nov 2012 05:07:41 -0800 (PST)
Received: from EVMHR02-UKBR.domain1.systemhost.net (193.113.108.41) by EVMHR68-UKRD.bt.com (10.187.101.23) with Microsoft SMTP Server (TLS) id 8.3.279.1; Thu, 15 Nov 2012 13:07:40 +0000
Received: from dyw02134app01.domain1.systemhost.net (193.113.249.13) by EVMHR02-UKBR.domain1.systemhost.net (193.113.108.41) with Microsoft SMTP Server (TLS) id 8.3.279.1; Thu, 15 Nov 2012 13:07:39 +0000
Received: from cbibipnt05.iuser.iroot.adidom.com (147.149.196.177) by dyw02134app01.domain1.systemhost.net (10.35.25.214) with Microsoft SMTP Server id 14.2.309.2; Thu, 15 Nov 2012 13:07:34 +0000
Received: From bagheera.jungle.bt.co.uk ([132.146.168.158]) by cbibipnt05.iuser.iroot.adidom.com (WebShield SMTP v4.5 MR1a P0803.399); id 1352984850566; Thu, 15 Nov 2012 13:07:30 +0000
Received: from MUT.jungle.bt.co.uk ([10.73.9.241]) by bagheera.jungle.bt.co.uk (8.13.5/8.12.8) with ESMTP id qAFD7RA0009392; Thu, 15 Nov 2012 13:07:27 GMT
Message-ID: <201211151307.qAFD7RA0009392@bagheera.jungle.bt.co.uk>
X-Mailer: QUALCOMM Windows Eudora Version 7.1.0.9
Date: Thu, 15 Nov 2012 13:07:31 +0000
To: karagian@cs.utwente.nl, anuragb@cisco.com
From: Bob Briscoe <bob.briscoe@bt.com>
In-Reply-To: <FF1A9612A94D5C4A81ED7DE1039AB80F2ED8FEDF@EXMBX04.ad.utwent e.nl>
References: <87222982-329F-43DF-BFD8-9D3705AFE101@mimectl> <E728D0E3C41E644A96A7CCA61863BED4081DE009@xmb-aln-x12.cisco.com> <201211141251.qAECpsn0005426@bagheera.jungle.bt.co.uk> <FF1A9612A94D5C4A81ED7DE1039AB80F2ED8FEDF@EXMBX04.ad.utwente.nl>
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="=====================_662145354==.ALT"
X-Scanned-By: MIMEDefang 2.56 on 132.146.168.158
Cc: tsvwg@ietf.org, philip.eardley@bt.com, anuragb@cisco.com, carlberg@g11.org.uk, rsvp-dir@ietfa.amsl.com, PCN IETF list <pcn@ietf.org>
Subject: [rsvp-dir] Redundant aggregate reservations: draft-ietf-tsvwg-rsvp-pcn-03
X-BeenThere: rsvp-dir@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: RSVP directorate <rsvp-dir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rsvp-dir>, <mailto:rsvp-dir-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rsvp-dir>
List-Post: <mailto:rsvp-dir@ietf.org>
List-Help: <mailto:rsvp-dir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rsvp-dir>, <mailto:rsvp-dir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 15 Nov 2012 13:07:45 -0000

Georgios, Anurag,

Below is the main point of my review, arguing that aggregate 
reservations are redundant. I'm reviewing as:
- a member of the RSVP directorate
- one of the early PCN design team
- a co-author of draft-lefaucheur-rsvp-ecn-01, on which this draft is based.

I would have missed any decision to use aggregate reservations. Pls 
point me to the relevant discussion (e.g. Subject line / date).

I admit I tuned out of much of the later PCN signalling discussion. I 
found the whole exercise of abstracting PCN away from specific 
signalling protocols highly tedious; it meant we couldn't sensibly 
address important issues like message reliability, timeliness etc.

Nonetheless, here's why I believe RSVP aggregation is redundant:

PCN edge-nodes support the concept of an ingress-egress aggregate in 
their own internal tables, but they don't need to refer to an 
aggregate on the wire{Note 1}. PCN-ingress and PCN-egress nodes 
intrinscially know which e2e reservations belong to which aggregate 
by grouping together those e2e reservations with the same next hop or 
previous hop respectively.

{Note 1: except in one case described later - but it doesn't require 
all the other baggage of aggregate reservations}

==Background==
Aggregate reservations [RFC3175, RFC4860] are designed to reduce the 
state required on interior nodes. Interior nodes still require state 
per aggregate reservation, but only reservation state, not 
classification and scheduling state [RFC3175, Section 1.4.1 last para].

In contrast, as you correctly point out (in Section 2.1.7), PCN 
requires absolutely no reservation-related state on interior nodes.

==Disadvantages==
Requiring PCN to use aggregate reservations has the following three 
disadvantages and no advantages:

1) Redundant Processing
The PATH message between aggregator and deaggregator in rsvp-pcn-03 
(triggered by an E2E PathErr message from deaggregator to aggregator) 
is redundant, and just doubles the processing required at the 
PCN-edge-nodes (if this isn't obvious, I spell it out separately for 
PATH & RESV messages below).

2) Reduced Resilience
Not only is an aggregate PATH redundant, it actually reduces 
resilience. Because an aggregate PATH is pinned to interior routers. 
Therefore, when routing changes, it is more complex and slower to 
move to the new route. By not pinning to interior routers, PCN was 
designed to 'just work' over interior routing changes - with no need 
for any changes to the RSVP PATHs. (But it would still detect 
overload after a re-route and terminate or rate-reduce flows if necessary.)

3) Extra Latency
A further disadvantage is the extra latency required for the first 
reservation that sets up an aggregate. This is two ingress-egress 
round trips minus the round trip time from egress to destination (or 
one ingress-egress round trip if it is greater). This will rarely add 
to latency on heavily used ingress-egress aggregates, but it will 
occur frequently on all the 'long-tail' (lightly used) ingress-egress 
aggregates.

==PATH==

With RFC 3175 or 4860 aggregate paths, the aggregator forwards the 
e2e PATH messages with IP protocol number RSVP-E2E-IGNORE and the 
deaggregator changes them back to RSVP before forwarding onward. Also 
the aggregator sends an aggregate PATH message, which is processed by 
each interior node and by the deaggregator.

On a path across a PCN region, given interior nodes ignore aggregate 
PATH messages as well, the only PCN nodes that handle aggregate 
messages are the aggregator and the deaggregator. The aggregator and 
deaggregator process all the e2e PATH messages anyway, so if we 
require the aggregator to add up all the e2e PATH messages and form 
them into an aggregate PATH message, this is just extra redundant 
work for both PCN-edge-nodes.

==RESV==

The deaggregator unicasts e2e RESV messages to the previous RSVP hop, 
which is the aggregator. Therefore, if we require the deaggregator to 
add up all the RESV messages and form them into an aggregate RESV 
message, this is just redundant work for both PCN-edge-nodes, because 
they both already process all the e2e RESV messages anyway, and no 
other node uses the aggregate RESV messages.

==PCN object==

This raises the question of how the PCN-egress communicates the 
various marking rates (the PCN object) to the PCN-ingress. There are 
two possibilities:
i) the PCN-egress includes a current PCN object in each e2e RESV that 
it returns to the PCN-ingress. The PCN-ingress strips the PCN object 
out before forwarding the RESV back to the previous RSVP hop.
ii) the PCN-egress attaches a PCN object to an aggregate reservation, 
as in pcn-rsvp-03.

Either are possible, because a PCN object carries information about 
marking probabilities, and PCN works on the assumption that the 
marking probability of an ingress-egress aggregate is the same as the 
marking probability of the flows within the aggregate. A PCN object 
can be contained either in an e2e RESV or an aggregate RESV as long 
as the PCN-ingress can associate an e2e RESV with the correct 
aggregate (which it can, because it maintains an internal table of 
mappings between e2e reservations and their aggregates).

Which of the two is best is a question of message timing...

* For e2e admission decisions, the PCN object is only needed at the 
time each e2e RESV is sent, so option i) makes sense.

* For flow rate reduction or flow termination decisions, the 
deaggregator needs to regularly send PCN objects to the ingress.

The PCN-egress is sending regular e2e RESV refresh messages to the 
PCN-ingress, so a PCN object can be included in each of these. To 
ensure that PCN objects are sent often enough, I suggest the 
PCN-egress also maintains a timer per ingress-egress aggregate which 
it resets every time it sends a PCN object for that IEA. If the timer 
expires, the PCN-egress sends a PCN object to the PCN-egress even 
thought it was not triggered by an e2e RESV refresh. We could require 
the SESSION object in this message to refer to either of:
a) any one of the e2e SESSIONs in the aggregate,
b) the aggregate.

In case (a), the message would need to somehow tell the ingress not 
to forward this RESV refresh to the RSVP previous hop.

In case (b) in the PCN-ingress table of mappings between e2e SESSIONs 
and aggregate SESSIONs, it would include an entry for the aggregate 
that maps to itself. If the result of the look-up is the same as the 
input, it knows not to forward the RESV refresh further.

The wire protocol doesn't need to identify whether the SESSION is an 
aggregate or not. This is the one case I mentioned at the start {Note 
1} where an aggregate is referred to on the wire.



In summary, PCN already reduces reservation state and processing to 
nothing on interior nodes. Adding aggregate reservations to PCN 
requires more processing and state, it unnecessarily pins routes to 
interior nodes and adds unnecessary latency.


Bob

At 18:23 14/11/2012, karagian@cs.utwente.nl wrote:
>Hi Bob,
>
>Regarding the generic aggregated RSVP selection, actually the PCN WG 
>agreed with this selection!
>This was actually the first step that was needed for this work, and 
>the PCN WG had no main objections on this selection!
>
>So I do not understand your remark that your comment will have major 
>implications!
>
>Please note that the generic aggregated RSVP is selected since the 
>PCN IEA are associated with flows that are aggregated at the edges. 
>So a signalling protocol that supports aggregation of flows at the 
>edges is very suitable for this purpose! The generic aggregated RSVP 
>is such a signalling protocol!
>
>
>Best regards,
>Georgios
>
>
>
>
>From: Bob Briscoe [mailto:bob.briscoe@bt.com]
>Sent: woensdag 14 november 2012 12:57
>To: Anurag Bhargava (anuragb)
>Cc: Karagiannis, G. (EWI)
>Subject: Re: [tsvwg] draft-ietf-tsvwg-rsvp-pcn-03
>
>Anurag,
>
>I have my comments half-written up. I will try to finish them by the 
>end of today.
>
>They should be orthogonal to the PBAC comment below, so if you 
>wanted to start altering that area, I don't think it would waste too 
>much time.
>
>However, my main comments will concern the use of aggregated 
>reservations (as I said at the mic), so that could have major implications.
>
>
>Bob
>
>At 20:14 13/11/2012, Anurag Bhargava (anuragb) wrote:
>
>Hi Bob,
>  Thanks for the comments. If U have some text that will be great 
> els I have also started putting some text on the topic U brought up.
>  May be we can conference after US thanksgiving week and 
> collaborate the text and try to move forward.
>
>  Please let us know what might be a good time and I can schedule a 
> Webex conf.
>
>Thx
>-Anurag
>
>From: "<mailto:karagian@cs.utwente.nl>karagian@cs.utwente.nl " 
><<mailto:karagian@cs.utwente.nl>karagian@cs.utwente.nl >
>Date: Saturday, November 10, 2012 8:10 AM
>To: "<mailto:bob.briscoe@bt.com>bob.briscoe@bt.com" 
><<mailto:bob.briscoe@bt.com>bob.briscoe@bt.com>
>Cc: "<mailto:carlberg@g11.org.uk>carlberg@g11.org.uk" 
><<mailto:carlberg@g11.org.uk>carlberg@g11.org.uk>, Anurag Bhargava 
><<mailto:anuragb@cisco.com>anuragb@cisco.com>, 
>"<mailto:tsvwg@ietf.org>tsvwg@ietf.org" 
><<mailto:tsvwg@ietf.org>tsvwg@ietf.org>, 
>"<mailto:philip.eardley@bt.com>philip.eardley@bt.com " 
><<mailto:philip.eardley@bt.com>philip.eardley@bt.com >
>Subject: RE: [tsvwg] draft-ietf-tsvwg-rsvp-pcn-03
>
>Hi Bob,
>
>
>
>Thanks very much for the comments! I think that they are very useful!
>
>
>
>It will be very beneficiary for the fast progress of this draft if 
>you would like to contribute as a co-author to this draft and write 
>this additional section that describes "that the PCN-ingress can 
>refer flow admission and
>termination decisions to a central decision point (using e.g. COPS), 
>which will respond to the PCN-ingress as per RFC2753. (Alternatively 
>the PCN-ingress could itself be the policy decision point.)"
>
>
>
>Best regards,
>
>Georgios
>
>
>
>----------
>Van: Bob Briscoe [<mailto:bob.briscoe@bt.com>bob.briscoe@bt.com]
>Verzonden: vrijdag 9 november 2012 16:33
>To: Karagiannis, G. (EWI)
>Cc: <mailto:carlberg@g11.org.uk>carlberg@g11.org.uk; 
><mailto:anuragb@cisco.com>anuragb@cisco.com; 
><mailto:tsvwg@ietf.org>tsvwg@ietf.org; EARDLEY, Phil
>Onderwerp: Re: [tsvwg] draft-ietf-tsvwg-rsvp-pcn-03
>
>Georgios,
>
>I shall post my full review of this draft in the next few days (needs
>typing up - currently scribbled on a paper copy). This email is
>solely in response to your answer about on-path vs off-path policy.
>
>At 18:55 04/11/2012, 
><mailto:karagian@cs.utwente.nl>karagian@cs.utwente.nl wrote:
> >So in this case an additional signaling protocol will be
> >needed to be specified that covers the signaling between the
> >PCN-egress-node and the centralized node
> >and between PCN-ingress-node and the centralized node.
> >In PCN we decdided to only focus on the specification of the
> >signaling protocol that completes the
> >feedback loop from PCN-egress-node to PCN-ingress-node and to focus
> >on the signaling protocol
> >used between the edge nodes and the centralized node.
>
>When I/we originally designed CL-PCN over RSVP (2005), the idea was
>that it would fit with the policy-based admission control (PBAC)
>architecture of RFC2753. In this architecture, an Intserv node at the
>ingress to a domain is the policy enforcement point (PEP), and it
>refers to a logically centralised 'policy decision point' (PDP) for
>decisions on which flows to block/terminate, typically using COPS.
>
>To make this doc fit the PBAC framework, all we have to do is:
>* Describe the PCN-ingress only as the PCN-ingress and not as the
>decision point (find 'decision' throughout doc and fix).
>* Add a section saying the PCN-ingress can refer flow admission and
>termination decisions to a central decision point (using e.g. COPS),
>which will respond to the PCN-ingress as per RFC2753. (Alternatively
>the PCN-ingress could itself be the policy decision point.)
>* Refer to this new PBAC section from Section 3.11 giving the
>admission decision procedure.
>
>* Some people might think this means COPS will need new protocol
>elements to carry PCN marking rates to the policy decision point. But
>PCN marking rates are irrelevant to the policy decision: the
>PCN-ingress just uses PCN to determine whether it needs to block or
>terminate, and it refers to the policy decision point for which flows
>to block/terminate.
>
>* Unfortunately, neither of the two PCN system descriptions [RFC6661,
>RFC6662] describe a PBAC-based case. The architecture [RFC5559]
>refers to the PBAC framework [RFC2753] but unfortunately doesn't
>spell out how it fits. Originally, I referenced PBAC from RFC5559,
>but just as the PCN w-g was closing I realised that (some?) others in
>the PCN w-g were working under the assumption that the only way to
>talk to a centralised policy node was from the egress, possibly
>without being aware of the contents of RFC2753.
>
>I think it's OK to introduce a new architectural arrangement in this
>RSVP doc, given RFC2753 is specific to the way RSVP works.
>
>
>Bob
>
>
>
>At 12:28 05/11/2012, 
><mailto:karagian@cs.utwente.nl>karagian@cs.utwente.nl wrote:
> >Hi Ken,
> >
> >Thank you very much!
> >We will try to catch Francois and discuss the last (in line) issue with him!
> >
> >Best regards,
> >Georgios
> >
> >________________________________________
> >Van: ken carlberg [<mailto:carlberg@g11.org.uk>carlberg@g11.org.uk]
> >Verzonden: maandag 5 november 2012 13:23
> >To: Karagiannis, G. (EWI)
> >Cc: <mailto:tsvwg@ietf.org>tsvwg@ietf.org; 
> <mailto:anuragb@cisco.com>anuragb@cisco.com
> >Onderwerp: Re: draft-ietf-tsvwg-rsvp-pcn-03
> >
> >Georgios,
> >
> > > Georgios: We will try to explain the rationale of why we consider
> > that RSVP can only be used for the situation that the
> > > decision point is collocated with the PCN-ingress-node. The main
> > reason of this is that in the case that the
> > > decision point is collocated with the PCN-ingress-node,the
> > required signaling protocol used to complete a
> > > feedback loop from egress to ingress can be an entirely on-path
> > protocol, like what RSVP is.
> > > In the situation that the the decision point is a centralized
> > node, then the required signaling protocol
> > > can be a combination of an on-path and off-path protocol. This is
> > because the
> > > decision point might not be located on the data path! So in this
> > case an additional signaling protocol will be
> > > needed to be specified that covers the signaling between the
> > PCN-egress-node and the centralized node
> > > and between PCN-ingress-node and the centralized node.
> > > In PCN we decdided to only focus on the specification of the
> > signaling protocol that completes the
> > > feedback loop from PCN-egress-node to PCN-ingress-node and to
> > focus on the signaling protocol
> > > used between the edge nodes and the centralized node.
> > > This is also the reason of why PCN WG decided to only focus on
> > the situation that the decision point is
> > > collocated with the PCN-ingress-node.
> >
> >Great, this is helpful, and this is the information that needs to be
> >in the draft.
> >
> > >> 6) This comment is just for you to contemplate -- I'm not
> > expecting any changes.
> > >> I noticed that you have a fair number of SHOULD, and some SHOULD NOTs.
> > >> And it seems a lot of this is a carry over from rfc-4860, so in
> > a sense you are inheriting an approach that
> > >> was agreed to from an earlier effort.  But I wonder in the back
> > of my mind, what impact occurs if
> > >> an implementor doesn't follow the SHOULD?  Does the design break
> > in supporting PCN?
> > >> Again, I want to stress that this isnt a show stopper, but I
> > would appreciate it if you gave it some thought.
> > >
> > > Georgios: Yes, in several cases the design might break in supporting PCN.
> > > This is also the reason of using SHOULD instead of MAY. Do you
> > want us to explain this in more detail in the draft?
> >
> >well, actually, I was more curious as to why a number of these cases
> >are SHOULD instead of MUST.  Again, the SHOULD's in your document
> >seem to be a carry-over from rfc-4860 (which set the precedent), so
> >its a bit unfair for you to explain what was done in an earlier
> >effort.  I just wanted to make sure you gave some thought to the
> >subject.  And if things will break if SHOULD is not followed by an
> >implementer/configuration, then maybe you should be more stringent
> >and change things to MUST.  Perhaps a brief private conversation
> >with Francois Le Faucheur will be helpful.
> >
> >cheers,
> >
> >-ken
>
>________________________________________________________________
>Bob Briscoe,                                BT Innovate & Design
>
>________________________________________________________________
>Bob Briscoe,                                BT Innovate & Design

________________________________________________________________
Bob Briscoe,                                BT Innovate & Design