Re: Updated draft response to OIF on 1:n protection
"Adrian Farrel" <adrian@olddog.co.uk> Mon, 12 June 2006 17:04 UTC
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1Fpppt-0001eK-UH for ccamp-archive@ietf.org; Mon, 12 Jun 2006 13:04:21 -0400
Received: from psg.com ([147.28.0.62]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1Fppps-0004K8-Mz for ccamp-archive@ietf.org; Mon, 12 Jun 2006 13:04:21 -0400
Received: from majordom by psg.com with local (Exim 4.60 (FreeBSD)) (envelope-from <owner-ccamp@ops.ietf.org>) id 1Fppkx-000JkC-2z for ccamp-data@psg.com; Mon, 12 Jun 2006 16:59:15 +0000
X-Spam-Checker-Version: SpamAssassin 3.1.1 (2006-03-10) on psg.com
X-Spam-Level:
X-Spam-Status: No, score=-2.5 required=5.0 tests=AWL,BAYES_00, FORGED_RCVD_HELO autolearn=ham version=3.1.1
Received: from [80.68.34.48] (helo=mail1.noc.data.net.uk) by psg.com with esmtp (Exim 4.60 (FreeBSD)) (envelope-from <adrian@olddog.co.uk>) id 1Fppku-000Jjs-CC for ccamp@ops.ietf.org; Mon, 12 Jun 2006 16:59:13 +0000
Received: from 57-99.dsl.data.net.uk ([80.68.57.99] helo=cortex.aria-networks.com) by mail1.noc.data.net.uk with esmtp (Exim 3.36 #2) id 1FpplA-0000wG-00 for ccamp@ops.ietf.org; Mon, 12 Jun 2006 17:59:28 +0100
Received: from your029b8cecfe ([217.158.132.225] RDNS failed) by cortex.aria-networks.com with Microsoft SMTPSVC(6.0.3790.1830); Mon, 12 Jun 2006 17:59:03 +0100
Message-ID: <00d601c68e41$80532970$0a23fea9@your029b8cecfe>
Reply-To: Adrian Farrel <adrian@olddog.co.uk>
From: Adrian Farrel <adrian@olddog.co.uk>
To: Lou Berger <lberger@labn.net>
Cc: ccamp@ops.ietf.org
References: <005b01c68b3b$191d9be0$c2849ed9@your029b8cecfe> <7.0.1.0.2.20060612081409.06ee4e28@labn.net>
Subject: Re: Updated draft response to OIF on 1:n protection
Date: Mon, 12 Jun 2006 17:57:37 +0100
Organization: Old Dog Consulting
MIME-Version: 1.0
Content-Type: text/plain; format="flowed"; charset="iso-8859-1"; reply-type="response"
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2900.2180
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2180
X-OriginalArrivalTime: 12 Jun 2006 16:59:03.0797 (UTC) FILETIME=[825FC250:01C68E41]
Sender: owner-ccamp@ops.ietf.org
Precedence: bulk
X-Spam-Score: 0.1 (/)
X-Scan-Signature: 3643ee1fccf5d6cf2af25f27d28abb29
Thanks Lou, Yes, you are right. I will fold that into the reply. Adrian ----- Original Message ----- From: "Lou Berger" <lberger@labn.net> To: "Adrian Farrel" <adrian@olddog.co.uk> Cc: <ccamp@ops.ietf.org> Sent: Monday, June 12, 2006 1:42 PM Subject: Re: Updated draft response to OIF on 1:n protection > Adrian, > I like what you have so far, but there is one additional item to > add: > > Bulk notifications *are* supported in the recovery drafts using the Notify > message. Notify messages could be used to provide a "single signaling > interaction" and "a bulk notification ... procedure". Protection/recovery > could then be provided on a per LSP basis (per segment protection) or use > the hierarchy approach you mentioned below. (Bulk) reversion is also > supported via Notify messages, see section 12 of > draft-ietf-ccamp-gmpls-recovery-e2e-signaling. Note that these procedures > also apply to segment recovery. > > Lou > > At 03:43 PM 6/8/2006, Adrian Farrel wrote: > >>Hi, >> >>Looking at the latest input from the OIF, I think we have to perform >>some type of meld on the two incoming communications and respond >>to them together. >> >>Here is my attempt building on what we had before. Please comment on or >>off the list. >> >>Thanks, >> >>Adrian >> >>== >> >>Dear Jim, >> >>Thank you for your communication to CCAMP on the use of GMPLS to provide >>1:n protection at the OIF UNI and the OIF E-NNI dated 20th May 2006 and >>for your updates received on 2nd June 2006. >> >>We are grateful for this opportunity to comment, but we note that this >>type of communication requesting clarifications is better suited to a >>mailing list discussion than to official communications that, by their >>nature, have a slow turn-around. This opinion is considerably reinforced >>by the process we have gone through here with a revision to >>the OIF communication being generated while CCAMP is trying to draft its >>response. It seems to us that if official lines of communication are to be >>followed then they have to be adhered to, but if iterative >>discussions are needed (as has proved to be the case here) then it would >>be possible to respond far more dynamically using mailing lists. >> >>The appropriate place for discussions of GMPLS protocols is the CCAMP >>working group mailing list. Details of how to subscribe to the mailing >>list can be found at >>http://www.ietf.org/html.charters/ccamp-charter.html >> >>Anyway, the CCAMP chairs are keen to ensure smooth communications with >>the OIF and have consulted as widely as they could in the short time in >>order to update the response that we had already drafted to your original >>enquiries. >> >>We hope that our answers are satisfactory. >> >>In the remainder of our response we have quoted extracts from your two >>communications as: >> >>>1> For a quote from the first communication dated 20th May 2006 >> >>>2> For a quote from your second communication dated 2nd June 2006 >> >>>1> Future updates to OIF UNI and E-NNI signaling may include a feature >>>1> for 1:N connection protection. The attached document presents >>>1> requirements for these features. Recently a review was completed >>>1> of RFCs 4426, 4427 and 4428 and IETF drafts that may be able to >>>1> implement this function (including draft-ietf-ccamp-gmpls-recovery- >>>1> e2e-signaling-03 and draft-ietf-ccamp-gmpls-segment-recovery-02). >>>1> It appears that the abstract messages from RFC 4426 provide much >>>1> of this functionality, however several questions resulted from this >>>1> review. OIF would appreciate review and comments from IETF >>>1> CCAMP on the following items. >>>1> >>>1> 1.) OIF would appreciate knowing if there are protocol features in >>>1> other IETF documents relevant to 1:N protection . >> >>We would like to suggest that, in order to utilize advanced features of >>the GMPLS control plane protocol, engineers should be familiar with the >>full set of GMPLS RFCs and Internet-Drafts. These are listed on the >>CCAMP charter page and can be downloaded free of charge by clicking on >>the links. >> >>Although not all of this work is directly related to protection and >>restoration, it should be noted that any protocol aspect present for a >>working path may also be required for a protection path. Protocol >>engineers must, therefore, be familiar with the details of the protocol >>before attempting to provide advanced functions like protection. >> >>>2> 1) OIF would appreciate CCAMP's guidance as to whether CCAMP has >>>2> defined standards for any similar form of restoration, i.e., one >>>2> that protects a group of LSPs at once over a local span, by >>>2> shifting these LSPs from their original link within the span over >>>2> to a backup link. It should be noted that >>>2> - the backup link may be a different type than the original >>>2> (e.g., OC192 rather than OC48), so that GMPLS signaling rather >>>2> than underlying SONET/SDH link protection is used to perform >>>2> the switchover; and >>>2> >>>2> - it is intended that the affected LSPs be shifted using a >>>2> single signaling interaction rather than separate interactions >>>2> per individual LSP in order to reduce the signaling overhead >>>2> required. >>>2> >>>2> We believe that some of the existing work, especially for segment >>>2> recovery, may be helpful, but may not meet the exact requirements >>>2> of the service that has been proposed within OIF. Any pointers to >>>2> existing drafts or RFCs, however, would be greatly appreciated. >> >>There are two principal ways in which the objectives you cite can be >>met, and both of these techniques are, to our certain knowledge, >>implemented and successfully deployed. >> >>a) Link-level protection. This technique relies on the protection >> of the underlying link outside the scope of GMPLS. Thus, the >> TE link over which one or more LSPs are provisioned is actually >> supported by more than one underlying link. When one link fails, >> the traffic that it was carrying (the LSPs) is transferred to >> another link. This type of protection is transparent to GMPLS >> although it could leverage GMPLS fault notification procedures. >> >> You can learn some more about link-level protection by reading >> RFC3945 and RFC4426 (where it is referred to as Span Protection). >> >> Please note that the links used in this mode do not need to be >> of the same type. This is not link bundling. >> >>b) LSP Hierarchies. This technique relies on nesting multiple LSPs >> within another LSP. Most familiar in packet technologies, this >> process is also applicable to non-packet technologies where >> appropriate adaptation is available. >> >> By nesting multiple LSPs within another LSP, it is possible to >> reroute them all simply by rerouting the nesting LSP. Thus any >> protection scheme that can be applied to the nesting LSP can be >> applied to the nested LSPs in a single stage. Such procedures >> are, therefore, fully available for GMPLS control. >> >> You can read more about LSP hierarchies in RFC4206. >> >>Excellent though the procedures documented in >>draft-ietf-ccamp-gmpls-segment-recovery are, we are unsure as to the >>"exact requirements of the service that has been proposed >>within OIF" and so cannot be sure which procedures to advise for >>the problem as you have described it. >> >>>2> 2) Reviewing some of the existing RFC text, we note that RFC 4426 >>>2> section 2.5.2 states "it MAY be possible for the LSPs on the working >>>2> link to be mapped to the protection link without re-signaling each >>>2> individual LSP" and "it MAY be possible to change the component >>>2> links without needing to re-signal each individual LSP". >>>2> This text appears to refer to the use of SONET/SDH link protection >>>2> in such a way that the labels for each LSP remain the same. Does >>>2> this imply, however, that an action that changes the local >>>2> labels for the affected LSPs then requires re-signaling of each >>>2> individual LSP, or is there a "bulk" mechanism to change labels >>>2> for a group of LSPs simultaneously? >> >>Your question is confusing in the light of the referenced section. The >>section describes the messages required to achieve span protection. >>Clearly, if a span is protected, then all LSPs carried over that span >>may be transparently protected. This is how normal link protection >>operates and there is nothing clever going on. >> >>Obviously (hopefully this is obvious) if you change the label in use on >>a link for a particular LSP then the NEs at each end of the link need to >>know that information since both the sender and the receiver need to >>use the correct label. This applies for each LSP whose label you change. >>The accepted mechanism in the control plane for exchanging labels is the >>signaling protocol, so it follows that, if you wish to change the label >>in use for an LSP on a link, you must engage signaling. >> >>You should observe that an NE may change the label in use on a link at >>any time using the RSVP-TE protocol. All that is required (assuming a >>unidirectional LSP) is a trigger Resv message carrying a new label. >>Considerations of the impact to user traffic are left as an exercise for >>the reader. >> >>It is unclear how the "bulk" mechanism you propose could operate unless >>it was well-known that all labels are going to change in the same way. So >>perhaps you are suggesting that a single signaling message might itemise >>all of the LSPs and show each new label. If this is really a significant >>issue (i.e., you feel it is absolutely imperative to reduce >>the number of signaling messages) then you should consider RSVP message >>bundling. >> >>>2> 3) RFC 4426 describes the sending of the Failure Indication >>>2> Message upon detection of failure by a slave device. It is >>>2> our belief that the same mechanism could also be used when >>>2> the slave device is triggered to send an indication due to >>>2> management system intervention (cases are mentioned in RFC >>>2> 4427 but not in 4426), and we would like to know if CCAMP >>>2> concurs with this. >>>2> An example of where this might occur is where the master >>>2> and slave devices are in different management domains. >> >>As you correctly observe, RFC4427 section 4.13 describes exactly this >>case where management plane intervention causes a Failure Indication, >>and it is useful for forced or controlled switch-over. >> >>You should note that RFC4426 section 2.5.1 says of the Failure >>Indication message... >> This message is sent from the slave to the master to indicate the >> identities of one or more failed working links. This message MAY not >> be necessary when the transport plane technology itself provides for >> such a notification. >> >>It could also be the case that the message MAY not be necessary in the >>case where the failure indication is conveyed to the master node by the >>management plane. That is to say, there is no specific requirement (in >>the case of management plane intervention) for the intervention to be to >>the slave and causing a Failure Indication message to be sent to the >>master - the management plane intervention could consist of a notification >>sent to both the slave and the master from the management >>plane. >> >>The absence of this discussion within the GMPLS RFCs owes much to the >>fact that they are largely control plane specifications with some notes >>about the management plane for additional helpfulness. >> >>Your final example about the use of this technique where the master and >>slave are in different management domains is interesting, but the use of >>a control plane means that you should consider the control plane domains, >>not just the management domains. >> >>>1> 4.) A goal of the 1:N protection is to use a bulk notification and >>>1> recovery procedure, based on RFC 4427 section 4.15. However, that >>>1> RFC states the corresponding recovery switching actions are >>>1> performed at the LSP level. It would be useful to know if bulk >>>1> processing could be applied to recovery of individual connection >>>1> segments on the failed span, not entire LSPs. >> >>>2> 4) RFC 4427, section 4.15 discusses bulk recovery for a failed span, >>>2> and suggests that the recovery switching message to recovered LSP >>>2> ratio may be 1 or greater. OIF would like to know if it is possible >>>2> to define procedures such that the ratio is much less than 1, 2> i.e., >>>a message that causes bulk recovery actions on a number of >>>2> LSPs. >> >>We believe that you have missed the point of section 4.15 of RFC4427. >>This section is describing the case where all or only some of the LSPs >>carried on a span are protected by a single recovery message exchange >>(full or partial span protection). In the case of partial span >>protection it is possible that not all LSPs on the span will be >>protected. Thus, the discussion of message to LSP ratios refers to the >>number of recovery messages needed to protect the LSPs on a span. >> >>The expression of the ratios is probably unclear, but the subsequent >>text explains the situation. >> >>Let us assume that there are S LSPs on the span, and s LSPs protected by >>a protection message. Consider the ratio S/s. >> >>If S/s = 1, one message has been used to protect all LSPs on the span. >>(Full recovery) >> >>If S/s > 1, more than one message is used to protect all of the LSPs on >>the span OR not all LSPs on the span are protected. (Partial recovery) >> >>Clearly a ratio of less than one would be particularly odd ! >> >>It should be obvious from wider reading of the RFCs (4436, 4427, and >>3473) that the whole point of the Failure Indication is to be able to >>report on more than one LSP failure at a time. >> >>>2> 5) RFC 4426 defines a "master" and "slave" role for dedicated 1+1 >>>2> and 1:1 span protection and a "source" and "destination" role for >>>2> control of end-to-end restoration and for reversion. We believe >>>2> that "source" and "destination" mean the initiator and receiver >>>2> of the LSP (as opposed to the source and destination of data >>>2> in-band). >> >>The terms "source" and "destination" are standard. >> >>For unidirectional LSPs, the "source" is the source of data on the LSP, >>also known as he ingress. Where a control plane is used, signaling >>progresses from the source (also known as the head-end). >> >>Similarly, the "destination" is the destination of data on the LSP, also >>known as the egress. Where a control plane is used, signaling progresses >>from the source to the destination (also known as the tail-end). >> >>By common convention, for bidirectional LSPs set up by the control >>plane, the "source" remains the signaling source (ingress) and the >>"destination" is the signaling destination (egress). Traffic flowing >>in the reverse direction is referred to as reverse direction traffic >>and flows from destination to source. >> >>Very probably there is an ITU-T architectural term for these end points >>of LSPs. >> >>Note that RFC 4426 is very careful to state: >> The end-to-end recovery models discussed in this >> document apply to segment protection where the source and destination >> refer to the protected segment rather than the entire LSP. >> >>Should this still be unclear to you, RFC4426 section 1 states >> Consider the control plane message flow during the establishment of >> an LSP. This message flow proceeds from an initiating (or source) >> node to a terminating (or destination) node, via a sequence of >> intermediate nodes. A node along the LSP is said to be "upstream" >> from another node if the former occurs first in the sequence. The >> latter node is said to be "downstream" from the former node. That >> is, an "upstream" node is closer to the initiating node than a node >> further "downstream". Unless otherwise stated, all references to >> "upstream" and "downstream" are in terms of the control plane message >> flow. >> >>The terms "master" and "slave" are introduced to describe the trigger >>points for protection activity and are defined clearly in section 2.3 >>of RFC4426. >> Consider two adjacent nodes, A and B. Under 1:1 protection, a >> dedicated link j between A and B is pre-assigned to protect working >> link i. Link j may be carrying (pre-emptable) Extra Traffic. A >> failure affecting link i results in the corresponding LSP(s) being >> restored to link j. Extra Traffic being routed over link j may need >> to be pre-empted to accommodate the LSPs that have to be restored. >> >> Once a fault is isolated/localized, the affected LSP(s) must be moved >> to the protection link. The process of moving an LSP from a failed >> (working) link to a protection link must be initiated by one of the >> nodes, A or B. This node is referred to as the "master". The other >> node is called the "slave". The determination of the master and the >> slave may be based on configured information or protocol specific >> requirements. >> >>Thus, the "master" is responsible for initiating the switchover, and the >>slave is responsible for keeping up with the state changes. >> >>>1> Further, it would be helpful to understand why the actions are >>>1> performed by source and destination nodes rather than master and >>>1> slave nodes. It may be appropriate to reuse the master/slave roles >>>1> in the reversion process just as is done in the switchover process. >> >>>2> We are not clear on the rationale for when control >>>2> plane roles are based on master/slave vs. source/destination: >>>2> it appears that local span actions are controlled using >>>2> master/slave while remote actions are controlled using >>>2> source/destination, however the reasoning for control of >>>2> reversion is less clear to us. Any clarification of the >>>2> rationale for using master/slave vs. source/destination >>>2> control would be appreciated. >> >>As explained by the definitions of the terms, there is a distinction >>between the node that invokes a switchover process (the master) and a >>node that performs the process. For example, a Bridge and Switch Request >>message is sent by the source node after it has bridged traffic back to >>both working and protection links simply because the source node has >>performed the bridging and is the only node that can know this fact. >> >>In other words, whether the source is master or slave depends on the >>protection scheme in use and the nature of the operation. It should be >>a simple matter when considering a protection scheme and the necessary >>protocol exchanges and switchover actions to determine which of the >>source and destination must play the master or slave role. >> >>>1> In addition, RFC 4426 does not include an abstract message similar >>>1> to the Failure Indication Message to request the beginning of the >>>1> reversion procedure. It may be beneficial to include a message from >>>1> the slave device to initiate reversion, just as there is a Failure >>>1> Indication Message to initiate switchover. (RFC 4426 states that the >>>1> Failure Indication Message may not be needed when the transport 1> >>>plane technology itself provides such a notification. The same may >>>1> apply when a failure is cleared; however, there should still be an >>>1> optional message to trigger the reversion process.) >> >>>2> 6) We believe that it may be useful in some cases of reversion to >>>2> allow a "slave" device to request reversion using an abstract 2> >>>message similar to the Failure Indication Message. An example >>>2> case is (again) when the "master" and "slave" devices are in >>>2> different management domains, such that reversion is initiated from >>>2> the management domain of the "slave" device. We request CCAMP 2> >>>comment on this suggestion. >> >>Reversion is described as an administrative procedure in RFC4426 and >>RFC4427 quite deliberately. In our view it should not be a rapid response >>to a specific situation triggered through the control plane >>by the 'master', but should be a considered operation under the control >>of administrative policy. The trigger is, therefore, outside the scope of >>the control plane. This discussion can be seen in section 4.13 of RFC4427. >> >>We believe that your suggestion does not change this view, but that you >>are proposing that the control plane be used as a transport for a >>management plane request. You are suggesting that a management station >>in the management domain that contains the slave sends the request to the >>slave, the slave would then deliver the request through the control >>plane to the master. In the absence of any specific control plane >>requirement for this message, we believe that the correct architectural >>approach is for management plane messages to be delivered in the >>management plane. Thus, if there is a need for management plane >>coordination between separate management plane domains, this should be >>arranged through an appropriate management plane peering point where the >>correct policies can be applied. >> >> >>We hope this answers your questions, and we would be happy to enter into >>further dialog on these topics. >> >>In conclusion, it may be helpful to the OIF to know the status of two >>CCAMP drafts related to recovery. >>draft-ietf-ccamp-gmpls-recovery-e2e-signaling-03 and >>draft-ietf-ccamp-gmpls-segment-recovery-02 both completed CCAMP >>working group last call in early 2005. Since then they have been >>implemented and tested. The drafts are stable and complete, and are >>queued in the IETF process waiting to become RFCs >> >> >> >>Best regards, >>Adrian Farrel and Deborah Brungard >>CCAMP co-chairs >> >> >> >> >> >> >> >> >>-- >>Internal Virus Database is out-of-date. >>Checked by AVG Anti-Virus. >>Version: 7.1.394 / Virus Database: 268.7.0/345 - Release Date: 5/22/2006 > > > > >
- Updated draft response to OIF on 1:n protection Adrian Farrel
- Re: Updated draft response to OIF on 1:n protecti… Lou Berger
- Re: Updated draft response to OIF on 1:n protecti… Adrian Farrel