8. High Availability Support (CE failover Support)

   The ForCES protocol provides mechanisms for CE redundancy and
   failover, in order to support High Availability as per Reqs[]. FE 
redundancy and FE to FE
interaction is currently out of scope of this draft. There can be
   multiple redundant CEs and FEs in a ForCES NE.  However, at any time
   there can only be one Primary CE controlling the FEs and there can be
   multiple secondary CEs.  The FE and the CE PL are aware of the
   primary and secondary CEs.  This information (primary, secondary CEs)
   is configured in the FE, CE PLs during pre-association by FEM, CEM
   respectively.  Only the primary CE sends Control messages to the FEs.
   The FE may send its event reports, redirection packets to only the
   Primary CE (Report Primary Mode) or it may send these to both primary
   and secondary CEs (Report All Mode).  (The latter helps with keeping
   state between CEs synchronized, although it does not guarantee
   synchronization.) This behavior or HA Modes are configured during
   Association setup phase but can be changed by the CE anytime during
   protocol operation.  A CE-to-CE synchronization protocol will be
   needed in most cases to support fast failover, however this will not
   be defined by the ForCES protocol.

   During a communication failure between the FE and CE (which is caused
   due to CE or link reasons, i.e.  not FE related), the TML on the FE
   will trigger the FE PL regarding this failure. This can also be detected 
using the HB messages between FEs and CEs. The FE PL will send a
   message (Event Report) to the Secondary CEs to indicate this failure
   or the CE PL will detect this and one of the Secondary CEs takes over
   as the primary CE for the FE.  During this phase, if the original primary
CE comes alive and starts sending any commands to the FE, the FE should 
ignore those messages and send an Event to all CEs indicating its change in
Primary CE. Thus the FE only has one primary CE at a time.

An explicit message (Config message-
   Move command) from the primary CE, can also be used to change the
   Primary CE for an FE during normal protocol operation.  In order to
   support fast failover, the FE will establish association (setup msg)
   as well as complete the capability exchange with the Primary as well
   as all the Secondary CEs (in all scenarios/modes).

   These two scenarios (Report All, Report Primary) have been
   illustrated in the figures below.


                     FE                      CE Primary         CE Secondary
                        |                       |                    |
                        | Asso Estb,Caps exchg  |                    |
                      1 |<--------------------->|                    |
                        |                       |                    |
                        |         Asso Estb,Caps|exchange            |
                      2 |<----------------------|------------------->|
                        |                       |                    |
                        |     All msgs          |                    |
                      3 |<--------------------->|                    |
                        |                       |                    |
                        |    packet redirection,|events, HBs         |
                      4 |-----------------------|------------------->|
                        |                       |                    |
                        |                   FAILURE                  |
                        |                                            |
                        |             Event Report (pri CE down)     |
                      5 |------------------------------------------->|
                        |                                            |
                        |                  All Msgs                  |
                      6 |------------------------------------------->|


               Figure 30: CE Failover for Report All mode


                        FE                   CE Primary        CE Secondary
                        |                       |                    |
                        |  Asso Estb,Caps exchg |                    |
                      1 |<--------------------->|                    |
                        |                       |                    |
                        |         Asso Estb,Caps|exchange            |
                      2 |<----------------------|------------------->|
                        |                       |                    |
                        |       All msgs        |                    |
                      3 |<--------------------->|                    |
                        |                       |                    |
                        |            (HeartBeats| only)              |
                      4 |-----------------------|------------------->|
                        |                       |                    |
                        |                   FAILURE                  |
                        |                                            |
                        |              Event Report (pri CE down)    |
                      5 |------------------------------------------->|
                        |                                            |
                        |                   All Msgs                 |
                      6 |------------------------------------------->|


             Figure 31: CE Failover for Report Primary Mode


8.1  Responsibilities for HA

   TML level - Transport level:
   1.  The TML controls logical connection availability and failover.
   2.  The TML also controls peer HA managements.

   At this level, control of all lower layers example transport level
   (such as IP addresses, MAC addresses etc) and associated links going
   down are the role of the TML.

   PL Level:
   All the other functionality including configuring the HA behavior
   during setup, the CEIDs are used to identify primary, secondary CEs,
   protocol Messages used to report CE failure (Event Report), Heartbeat
   messages used to detect association failure, messages to change
   primary CE (Config - move), and other HA related operations
   described before are the PL responsibility.

   To put the two together, if a path to a primary CE is down, the TML
   would take care of failing over to a backup path, if one is
   available.  If the CE is totally unreachable then the PL would be
   informed and it will take the appropriate actions described before.