RE: [L2tpext] Please comment on draft-galtzur-l2tpext-gr-01.txt

Hello Paul and all,
See my comments inline

Sharon Galtzur

-----Original Message-----
From: Paul W. Howard [mailto:phoward@juniper.net]
Sent: Wednesday, January 14, 2004 9:25 PM
To: Sharon Galtzur
Cc: W. Mark Townsley; l2tpext@ietf.org
Subject: Re: [L2tpext] Please comment on draft-galtzur-l2tpext-gr-01.txt

Sharon,

Thanx for your responses.   Please see further comments inline ([pwh])

Thanx,

Paul Howard

Sharon Galtzur wrote:

Hello all,

See my response in the mail body.

Sharon Galtzur

-----Original Message-----

From: Paul W. Howard [ mailto:phoward@juniper.net
<mailto:phoward@juniper.net> ]

Sent: Friday, January 09, 2004 4:59 PM

To: W. Mark Townsley

Cc:  l2tpext@ietf.org <mailto:l2tpext@ietf.org> 

Subject: Re: [L2tpext] Please comment on 

draft-galtzur-l2tpext-gr-01.txt

I have a few questions/concerns in regards to this draft:

1) This proposal presents what appears to be an entirely different 

mechanism than that proposed in the v2 draft 

(draft-vipin-l2tpext-failover-02.txt).   It would seem to be 

important 

that we attempt to have a single mechanism for both v2 and v3 

or to at 

least minimize the differences to those that are unavoidable.

In LDP there is exactly the same occurrence - 

RFC 3478: Graceful Restart Mechanism for Label Distribution Protocol 

RFC 3479: Fault Tolerance for the Label Distribution Protocol (LDP) 

This draft follows the model presented in RFC 3478 with adaptation to L2TPv3

[pwh] My understanding is that 3478 and 3479 present signficantly different
approaches to the problem of dealing with a failed endpoint - 3478 recovers
by learning from the non-failed endpoint (graceful restart) whereas 3479
recovers by state replication on the failed endpoint.    Both your draft and
Vipin's would seem to fall into the category of graceful restart (i.e.
learning from the non-failed endpoint).   Both of the l2tp drafts deal with
trying to recover key pieces of the control plane that are not normally
obtainable from the forwarding database - the sequence numbers of the
control connection and the state (established or non-established) of each of
the sessions.   It would thus seem reasonable to resolve the differences
between these drafts and produce as common an approach as possible to
graceful restart.

[pwh] btw - I misquoted the current version of the other draft - it's
draft-ietf-l2tpext-failover-02.txt

2) I don't understand the basic reset of the sequence number 

space for 

the control connection.   Is this setting up a replacement tunnel or 

attempting to reset the sequence numbers of the extant tunnel?

The tunnel is restarted almost exactly as in

draft-ietf-l2tpext-l2tp-base-11.txt

One technical difference is the added AVP to advertise the graceful restart

capability.

Another difference is that while restarting the sessions certain information

need 

to endure the restart (i.e. Session ID, Cookie value etc) that will allow

the data path

to continue without interruptions.

[pwh] I'm not entirely clear on your answer.   There would seem to be two
basic approaches to recovery of the control connection - one is to establish
an entirely new control connection and eventually transfer the sessions from
the old control connection to the new control connection; the other is to
reset the extant control connection,   It's not clear to me from your answer
or the draft which direction you are proposing.   So, when I attempt to
restart a control connection by sending an SCCRQ with the GR AVP and a
non-zero recovery time, what is the value of the Assigned Control Connection
Id AVP?  Is it the same as the one from the initial establishment (implies
reset of the extant control connection) or a new value (implies a recovery
control connection and transfer of the sessions from the old to the new)

[Sharon Galtzur] I agree this is indeed missing from the draft - 
The pro and con of each approach are as follow: 
1. If we remember the the CC ID - we gain a bit more confidence when we
looking up the CC in the LCCE that wasn't restart, because we have more
information
    to compare to when recovering the CC. The drawback is that the restarted
LCCE must have the means to remember this information (since this
information is NOT 
    a part of the data path and hence is not transferred to the data path
parts of the system). 
2. If we don't remember the CC ID - we lose some confidence when we looking
up the CC (but really not more then when doing normal restart). The gain is
that there is 
    no need for this information to be persistent.

Currently I would think that not remembering the CC ID from restart to
restart is preferred because it is less restrictive and is more close to the
non-graceful restart requirements. 

I would like to hear from others what they think before I update the draft.

3) It would appear that there is a linear relationship between the 

number of packets required for the fail over and the number of 

established sessions.   This would seem to present issues 

with scalability.

The behavior of the graceful restart proposal is not worse then normal start

of 

L2TPv3 (since the mechanism of the graceful and non graceful restart are

practically the same). 

If there is scalability issue in this proposal it is inherit to the whole

L2TP model and not restricted to 

the proposed graceful restart

[pwh] IMHO, there is a scability problem that needs to be addressed.    When
a failover occurs, the failed endpoint is in the process of trying to
recover it's state not only for L2TP, but for all other applications running
in the failed endpoint (e.g. IP, BGP, PPP, AAA, etc.).   It is a time of
maximum stress on the failed endpoint coupled with restrictive time limits
(e.g. BGP neighbor notifications).   Failover processes needs to be designed
with minimal resource consumption.    This draft proposes 'n' packets to be
sent from the failed endpoint to the non-failed endpoint where 'n' is the
number of sessions between the 2 endpoints.   This results in 10's of
thousands if not 100's of thousands of packets to complete the recovery.
In order to avoid this issue, it would seem that the session recovery needs
to be able to recover by packing multiple session recoveries per packet -
thus reducing to n/m packets where m is the number of recoveries per packet.
As the primary recovery requirement for the session is to confirm that both
endpoints believe the session to be in established state, this could reduce
to sending the session id's of the established sessions from the failed
endpoint to the non-failed endpoint followed by the non-failed endpoint
sending back the session ids of the established sessions based on the
intersection of what it received from the failed endpoint and it's local
database.   Assuming say 256 session ids packed in a single packet (1K of
data), then we end up with 2 * n / 256 packets to recover.   Thus with
100,000 sessions only 782 packets are required to recover instead of
100,000+ packets.  

[Sharon Galtzur] The same can be said for the current L2TP non-graceful
restart. It is probably even more pressing since there is no data flow till
the connections are 
restarted. The big advantage I see in this approach is that it is so similar
to normal L2TP model. This will allow implementers of L2TP stacks to
implement one model of establishing sessions instead of 2. 
I would suggest that if there is a real concern of connecting 100k
connection between 2 LCCE we should re-examine the possibility of
establishing multiple sessions connection using 1 packet instead of
requiring 1 packet per session in the non-graceful restart and then adopt it
to the graceful restart mode. 

Thanx,

Paul Howard

W. Mark Townsley wrote:

I would like to ask that interested parties review the 

following draft 

"Layer Two Tunneling Protocol (Version 3) Graceful Restart" and 

provide comments to the list. This draft was originally 

presented in 

Vienna, and updated based on comments during the meeting.

http://www.ietf.org/internet-drafts/draft-galtzur-l2tpext-gr-01.txt
<http://www.ietf.org/internet-drafts/draft-galtzur-l2tpext-gr-01.txt> 

The authors would like to make this a WG document.

Thanks,

- Mark

_______________________________________________

L2tpext mailing list

L2tpext@ietf.org <mailto:L2tpext@ietf.org> 

https://www1.ietf.org/mailman/listinfo/l2tpext
<https://www1.ietf.org/mailman/listinfo/l2tpext> 

_______________________________________________

L2tpext mailing list

L2tpext@ietf.org <mailto:L2tpext@ietf.org> 

https://www1.ietf.org/mailman/listinfo/l2tpext
<https://www1.ietf.org/mailman/listinfo/l2tpext>