Re: [L2tpext] Please comment on draft-galtzur-l2tpext-gr-01.txt

"Paul W. Howard" <phoward@juniper.net> Wed, 14 January 2004 19:27 UTC

Received: from optimus.ietf.org ([132.151.1.19]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id OAA04488 for <l2tpext-archive@lists.ietf.org>; Wed, 14 Jan 2004 14:27:37 -0500 (EST)
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AgqfM-00027v-6w; Wed, 14 Jan 2004 14:27:00 -0500
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1AgqfI-00027V-3s for l2tpext@optimus.ietf.org; Wed, 14 Jan 2004 14:26:56 -0500
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id OAA04461 for <l2tpext@ietf.org>; Wed, 14 Jan 2004 14:26:53 -0500 (EST)
Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 1AgqfF-0003kj-00 for l2tpext@ietf.org; Wed, 14 Jan 2004 14:26:53 -0500
Received: from exim by ietf-mx with spam-scanned (Exim 4.12) id 1AgqeM-0003jA-00 for l2tpext@ietf.org; Wed, 14 Jan 2004 14:26:00 -0500
Received: from natint2.juniper.net ([207.17.136.150] helo=gamma.jnpr.net) by ietf-mx with esmtp (Exim 4.12) id 1AgqdU-0003en-00 for l2tpext@ietf.org; Wed, 14 Jan 2004 14:25:04 -0500
Received: from pi-smtp.jnpr.net ([10.10.2.36]) by gamma.jnpr.net with Microsoft SMTPSVC(6.0.3790.0); Wed, 14 Jan 2004 11:24:33 -0800
Received: from juniper.net ([10.10.248.146] RDNS failed) by pi-smtp.jnpr.net with Microsoft SMTPSVC(5.0.2195.6713); Wed, 14 Jan 2004 14:24:32 -0500
Message-ID: <40059772.60607@juniper.net>
Date: Wed, 14 Jan 2004 14:24:34 -0500
From: "Paul W. Howard" <phoward@juniper.net>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.4) Gecko/20030624 Netscape/7.1 (ax)
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: Sharon Galtzur <sharon@AXERRA.com>
CC: "W. Mark Townsley" <townsley@cisco.com>, l2tpext@ietf.org
Subject: Re: [L2tpext] Please comment on draft-galtzur-l2tpext-gr-01.txt
References: <AF5018AC03D1D411ABB70002A509132601085FEA@TLV1>
In-Reply-To: <AF5018AC03D1D411ABB70002A509132601085FEA@TLV1>
Content-Type: multipart/alternative; boundary="------------070109090105070106030900"
X-OriginalArrivalTime: 14 Jan 2004 19:24:32.0144 (UTC) FILETIME=[095BD100:01C3DAD4]
X-Spam-Checker-Version: SpamAssassin 2.60 (1.212-2003-09-23-exp) on ietf-mx.ietf.org
X-Spam-Status: No, hits=0.3 required=5.0 tests=AWL,HTML_MESSAGE, HTML_TITLE_EMPTY autolearn=no version=2.60
Sender: l2tpext-admin@ietf.org
Errors-To: l2tpext-admin@ietf.org
X-BeenThere: l2tpext@ietf.org
X-Mailman-Version: 2.0.12
Precedence: bulk
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/l2tpext>, <mailto:l2tpext-request@ietf.org?subject=unsubscribe>
List-Id: Layer Two Tunneling Protocol Extensions <l2tpext.ietf.org>
List-Post: <mailto:l2tpext@ietf.org>
List-Help: <mailto:l2tpext-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/l2tpext>, <mailto:l2tpext-request@ietf.org?subject=subscribe>

Sharon,

Thanx for your responses.   Please see further comments inline ([pwh])

Thanx,

Paul Howard

Sharon Galtzur wrote:

>Hello all,
>See my response in the mail body.
>
>Sharon Galtzur
>
>  
>
>>-----Original Message-----
>>From: Paul W. Howard [mailto:phoward@juniper.net]
>>Sent: Friday, January 09, 2004 4:59 PM
>>To: W. Mark Townsley
>>Cc: l2tpext@ietf.org
>>Subject: Re: [L2tpext] Please comment on 
>>draft-galtzur-l2tpext-gr-01.txt
>>
>>
>>I have a few questions/concerns in regards to this draft:
>>
>>1) This proposal presents what appears to be an entirely different 
>>mechanism than that proposed in the v2 draft 
>>(draft-vipin-l2tpext-failover-02.txt).   It would seem to be 
>>important 
>>that we attempt to have a single mechanism for both v2 and v3 
>>or to at 
>>least minimize the differences to those that are unavoidable.
>>    
>>
>
>In LDP there is exactly the same occurrence - 
>RFC 3478: Graceful Restart Mechanism for Label Distribution Protocol 
>RFC 3479: Fault Tolerance for the Label Distribution Protocol (LDP) 
>
>This draft follows the model presented in RFC 3478 with adaptation to L2TPv3
>  
>
[pwh] My understanding is that 3478 and 3479 present signficantly 
different approaches to the problem of dealing with a failed endpoint - 
3478 recovers by learning from the non-failed endpoint (graceful 
restart) whereas 3479 recovers by state replication on the failed 
endpoint.    Both your draft and Vipin's would seem to fall into the 
category of graceful restart (i.e. learning from the non-failed 
endpoint).   Both of the l2tp drafts deal with trying to recover key 
pieces of the control plane that are not normally obtainable from the 
forwarding database - the sequence numbers of the control connection and 
the state (established or non-established) of each of the sessions.   It 
would thus seem reasonable to resolve the differences between these 
drafts and produce as common an approach as possible to graceful restart.

[pwh] btw - I misquoted the current version of the other draft - it's 
draft-ietf-l2tpext-failover-02.txt

>  
>
>>2) I don't understand the basic reset of the sequence number 
>>space for 
>>the control connection.   Is this setting up a replacement tunnel or 
>>attempting to reset the sequence numbers of the extant tunnel?
>>    
>>
>
>The tunnel is restarted almost exactly as in
>draft-ietf-l2tpext-l2tp-base-11.txt
>One technical difference is the added AVP to advertise the graceful restart
>capability.
>Another difference is that while restarting the sessions certain information
>need 
>to endure the restart (i.e. Session ID, Cookie value etc) that will allow
>the data path
>to continue without interruptions.
>
[pwh] I'm not entirely clear on your answer.   There would seem to be 
two basic approaches to recovery of the control connection - one is to 
establish an entirely new control connection and eventually transfer the 
sessions from the old control connection to the new control connection; 
the other is to reset the extant control connection,   It's not clear to 
me from your answer or the draft which direction you are proposing.   
So, when I attempt to restart a control connection by sending an SCCRQ 
with the GR AVP and a non-zero recovery time, what is the value of the 
Assigned Control Connection Id AVP?  Is it the same as the one from the 
initial establishment (implies reset of the extant control connection) 
or a new value (implies a recovery control connection and transfer of 
the sessions from the old to the new)

>  
>
>>3) It would appear that there is a linear relationship between the 
>>number of packets required for the fail over and the number of 
>>established sessions.   This would seem to present issues 
>>with scalability.
>>    
>>
>
>The behavior of the graceful restart proposal is not worse then normal start
>of 
>L2TPv3 (since the mechanism of the graceful and non graceful restart are
>practically the same). 
>If there is scalability issue in this proposal it is inherit to the whole
>L2TP model and not restricted to 
>the proposed graceful restart
>
>  
>
[pwh] IMHO, there is a scability problem that needs to be addressed.    
When a failover occurs, the failed endpoint is in the process of trying 
to recover it's state not only for L2TP, but for all other applications 
running in the failed endpoint (e.g. IP, BGP, PPP, AAA, etc.).   It is a 
time of maximum stress on the failed endpoint coupled with restrictive 
time limits (e.g. BGP neighbor notifications).   Failover processes 
needs to be designed with minimal resource consumption.    This draft 
proposes 'n' packets to be sent from the failed endpoint to the 
non-failed endpoint where 'n' is the number of sessions between the 2 
endpoints.   This results in 10's of thousands if not 100's of thousands 
of packets to complete the recovery.    In order to avoid this issue, it 
would seem that the session recovery needs to be able to recover by 
packing multiple session recoveries per packet - thus reducing to n/m 
packets where m is the number of recoveries per packet.   As the primary 
recovery requirement for the session is to confirm that both endpoints 
believe the session to be in established state, this could reduce to 
sending the session id's of the established sessions from the failed 
endpoint to the non-failed endpoint followed by the non-failed endpoint 
sending back the session ids of the established sessions based on the 
intersection of what it received from the failed endpoint and it's local 
database.   Assuming say 256 session ids packed in a single packet (1K 
of data), then we end up with 2 * n / 256 packets to recover.   Thus 
with 100,000 sessions only 782 packets are required to recover instead 
of 100,000+ packets.  

>  
>
>>Thanx,
>>
>>Paul Howard
>>
>>W. Mark Townsley wrote:
>>
>>    
>>
>>>I would like to ask that interested parties review the 
>>>      
>>>
>>following draft 
>>    
>>
>>>"Layer Two Tunneling Protocol (Version 3) Graceful Restart" and 
>>>provide comments to the list. This draft was originally 
>>>      
>>>
>>presented in 
>>    
>>
>>>Vienna, and updated based on comments during the meeting.
>>>
>>>http://www.ietf.org/internet-drafts/draft-galtzur-l2tpext-gr-01.txt
>>>
>>>The authors would like to make this a WG document.
>>>
>>>Thanks,
>>>
>>>- Mark
>>>
>>>
>>>_______________________________________________
>>>L2tpext mailing list
>>>L2tpext@ietf.org
>>>https://www1.ietf.org/mailman/listinfo/l2tpext
>>>
>>>      
>>>
>>_______________________________________________
>>L2tpext mailing list
>>L2tpext@ietf.org
>>https://www1.ietf.org/mailman/listinfo/l2tpext
>>
>>    
>>
>
>  
>