Re: quick failover in SCTP
Yoshifumi Nishida <nishida@sfc.wide.ad.jp> Tue, 12 October 2010 10:41 UTC
Return-Path: <yoshifumi.nishida@gmail.com>
X-Original-To: tsvwg@core3.amsl.com
Delivered-To: tsvwg@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 9ED153A67FD for <tsvwg@core3.amsl.com>; Tue, 12 Oct 2010 03:41:41 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -99.487
X-Spam-Level:
X-Spam-Status: No, score=-99.487 tagged_above=-999 required=5 tests=[AWL=-1.764, BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, FRT_BELOW2=2.154, J_CHICKENPOX_28=0.6, J_CHICKENPOX_43=0.6, J_CHICKENPOX_48=0.6, MIME_8BIT_HEADER=0.3, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vREQJbnmuaMP for <tsvwg@core3.amsl.com>; Tue, 12 Oct 2010 03:41:40 -0700 (PDT)
Received: from mail-wy0-f172.google.com (mail-wy0-f172.google.com [74.125.82.172]) by core3.amsl.com (Postfix) with ESMTP id 264143A63D2 for <tsvwg@ietf.org>; Tue, 12 Oct 2010 03:41:39 -0700 (PDT)
Received: by wyb29 with SMTP id 29so1618978wyb.31 for <tsvwg@ietf.org>; Tue, 12 Oct 2010 03:42:53 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:sender:received :in-reply-to:references:date:x-google-sender-auth:message-id:subject :from:to:cc:content-type:content-transfer-encoding; bh=GPc+miobI7WQ7suhooblHtDr1sf5G2e8sc0CPKWeQGk=; b=VNcuoUEwAzSL+gludo00H7vaRvk38LBylCcAe8S+3Fvm1eS7eToVwgCus87HG4PeYE A09KpUYhanyimj75ExgvjVp9nBheuX8FRUTRCfUJ1M8T5NHrxXra9UDBh+Y4ceDvuUIM Va+uRNqVtGvWCg621Bw/1I3pWHEsA+n3a6A8c=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=vn12x6E+9duXHBzcx9qH/XBH8gRLzm9bviFWMBeKWiKiEOHm13QMawDe5SSdHz5TIN 7bjmvx46wvUtB43+mLGakwUwJLxM2XCX2p8iyurhNnHwevZ0paC5t5vltdrHWPi7OKfZ 5rDiV9hb0ZuDJgTaqmzISYWfMEEp+wUty86mk=
MIME-Version: 1.0
Received: by 10.227.153.14 with SMTP id i14mr6413691wbw.142.1286880173494; Tue, 12 Oct 2010 03:42:53 -0700 (PDT)
Sender: yoshifumi.nishida@gmail.com
Received: by 10.216.133.225 with HTTP; Tue, 12 Oct 2010 03:42:53 -0700 (PDT)
In-Reply-To: <EC763CF2-05B6-4A3F-A508-72E482A5E5BE@lurchi.franken.de>
References: <AANLkTi=07JfcQOKhfLouaU8N6=r57Koh9fKw+j3=O56R@mail.gmail.com> <8CF58AED-5D88-46A3-B873-26AAB8DAF9BD@lurchi.franken.de> <AANLkTi=2uGTkPjtcohABJ3vpy=5ryAMzXSwGiq3tXgi_@mail.gmail.com> <17E9584A-5AC1-4CC7-AC93-7207412DD05B@lurchi.franken.de> <AANLkTi=4NDwVgkgsndZYy_t6OcA4b3fgOBC4w+o+HEhA@mail.gmail.com> <EC763CF2-05B6-4A3F-A508-72E482A5E5BE@lurchi.franken.de>
Date: Tue, 12 Oct 2010 03:42:53 -0700
X-Google-Sender-Auth: L9SJPPtX4pxtFQBmkkA9n29Uc1Q
Message-ID: <AANLkTikAFt7q=nOphfFcs2ur8M131sudmqi8myweHvnj@mail.gmail.com>
Subject: Re: quick failover in SCTP
From: Yoshifumi Nishida <nishida@sfc.wide.ad.jp>
To: Michael Tüxen <Michael.Tuexen@lurchi.franken.de>
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: quoted-printable
Cc: tsvwg@ietf.org, Preethi Natarajan <prenatar@cisco.com>
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tsvwg>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 12 Oct 2010 10:41:41 -0000
Hello Michael, Thanks for the clarification. 2010/10/9 Michael Tüxen <Michael.Tuexen@lurchi.franken.de>: > On Oct 9, 2010, at 12:04 PM, Yoshifumi Nishida wrote: > >> Hello Michael, >> >> Thanks for your response. >> >> 2010/10/8 Michael Tüxen <Michael.Tuexen@lurchi.franken.de>: >>> Hello Nishida-san, >>> >>> OK, I thought about this some time and think it would be good >>> to specify a way for a quick failover which can be implemented >>> at the sender only. >> >> Great! >> Please check my comments bellow. >> >>> I would like to see some extensions to you suggestion: >>> * Introduce a threshold (call it PFMR for now). >>> Then use >>> if (error counter > PMR) >>> the path is inactive. >>> if (error counter <= PMR) && (error counter > PFMR) >>> the path is potentially failed. >>> Using PFMR = 0 is what you suggest, PFMR=PMR gives >>> the old behavior. >> >> I thought similar thing and I agree with providing a way to disable PF. >> I tend to agree with this idea, but one thing I'm not very sure is how >> PFMR != 0 && PFMR < PMR can be useful. > I could image someone wanting to call a path potentially failed > after 2 consecutive timer based retransmission instead of 1. > Just being a bit more conservative. This might help deploying > such a feature in SIGTRAN networks where CMT is not deployed. > For CMT traffic, PFMR==0 is the right choice, I guess. > But I think PF is also very helpful in non-CMT scenarios. OK. I think it would be good to have tunable parameter for users. I agree with this. >> If we just want a switch to disable PF, having USE_PF parameter might be enough? >> May be Preethi has an opinion on this? >> >>> * Specify that you start sending HBs when the path is >>> PF. Explicitly allow PFHB.interval=0, which >>> I think is a good choice. Maybe we can just remove PFHB.interval. >> >> Hmm. Sorry. I might not quite follow this point. >> Does PFHB.interval=0 mean suppress sending HB during PF state? > No. I mean just to send them every RTO. > Having a specific interval allows this by setting PFHB.interval=0. > I was just thinking whether one can remove the parameter and send > the HB every RTO (and doubling it)... The same as using PFHB.interval=0. I see. My main motivation here is not to use HB.interval since it's usually rather big value. So, using RTO is fine for me. However, I think there might be useful case for PFHB.interval. (e.g. all paths have mostly the same bandwidth and priority) I prefer to keep tunable parameter here while recommended behavior is using RTO. Thanks, -- Yoshifumi Nishida nishida@sfc.wide.ad.jp > Best regards > Michael >> >> >> I agree with all of the following points. I think these are very good points. >> >>> * Make sure that the following works: The application disabled HBs. >>> When a path enters PF (or failed) HB are sent to get the path >>> active again. If it is active, no HB should be send (since >>> the application disables them). >>> >>> * Provide a way that applications are not bothered with >>> state change notification related to PF when not explicitly >>> subscribed. >>> >>> * Make clear what to do when all paths are PF. >>> >>> * Make clear what to do when all paths are failed. >>> >>> What do you think? >>> >>> Best regards >>> Michael >>> >>> On Aug 24, 2010, at 12:04 PM, Yoshifumi Nishida wrote: >>> >>>> Hello Michael, >>>> >>>> Thanks for your reply. In a nutshell, the difference between PF and PMR=0 are: >>>> PF allows SCTP to use another path while SCTP is checking the >>>> primary path is active or inactive, but don't change the behavior of >>>> marking the path as inactive. >>>> PMR=0 allows SCTP to mark the primary as inactive quickly. >>>> >>>> In my feeling, this will create several differences although it looks similar >>>> >>>> For example, if we use PMR=0, we will need to modify at least the >>>> following points in the RFC4960 >>>> 1) recommended value for PMR >>>> 2) behavior in dormant state >>>> 3) relationship between PMR and AMR >>>> RFC4960 states users should avoid having the value of >>>> 'Association.Max.Retrans' larger than the >>>> summation of the 'Path.Max.Retrans' we'll need to change this part. >>>> >>>> Also, I think we'll need to think about the following points >>>> a) Some of current applications or OS local configurations might >>>> already have specified PMR on their own. If they're not using PMR=0, >>>> their benefit might be reduced. >>>> b) When you have 100Mbps and 1Mbps links and you set 100Mbps as >>>> primary, everytime packet loss happens on the 100Mbps link, it will >>>> switch over to the 1Mbps link and have to wait for HeartBeat which is >>>> likely less frequent (30secs or so). Or, you'll need to add special >>>> logic here in the spec. >>>> >>>> If we use PF, >>>> PF allows SCTP to keep PMR and AMR unchange. Hence, we don't have >>>> to modify 1) and 3). >>>> and issue 2) will be a minor point since we don't change PMR and AMR. >>>> Also, PF already has a solution for b) as described in the draft. >>>> >>>> In my view, PMR=0 will requires several "modifications" to the spec >>>> which might be a bit tricky to understand for implementers while PF >>>> will requires explicit "addition". >>>> >>>> Thanks, >>>> -- >>>> Yoshifumi Nishida >>>> nishida@sfc.wide.ad.jp >>>> >>>> >>>> 2010/8/23 Michael Tüxen <Michael.Tuexen@lurchi.franken.de> >>>>> >>>>> On Aug 17, 2010, at 12:16 PM, Yoshifumi Nishida wrote: >>>>> >>>>>> Hi folks, >>>>>> This is a follow-up from the Maastricht meeting. >>>>>> Preethi and I are proposing quick failover algorithm in SCTP and gave a presentation about this one. >>>>>> >>>>>> In my feeling, the community seems to be positive in enhancing the SCTP standard to some extent to address this issue. >>>>>> Also, in my understanding, Michael and Randy suggested that minor updates in the current spec can have the similar effects as the PF approach can do. >>>>>> So, we're going to start with investigating the alternatives for PF approach and would like to know about the detail of the suggestion. >>>>>> If Michael and Randy could give us some info about this, we would be grateful very much. >>>>> Sure. >>>>> One point is that RFC 4960 does not specify what to do when all paths >>>>> are INACTIVE. If that happens, the association is called dormant. >>>>> I think it should be clarified that you still send HB to get a path >>>>> to ACTIVE again until the association finally fails. This could be >>>>> handled as an errata, I think. >>>>> >>>>> Now assume that handling of the dormant state. If I understand PF correctly, >>>>> you could simply set Path.Max.Retrans to 0 to get the same behavior on the >>>>> wire as when using PF, or am I missing something? The only difference I see, >>>>> are the path state change notification sent locally to the user. >>>>> >>>>> But maybe I have overlooked something... >>>>> >>>>> Best regards >>>>> Michael >>>>>> Also, if someone who has comments or feedbacks for this, please let us know. >>>>>> >>>>>> Thank you so much. >>>>>> -- >>>>>> Yoshifumi Nishida >>>>>> nishida@sfc.wide.ad.jp >>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >>> >> > >
- Re: quick failover in SCTP Michael Tüxen
- quick failover in SCTP Yoshifumi Nishida
- Re: quick failover in SCTP Yoshifumi Nishida
- Re: quick failover in SCTP Michael Tüxen
- Re: quick failover in SCTP Yoshifumi Nishida
- Re: quick failover in SCTP Michael Tüxen
- Re: quick failover in SCTP Michael Tüxen
- Re: quick failover in SCTP Preethi Natarajan
- Re: quick failover in SCTP Yoshifumi Nishida
- RE: quick failover in SCTP David Lehmann
- Re: quick failover in SCTP Michael Tüxen
- RE: quick failover in SCTP David Lehmann
- Re: quick failover in SCTP Preethi Natarajan
- Re: quick failover in SCTP Preethi Natarajan
- Re: quick failover in SCTP Preethi Natarajan