[dhcwg] Re: Changes to create draft-ietf-dhc-failover-12.txt

Kim Kinnear <kkinnear@cisco.com> Mon, 03 March 2003 21:22 UTC

Received: from www1.ietf.org (ietf.org [132.151.1.19] (may be forged)) by ietf.org (8.9.1a/8.9.1a) with ESMTP id QAA13276; Mon, 3 Mar 2003 16:22:28 -0500 (EST)
Received: from www1.ietf.org (localhost.localdomain [127.0.0.1]) by www1.ietf.org (8.11.6/8.11.6) with ESMTP id h23LWVp27519; Mon, 3 Mar 2003 16:32:31 -0500
Received: from ietf.org (odin.ietf.org [132.151.1.176]) by www1.ietf.org (8.11.6/8.11.6) with ESMTP id h23LVRp27481 for <dhcwg@optimus.ietf.org>; Mon, 3 Mar 2003 16:31:27 -0500
Received: from rtp-core-1.cisco.com (rtp-core-1.cisco.com [64.102.124.12]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id QAA13251 for <dhcwg@ietf.org>; Mon, 3 Mar 2003 16:20:47 -0500 (EST)
Received: from goblet.cisco.com (IDENT:mirapoint@goblet.cisco.com [161.44.168.80]) by rtp-core-1.cisco.com (8.12.6/8.12.6) with ESMTP id h23LMIJR018281; Mon, 3 Mar 2003 16:22:18 -0500 (EST)
Received: from KKINNEAR-W2K.cisco.com (rtp-vpn2-958.cisco.com [10.82.243.190]) by goblet.cisco.com (Mirapoint) with ESMTP id ACS60723; Mon, 3 Mar 2003 16:22:14 -0500 (EST)
Message-Id: <4.3.2.7.2.20030303161952.025645b8@goblet.cisco.com>
X-Sender: kkinnear@goblet.cisco.com
X-Mailer: QUALCOMM Windows Eudora Version 4.3.2
Date: Mon, 03 Mar 2003 16:22:10 -0500
To: Kim Kinnear <kkinnear@cisco.com>, dhcwg@ietf.org
From: Kim Kinnear <kkinnear@cisco.com>
In-Reply-To: <4.3.2.7.2.20030303135757.0255f9b0@goblet.cisco.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Subject: [dhcwg] Re: Changes to create draft-ietf-dhc-failover-12.txt
Sender: dhcwg-admin@ietf.org
Errors-To: dhcwg-admin@ietf.org
X-BeenThere: dhcwg@ietf.org
X-Mailman-Version: 2.0.12
Precedence: bulk
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/dhcwg>, <mailto:dhcwg-request@ietf.org?subject=unsubscribe>
List-Id: <dhcwg.ietf.org>
List-Post: <mailto:dhcwg@ietf.org>
List-Help: <mailto:dhcwg-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/dhcwg>, <mailto:dhcwg-request@ietf.org?subject=subscribe>

Well, the submission deadline *used* to be 5pm, but I guess that
was only when it was on Friday.  Sigh, I should read the fine
print next time.  So, the -12 version of the draft will not be
available until I find someplace else to stash it so you can read
it.  Ditto for the leasequery draft that I was about to submit.

More later,

Kim


At 02:22 PM 3/3/2003, Kim Kinnear wrote:

>Folks,
>
>Here is mail detailing the changes made to
>draft-ietf-dhc-failover-11.txt to yield
>draft-ietf-dhc-failover-12.txt.  In most cases considerable
>changes were made to the indicated sections, so that it was not
>possible to say "this word changed to that word", but rather the
>entire section needs to be re-read to grasp the essence of the
>change.
>
>-------------------------------------------------------------------
>
>During the last IETF in Atlanta, a discussion was held with several
>folks about problems in the DHCP failover protocol and its description.
>In attendance were:
>
>Scanner Luce    scanner@nominum.com
>Bernie Volz     Bernie.Volz@am1.ericsson.se
>Mark Stapp      mjs@cisco.com
>Kim Kinnear     kkinnear@cisco.com
>
>We primarily discussed issues encountered by Scanner during his
>implementation of the failover protocol, and first raised at a
>meeting with several people during the Summer 2002 IETF.
>
>Thanks to Bernie Volz for comments and additions to an earlier
>version of these notes.  Some of his additions I've simply placed
>into the text, but I've left one comment explicit since we may
>want to discuss it further.
>
>The action plan for failover at this point is:
>
> (x)    a. Circulate these notes to the DHCP list. 
>
> (x)    b. Accept comments.
>
> (x)    c. Update the failover draft by the end of February.
>
>-->     c-1. Circulate email concerning changes made.
>
>        d. Consider whether we need another WG last call based
>        on these changes.
>
>This email is step C-1.
>
>Changes made to the failover draft:
>-----------------------------------
>
>While the discussion ranged over several topics, the action items
>boiled down to the following:
>
>1.  Connection establishment changes:
>
>        a.  There MUST be one endpoint failover relationship
>        (i.e., between two servers).
>
>        b.  There SHOULD be one relationship per partner, but
>        this is not a requirement.
>
>        We determined there was little value in having the same
>        two servers be involved in two relationships (in
>        general).  So, having one server be primary for some
>        pools and secondary for others where the partner has
>        opposite roles is really not necessary and makes little
>        sense.  Especially now that load balancing exists and the
>        primary and secondary are almost equal (the primary just
>        breaks ties).
>
>        c.  There SHOULD be only one port in use for failover
>        traffic.
>
>        d.  The TCP connection from the secondary server to the
>        primary server is dropped by the secondary server
>      or is dropped by the primary server in the event that both
>        servers end up connecting at the same time.
>
>        We need a strategy to handle the case where two
>        connections are done at the same time (primary ->
>        secondary; secondary -> primary).  The role is simply
>        that the connection that the primary initiated is the one
>        that is kept.  So, the primary drops the connection it
>        ACCEPTed from the secondary and the secondary drops the
>        connection on which it CONNECTed to the primary.
>
>  Modifications were made to implement these changes to:
>
>        Definition of failover endpoint.
>
>        Section 5.1.1 Failover endpoints
>
>        Section 8. Connection Management
>        
>        Section 8.1 Connection granularity
>
>        Section 8.2 Creating the TCP connection
>
>2.  Remove paragraph 4 from Section 7.1.3 (from the BNDUPD
>conflict section).  This paragraph turns out to add as opposed to
>remove confusion.
>
>  Modifications were made to implement this change by:
>
>        Removing paragraph 4 from Section 7.1.3.
>
>3.  Consider adding pseudo-code for the MCLT logic (which Scanner
>has offered to contribute, since he felt this would be helpful.)
>I'll include anything I get in this regard.
>
>  No modifications were made in this case, because I received
>  no pseudo-code to add.
>
>4.  Review sequence diagrams for accuracy.
>
>  These were all reviewed, but no changes were made as no problems
>  were found.  If someone has problems with these sequence diagrams,
>  please send me specific information about the problem, and I'll
>  be glad to fix it.  
>
>5.  The failover partners can run in two different modes -- time
>sync mode, or time skew mode.  In time sync mode, still send the
>time, and the receiving server can be itself be in one of two
>modes -- time correction mode (to handle time drift) or time
>rejection mode (which will reject packets with a time that is
>"too wrong") in them.
>
>Bernie added:
>
>        One point is that whichever mode one is in, one must
>        allow some small drift in time when doing time based
>        comparisons.  For example, lease expirations may easily
>        be off by 1 second just because of the time that elapsed
>        between when a packet was sent and received (a very small
>        fraction of time must elapse for this to potentially
>        happen).  I think that was more the issue that we need to
>        make clear - time checks need to be somewhat soft rather
>        than absolute?  But I agree that servers may chose to
>        require the time to be "in sync" or can chose to do time
>        corrections.  
>
>        Note that we have found problems in not accommodating
>        time skew; if the servers require the time to be in sync
>        and continue running (but refuse to communicate) bad
>        things sometimes happen (especially if the servers assume
>        partner-down state).  This can happen because step 6 in
>        section 9.3.2 allows two actions.  We may want to
>        consider exactly what it means for the partner to be
>        "down"; for example if the partner is up but a connection
>        can not be established because the time-skew it out of
>        range, then it may be better to have one server wait
>        (typically the one that was down the longest?).
>
>  Modifications were made to implement this change by changing:
>
>        Section 5.10 Time synchronization between servers
>
>6.  The client-last-transaction-time should not be remembered if
>the packet is dropped due to the server being in the wrong
>failover state to respond to DHCP client packets.  This was
>implicit in the previous versions of the draft, but not clearly
>stated.
>
>  Modifications were made to implement this change by changing:
>
>        Section 9.2 Server State Transitions
>
>7. Contact information was changed, and some dates were updated
>to 2003.

_______________________________________________
dhcwg mailing list
dhcwg@ietf.org
https://www1.ietf.org/mailman/listinfo/dhcwg