Re: [dhcwg] Load Balancing for DHCPv6

"Bernie Volz (volz)" <volz@cisco.com> Tue, 18 September 2012 16:24 UTC

Return-Path: <volz@cisco.com>
X-Original-To: dhcwg@ietfa.amsl.com
Delivered-To: dhcwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 48AC921F851C for <dhcwg@ietfa.amsl.com>; Tue, 18 Sep 2012 09:24:12 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.599
X-Spam-Level:
X-Spam-Status: No, score=-10.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id kfKKhROy9QoT for <dhcwg@ietfa.amsl.com>; Tue, 18 Sep 2012 09:24:11 -0700 (PDT)
Received: from rcdn-iport-1.cisco.com (rcdn-iport-1.cisco.com [173.37.86.72]) by ietfa.amsl.com (Postfix) with ESMTP id 43B0921F8518 for <dhcwg@ietf.org>; Tue, 18 Sep 2012 09:24:11 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=4038; q=dns/txt; s=iport; t=1347985451; x=1349195051; h=from:to:cc:subject:date:message-id:in-reply-to: content-id:content-transfer-encoding:mime-version; bh=AktEOJxKYHx/2ILzjj/b0z3PL0pwRjPTllpbcULNnBk=; b=ltrmmxO/B0DQWk05qgerrIgJ6BHJ/Rdx7iK+0hx5meFnXyou3Ova6wam 6XJVlq977bqBNi71T0cyHLOFi7JCEVWuMpibYTcVUHm+LdQt04mMZUW+y DXXVdekCoAV2C0EhTtdRoyBapkYyYVUG8Xk6LNRxqZwpdzcT/lcawCwrX s=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: Av0EABmfWFCtJXHB/2dsb2JhbABFvD+BCIInEgEUSgEHEgEIDmolAgQOBSKHXpoqoDuLGxWGWwOVY444gWmCZoFbPA
X-IronPort-AV: E=Sophos;i="4.80,442,1344211200"; d="scan'208";a="122604928"
Received: from rcdn-core2-6.cisco.com ([173.37.113.193]) by rcdn-iport-1.cisco.com with ESMTP; 18 Sep 2012 16:24:10 +0000
Received: from xhc-aln-x01.cisco.com (xhc-aln-x01.cisco.com [173.36.12.75]) by rcdn-core2-6.cisco.com (8.14.5/8.14.5) with ESMTP id q8IGO9V2028405 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=FAIL); Tue, 18 Sep 2012 16:24:09 GMT
Received: from xmb-rcd-x04.cisco.com ([169.254.8.159]) by xhc-aln-x01.cisco.com ([173.36.12.75]) with mapi id 14.02.0298.004; Tue, 18 Sep 2012 11:24:07 -0500
From: "Bernie Volz (volz)" <volz@cisco.com>
To: Andre Kostur <akostur@incognito.com>
Thread-Topic: [dhcwg] Load Balancing for DHCPv6
Thread-Index: AQHNj2xDaJ+khP2NkUiXQs2wUqwOeJeHnBuAgAkHUwD//8KGAA==
Date: Tue, 18 Sep 2012 16:24:06 +0000
Message-ID: <CC7E13F1.2549%volz@cisco.com>
In-Reply-To: <CAL10_BqbUrhzYJMSLBGsFDR_kFth2SbdC9AOHyOfyKdhNyzNkw@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
user-agent: Microsoft-MacOutlook/14.2.3.120616
x-originating-ip: [161.44.65.136]
x-tm-as-product-ver: SMEX-10.2.0.1135-7.000.1014-19190.004
x-tm-as-result: No--58.536400-8.000000-31
x-tm-as-user-approved-sender: No
x-tm-as-user-blocked-sender: No
Content-Type: text/plain; charset="Windows-1252"
Content-ID: <2346551E61B1064E8CD200C953DBD366@cisco.com>
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Cc: "dhcwg@ietf.org" <dhcwg@ietf.org>, Ted Lemon <Ted.Lemon@nominum.com>
Subject: Re: [dhcwg] Load Balancing for DHCPv6
X-BeenThere: dhcwg@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: <dhcwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dhcwg>, <mailto:dhcwg-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/dhcwg>
List-Post: <mailto:dhcwg@ietf.org>
List-Help: <mailto:dhcwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dhcwg>, <mailto:dhcwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 18 Sep 2012 16:24:12 -0000

First, I think this would be a rather bad change to DHCPv6 operation.
Client may be checking the server-identifier option they get back and so
you would have to lie about that in the response which just seems like a
very bad thing to do. (I don't believe RFC 3315 actually ever states the
clients should be checking the server-identifier, but I could be wrong --
section 15 does say they check for the presence of the option, but do not
check the contents). One could also think of situations where a relay
agent might forward the packet to explicit destinations based on the
server identifier (though again, I doubt anyone does this) -- much like
switches limit traffic based on mac-addresses.

Second, the following assumes load balancing is being used for failover ...

There is one 'trick' that I had been thinking about for a long time and is
perhaps still an open issue for failover for v6 Š and that is why not have
both failover partners use the same server-identifier value (you can think
of the server-identifier belonging to that failover relationship) -- then
the servers can always use load balancing or other 'policy' to decide
which one responds and you don't violate the concept that the
server-identifer is the one that the client expects.

It also means that Renew packets can be handled by either partner.

However, it does complicate Request packets because failover doesn't
usually exchange tentative bindings from a Solicit and so you really would
only want the server that Solicited to respond to the Request (this could
be done by looking to see if you had a tentative binding - but if neither
server does, the client has to suffer a Request timeout to get back to the
Solicit phase).

--> That is the main reason why I think this fails to work and I have not
proposed it.

Also, note that you would potentially have this same issue for your
proposal when 


For v4, the current operation has not really caused operational issues. We
have thought about how to 'force' traffic back to the original server but
as it hasn't caused any issues, we haven't bothered? New v4 clients (i.e.
those that do a discover), will get back to their server and most clients
seem to do so over a reasonable period of time.


Anyway, I think this needs to be thought through much more carefully as to
the potential consequences and I think the best is to keep it simple -
once a client has 'bound' to a server, it stays with that server until it
issues a request message which does not contain a server-identifier option.

- Bernie


On 9/18/12 12:04 PM, "Andre Kostur" <akostur@incognito.com> wrote:

>On Wed, Sep 12, 2012 at 8:15 PM, Bernie Volz (volz) <volz@cisco.com>
>wrote:
>>
>> One comment on the draft is that I think it needs to be clear that load
>> balancing is ONLY used for DHCPv6 client messages that do not include a
>> server-identifier option (this is Solicit, Information-Request (if no
>> server-identifier option), Rebind, and Confirm). Load balancing MUST
>>NOT be
>> used if a server-identifier option exists in the client's message, as
>>then
>> only that server should respond.
>
>I've been mulling this over for a while in the context of the following
>problem:
>
>Let us assume that there are two cooperating (through some additional
>mechanism external to load balancing) DHCPv6 servers A and B.  At some
>point, server A goes away for a while and the entire population
>eventually binds to server B (whether rebooting, or Rebinding to it).
> Now server A returns to service.  Until the members do a reboot or
>Rebind, server A will not be handling any traffic (other than about
>1/2 of the new clients) as server B is receiving and answering all of
>the Renews.  What I'd like is that over time, the population will
>migrate around to splitting between server A and server B again.
>
>I think for this reason we shouldn't force the server to answer (well,
>consider) every request aimed at it.
>
>--
>Andre Kostur