Re: [dhcwg] draft-ietf-dhc-server-override-03.txt returned to dhc WG for review (2nd call for review)

Kim Kinnear <kkinnear@cisco.com> Tue, 14 March 2006 17:56 UTC

Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1FJDkz-0007Tu-F9; Tue, 14 Mar 2006 12:56:29 -0500
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1FJDky-0007Tp-Lu for dhcwg@ietf.org; Tue, 14 Mar 2006 12:56:28 -0500
Received: from sj-iport-3-in.cisco.com ([171.71.176.72] helo=sj-iport-3.cisco.com) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1FJDky-0000Nr-08 for dhcwg@ietf.org; Tue, 14 Mar 2006 12:56:28 -0500
Received: from sj-core-5.cisco.com ([171.71.177.238]) by sj-iport-3.cisco.com with ESMTP; 14 Mar 2006 09:56:27 -0800
X-IronPort-AV: i="4.02,191,1139212800"; d="scan'208"; a="415377339:sNHT34653698"
Received: from xbh-rtp-211.amer.cisco.com (xbh-rtp-211.cisco.com [64.102.31.102]) by sj-core-5.cisco.com (8.12.10/8.12.6) with ESMTP id k2EHuO7T026477; Tue, 14 Mar 2006 09:56:24 -0800 (PST)
Received: from xfe-rtp-202.amer.cisco.com ([64.102.31.21]) by xbh-rtp-211.amer.cisco.com with Microsoft SMTPSVC(6.0.3790.211); Tue, 14 Mar 2006 12:56:23 -0500
Received: from kkinnear-wxp.cisco.com ([161.44.65.117]) by xfe-rtp-202.amer.cisco.com with Microsoft SMTPSVC(6.0.3790.211); Tue, 14 Mar 2006 12:56:22 -0500
Message-Id: <4.3.2.7.2.20060314120158.0316cc88@email.cisco.com>
X-Sender: kkinnear@email.cisco.com
X-Mailer: QUALCOMM Windows Eudora Version 4.3.2
Date: Tue, 14 Mar 2006 12:56:21 -0500
To: Stig Venaas <stig.venaas@uninett.no>, "David W. Hankins" <David_Hankins@isc.org>
From: Kim Kinnear <kkinnear@cisco.com>
Subject: Re: [dhcwg] draft-ietf-dhc-server-override-03.txt returned to dhc WG for review (2nd call for review)
In-Reply-To: <4416E226.3030606@uninett.no>
References: <20060309193834.GF27300@isc.org> <C032E42D.11018%rdroms@cisco.com> <20060309193834.GF27300@isc.org>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
X-OriginalArrivalTime: 14 Mar 2006 17:56:22.0418 (UTC) FILETIME=[9AC62F20:01C64790]
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 827a2a57ca7ab0837847220f447e8d56
Cc: Margaret Wasserman <margaret@thingmagic.com>, dhcwg <dhcwg@ietf.org>, Ralph Droms <rdroms@cisco.com>, kkinnear@cisco.com, "Mark Townsley (townsley)" <townsley@cisco.com>
X-BeenThere: dhcwg@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: dhcwg.ietf.org
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/dhcwg>, <mailto:dhcwg-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:dhcwg@ietf.org>
List-Help: <mailto:dhcwg-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/dhcwg>, <mailto:dhcwg-request@ietf.org?subject=subscribe>
Errors-To: dhcwg-bounces@ietf.org

Stig, David,

I'm responding to Stig's response to David's message, but didn't
include the considerable text to (hopefully) enhance clarity.

I will discuss the problems that David discusses, and at the end,
I will propose a different approach to solving them which also
solves additional related problems.

I hear three issues:

  1.  A DHCP server can't tell if a DHCP renewal was broadcast or
  unicast by examing the giaddr if all renewals are unicast to
  the relay agent and then unicast on to the DHCP server with a
  giaddr inserted.

  2.  In a load balancing situation both servers will get all
  renewals.

  3.  In a load balancing situation a client will get two Ack's,
  instead of one as is the usual case.

In detail:

Issue #1:  Server can't tell broadcast from unicast.

        David says:

>        The message is clear to me, in the above text snippets:
>        DHCPNAK while the client is perceptively in RENEWING
>        state is illegal (despite its presence on the state
>        engine in 4.4, this process is never codified in text).

        Stig says: under what circumstances would you *want* to
        NAK a renewal that you received through the relay anyway?

Kim responds:

Ignoring for the moment Stig's point, there are many times that
renewals are NAK'ed in practice, and all clients known to us will
honor that per the state transition diagram and move back to INIT
state on a NAK.  The common case is that the IP address the
client has is deemed to be "uncorrect" for some reason other than
network connectivity, and the client is given a NAK to cause them
to DISCOVER for a different IP address.

This comes up in several different situations in our support for
client-class or in cases of client registration.  Sometimes a
client is given an IP address which is "restricted" in some way,
and then after registration, is given a "better" IP address.  In
most implementations of which I am aware (including ours), when
it is time to switch from a "restricted" to "better" IP address,
a NAK is returned to the renew for the "restricted" IP address.
We've been doing this with countless installations for at least 7
years, and have heard of no problems.

Probably this is in part because of the state transition diagram
and probably this is in part because the code in most clients
that processes renewal responses is closely related to the code
that processes rebind responses, which the text in RFC2131 makes
more clear can be NAKed.

We certainly don't have the only server which NAK's renewals, and
I thought that this was old news indeed -- about the state
transition diagram allowing NAK's on renews.

Now, I've never discussed NAKing renews with anyone based on
network attachment, and indeed, our server has code that is
functionally similar that described by David -- we don't NAK a
renew based on network connnectivity, only a rebind.  But that
doesn't mean that in general we can't NAK a renew -- far from it.

So I think the issue at bottom is: what kind of network design
could you have that would allow renews (because you can't NAK
them for network connectivity) and yet would cause NAK's on
rebinds (due to network connectivity)?  This certainly sounds
like a network on which any kind of failover or load balancing
would not work at all well, since rebinds are an integral part of
the failover process.  I can't believe that this is a robust
network design.

Issue #2: Load balancing generates multiple ACK's.

This certainly is an issue, and does make load balancing largely
uninteresting in a situation where you would use the
server-id-override sub-option.  On the other hand, failover would
still work -- you just don't get any effective load balancing.

We've viewed this as the price you pay to talk to a DHCP client
that you wouldn't otherwise be able to talk to, and given the
circumstances, a small price to pay for that capability.

It would be nice to be able to tell a unicast from a broadcast in
the packet forwarded from the relay agent.  That would make load
balancing work and would also allow a solution to #1 and #3.

Issue #3:  Clients can get multiple ACK's to a renew, not just a
rebind.

At present a client can get multiple ACK's to rebind, but not a
renew.  This would go away if issue #2 was solved somehow, since
that is how this situation occurs.

Given that client's expect multiple responses from a rebind,
we've never seen issues with clients from multiple ack's from
renews, for what that is worth.

----------------------------------------------------------

Solution?

If we have the relay agent tell the DHCP server if a packet it
forwards was broadcast or unicast, then the DHCP server could
handle each issue in a way that was not only correct but also
compatible with its current approach.  I think issue #2 is the
only really substantive one, but if we solve that we can solve
them all, so why argue about which are important and which not?

The question comes to -- how to have the relay agent tell the
server the packet was unicast or broadcast.  The obvious answer
is add information to the server-id-override sub-option, a flags
byte perhaps.

But there is a larger problem related to this one that deserves
some discussion:

One major issue with implementing advanced capabilities in the
DHCP server is the lack of option-82 information in the unicast
renew packets.  One solution to this is to never answer renews,
but just answer rebinds, which works but is less than optimal.

A viable but flawed solution to this lack of option-82
information is to have the DHCP server put the giaddr into the
server-identifier sub-option even if it *didn't* get a
server-id-override sub-option from the relay agent.  This will
cause all packets to be sent to the relay-agent.  All relay
agents known to us will accept packets unicast to them and put
option-82 information in the packets and forward them on to the
DHCP server.

But of course this approach is flawed by the same three issues
above.  

However, if we get the relay agent to tell the server if a packet
was broadcast or unicast, then this could be a viable way to
ensure option-82 information in all renewals.  But we can't
extend the server-id-override sub-option to do it.  We would need
a new sub-option to signal that a packet with option-82
information was unicast to the relay agent.

Perhaps a "unicast-to-relay" option-82 sub-option could be defined
that would be used whenever a relay agent forwarded on a renewal
that was unicast to it.  This would be a zero-length sub-option.
It would solve the general problem of telling a unicast from a
broadcast packet apart that came to the relay.

This would be in addition to the server-id-override sub-option
(if any) that might be in the packet.

Something to think about...

Cheers -- Kim

_______________________________________________
dhcwg mailing list
dhcwg@ietf.org
https://www1.ietf.org/mailman/listinfo/dhcwg