[dhcwg] Proposed Resolution to DNAv4 Issue 30

The text of DNAv4 issue 30 is enclosed below.  This and other DNAv4 issues
are tracked on the DNAv4 Issue page, located at:
http://www.drizzle.com/~aboba/DNA/

A version of the document with the proposed changes applied is available
here:
http://www.drizzle.com/~aboba/DNA/draft-ietf-dhc-dna-14.txt

The proposed resolution is as follows:

In Section 1.2, change the definition of MLN to:

"Most Likely Networks (MLNs)
The attached network(s) determined by the host to be most likely."

Change Section 2, 2.1, 2.2 and 2.3 to the following:

"2. Overview

DNAv4 consists of three phases: determination of the Most Likely
Networks (MLNs), reachability testing, and IPv4 address acquisition.

On connecting to a new point of attachment, the host responds to a
"Link Up" indication from the link layer by carrying out the DNAv4
procedure. Based on the networks that the host has most recently
connected to, as well as hints available from the link and Internet
layers, the host determines the "Most Likely Network(s)" (MLNs) and
determines whether it has an operable IPv4 configuration associated
with each of them.

If the host believes that it has an operable IPv4 configuration on a
MLN, it performs a reachability test in order to confirm that
configuration. The reachability test is designed to verify bi-
directional connectivity to the default gateway(s) on the MLN. If
the reachability test is successful, the host SHOULD continue to use
an operable routable IPv4 address, without needing to re-acquire it,
thereby allowing the host to bypass DHCPv4 as well as Duplicate
Address Detection (DAD). If the host believes that it has attached
to a network on which it has no operable IPv4 configuration, or if
the reachability test fails, then the host attempts to obtain an IPv4
configuration using DHCPv4.

Since DNAv4 represents a performance optimization, it is important to
avoid compromising robustness. In some circumstances, DNAv4 may
result in a host successfully verifying an existing IPv4
configuration where attempting to obtain configuration via DHCPv4
would fail (such as when the DHCPv4 server is down).

To improve robustness, this document suggests that hosts behave
conservatively with respect to assignment of IPv4 Link-Local
addresses [RFC3927], configuring them only in situations in which
they can do no harm. Experience has shown that IPv4 Link-Local
addresses are often assigned inappropriately, compromising both
performance and connectivity.

Where the host tests reachability only to a single MLN, the
performance of DNAv4 is to some extent dependent on the reliability
of the hints provided to the client. However, the host will
ultimately determine the correct IPv4 configuration even in the
presence of misleading hints. Where reachability test(s) fail a
timeout will occur, after which the host will eventually obtain the
correct configuration using DHCPv4, albeit with a performance
penalty.

Where there is more than one MLN, the host can test reachability to
the MLN(s) in serial or in parallel. An implementation can also
attempt to obtain IPv4 configuration via DHCPv4 in parallel with one
or more reachability tests, with the host using the first answer
returned. These optimizations reduce the reliance on link and
Internet layer hints, which may not be present or may be misleading.
Attempting to obtain IPv4 configuration via DHCPv4 in parallel is
particularly valuable in implementations that only test reachability
of a single MLN. Since confirming failure of a reachability test
requires a timeout, mistakes are costly and therefore sending a
DHCPREQUEST from the INIT-REBOOT state, as described in [RFC2131]
Section 3.2 and 4.3.2 may complete more quickly than the reachability
test.

DNAv4 does not increase the likelihood of an address conflict. The
DNAv4 procedure is only carried out when the host has an operable
IPv4 configuration on one or more MLNs, implying that duplicate
address detection has previously been completed. Restrictions on
sending ARP Requests and Responses are described in Section 2.2.1.

2.1. Most Likely Networks (MLNs)

In order to determine the MLN(s), it is assumed that the host saves
to stable storage parameters relating to the networks it connects to:

[1] The IPv4 and MAC address of the default gateway(s) on
each network.

[2] The link type, such as whether the link utilizes
Ethernet, or 802.11 adhoc or infrastructure mode.

[3] Link and Internet layer hints associated with each
network. For details, see Appendix A.

Appendix A discusses hints useful for the determination of the
MLN(s). By matching received hints against network parameters
previously stored, an implementation testing reachability to a single
MLN can make an an educated guess as to which network it has attached
to. Alternatively, an implementation that simultaneously tests
reachability to multiple MLNs can select them solely based on the
networks it has most recently connected to, in which case it may not
be necessary to consult hints.

2.2. Reachability Test

If the host has an operable routable IPv4 address on a MLN, a host
conforming to this specification SHOULD perform a reachability test,
in order to confirm that it is connected to a network on which it has
an operable routable IPv4 address.

The host skips the reachability test for a MLN if any of the
following conditions are true:

[a] The host does not have an operable routable IPv4
address on a MLN. In this case, the reachability
test cannot confirm that the host has an operable
routable IPv4 address, so completing the
reachability test would serve no purpose.
A host MUST NOT use the reachability test to
confirm configuration of an IPv4 Link-Local
address.

[b] The host does not have information on the default
gateway(s) on a MLN. In this case, insufficient
information is available to carry out the reachability
test.

[c] If secure detection of network attachment is required.
The reachability test utilizes ARP which is insecure,
whereas DHCPv4 can be secured via DHCPv4 authentication,
described in [RFC3118]. See Section 5 for details.

[d] If the default gateway address is an IPv4 Link-Local
address. In this case, it is possible that the
reachability test could be misinterpreted as
indication of an address conflict. See [RFC3927]
Section 2.2.1 for details.

For a particular MLN, the host MAY test the reachability of the
primary default gateway, or it MAY test reachability of the primary
and secondary default gateways in series or in parallel. In order to
ensure configuration validity, the host SHOULD only configure
default gateway(s) which pass the reachability test.

2.3. IPv4 Address Acquisition

If the host has an operable routable IPv4 address on one or more
MLNs, but the reachability test(s) fail, the host SHOULD attempt to
revalidate the configuration by entering the INIT-REBOOT state, and
sending a DHCPREQUEST to the broadcast address as specified in
[RFC2131] Section 4.4.2. As noted in Section 2, it is also possible
for IPv4 address acquisition to occur in parallel with the
reachability test.

If the host does not have an operable routable IPv4 address on any
MLN, the host enters the INIT state and sends a DHCPDISCOVER packet
to the broadcast address, as described in [RFC2131] Section 4.4.1.
If the host supports the Rapid Commit Option [RFC4039], it is
possible that the exchange can be shortened from a 4-message exchange
to a 2-message exchange.

If the host does not receive a response to a DHCPREQUEST or
DHCPDISCOVER, then it retransmits as specified in [RFC2131] Section
4.1.

As discussed in [RFC2131], Section 4.4.4, a host in INIT or REBOOTING
state that knows the address of a DHCP server may use that address in
the DHCPDISCOVER or DHCPREQUEST rather than the IPv4 broadcast
address. In the INIT-REBOOT state a DHCPREQUEST is sent to the
broadcast address so that the host will receive a response regardless
of whether the previously configured IPv4 address is correct for the
network to which it has connected.

Sending a DHCPREQUEST to the unicast address in INIT-REBOOT state is
not appropriate, since if the DHCP client has moved to another
subnet, a DHCP server response cannot be routed back to the client
since the DHCPREQUEST will bypass the DHCP relay and will contain an
invalid source address."

In Appendix A.1, add the following sentences to the fourth paragraph:

" In order to examine the tradeoffs in implementations that only test
reachability to a single MLN..."

Add the following sentence at the end of the section:

" If instead in the above example IPv4 address acquisition were carried
out simultaneously with the reachability test, then performance would
not suffer, even where hints are unreliable."

---------------------------------------------------------------------------
Issue 30: Review of DNAv-13
Submitter name: Stuart Cheshire
Submitter email address: cheshire@apple.com
Date first submitted: July 12, 2005
Reference:
http://www1.ietf.org/mail-archive/web/dhcwg/current/msg05188.html
Document: DNA-13
Comment type: T
Priority: S
Section: Various
Rationale/Explanation of issue:
I just read draft-ietf-dhc-dna-ipv4-13.txt.

I support the goals, but I think the current document misses the target.
I have a long list of comments, but these are the three major ones:

1. Hints and heuristics

It seems to me that the reliance on hints and heuristics is vastly
overstated. The biggest and best heuristic of network attachment is the
MAC address of the default gateway, which can be verified with a trivial
ARP request (i.e. the proposed DNA mechanism itself), thereby negating
much of the usefulness of all the other hints. The document even has a
section for IP-layer hints, and then says they're pointless and shouldn't
be used. Given than an ARP request can be answered in under 1ms, any hint
or heuristic has to be significantly faster than that, or it's
self-defeating.

[BA] I agree with Stuart that link layer hints may not be very useful. If
the host attempts verification of multiple MLNs, then it is likely to be
less sensitive to bad hints, and more likely to figure things out without
any hints at all.  This should be pointed out.

[Stuart]
The argument in Appendix A that you don't want to delay the normal DHCP
INIT-REBOOT process while waiting for DNA is bogus. There's no reason why
a host that wants rapid attachment to the network (which is, after all,
the whole point of DNA) wouldn't do both simultaneously, and then just
wait and see which approach yields a fruitful result quickest.

[BA] I agree that an implementation could choose to do both DHCP and
DNAv4 in parallel. This will be useful in circumstances where
hints are unreliable or the router may have gone down, or doesn't respond
for some reason. We should clarify this.

[Stuart]
2. Why only one MLN candidate?

Why limit a host to picking just one candidate network to verify?

The drafts says, "In the absence of other information, the MLN defaults
to the network to which the host was most recently attached." Consider a
laptop computer that moves daily between Ethernet at work and Ethernet at
home. If each time it picks its most recently attached as its best guess,
it's going to be wrong 100% of the time.

It would be much better if a device simply sent ten ARP Requests for the
last ten default gateway addresses it has seen, and then sees which, if
any, are answered.

If you don't want to hit the network with ten back-to-back wire-rate ARP
Requests, then they can be staggered 100ms apart, starting with the most
recently seen network, then the previous, and so on.

Most network gateways should be able to answer an ARP Request almost
instantaneously, since answering ARP Requests is something they have to
do all the time anyway, and if they're slow at that, then that would
adversely impact the performance of pretty much all IP traffic flowing
through the gateway.

This stream of ARP Requests would of course be going on concurrently with
DHCP INIT-REBOOT processing, and as soon as one of them yields a fruitful
result, the device should stop.

It's simple, it's fast, and it yields the desired result.

[BA] As Stuart points out, an implementation can test reachability to
multiple MLNs in parallel. We should clarify that it is
ok for the implementation to do that. Assuming that the number of MLNs
chosen is reasonable, I don't think it's necessary to rate limit.

_______________________________________________
dhcwg mailing list
dhcwg@ietf.org
https://www1.ietf.org/mailman/listinfo/dhcwg