Re: [dna] Issues with dna-simple-14

Hi Erik,
  I am in the process of producing version -15. I will get back to you with the proposed changes and try to address your comments in this revision.

Thanks
Suresh
________________________________________
From: dna-bounces@ietf.org [dna-bounces@ietf.org] On Behalf Of Erik Nordmark [erik.nordmark@oracle.com]
Sent: Thursday, April 01, 2010 12:56 AM
To: dna@ietf.org
Subject: [dna] Issues with dna-simple-14

Summary:

While I think I can implement this protocol based on the IETF
conversations, I think it would be hard to get a correct and
interoperable implementation based solely on the specification.

Most of the issues are around the clarity of the spec, but there are
some things that are missing and one thing which makes no sense.

Section 3 defines "operable IPv6 address", but this term is barely used
in the document. The descriptions would be more exact if the document
actually used the term, and also explain the implications of having an
inoperable address. For instance, I think the intent is that an
inoperable address not be configured on an interface, hence it would not
be used for non-DNA traffic. (At least that is what the pseudo-code
comments lead me to believe.)

Section 4.3 and the pseudo-code says that all addresses should be marked
deprecated. That doesn't make sense because a deprecated address can
still be used, thus can have issues with DAD when the host has moved to
a different network. I think the right description is that all addresses
except the link-local one should be made inoperable in section 4.3.
(That matches subsequent text that says that when an address is made
operable.)

---

Section 2.4 first paragraph is confusing. It would make more sense to say:
    Detecting Network Attachment is performed by hosts after detecting a
    link-layer "up" indication.  The host simultaneously sends unicast
    Neighbor Solicitations (NSs) and multicast Router Solicitations
    (RSs) in order to determine whether previously encountered routers
    are present on the link, and if they are not, acquire the new
    configuration information.

Section 2.4 last paragraph talks about MAX_RA_DELAY_TIME, but that is
only source of delays. RFC 4861 also has
    In addition, consecutive Router Advertisements
    sent to the all-nodes multicast address MUST be rate limited to no
    more than one advertisement every MIN_DELAY_BETWEEN_RAS seconds.
Thus 3.5 seconds is the worse delay in the absence of packet loss.

Section 4.1.1 appears to have a SHOULD for a particular implementation
technique (a period timer), which is inappropriate. And it is missing
text that specifies what happens when a particular router stops
advertising a prefix, while other routers continue to advertise that
prefix. Both those can be addressed by rephrasing the text to be:
    The host SHOULD maintain the SDAT table by removing entries when the
    valid lifetime for the prefix and address expires, that is, at the
    same as as the prefix is removed from the Prefix List in [RFC4861].
    But a host SHOULD also remove a router from a SDAT entry when that
    router stops advertising a particular prefix. When three consequtive
    RAs from a particular router has not included a prefix, then the
    router should be removed from the corresponding SDAT entry.
    Likewise, if a router starts advertising a prefix for which there
    already exists a SDAT entry then that router should be added to the
    SDAT entry.

    A host MAY maintain SDAT entries from some number of previously
    visited networks.  When the host attaches to a previously unknown
    network it might need to discard some older SDAT entries.

The second paragraph in 4.4 says that a unicast NS is sent to each
router, but it fails to say that the link-layer address in the SDAT
entry should be used.

The second paragraph in section 4.4 doesn't seem to take into account a
SDAT entry having multiple routers. If a host has SDAT entries for e.g.,
four previously visited networks, each having three routers advertising
the same (set of) prefix(es), then it doesn't make sense to probe all
the routers that advertised a particular prefix, but instead try to
"spread" the probes across SDATs for different prefixes by first sending
a NS to one router for each prefix. Perhaps that can be addressed by
loosening up the language in that paragraph and adding text to the end
of the section after the text about "6".

Section 4.4 and elsewhere use the term "test node" but that term isn't
defined anywhere. Given that it is the same as the router selected from
the SDAT entry I don't see why we need to define a unique and different
term for this. The particular text in 4.4 that uses "test node" is
redundant with section 4.1.1 (but there is a difference wrt SHOULD vs.
MUST - perhaps 4.1.1 needs to say MUST.)

Typo: s/learnt/learned/

Section 4.5.1 says the SLLA SHOULD be included in the NS. If the
link-local address is a duplicate that will override the NCE in the
router. Can't we assume that the router retains a NCE for the host and
omit the SLLA?

Section 4.7 talks about a "responding Neighbor Advertisement" but I
think the well-defined term to use is a "solicited Neighbor
Advertisement". I think the use of "test node" is superfluos in that
section.

Section 4.7 uses the vague language "utilizes the addresses" and also
"the detected network", when it would be more accurate to say "mark the
addresses as operable" and "the responding router", respectively. (The
protocol doesn't pretend to detect a network/link, but merely detect
whether an old router is present.)

Section 4.7 is underspecified in that it doesn't state what should
happen when a RA is received from an unknown router (not in any SDAT
entry) and no prefix overlap. What is the intended behavior?

The last paragraph in section 4.7 about REACHABLE, and the corresponding
text in the pseudo-code is at conflict with RFC 4861, and I don't think
that is the intent. The 4861 behavior is to not change the state for RAs
since we can't tell that an RA is solicited hence we don't know there is
bidirectional reachability.
The 4861 state table has these two relevant entries:
    !INCOMPLETE     NS, RS, RA, Redirect    Update link-layer     STALE
                    Different link-layer    address
                    address than cached.

    !INCOMPLETE     NS, RS, RA, Redirect    -                   unchanged
                    Same link-layer
                    address as cached.

While one could try to improve 4861 by defining a "solicited RA" as one
that is unicast to the host's IP address, that seems counter to keeping
DNA simple.

The intent of the text about SEND_NA_GRACE_PERIOD in section 4.7 makes
sense. But the pseudo-code doesn't behave that way. As soon as a RA is
received in the pseudo-code, then the router is added to
IsRouterOnNAIgnoreList, thus a subsequent SEND-protected NA will be
ignored. I don't have a good suggestion on how to fix the pseudo-code here.

Section 5 says
        /* Only for RAs received as response to DNA RSs */
but I actually don't know how to test for that. Is the intent to only do
this for unicast RAs? That would be odd because RFC 4861 doesn't require
routers to ever unicast RAs.

Section 5 doesn't include the Router L2 address in the call to
Send_Neighbor_Solicitationn.

Section 5 doesn't include the limit of at most 6 NSs.

Section 5 says
            /* If address is configured on interface, remove it */
but I thought all inoperable addresses need to be removed after the
link-up. (Otherwise, the previous comment about "address is operable"
makes no sense.)

Section 5 says
        /* Only for NAs received as response to DNA NSs */
but I don't know how to test for this.
Should it (and the above similar RA test) only apply when we have some
inoperable addresses on the interface? I don't think so since we want to
handle conflicting RAs after we have received a NS that made the last
address operable. Thus I'm at a loss about the intent of those two comments.

Section 6 has a definition of SEND_NA_GRACE_TIME which doesn't match the
earlier text. As I understand it it is the time from when a non-send RA
arrives (that conflicts with a SEND NS) until we declare the RA a
winner. Is that correct?

    Erik

_______________________________________________
dna mailing list
dna@ietf.org
https://www.ietf.org/mailman/listinfo/dna