Re: shim6-proto-07 review

marcelo bagnulo braun <marcelo@it.uc3m.es> Thu, 28 December 2006 17:58 UTC

Envelope-to: shim6-data@psg.com
Delivery-date: Thu, 28 Dec 2006 17:59:47 +0000
Mime-Version: 1.0 (Apple Message framework v624)
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Message-Id: <e180ab768f941d2b09e30a28dc95c470@it.uc3m.es>
Content-Transfer-Encoding: quoted-printable
Cc: shim6-wg <shim6@psg.com>
From: marcelo bagnulo braun <marcelo@it.uc3m.es>
Subject: Re: shim6-proto-07 review
Date: Thu, 28 Dec 2006 18:58:01 +0100
To: Iljitsch van Beijnum <iljitsch@muada.com>

Hi Iljitsch,

sorry for the delay and thank you for the review...

I have sent a separate reply to the most substantial issues. for the 
remaining issues (which are many and not all of them editorial), i 
comment below...

El 11/12/2006, a las 19:54, Iljitsch van Beijnum escribió:

> (Note that I'm behind on the shim6 list so I'm not aware of recently 
> discussed issues, so these may be duplicated here.)
>
> This is a review of the shim proto 07 draft. It contains both nits and 
> more fundamental issues, they are presented in the order of the text, 
> but let me get two other things out of the way first:
>
> 1. Always including the context tag
>

sent a separate email for this

>
> 2. Congestion
>
> There is no discussion of congestion issues when the shim moves 
> ongoing communicaiton to another locator pair, which will generally 
> make the communication flow over a different path. We've had some 
> discussions about this before, where the suggestion was made to go 
> into slow start after a rehoming event. The counter argument: but 
> maybe the new path is just as fast as the old one. My counter counter 
> argument: suppose a file transfer over a 1 Gbps link, the Gbps link 
> goes down and the session is rehomed to a low speed link (GPRS, modem, 
> ADSL with limited uplink capacity). The send window used when the 
> session went along over the Gbps link will be so large that massive 
> congestion ensues, and also, all buffers will be filled up which 
> guarantuees that the congestion will persist for a relatively long 
> time, possibly a handful of seconds.
>
> There are no easy answers here, but congestion control is one of the 
> core concerns in the development of the internet, so I don't think we 
> can get away with ignoring this completely.
>

i can see this is an issue...

what do you suggest to do here? could you send text?

>
>    o  Preserve established communications in the presence of certain
>       classes of failures, for example, TCP connections and UDP 
> streams.
>
> Shouldn't this be "communication"?

i don't think so...

it could be Preserve an established communication or Preserve 
established communications

i think it is better the second one, because the first option seems we 
are talking about a given specific communication...

>
>
>    o  Have minimal impact on upper layer protocols in general and on
>       transport protocols in particular.
>
> And applications.

ok

>
>
> Early in the text, the phrase "site multihoming" is used. There has 
> been some discussion on this list as to whether shim6 actually is site 
> multihoming, and readers of the draft may not know that all of this is 
> the result of wgs chartered to work on "site" multihoming. So I 
> suggest adding text to clear up any potential confusion, for example:
>
> "The shim protocol is a site multihoming solution in the sense that it 
> allows existing communication to continue when a site that has 
> multiple connections to the internet experiences an outage on a subset 
> of these connections or further upstream. However, shim processing is 
> performed in individual hosts rather than through site-wide 
> mechanisms."

ok will add this... did you have a place in the draft you were thinking 
for this text?

>
>
>    Finally, this proposal also does not try to provide a new network
>    level or transport level identifier name space distinct from the
>    current IP address name space.
>
> The terms "identifier" and "locator" are used extensively even though 
> the shim is NOT an actual identifier/locator separation solution... 
> Suggested text (immediately following the sentence above):
>
> "The shim proposal doesn't fully separate the identifier and locator 
> functions that have traditionally been overloaded in the IP address. 
> However, throughout this document the term "identifier", or more 
> specifically, Upper Layer Identifier (ULID) refers to the identifying 
> function of an IPv6 address, and "locator" to the network layer 
> routing and forwarding properties of an IPv6 address."
>

ok will add that

>
>    solution.  While this document doesn't specify all aspects of this,
>    it is believed that the approach can be extended to handle the non-
>    routable address case..
>
> Extra period.

ok

> (Note that the quaint custom of inserting an extra space after a 
> sentence is generally "discouraged" in style manuals.)
>

it is only after each paragraph, not after each sentence afaict

>
>    the original locators become invalid at the same time and depending
>    on the time that is required to update the DNS and for those updates
>    to propagate.
>
> Why is the DNS relevant here?
>
>

because at least one valid locator must be published in the DNS in 
order for a shim host to be reachable, since the DNS is used as a 
rendez vous mechanism. OTOH, this is no specific of the shim6 protocol, 
but generic for any host... do you prefer we remove the DNS 
consideration from here?

>    But IP addresses are also used as ULID,
>
> Addresses is plural, ULID singular... Probably make this "ULIDs".
>

ok

>
>    In the worst case we could end up with two separate hosts using the
>    same ULID while both of them are communicating with the same host.
>
>    This potential source for confusion is avoided requiring that any
>    communication using a ULID MUST be terminated when the ULID becomes
>    invalid (due to the underlying prefix becoming invalid).
>
> This makes me uncomfortable. How do you know that an address has 
> become terminally invalid, rather than accidentally unusable?
>

this was discussed during the last meeting in san diego and there was 
consensus that if a prefix is removed from the site, existing shim 
contexts that are using a ulid containing the prefix must be terminated 
when the prefix is removed.
however, i read from your comment is that you don't have a problem with 
this idea but your problem is with how to implement this, right?



>  I contend that the distinction can't be made in a stack in a 
> meaningful way,

what about when the address becomes invalid?

>  so the above requirement will in practice only serve to disrupt 
> communication unnecessary. Rather, I would require some administrative 
> "cooling off" period to avoid using the same ULID by a different host 
> (only possible with CGA not HBA anyway). For instance, there must be 
> 24 hours between decommisioning and recommisioning of address space, 
> and we garbage collect shim state after 24 hours of not being used.
>

but what do you mean the decomisioning of the prefix? wouldn't be this 
the time the prefix is removed from the site? if this is so, the 
current draft recommends that the existing shim6 contexts that are 
using ulids containing this prefix are terminated

in other words, i fail to see why this is easier to implement that what 
is currently in the draft. I mean, if the admin can determine the 
decomisioning time, it also can terminate the existing shim6 
contexts...

> I don't see how regular nomadic behavior will result in two hosts 
> using the same address in quick succession, and they can further 
> reduce the potential for problems by not using temporary addresses as 
> ULIDs.
>

i am not sure what part of the draft this comment is referred to, but 
if a nomadic host creates a ulid using a temporary prefix and it moves 
away and then another nomadic hosts attaches itself to the same network 
and generates the same address, it is possible that the the two nomadic 
hosts end up using the same address. Now, it is worse a ULID collision 
than a locator collision, in particular, beacause once the first 
nomadic hosts leaves the visited network, the locator becomes 
unreachable, so it cannot longer be used, even if the hosts wants to 
use it. OTOH, if the address is being used as ulid, the first nomadic 
host can keep on using it, even if it is no longer located in the 
network from which it obtained the address. so i guess the comment 
makes sense to me.

>
>    layer map to/from different locators.  The shim6 layer maintains
>    state, called ULID-pair context, per ULID pairs
>
> "Pairs" should probably be singular.
>

ok

>
>    fields, and even though those locators may be changed by the
>    transmitting shim6 layer. .
>
> Extra  .
>

ok

>    The result of this consistent mapping is that there is no impact on
>    the ULPs.  In particular, there is no impact on pseudo-header
>    checksums and connection identification.
>
>
> The problem here is that some intermediate system, such as a firewall 
> or a smart NIC, may take it upon itself to check the TCP or UDP 
> checksum and discard the packet if the checksum fails.

comments in a separate email

>
>
>    Inherent in a scalable multihoming mechanism that separates locators
>    from identifiers is that each host ends up with multiple locators.
>
> This says explicitly that we do id/loc...
>

yes i can rephrase it using the id locator function terminology that 
you suggested previously

>
>    This means that at least for initial contact, it is the remote peer
>    that needs to select which peer locator to try first.  In the case 
> of
>    shim6 this is performed by applying RFC 3484 address selection.
>
> This is incorrect: the application (or layer working on its behalf) 
> needs to select an initial ULID, which automatically becomes the 
> initial locator.

agree, will rephrase this

>
>
>    This document uses the terms MUST, SHOULD, RECOMMENDED, MAY, SHOULD
>    NOT and MUST NOT defined in RFC 2119 [1].  The terms defined in RFC
>    2460 [2] are also used.
>
> Please list them.

ok, will try

>
>
>    FQDN                Fully Qualified Domain Name
>
> Hm, if you don't know what FQDN is you probably also don't know what 
> it is when spelled out... How about adding "full DNS name"?
>

i guess the goal here is just to describe the acronym, not to provide a 
definition

>
>    document), such as having the ISPs relax there ingress filters, or
>    selecting the egress such that it matches the IP source address
>    prefix.
>
> There -> their
>

ok

>
>    o  Some heuristic on A or B (or both) determine that it is
>       appropriate to pay the shim6 overhead to make this host-to-host
>       communication robust against locator failures.  For instance, 
> this
>       heuristic might be that more than 50 packets have been sent or
>       received, or a timer expiration while active packet exchange is 
> in
>       place.  This makes the shim initiate the 4-way context
>       establishment exchange.
>
> Maybe say something like:
>
> "The purpose of this heuristic is to avoid setting up a shim context 
> when only a small number of packets is exchanged between two hosts."
>

ok, will add this

>
>       If the context establishment exchange fails, the initiator will
>       then know that the other end does not support shim6, and will
>       continue with standard unicast behavior for the session.
>
> Unicast? Shouldn't this be "single homed"?
>

will change to standard (non-shim6) behaviour, ok?

>
>    the message allocated.  Thus at a minimum the combination of <peer
>    ULID, local ULID, local context tag> have to uniquely identify one
>    context.
>
> I'm not sure if I understand this.
>
> More in general, the draft seems to suggest that the content of the 
> source address field in received packets may be ignored,

this is explicitly stated in the draft

actually the complete paragraph from where the sentence above belongs 
to states:

       A context between two hosts is actually a context between two 
ULIDs.
    The context is identified by a pair of context tags.  Each end gets
    to allocate a context tag, and once the context is established, most
    shim6 control messages contain the context tag that the receiver of
    the message allocated.  Thus at a minimum the combination of <peer
    ULID, local ULID, local context tag> have to uniquely identify one
    context.  But since the Payload extension headers are demultiplexed
    without looking at the locators in the packet, the receiver will need
    to allocate context tags that are unique for all its contexts.  The
    context tag is a 47-bit number (the largest which can fit in an
    8-octet extension header).

as you can see, it is explicitly mentioned that the locators in the 
packet are not used for demux and this is the assumption that is used 
over the whole draft

actually, the first sentence of this paragraph where the uniqueness of 
peer ulid, local ulid and context tag is required, is only included for 
provide the context in which the design was performed. The actual 
design choice that was made was that context tags are unique in a host.

have you identified other places of the draft where it is mentioned 
that the demux is performed taking into account the locators in 
addition of the context tag?

>  but also that this is not the case. This is a very important decision 
> with far reaching consequences so it should be made carefully. For 
> instance, if the source address may be rewritten arbitrarily, 
> obviously routers can easily do this without much or any coordination. 
> But the potential for security issues is significant in this case.
>

the security level provided by the shim6 protocol relies on the context 
tag. Any attacker needs to learn the context tag before being able to 
inject addresses into an established context. It was concluded that 
this is an acceptable security level and similar to the one available 
in today's internet... do you agree?
>
>    context.  But since the Payload extension headers are demultiplexed
>    without looking at the locators in the packet, the receiver will 
> need
>    to allocate context tags that are unique for all its contexts.
>
> See above.
>
>
>    context tag is a 47-bit number (the largest which can fit in an
>    8-octet extension header).
>
> "while preserving one bit to differentiate the shim signalling 
> messages from the shim header included in data packets, allowing both 
> to use the same protocol number."

ok

>
>
> 4.2 context forking
>
> Never been a fan of this, but it doesn't seem to add too much extra 
> complexity the way it is now.
>

fair enough...

>
>       Such discovery probably requires to be along the path in order to
>       be sniff the context tag value.
>
> Grammar: clause without subject. Who is required to be along the path?
>

will rephrase

>
>    dynamic.  For this reason there is a Update Request and Update
>    Acknowledgement messages, and a Locator List option.
>
> Grammar. "is a" -> "are" would be better.
>

ok

>
>    Even when the list of locators is fixed, a host might determine that
>    some preferences might have changed.  For instance, it might
>    determine that there is a locally visible failure that implies that
>    some locator(s) are no longer usable.  This uses a Locator
>    Preferences option in the Update Request message.
>
> I don't consider reachability status a preference...
>

the flags in the locator preference options, in particular the broken 
flag is used for this.  I may agree that this is not a preference, but 
the other information which is related to this is preference and this 
flag needed to be included somewhere and this seems the natural 
place...

>
>    Bidirectional Communication (FBD).  FBD uses a Keepalive message
>    which is sent when a host has received packets from its peer but has
>    not yet sent any packets from its ULP to the peer.
>
> No, this works per address (per locator even, not per ULID, IIRC), not 
> per ULP.
>

the meaning here is that FBD generates packets when the ULPs are 
silent. It doesn't mean that it generates packets per ULP, just that 
when all ULPs are silent, FBD generates packets. It may be rephrased as 
follows:

    FBD uses a Keepalive message
    which is sent when a host has received packets from its peer but has
    not yet sent any packets from any of its ULPs to the peer.

would this be better?


>
>    which precedes a routing header).  When tunneling is used, whether
>    IP-in-IP tunneling or the special form of tunneling that Mobile IPv6
>    uses (with Home Address Options and Routing header type 2), there is
>    a choice whether the shim applies inside the tunnel or outside the
>    tunnel, which affects the location of the shim6 header.
>
> How is this coordinated with the other side? If one side does 
> tunneling first and shim second and the other side the other way 
> around, there will be trouble. I don't see an easy way to avoid this.
>

the header order would be enough imho. I mean, when tunneling is 
performed, two IP layers are involved and each ip layer can have its 
own shim sublayer inside. So, for each ip layer a shim header can be 
added, so depending on the location of the shim header, how it applies, 
makes sense?

>
>    the control messages; only the payload extension header use the Next
>    Header field.
>
> uses
>
ok

>
>    Next Header:   8-bit selector.  Normally set to NO_NXT_HDR (59).
>
> So what happens when some other header follows the shim header? Could 
> this be used for attacks?
>

this is for shim6 control messages and so far we haven't defined means 
to perform piggybacking of other protocols in shim6 control messages, 
so it is not possible for other protocols to follow the shim header. If 
this is defined in future extensions, they will need to update this 
part of the spec

>
> About the different messages: they are very similar. If I were to 
> implement all of this, I would rather work with one basic structure 
> for all of the messages, even if the _meaning_ of some fields is 
> different as long as their structure is always the same. I think this 
> can easily be done here, by including fields that nearly all messages 
> need (simply leave it zero when a particular message doesn't need a 
> field) and use options for things that a particular message needs that 
> aren't accommodated in the unified structure.
>

i can clearly see the cost of this approach, additional overhead for 
unused fields and potential source of confusion when fields change 
their name... what is the advantage of such approach?

> Did I miss the place where HBA information is exchanged?
>

The CGA Parameter data structure option contains all the information 
required by the HBA

> update request: why is this a request?
>

because this is a two way protocol, update and request... (i do agree 
that it is not the best name in this case... suggestion?)

>
>    This message is sent in response to a Update Request message.  It
>    implies that the Update Request has been received, and that any new
>    locators in the Update Request can now be used as the source 
> locators
>    of packets.  But it does not imply that the (new) locators have been
>    verified to be used as a destination, since the host might defer the
>    verification of a locator until it sees a need to use a locator as
>    the destination.
>
> Hm, is it smart to defer verification here?

depending what verification we are talking about....

this is explained in detail in section 7.2.  Locator Verification

there are two type of verification HBA/CGA verification is one type and 
the other type is the one against flooding attakcs that is achived 
using a probe packets (of the path exploration protocol)

The spec in section 7.2 states that:

    Thus the HBA/CGA
    verification SHOULD be performed by the host before the host
    acknowledges the new locator, by sending an Update Acknowledgement
    message, or an R2 message.

    Before a host can use a locator (different than the ULID) as the
    destination locator it MUST perform the HBA/CGA verification if this
    was not performed before upon the reception of the locator set.  In
    addition, it MUST verify that the ULID is indeed present at that
    locator.  This verification is performed by doing a return-
    routability test as part of the Probe sub-protocol [9].

makes sense?

>  We've already said that the other end may use them as source 
> addresses. If there is a failure and we do the verification then, we 
> may find out that it fails and we have no reasonable course of action.
>
> Also, for CGA verification, don't we need to send the other side a 
> challenge to avoid replays?
>

separate email for this issue

>
>    direction.  When the ULP is sending bidirectional traffic, no extra
>    packets need to be inserted.
>
> This works per address pair, not per ULP.
>

this is similar to what we were discussing above...

the ulps are the one sending traffic... maybe we can rephrase it as:

When the traffic generated by the ULPs results in a bidirectional flow 
of packet between the peers, no extra packets need to be inserted.

is this ok?
>
> 5.13.  Probe Message Format
>
>    This message and its semantics are defined in [9].
>
>    The idea behind that mechanism is to be able to handle the case when
>    one locator pair works in from A to B, and another locator pair 
> works
>    from B to A, but there is no locator pair which works in both
>    directions.  The protocol mechanism is that as A is sending probe
>    messages to B, B will observe which locator pairs it has received
>    from and report that back in probe messages it is sending to A.
>
> No, this is to test whether locator pairs work or not in the general 
> case.
>

agree, we went on describing the details without previously describing 
the main idea...

we can add your sentence at the begining of the paragraph, resulting in:

    The goal of this mechanism is to test whether locator pairs work or
    not in the general case. In particular, this mechanism is able to 
handle the case when
    one locator pair works in from A to B, and another locator pair works
    from B to A, but there is no locator pair which works in both
    directions.  The protocol mechanism is that as A is sending probe
    messages to B, B will observe which locator pairs it has received
    from and report that back in probe messages it is sending to A.

would this be ok?

>
>    All of the TLV parameters have a length (including Type and Length
>    fields) which is a multiple of 8 bytes.
>
> Ugh, this is certainly enough to make a grown man cry... Why all of 
> this alignment silliness? BGP works pretty well without it.
>

i think this is general in IPv6 protocols which are optimized for 
handling 64 bits units.... but maybe someone with more expertise can 
answer this one...

>
>    Consequently, the Length field indicates the length of the Contents
>    field (in bytes).  The total length of the TLV parameter (including
>    Type, Length, Contents, and Padding) is related to the Length field
>    according to the following formula:
>
>    Total Length = 11 + Length - (Length + 3) % 8;
>
> This is almost impossible to understand.
>
> First of all, this assumes familiarity with C or a similar language 
> from the reader to note that % is the modulo operation and that it 
> binds stronger than subtraction. As such, this would be an 
> improvement:
>
> Total Length = 11 + Length - ((Length + 3) mod 8)
>

ok

> However, the logic that underpins this is never spelled out, apart 
> from the requirement that all options be a multiple of 8 bytes long. 
> (Yes, _bytes_, not octets.)
>
> Text:
>
> "The Total Length of the option is the smallest multiple of 8 bytes 
> that allows for the 4 bytes of option header and the option itself. 
> The amount of padding required can be calculated as follows:
>
> padding = 7 - ((Length + 3) mod 8)
>
> And:
>
> Total Length = 4 + Length + padding"
>
>

great, thanks!!

will add this

> I see no discussion of size issues. A single option can be made large 
> enough to push a packet beyond 1280 bytes. More realistically, this 
> will happen when multiple options are present. What happens in this 
> case? What is the largest option size and the largest shim packet size 
> implementations must be prepared to handle?
>

separate email for this

>


>    C:             Critical.  One if this parameter is critical, and 
> MUST
>                   be recognized by the recipient, zero otherwise.
>
> You can't force a receiver to recognize something...
>

in general, i think this means that if this bit is set and the receiver 
don't understand the content, then it must react accordingly. If not 
critical, it simply silently ignore the option

i can rephrase it to express this....

>
>    o  If C=1 then the host SHOULD send back an ICMP parameter problem
>       (type 4, code 1), with the Pointer referencing the first octet in
>       the option Type field.  When C=1 the message MUST NOT be
>       processed.
>
> Why use ICMP for errors? Isn't it easier to define a shim error 
> message?
>  If the correspondent wants to fall back to some other way to set up 
> the shim having to intercept ICMP messages to make that happen is 
> pretty messy.
>
> More in general, most error conditions are handled by silently 
> dropping packets, however, which is a very bad idea because that way, 
> there is no difference between an error and lost messages.
>  So in some cases, a host may continue to resend the offending packet 
> because it doesn't know something went wrong. The main problem with 
> this approach is that you can't debug it from one end: you need to see 
> what happens on both ends to determine why something doesn't work.
>
> Silently dropping packets because of errors is the right approach for 
> security reasons in some cases, but I don't think this applies here. A 
> short error message with an error code and optionally a human-readable 
> message would be much better. As long as these error packets are 
> smaller than the packets that trigger them, there should be little or 
> no security impact, especially considering that we're prepared to talk 
> shim with the correspondent in question to begin with.
>

sent this issue in a separate email

>
>    The responder can choose exactly what input is used to compute the
>    validator, and what one-way function (MD5, SHA1)
>
> Or something else, I presume? So "(such as MD5 or SHA-1)"
>

ok

>
> About the locator option: how many locators are allowed?
>

there is no upper limit other than the ones that fit inside a packet 
with the other required options... As i mentioned above, roughly this 
seemed enough not to become a limitation

>
>       TEMPORARY: 0x02
>
>    The intent of the BROKEN flag is to inform the peer that a given
>    locator is known to be not working.  The intent of TEMPORARY is to
>    allow the distinction between more stable addresses and less stable
>    addresses when shim6 is combined with IP mobility, when we might 
> have
>    more stable home locators, and less stable care-of-locators.
>
> So this has nothing to do with RFC 3041 temporary addresses?

right

> In that case, a different name is probably better.
>

agree... what about "transient"?

>
>    o  For each peer locator, a bit whether it has been verified using
>       HBA or CGA, and a bit whether the locator has been probed to
>       verify that the ULID is present at that location.
>
> "Flag" rather than "bit"?

no preference here, so flag if you find it clearer

>
>
>    | E-FAILED            | Context establishment exchange failed
>
> How do we know this,

becuase we have retried several times with I1 and never got a R1 back

>  and is it necessary to explicitly take notice of this situation?

yes, so we don't try again for a while

makes sense?

>
>
>    | E-FAILED            | ULID(peer), ULID(local)                     
> |
>    |                     |                                             
> |
>    | NO-SUPPORT          | ULID(peer), ULID(local)
>
> How is ULID(local) relevant here?

because we have tried to establish a shim6 context for a given ULID 
pair.

>  We know there is connectivity (ULP is running)

this is not necesarily the case, we can try to establish the shim6 
context without having an ongoing ulp communication

>  so if we don't get any shim negotation back or it fails, then this 
> situation can be attributed to the peer as a whole, not to the ULID 
> pair.

agree, but you don't know what are all the ULIDs of the peer, so i 
guess that at this point, caching that a shim6 context establishment 
exchange has failed for this ULID pair is the best you can do at this 
point...
>
>
>    In all the cases the result is that the peer without state receives 
> a
>    shim message for which it has to context for the context tag.
>
> To -> no?

ok

>
>
>    case we can not use the recovery mechanisms since there needs to be
>    separate context tags for the two ULID pairs.
>
> Needs -> need
>

ok

>
> Regarding section 7.9: shouldn't there be checks to make sure that 
> seemingly duplicate packets contain the same information as the 
> earlier packets they are supposedly the duplicate of?
>

I am not following you here...

are we talking about an initiator that receives a duplicate R1 back or 
a receiver that receives a duplicate I1?

If a receiver receives a duplicate I1 it doesn't do anything special, 
just replies with a R1. don't see any problem here

If an initiator receives a duplicated R1 (this may be due that the 
initiator have retried with multiple I1), it will process the first one 
received, and send a I2. The second one received, there will be no 
shim6 context in I1-SENT state, so it will be discarded. I do not see 
the point in verifying that the other information in the packet is ok 
or not... what would be the point in doing this verification?

> What if validators don't match? Eventually this shouldn't be a problem 
> but I expect some initial trouble here because you're doing hashes 
> over a fairly large number of values, a small mistake somewhere means 
> the hash doesn't work, some feedback in the form of an error message 
> would be good.

you mean for the first packet or for the following (duplicated)  
packets? for the first packet, this is a specific case of the general 
problem about whether it is ok to silently discard wrong packets (that 
i have initiated a new thread for this)
For the second case, i don't see any point on doing this...

>
>
> It occurs to me that there is nothing or very little in the protocol 
> that precludes shim negotiation using non-ULID addresses.

not sure what you mean by non ulid addresses....

>  We probably need a few minor tweaks to the reachability protocol to 
> also allow this, but then there is no fundamental reason to not allow 
> shim setup using non-ULID addresses, and by extension, unreachable 
> ULIDs = a separate identifier space.

if you mean that the shim protocol can support unreachable ULIDs like 
ULAs, then the answer is yes, it has been taken into account and should 
work fine with unreachable ulids...(you cannot do deferred setup but 
the spec supports it)

>  If it's this easy, we should definately make sure there isn't some 
> minor obstacle somewhere,

this has been done, and i hope there are no obstacles. If you have 
identified some, please let me know

>  so that we can add this feature easily in the future when we've 
> worked out the additional issues such as locator discovery.
>
>
>
>    o  Where Ls(peer) has at least one locator in common with the newly
>       created or updated context.
>
> Why? I don't see how that buys us anything. Also, it's fairly trivial 
> to insert a bogus locator to meet the requirement that there is one in 
> common between the old and new sets.
>

this is not a security check. The goal of this verification is to find 
out whether we are in the context confusion situation i.e. if we have a 
single peer that is using the same context tag value for two different 
contexts. We can identify this situation because of the overlapping of 
at least one locator in the locator set (this means it is the same host 
in both contexts)

> Adn why verify whether the source address is in Ls(peer)? The security 
> mechanisms do all the checking we need.
>

could you point me where is the text you are referring to? i mean there 
is no such verification in section 7.15 which was the section the 
previous comment was referring to...

>
>       context.  In this case, we are in the Context confusion 
> situation,
>       and the host MUST NOT use the old context to send any packets.  
> It
>       MAY just discard the old context (after all, the peer has
>       discarded it), or it MAY attempt to re-establish the old context
>       by sending a new I1 message and moving its state to I1-SENT.  In
>       any case, once that this situation is detected, the host MUST NOT
>       keep two contexts with overlapping Ls(peer) locator sets and the
>       same context tag in ESTABLISHED state, since this would result in
>       demultiplexing problems on the peer.
>
> What if an attacker is trying to interfere with legitimate 
> communication? We must be VERY sure that the new shim messages come 
> from the same host as the one that created the existing state if we're 
> going to mess with that existing state.
>

separate email for this

>


> About the randomness of the context tag: I don't think we have to 
> require that the entire context tag random in a cryptographically 
> strong sense. If this makes implementation easier, why not allow an 
> implementation to use part of the CT to be used as a lookup key (which 
> is relatively easy to predict) as long as enough bits are really 
> random? In my opinion, 20 good random bits is enough here. Suggested 
> text (but no suggested place to put it):
>
> "It is important that context tags are hard to guess for off-path 
> attackers. Therefore, if an implementation uses structure in the 
> context tag to facilitate efficient lookups, at least 20 bits of the 
> context tag must be unstructured and populated by completely random 
> bits. For this purpose, bits derived from one of the generally used 
> one-way hash functions such as SHA-1 may be considered random.

yes, structure in context tags can be useful for other purposes as 
well, so i think your recomendation is ok

>
>
>    A host MUST silently discard any received Update Acknowledgement
>    messages that do not satisfy all of the following validity checks in
>    addition to those specified in Section 12.2:
>
>    o  The Hdr Ext Len field is at least 1, i.e., the length is at least
>       16 octets.
>
> Added bonus when the header structure is unified: no need to repeat 
> the above over and over throughout the text.
>
>
>    NO_R1_HOLDDOWN_TIME = 1 min
>
>    ICMP_HOLDDOWN_TIME = 10 min
>
> This seems rather short, basically a shim host talking to a non-shim 
> host would retry setting up the shim every minute or every 10 minutes 
> even though there is good reason to assume this won't be successful. 
> Something like several hours seems more appropriate. (And only when 
> packets are actively exchanged.)
>

well, NO_R1_HOLDDOWN_TIME is about how long we should wait until we 
retry when no R1 have been received. Note that this may be due to 
network outages that can be solved in a short time. So, we have a good 
reason to assume this will be successful, someone fixed the network 
path :-)

in the other case, i agree, what about:

    NO_R1_HOLDDOWN_TIME = 5 min

    ICMP_HOLDDOWN_TIME = 60 min


>
>    network transit path.  Second, in case that IPSec is implemented as
>    Bump-In-The-Wire (BITW) [7] it is expected that the shim6 sub-layer
>    is also implemnted in the same fashion.
>
> Not strong enough:
>
> "in case that IPSec is implemented as Bump-In-The-Wire (BITW) [7], 
> either the shim MUST be disabled, or the shim MUST also be implemented 
> as Bump-In-The-Wire, in order to satisfy the requirement that IPsec is 
> layered above the shim."
>

ok

>
>       could require a 2-way handshake "did you really loose the state?"
>       in response to the error message.
>
> lose
>

ok

>
>    o  The validator included in the R1 and R1bis packets are generated
>       as a hash of several input parameters.  However, most of the
>       inputs are actually determined by the sender, and only the secret
>       value S is unknown to the sender.  However, the resulting
>       protection is deemed to be enough since it would be easier for 
> the
>       attacker to just obtain a new validator sending a I1 packet than
>       performing all the computations required to determine the secret
>       S. However, it is recommended that the host changes the secret S
>       periodically.
>
> Too many howevers...
>

will try to rephrase

>
>    o  Study whether a host explicitly fail communication when a ULID
>       becomes invalid (based on RFC 2462 lifetimes or DHCPv6), or 
> should
>       we let the communication continue using the invalidated ULID (it
>       can certainly work since other locators will be used).
>
> Some kind of grammar problem, not obvious to me what is meant here.
>

we have discussed this problem and already agreed on terminating the 
communication, so i will remove this paragrpah

>
> Appendix B.  Simplified State Machine
>
>    The states are defined in Section 6.2.  The intent is that the
>    stylized description below be consistent with the textual 
> description
>    in the specification, but should they conflict, the textual
>    description is normative.
>
> Haven't looked at this.
>
>
>    that the Flow Label carries context information as proposed in the
>    now expired NOID draft. .
>
> Extra  .

ok

>
>
>    It may happen, that later on, one of the hosts, e.g.  Host A looses
>    the shim context.
>
> loses
>

ok

>
>    Mechanisms for detecting context. loss
>
> Extra word?
>
>

    Mechanisms for detecting context loss.

:-)

> There are discussions in the appendixes, maybe make this a separate 
> document?
>

i am ok either way....

>
> The Locator List Option Format only specifies two verification methods 
> at this time: CGA or HBA. What about the case where a locator can be 
> verified using either CGA or HBA? Maybe it makes more sense to have 
> each method be a bit so they can be present or absent independently.
>

if this turns to be useful we could define a verification code that 
would be CGA or HBA but i don't see why this would be useful... do you?

>
>    approach eliminates the possibility of a context confusion situation
>    because premature garbage collection, but it does not prevents
>
> prevent
>
ok

>
>    [9]  Arkko, J. and I. Beijnum, "Failure Detection and Locator Pair
>         Exploration Protocol for IPv6  Multihoming",
>
> Please make this "I. van Beijnum"
>

will try if the xml tool allows me to....

>
> Note to self: look at implications of the fact that keepalive and 
> probe messages (as defined here) don't trigger R1bis in the 
> reachability draft.
>