Re: Comments on draft-ietf-shim6-proto-00.txt

Erik Nordmark <erik.nordmark@sun.com> Fri, 07 October 2005 15:18 UTC

Envelope-to: shim6-data@psg.com
Delivery-date: Fri, 07 Oct 2005 15:30:29 +0000
Message-ID: <434691B0.1060408@sun.com>
Date: Fri, 07 Oct 2005 08:18:08 -0700
From: Erik Nordmark <erik.nordmark@sun.com>
User-Agent: Mozilla Thunderbird 1.0.6 (X11/20050720)
MIME-Version: 1.0
To: Jari Arkko <jari.arkko@piuha.net>
CC: shim6 <shim6@psg.com>
Subject: Re: Comments on draft-ietf-shim6-proto-00.txt
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit

Jari Arkko wrote:

> Technical:
> 
> Missing a discussion on the relationship of the shim6
> processing wrt other processing that is taking place
> at the IP layer, including at least IPsec but probably
> also Mobile IPv6. I know from past experience that
> its quite hard to define the relationships and processing
> order correctly, and I suspect that its increasingly hard
> for shim6.

That's covered in abbreviated form (the string "TBD" :-) )

Seriously, the intent is to have the same placement of the shim as we've
discussed before, and flush out the details a bit more than in the
l3shim draft.

> Also, what is the relationship of Shim6 processing to
> things in the host that depend on literal addresses,
> such as IPsec policies?

With a l3shim below the IP endpoint sublayer, the IPsec policies operate
on ULIDs.

> Another missing discussion: the document refers to
> SCTP as if it would be obvious how it can use Shim6. I'm
> not sure that's the case. Or at least its not obvious to me :-)

I suspect we don't need to do anything special to "not break" SCTP. But
there are choices whether SCTP would see all locators as ULIDs or not.
(If it sees all of them, then SCTP might fail over between ULIDs while
the shim underneath takes a ULID-pair context and fails it over to
another locator pair, which is probably a bit suboptimal.)
But I think it would make sense to put this type of discussion in a
short separate draft, since this makes it easier to get feedback from
the SCTP folks (and transport folks in general.)

> Yet another missing discussion: is there some interaction
> with this protocol and the protocol defined in Marcelo's
> draft that talked about communication with non-shim6
> peers? It would appear that some aspects (e.g. input from
> RAs) is common.

That relates to reachability detection and exploration, which I think
the WG decided to put in a separate draft. It probably makes sense to
keep the packet formats in one place, but define the detailed behavior
for those items in a separate one.

> And one more: the document is relatively silent on
> (un)reachability detection mechanisms beyond shim6-based
> probing. We do have ND(NUD), L2, etc. mechanisms that
> should be taken into account. If your L2 tells you that its
> lost the connection, there's no point in probing at L3,
> we need to find another interface!

See above.

>>   multihoming can be provided for IPv6 with failover and load spreading
>>   properties
>>
> I'm a bit concerned that we have not figured out all the details
> regarding load spreading.

FWIW this text is what I named level (0) in another email today.
But the text needs to be flushed out.

>>   In addition, the non-shim6 messages, which we call payload packets,
>>   will not contain the ULIDs after a failure.  This introduces the
>>   requirement that the <peer locator, local locator, local context tag>
>>   MUST uniquely identify the context.  Since the peer's set of locators
>>   might be dynamic the simplest form of unique allocation of the local
>>   context tag is to pick a number that is unique on the host.  Hosts
>>   which serve multiple ULIDs using disjoint sets of locators can
>>   maintain the context tag allocation per such disjoint set.
>>  
>>
> Not sure if this always needs to be true.
> It might be that the shim6 failover protocol signaling
> is used to tell the peer what the new locators are. If
> that's the case, then the local context tag alone does
> not need to be unique, you could rely also on the
> addresses.

Yes, the language was meant more as an example than a requirement.

But  I'm not sure what the actual requirement is. For example,
  - the packets sent from A to B for ULID A1 to B1 use context tag X
    B knows that A has locators A1, A2.
  - there is a request from ULID C1 to B1, which makes B allocate a new
    context tag. Suppose B chooses tag X.
  - Later C1 tells B that it has locators A1, A2, C1.
    Now B has a potential problem, since later the C1 to B1 context might
    switch to using the same locators as the A1 to B1 context.

It seems like B must know all the possible future locators of a peer
before allocating a context tag, *or* B must have a way of refusing a
peer to use a locator for a context in order to avoid confusion.

So a question is whether we need to have a protocol mechanism by which a 
host can refuse that a peer use of its locators in the locator set for 
the context.

> Also, there seems to be security issue in using just
> the context tag to do the demux. (Or is there some
> crypto hash somewhere too?) If I learn or guess
> your tag, does that mean that I can start sending
> traffic that appears to come from you, even if I
> use a different source IP and my host is under
> ingress filtering restrictions?

The context lookup on the receiver is based on <source locator,
destination locator, context tag>, thus this is stronger than today
where it is sufficient to spoof the source address; you also need to
find the right context tag to be able to inject packets.


> I'd like to understand this better. Is the information flow
> always from the host to the middleboxes and routers?
> If yes, then the host can keep its choices about the flow
> labels. If not, then we may have a problem.

I'm not sure if RFC 3697 specifies that level of constraint for new flow 
signaling mechanisms. For the unsignaled traffic the host can pick 
whatever flow label it wants.

I think RSVP assumes that the host would provide the flow label (as it 
provides IP addresses and port numbers to specify an IPv4 flow).
Any idea if NSIS is different?

> In a crash followed by reboot it does not help that shim6
> recovers the state. TCP and SCTP state is gone anyway.
> For UDP and ICMP shim6 recovery would help.

To the contrary I think it is required, so that in the TCP case there 
can be a reset going back.

If the sender is using shim state and locators different than the ULIDs, 
but the receiver has lost the shim state, and there is no indication in 
the received packet that it must be processed by the shim, then IP layer 
will just pass it up to the ULP. Since the pseudo-header checksum is 
different, this will result in a ULP checksum error.

So without *any* recovery mechanism, TCP would send and never hear a RST 
back. If TCP then times out and gives up, even if it creates a new 
connection (between the same ULID pairs), the sender would still apply 
the shim to the TCP SYN packets, and the same checksum error would be 
the result.

So for the crash and reboot case we need some mechanism that can be used 
to indicate that the context state is gone.
Perhaps a "do you still have the context" every minute or so when other 
packets are being sent?

But there is another potential reason for the protocol field 
overloading, which I haven't fully worked out.

With an overloaded protocol field, the order in which the receiver does 
its work is specified in the chain of protocol fields/extension headers 
in the packet. (In effect, the overloaded "udp over shim6" protocol 
field acts as an indication that there is a zero length shim6 extension 
header followed by a UDP header.) If we don't have this, then at what 
point in time would the receiver look for a context based on the <source 
locator, destination locator, flow label>? (If there might be routing 
header type 2 or home address option, it might be critical to do this 
lookup at the correct place.)


>>   Thus with 7 or so additional protocol field values we can do a
>>   reasonable job of overloading the flow label field and get detection
>>   of lost context state.
>>  
>>
> This is primarily an optimization. I wonder if it would make
> sense to limit the optimization to the 80% useful case which
> to me would be TCP, ESP, UDP, or even less. No sense in
> optimization ICMP messages, I think.

I guess we first need to decide whether we want this protocol field 
overloading, or whether we do an 8 byte extension header when the 
locators are not the ULIDs.

> It isn't clear to me how this would be done. Presumably
> its the application that is going through the IPs retrieved
> from DNS, not the IP layer. Is this something that we
> need to handle?

I think we need to say home failure during initial communication is 
handled. I'm not certain the extent to which we need to optimize this 
case, hence I don't know if the shim needs to be involved at all.


> It might be possible to combine these functions. But lets
> do the individual design first for each, and combine later
> if possible.
> 
> Also, the assistance from payload packets in the explore
> phase is not discussed.

Ack.

> Finally, I hope everyone is OK with the design that is
> extremely tightly integrated with IPv6. I know its
> in our charter, but I still worry about it since I see
> so many things that eventually had to work on both
> v6 and v4 (people working on mobile IP now to run on both,
> IPsec has done so for a very long time, IP multimedia
> systems had to run on v4 too, etc). We won't be able
> to do this for Shim6.

We don't have an approach to securing IPv4 since HBA/CGA doesn't apply.
(The only thing that multi6 discussed that would fit that would be to 
use the NOID-style DNS verification.)

So the ULIDs have to be IPv6 addresses.
For CGA, perhaps there can be locators that are IPv4-mapped addresses 
(i.e., just IPv4 addresses).

    Erik