Re: [Fwd: I-D ACTION:draft-nordmark-shim6-esd-00.txt]

marcelo bagnulo braun <marcelo@it.uc3m.es> Tue, 07 March 2006 08:18 UTC

Envelope-to: shim6-data@psg.com
Delivery-date: Tue, 07 Mar 2006 08:19:21 +0000
Mime-Version: 1.0 (Apple Message framework v623)
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Message-Id: <50c109ec256f700ecd17d5d9f25e0cbd@it.uc3m.es>
Content-Transfer-Encoding: quoted-printable
Cc: shim6 <shim6@psg.com>
From: marcelo bagnulo braun <marcelo@it.uc3m.es>
Subject: Re: [Fwd: I-D ACTION:draft-nordmark-shim6-esd-00.txt]
Date: Tue, 07 Mar 2006 10:18:23 +0200
To: Erik Nordmark <erik.nordmark@sun.com>

El 07/03/2006, a las 2:11, Erik Nordmark escribió:

> marcelo bagnulo braun wrote:
>
>> I agree that we should care about robustness
>> But my point is that if someone is doing shim and it is using non 
>> routable ids, then he probably cares about robustness and so he 
>> should know that he must properly populate the reverse DNS.
>
> They might think they have populated it correctly, but if it "never"
> gets used there is no easy way for them to tell.
>
> I think some careful judgment is needed here, so ensure that it isn't 
> only during failures in combination with other events (such as 
> referrals or long-lived application sessions), that the ID->locators 
> would be looked up in the DNS.
>
> Taking a car analogy. Folks are reasonably confident that the breaks 
> on the car works, because they use them when slowing down, and when 
> stopping at intersections.
> If the breaks where only used for emergencies (such as somebody 
> jumping into the street just in front of you, or the car in front 
> suddenly slamming its breaks), then it would be too late to discover 
> you were out of break fluid.
>

right, but you probably don't want to periodically crash into things to 
verify the airbag... ;-)

I can see your point and i agree that this needs to be taken into 
account.

Perhaps an option could be to get the identifiers and the locator from 
the direct DNS when the fqdn is resolved AND also perform the reverse 
mapping from the ID to the locator set, to verify that the information 
is there.
So, when the fqdn is resolved, the host has all the information and can 
proceed with the communication establishment. But also, if after 
performing the reverse lookup it discovers that it is not properly 
populated, some kind of error message can be returned, so that the 
admin of the zone is informed.
In this way, you can periodically verify the reverse zone, but reduce 
the cost involved in the verification operation. I mean there are two 
costs involved in the verification: the packet overhead due to the 
query and the added latency of waiting to obtain the locator 
information. IMHO the second one is the most problematic. With this 
option we would be eliminating the second one and just keeping the 
first one.


>
>>> We can encode the address type in different ways. One way would be 
>>> to use a flag field in the verification method. Another would be to 
>>> use the IPv4-mapped address format (::ffff:1.2.3.4). Both would work 
>>> AFAIK.
>>>
>> sure
>> but i guess that my point is that if already identify that we may 
>> need additional flags about the address information, it would be a 
>> good option to redefine the verification method to a generic flag 
>> octet, in order to support future flexibility (this is somehow 
>> independent of the option selected for the v4 address, is more a 
>> general observation)
>
> Yes, but if we want to support different types of locators (IPv6, 
> IPv4, and perhaps IPv4+UDP port for NAT traversal), then maybe we 
> should have an explicit locator type? (and length?)
>

well, i guess that we don't need more than 128 bits, so probably if the 
locator field is 128 bits, this would be enough.
w.r.t. to the explicit locator type, perhaps the option would be to use 
3 bits of the flag field to express this...
> ...

>> wouldn't the Sent locator option and the received locator option be 
>> useful for dealing with NATs?
>> I mean using these, a host can find out if the addresses were 
>> rewritten (whether by a router or a nat doesn't matter i guess), and 
>> discover its own addresses and eventually add them to the Ls
>
> Yes, that is potentially useful. Hadn't thought about that (I try to 
> avoid spending precious brain cycles on inventing yet another NAT 
> traversal mechanism).

> But the semantics might be different.
> With a NAT rewrite it presumably means that some other address (the 
> one in the source in the Sent locator option) is no longer useful.
> But with router rewriting, that address might still be useful but just 
> isn't recommended (e.g., due to TE) for this communication.
>

agree but i guess that it is easy to determine that in the NAT case (in 
the general case). I mean, if the other end sees a public address and 
we are using a private address, then the private address is unreachable 
from the peer and must be substituted with the public one received in 
the option.

If both v4 addresses received are public, then this is TE and both are 
ok.

(The problem is that now some very big sites are starting to use public 
addresses when they run out of private addresses 
(http://www.arin.net/policy/proposals/2004_3.html) which would break 
the simplicity of the approach. But even in this case, the local host 
should be able to know which are the addresses reserved for private use 
and apply the same heuristic mentioned above


>> In addition, as you mention in the draft the failure detection 
>> mechanism could be used to preserve the NAT state if detected.
>
> Detection and preservation are different things.
> To preserve one would have to determine how often the NAT needs to see 
> packets in order for it to not discard the state. This is hard. Or one 
> has to send a packet every N seconds all the time, which generates 
> lots of extra load.
>

yes, but shim is supposed to be able to deal with failures and the NAT 
loosing its state can be seen as a failure. So, the shim will be able 
to recover from this and even recreate the NAT state. Probably we need 
some additional logic to be smart enough to identify that what is going 
on is that the nat is loosing state and increase the keepalive 
frequency.
I mean, in the case of the shim, the communication can be recovered 
from a nat loosing state and the state in the nat can be easily 
recovered.

>> Probably the piece that is missing is somehow of rendez vous server 
>> for initial contact for hosts behind nat
>
> Which is hard.
>
> One can observe some similarities between a NAT rendez-vous server, 
> and a mobility home agent, in that they both could be a (set of) fixed 
> locators where one can send packets (and have them be forwarded on, 
> and later route optimized away), with a mechanism for the host to keep 
> the rendez-vous server updated with the current locators of the host.
>
> But this is harder for NAT than for IP mobility; with full cone NATs 
> it could be a single IPv4 address plus port as the locator. But with 
> stricter NATs then each peer has to find a different IP+port to talk 
> via the NAT.
>

yes this is a completely new piece that is missing and i guess that in 
this case, it would make sense to try to import any solution designed 
somewhere else to deal with this problem

> All good reasons for me to focus on
>  - completing the ID/loc split,
>  - TE feedback
>

i agree with the approach, just wanted to explore this a bit

> and let somebody else think about NAT traversal.
>
>
>> I guess that locator selection incorporates several elements:
>> - Rechability
>> - Preferences with respect the different types of locators (CoA vs. 
>> HoA, scope, private etc)
>> - TE/policing issues (including local preferences and remote 
>> preferences (the preferences of the peer))
>> RFC3484 can be used to express the tree items somehow (with its 
>> limitations) (note that it does not help to determine if an address 
>> is reachable or not, but it does take this information into account 
>> when performing dest address selection)
>> In addition, the locator selection is performed:
>> - when the communication is initiated (whether when a routable id is 
>> selected (just RFC3484) or when a non routable id is used and the 
>> shim selects a correspondent locator)
>> - when a failure occurs,
>> - because of TE considerations, a host may choose to change the 
>> locator
>> I would say that in any of those cases, it is importnat to take into 
>> account all the aforementioned 3 items, so i guess that RFC3484 may 
>> be a good candidate for locator selection
>
> s/good candidate for/provides some additional constraints on/
>
>> But not only availability must be taken into account for exploring 
>> and rehoming i guess.
>> First of all, you probably want to take into account considerations 
>> like scope, HoA vs. CoA, private vs. public addresses when selecting 
>> which address to explore
>
> Sure. But the tradeoffs are quite different in a shim6 locator 
> selection
> context that in address selection.
>
> For instance, for private vs. public in order to provide pseudonymity 
> the host has to keep multiple sets of <temporary ULID, set of 
> temporary locators> and never combine ULID/locators across the sets, 
> since that would provide a linkage over time where the host can be 
> tracked even though it changes its ULIDs and locators over time.
>
> For HoA/CoA the tradeoffs are different when used as address today 
> (where using a CoA can easily cause communication to fail), and using 
> it as a locator with shim6.
>
> So RFC 3484 does help to list some things to consider, but doesn't 
> provide useful answers for the issues with shim6.
>

i agree with the points made above

So what we need is a locator selection algorithm that is likely to take 
into account the same issues that RFC3484 does, but with different 
trade-offs being made, right?


>
>>> One of the interesting things is the combination of the unroutable 
>>> ULID and getting router feedback during the context establishment 
>>> exchange (which will be done before any ULP packets are exchanged in 
>>> that case).
>>> If we only allow rewriting on packets with the payload extension 
>>> header we wouldn't have that capability of early locator selection 
>>> according to the policy implicit in the router's rewriting.
>>>
>> but how early is this compared to the case where no rewriting is 
>> supported for control packets?
>> I mean, in the case of no rewriting for control packets, the first 
>> payload packet would get rewritten and the capability would only get 
>> delayed for 2 packets i.e. the shim context establishment packets, 
>> which i don't think is an issue...
>> but probably i am missing the point you are making...
>
> My point is that if we have unroutable ULIDs, then there isn't any 
> extra cost I can see for allowing the rewrite of on the I1/R1 etc 
> messages; those messages get a bit larger that's all.
>
> Without rewriting in I1/R1/etc we get rewriting for the first data 
> packet resulting in an Update Request message to inform the sender of 
> a rewrite, potentially followed by an Update Request in the reverse 
> direction to inform the peer that it can use that new locator as a 
> destination.
>
>
> And if we are pursuing router rewriting, a simple test for "can this 
> packet be rewritten" might be important for performance. With 
> rewriting on shim6 control packets, this test is
> 	if (ip6->ip6_nxthdr == IPPPROTO_SHIM6)
>


i may agree with that.
It is just that i am concerned about the added complexity to the 
protocol itself, which is already quite complex...

regards, marcelo



>
>    Erik
>