Re: [RAM] The mapping problem: rendezvous points?

Iljitsch van Beijnum <iljitsch@muada.com> Fri, 07 September 2007 13:12 UTC

Return-path: <ram-bounces@iab.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1ITddI-00048F-Vl; Fri, 07 Sep 2007 09:12:24 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1ITddH-00046Z-QW for ram@iab.org; Fri, 07 Sep 2007 09:12:23 -0400
Received: from sequoia.muada.com ([83.149.65.1]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1ITddG-0000Cq-FN for ram@iab.org; Fri, 07 Sep 2007 09:12:23 -0400
Received: from [82.192.90.28] (nirrti.muada.com [82.192.90.28]) (authenticated bits=0) by sequoia.muada.com (8.13.3/8.13.3) with ESMTP id l87D8a8Y025658 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Fri, 7 Sep 2007 15:08:37 +0200 (CEST) (envelope-from iljitsch@muada.com)
In-Reply-To: <380DB505-2581-4E48-B413-37BB6AB6272F@virtualized.org>
References: <8F47F550-6224-4AFF-8359-CBA98D3F2FAB@muada.com> <271CF87FD652F34DBF877CB0CB5D16FC054EA470@WIN-MSG-21.wingroup.windeploy.nt dev.microsoft.com> <9C228355-9425-4C66-A9A7-47498490E3B1@virtualized.org> <271CF87FD652F34DBF877CB0CB5D16FC054EA59D@WIN-MSG-21.wingroup.windeploy.nt dev.microsoft.com> <86588E66-ACED-4DD2-B286-3DA5B2518B1A@virtualized.org> <4641750A.9010906@cisco.com> <283D52E5-AD3A-40FA-B81C-27DD950176CA@virtualized.org> <4641F33B.4070103@cisco.com> <2CB91D98-4CA3-4F4E-A2F6-CFEF5E04C0DB@virtualized.org> <p06240602c267b7bdf733@[76.102.225.135]> <380DB505-2581-4E48-B413-37BB6AB6272F@virtualized.org>
Mime-Version: 1.0 (Apple Message framework v752.3)
Content-Type: text/plain; charset="US-ASCII"; delsp="yes"; format="flowed"
Message-Id: <990A1601-377B-4F54-82D7-CDAA0757E269@muada.com>
Content-Transfer-Encoding: 7bit
From: Iljitsch van Beijnum <iljitsch@muada.com>
Subject: Re: [RAM] The mapping problem: rendezvous points?
Date: Fri, 07 Sep 2007 15:10:59 +0200
To: David Conrad <drc@virtualized.org>
X-Mailer: Apple Mail (2.752.3)
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 41c17b4b16d1eedaa8395c26e9a251c4
Cc: ram@iab.org
X-BeenThere: ram@iab.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Routing and Addressing Mailing List <ram.iab.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ram>, <mailto:ram-request@iab.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ram>
List-Post: <mailto:ram@iab.org>
List-Help: <mailto:ram-request@iab.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ram>, <mailto:ram-request@iab.org?subject=subscribe>
Errors-To: ram-bounces@iab.org

[sorry, replying to an even older message... but I think this is  
important]

On 9-mei-2007, at 20:07, David Conrad wrote:

> If an application works today, I'm having trouble understanding how  
> the introduction of additional O(10-100ms) latencies in the event  
> of a cache miss (which will likely only occur at the first packet  
> in a packet train) is going to have significant impact.  In those  
> odd cases that there is an impact, how much work would it be to  
> modify the application operation to be more tolerant of network  
> delays?

I'm assuming an encapsulation device close to the source host that  
serves more than a single host, and maybe even a great number of hosts.

If nothing much is happening and caches are empty, and a host then  
starts a session towards some remote destination, in a pull model,  
the encapsulating device must first look up a mapping so it can  
perform the required encapsulation. So the first packet must be  
dropped. At the start of a session, that would be the first TCP SYN  
(or equivalent packet for other transport protocols). At this point,  
it doesn't matter all that much how fast the mapping information  
becomes available, because the application has to wait for a  
retransmission of the first packet. Obviously it doesn't help if the  
mapping system is so slow that the second packet is also dropped. But  
I think 10 - 100 ms is extremely optimistic, DNS is about an order of  
magnitude slower than that in practice and it takes ~ 350 ms round  
trip across the world.

So the difference here is that unlike with the DNS, where you can  
proceed as soon as there is an answer, you have to time out here, and  
both doing that too fast and too slow increases latency.

But that's only when the sky is blue and the sun is shining. When an  
encapsulation device suddenly gets a lot of traffic that has been  
going for a while, so it's happening at high speeds, but it somehow  
doesn't have any mapping state (mapping state timed out, rerouting  
between encapsulation devices, maybe even a reboot), very many  
packets will be lost while a mapping lookup is done. Also, if a large  
number of flows require mapping requests, it's entirely possible that  
these must be rate limited, increasing delays even further.

I guess it would be possible for the encapsulation device to send a  
message back to the source host that indicates that the packet was  
dropped so it should be retransmitted. I'm guessing TCP will do that  
for ICMP "packet too big" messages today.

However, I would prefer "initial non-optimizedness" over "initial  
loss". If there is some place packets for destinations that aren't  
mapped yet can be sent to so they arrive anyway, even if this path is  
slower, this takes a lot of pressure off of the encapsulation  
devices, making the downsides of pull and caching less problematic:  
mapping requests can be done at the rate that best suit the  
encapsulation device rather than as fast as possible to avoid  
application impact.

I suppose there are some applications that will try to judge network  
characteristics based on the first few packets, and in such a  
situation, they would come to unnecessarily conservative conclusions.  
However, this could be addressed by changing these applications  
rather than restricting the internet architecture.

The issue of determining rendezvous locations that Marshall brought  
up is simple enough: have each rendezvous point handle a very long  
prefix, such as 96.0.0.0/6. This of course sucks if you have 97.0.0.1  
and you're in South Africa while the relevant rendezvous point for  
96.0.0.0/6 is in Japan, but since the detour only happens for the  
first few packets this shouldn't be a huge issue.

_______________________________________________
RAM mailing list
RAM@iab.org
https://www1.ietf.org/mailman/listinfo/ram