Re: [p2pi] WG Review: Application-Layer Traffic Optimization (alto)

"Michael J. Freedman" <mfreed@CS.Princeton.EDU> Wed, 22 October 2008 00:12 UTC

Return-Path: <p2pi-bounces@ietf.org>
X-Original-To: p2pi-archive@ietf.org
Delivered-To: ietfarch-p2pi-archive@core3.amsl.com
Received: from [127.0.0.1] (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id F104B3A6853; Tue, 21 Oct 2008 17:12:06 -0700 (PDT)
X-Original-To: p2pi@core3.amsl.com
Delivered-To: p2pi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 6E45D3A6853 for <p2pi@core3.amsl.com>; Tue, 21 Oct 2008 17:12:05 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -5.869
X-Spam-Level:
X-Spam-Status: No, score=-5.869 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, J_CHICKENPOX_62=0.6, RCVD_IN_DNSWL_MED=-4, SARE_RMML_Stock10=0.13]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vAvicADyaN3d for <p2pi@core3.amsl.com>; Tue, 21 Oct 2008 17:12:04 -0700 (PDT)
Received: from redflag.CS.Princeton.EDU (redflag.CS.Princeton.EDU [128.112.136.72]) by core3.amsl.com (Postfix) with ESMTP id 3BDA23A63EC for <p2pi@ietf.org>; Tue, 21 Oct 2008 17:12:04 -0700 (PDT)
Received: from mfreedman.CS.Princeton.EDU (mfreedman.CS.Princeton.EDU [128.112.95.37]) (authenticated bits=0) by redflag.CS.Princeton.EDU (8.13.8/8.13.8) with ESMTP id m9M0CZ4t027425 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Tue, 21 Oct 2008 20:12:38 -0400 (EDT)
Message-ID: <48FE6FF3.6030008@cs.princeton.edu>
Date: Tue, 21 Oct 2008 20:12:35 -0400
From: "Michael J. Freedman" <mfreed@CS.Princeton.EDU>
User-Agent: Thunderbird 2.0.0.17 (Macintosh/20080914)
MIME-Version: 1.0
To: Nicholas Weaver <nweaver@ICSI.Berkeley.EDU>
References: <20081006203532.B1D673A68AF@core3.amsl.com> <BE82361A0E26874DBC2ED1BA244866B9276373BA@NALASEXMB08.na.qualcomm.com> <48EEB19C.4000303@bbn.com> <48EEE549.1080208@qualcomm.com> <48EF477E.4080708@telecomitalia.it> <48EF706C.9050508@qualcomm.com> <48EFA0BE.1040809@alcatel-lucent.com> <ca722a9e0810101221yb84ac3ar8ff0f267718c88c9@mail.gmail.com> <48EFD2BC.8050706@qualcomm.com> <48F000FD.5000302@telecomitalia.it> <3C654581-ABA5-45B9-A36D-E0BD9B52366B@nokia.com> <48F4E20B.8000609@cavebear.com> <EE0DFEE3-0646-45F7-B194-199B83B2E9D4@icsi.berkeley.edu> <48FE4424.2000309@alcatel-lucent.com> <7307B5F6-254F-4A22-8C68-93A83D09E6A7@ICSI.Berkeley.EDU>
In-Reply-To: <7307B5F6-254F-4A22-8C68-93A83D09E6A7@ICSI.Berkeley.EDU>
X-Proofpoint-Virus-Version: vendor=nai engine=5.3.00 definitions=5411 signatures=474220
X-Proofpoint-Spam-Reason: safe
Cc: "p2pi@ietf.org" <p2pi@ietf.org>
Subject: Re: [p2pi] WG Review: Application-Layer Traffic Optimization (alto)
X-BeenThere: p2pi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: P2P Infrastructure Discussion <p2pi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/p2pi>, <mailto:p2pi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/pipermail/p2pi>
List-Post: <mailto:p2pi@ietf.org>
List-Help: <mailto:p2pi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/p2pi>, <mailto:p2pi-request@ietf.org?subject=subscribe>
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain; charset="iso-8859-1"; Format="flowed"
Sender: p2pi-bounces@ietf.org
Errors-To: p2pi-bounces@ietf.org

Nicholas Weaver wrote:
> Ono is leveraging off Akamai's DNS information:  You do a few queries to 
> akamai'ed (and other CDN'ed) domains, and use the results as a 
> coordinate space.
> 
> The question is the simpler coordinate: "who's MY resolver and who else 
> uses it".

CoralCDN did something very similar to this at first.  In fact, in 
addition to inserting a <DNS resolver, serverid> entry in the Coral DHT, 
it also tracerouted the resolver (of our proxy servers).  If the 
resulting last-hop IP addresses were a "relatively good match" for the 
resolver, we would store these "network topology" hints in our DHT as 
well.  (This conditional check was to avoid cases when "last 
traceroute-able hops" were quite distant from the destination, 
especially given the fact that many hosts and routers don't actually 
reply to traceroute probes these days.)

The goal of this was to perform topological clustering of our peers, 
much as you suggest:  When a client would perform a DNS request, we 
would synchronously traceroute the client's DNS resolver (we built a 
very fast tool that would return in ~2 network RTTs), do a DHT lookup on 
these the resolver and its last hops, and use any returned information 
in our selection of "nearby" CoralCDN web proxies.

Unfortunately, it didn't really work.  I can't cite the precise numbers 
off-hand, but we were seeing hit rates in the low single-digits.  Thus, 
the delay clients were experiencing for all this network hackery was 
totally out-of-proportion to its benefit.  (And when this hackery would 
occasionally take more than two seconds, the client's DNS resolver would 
happily reissue DNS requests to other Coral DNS servers, losing all the 
previous work already done.)

In short, we stopped using this technique after CoralCDN was deployed 
for a few months.  In fact, the experience of needing to return accurate 
responses very quickly led us to design and build another system, OASIS, 
that tries to build stable network maps that other services can use as a 
peer-selection oracle.

Now, one should be careful in trying to extend these lessons to the P2P 
file-sharing setting, as CoralCDN focused on web content, thus had much 
tighter latency requirements than a lot of file-sharing applications. 
Second, our use of a decentralized DHT, especially deployed on flaky 
PlanetLab servers, made meta-data loss more prevalent than one might 
find in cluster storage.  Third, and perhaps most importantly, the
"CoralCDN proxies" we were mapping clients to were largely deployed on 
university networks, while our client population was well-distributed 
over the entire Internet:  at the time, 100,000s-1M distinct clients IP 
addresses (and 10,000s of distinct DNS resolvers) per day.

On the flip side, I wouldn't completely discount these results.  Many of 
the very peers that contribute large amounts of upstream bandwidth on 
P2P systems are *also* on university networks (both because of their 
high upstream capacity and their lack of NATing compared to home users).

As an aside, ALTO seems to have many similar goals as the IETF 
SONAR/HOPs work in the mid-to-late 90s, although perhaps with more focus 
on ISP buy-in.  Thoughts?

Regards,
--mike


References:

   CoralCDN, http://www.coralcdn.org/
   NSDI 2004 paper: http://www.coralcdn.org/docs/coral-nsdi04.pdf

   OASIS, http://oasis.coralcdn.org/
   NSDI 2006 paper: http://www.coralcdn.org/docs/oasis-nsdi06.pdf


> On Oct 21, 2008, at 2:05 PM, Vijay K. Gurbani wrote:
> 
>> Nicholas Weaver wrote:
>>> Hey, stupid thought...
>>> Could you do proximity based on "who's your DNS resolver"?  Do a few 
>>> name lookups: one to register YOU as using YOUR DNS resolver to the 
>>> remote coordinator, and one to get "who are other peers using the 
>>> same resolver"?
>>> An ugly, UGLY hack, but it might be interesting to think about.
>>> Has anyone done this already?
>>
>> Doesn't Ono [1] do that?  Or am I missing something?  CDNs also
>> use DNS redirects to redirect clients to nearest servers.
>>
>> [1] David R. Choffnes and Fabián E. Bustamante. Taming the Torrent:
>>    A practical approach to reducing cross-ISP traffic in P2P
>>    systems, Proc. of ACM SIGCOMM 2008, August 2008.
_______________________________________________
p2pi mailing list
p2pi@ietf.org
https://www.ietf.org/mailman/listinfo/p2pi