Re: [p2pi] Refining the ALTO problem statement [Was: Re: discussing P2PI-related standardization in Dublin]
stefano previdi <sprevidi@cisco.com> Fri, 13 June 2008 13:01 UTC
Return-Path: <p2pi-bounces@ietf.org>
X-Original-To: p2pi-archive@ietf.org
Delivered-To: ietfarch-p2pi-archive@core3.amsl.com
Received: from [127.0.0.1] (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 24B6F3A6A16; Fri, 13 Jun 2008 06:01:28 -0700 (PDT)
X-Original-To: p2pi@core3.amsl.com
Delivered-To: p2pi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 70FFF3A6A16 for <p2pi@core3.amsl.com>; Fri, 13 Jun 2008 06:01:27 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.566
X-Spam-Level:
X-Spam-Status: No, score=-2.566 tagged_above=-999 required=5 tests=[AWL=0.033, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id UdvZjjKqlb4d for <p2pi@core3.amsl.com>; Fri, 13 Jun 2008 06:01:25 -0700 (PDT)
Received: from av-tac-bru.cisco.com (odd-brew.cisco.com [144.254.15.119]) by core3.amsl.com (Postfix) with ESMTP id 5F86B3A6924 for <p2pi@ietf.org>; Fri, 13 Jun 2008 06:01:25 -0700 (PDT)
X-TACSUNS: Virus Scanned
Received: from strange-brew.cisco.com (localhost [127.0.0.1]) by av-tac-bru.cisco.com (8.11.7p3+Sun/8.11.7) with ESMTP id m5DD1um02688; Fri, 13 Jun 2008 15:01:56 +0200 (CEST)
Received: from [144.254.189.71] (dhcp-rom2-144-254-189-71.cisco.com [144.254.189.71]) by strange-brew.cisco.com (8.11.7p3+Sun/8.11.7) with ESMTP id m5DD1qA09863; Fri, 13 Jun 2008 15:01:52 +0200 (CEST)
User-Agent: Microsoft-Entourage/11.4.0.080122
Date: Fri, 13 Jun 2008 15:01:32 +0200
From: stefano previdi <sprevidi@cisco.com>
To: Laird Popkin <laird@pando.com>
Message-ID: <C4783C4C.68E15%sprevidi@cisco.com>
Thread-Topic: [p2pi] Refining the ALTO problem statement [Was: Re: discussing P2PI-related standardization in Dublin]
Thread-Index: AcjNVZoF2Ksm3zlIEd2gXAAX8vOM8g==
In-Reply-To: <565621476.393261213359293462.JavaMail.root@dkny.pando.com>
Mime-version: 1.0
Cc: p2pi@ietf.org
Subject: Re: [p2pi] Refining the ALTO problem statement [Was: Re: discussing P2PI-related standardization in Dublin]
X-BeenThere: p2pi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: P2P Infrastructure Discussion <p2pi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/p2pi>, <mailto:p2pi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/pipermail/p2pi>
List-Post: <mailto:p2pi@ietf.org>
List-Help: <mailto:p2pi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/p2pi>, <mailto:p2pi-request@ietf.org?subject=subscribe>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: p2pi-bounces@ietf.org
Errors-To: p2pi-bounces@ietf.org
Hi Laird, I understand your concern and let me try to answer from a routing implementation perspective. My thoughts are as follows: - the options of having a list of IP addresses to rank may be faced to some limitations but I probably have different view. I believe one alternative is to have a ranking of _pair_ of IP addresses (I believe there's a proposal made already on this direction). It doesn't address scalability per se but it allows other form of oracle/proximity capability. - From a routing algorithm perspective (and its implementation). The localization of 10 Vs. 1000 addresses may not incur an exponential complexity. I fact, routing algorithms/implementations are faced to this problem for years now... ;-) In the last years a lot of work has been done in the routing layer in order to improve performance and to minimize the impact of routing computations. We are now leveraging all this work in order to address other type of routing paradigms (using the same input). - State can be kept even if it results from different computations. This allows an "oracle/proximity" provider to minimize the amount of computation and to be able do deliver results on-the-fly. - There are apps models that have some form of hierarchy (e.g: trackers) and some others not. Should we have different models in the oracle too, or should we try to be as much as possible transparent/agnostic ? I'd vote for the latter. - I can't speak much about the legal/psychological issue (disclosing p2p addresses to SP) as I come from bottom layers, but what I can say is that, technically, the SP doesn't need any form of oracle in order to understand who among his customers is a p2p addicted... ;-) - In one point we're in violent agreement: This is a great discussion. Thanks. s. > From: Laird Popkin <laird@pando.com> > > This is a great discussion. > > One point I'll make related to the signaling protocol for querying the > "oracle" is that while the "send in a list of peer IP addresses and receive a > sorted list back" approach is appealing in its simplicity, it has some serious > limitations that would need to be taken into account. > > It's important to keep in mind that optimizing peer selection for p2p > transfers has the most impact for large swarms. That is, if there are only 20 > people on the planet that can serve you data, (1) it doesn't take too long to > connect to all of them so optimized peer selection doesn't save much peer > connection time, and (2) the odds of any of them being in the same ISP is > fairly low, so optimized peer selection doesn't affect data flow much. > Optimizing peer selection is focused on swarms with very large numbers of > peers. For example, we often see swarms with thousands or tens of thousands of > peers. These are a small percentage of swarms, of course, but they are > responsible for a disproportionately large volume of data. For example, when > Yale surveyed "major tracker sites" they saw that a very small percentage of > swarms had 100+ peers, but that those swarms were responsible for well over > 50% of downloads. For perspective, at Pando, we see swarms with thousands to > tens of thousands of peers fairly often. > > It is for these large sarms where the "list of IPs" approach runs into some > pragmatic issues: > - The "oracle" must be sent the current complete set of IPs in each swarm. > This means that the queries would be very large (e.g. a list of 1,000 IP > addresses). This is not very cacheable, because swarm composition is highly > dynamic. And it has to be a complete list, or the value of the guidance > degrades dramatically. (Conceivable this could be implemented as 'delta' > updates, but that raises oracle state issues, makes the software and protocol > much more complex, etc.). > > - The "oracle" would have to maintain state (or recompute on every query) for > every swarm in every p2p network. > - The volume of queries would be very large (perhaps on the order of 100K > queries per second, across all p2p networks operating within an ISP). > - The queries cannot be cached (because the first announce is the most > important to optimize, and following queries need to include all new peers, > and each query generates a response unique to the peer being provided the > guidance). > - The ISP would need to scale the "oracle" capacity to satisfy this volume. > - It exposes the entire membership and data flow of the p2p network to the > ISPs, which raises obvious legal and privacy issues. Imagine asking your > lawyers whether they want to run Pirate's Bay's Trackers. > - It makes every Tracker announce slow, expensive and unreliable instead of > fast, cheap and reliable. > > This is why we shifted (in the P4P approach) away from the more obvious IP > list approach to moving the guidance into the Tracker's memory. Admittedly > that does require ISPs to expose some abstract guidance information (IP > prefixes and percentages), but in return: > - The communication from the "oracle" to the p2p network is a single > asynchronous update (e.g. as little as one message every hour) instead of one > message per peer announce per swarm. > - The p2p network's Tracker can process peer requests in-memory, without > having to make a blocking network call to an external "oracle". This is > several orders of magnitude faster, more reliable, and far less resource > intensive. > - There's minimal information exposed between the ISP and the P2P network > (which is, from a legal and privacy angle, better for both P2P and ISP). > > - Laird Popkin, CTO, Pando Networks > mobile: 646/465-0570 > > ----- Original Message ----- > From: "stefano previdi" <sprevidi@cisco.com> > To: "Enrico Marocco" <enrico.marocco@telecomitalia.it>, p2pi@ietf.org > Sent: Friday, June 13, 2008 4:52:29 AM (GMT-0500) America/New_York > Subject: Re: [p2pi] Refining the ALTO problem statement [Was: Re: discussing > P2PI-related standardization in Dublin] > > Enrico, > > I mostly agree mostly with everything below. My suggestion (as a routing guy > more than an apps guy) is also to scope what could be extended/enhanced in > routing protocols in order to improve localization mechanisms (whatever they > are). > > As an example, we have extended ISIS routing protocol in order to carry some > application related information (at this stage is an empty container) and > probably these kind of enhancements may have a use in localization/oracle > services. > > s. > > >> From: Enrico Marocco <enrico.marocco@telecomitalia.it> >> >> Peterson, Jon wrote: >>> The proposal for the ALTO BoF (which roughly covers (3)) is now in the >>> BoF tracker. You can read about it at its home page: >>> >>> http://alto.tilab.com/ >> [...] >>> we hope that discussions surrounding the BoF could elucidate the proper >>> division of labor and approach to this problem. >> >> To follow up on Jon's thoughts and to reflect discussions that happened >> during the MIT workshop and on the mailing list, Vijay and I are going >> to update the problem statement draft the ALTO BoF proposal is based on >> (draft-marocco-alto-problem-statement-00). >> >> As a prelude to focus the ALTO BoF, it would be great to start >> coalescing community agreement on ensuring that the scope of the problem >> is well defined and understood. >> >> Essentially, the problem that ALTO will attempt to solve is this: >> applications (P2P and otherwise) that make intensive use of the network >> bandwidth or want to make decisions to optimize some function (see use >> cases) may be interested in choosing all or some of their peers based on >> recommendations provided by some network entity -- colloquially called >> an oracle -- that has more of a global view on the peers in the network, >> their topological distances, and their capabilities than the application >> itself does. >> >> In thinking about the problem, it may help to consider the following: >> >> + Use cases: current version of the draft identifies four use cases for >> ALTO solutions (file-sharing, VoIP, p2p streaming and DHTs) but others >> come to mind (e.g. cache/http-mirror selection, CDN). At the MIT >> workshop, Ted Hardie coined the term "unattended consequences" of >> peer-to-peer nodes where traffic runs constantly for as long as the >> resources to that node are popular. Video feeds to remote stations >> were identified as a category. >> >> + Third-party services: even if topology information is directly >> available to network operators, nothing should prevent third-parties >> to provide peer selection services alternatively to or in competition >> with ISPs. An incomplete list of proposed solutions for providing >> such services is online at http://alto.tilab.com/resources.html; >> >> + Type(s) of information that are likely provided by the oracle (note >> that a third-party oracle may not have access to some of this >> information): >> * Given a list of peer candidate IP addresses, sort them according to >> preferences (topological proximity, bandwidth constraints, etc.) and >> return to querying application; >> * Ability for an application to determine its network location and >> proximity to other nodes; >> * Support for cost and charging information for application so that >> applications can make appropriate trade-offs; >> * Capabilities and characteristics of peers (i.e., last hop bandwidth, >> etc.); >> * Network policies; >> * ... >> >> + Type(s) of information that is provided to the oracle for rendering an >> equitable decision: >> * List of peer candidate IP addresses; >> * Parameter to optimize (cost, delay, ...); >> * ... >> >> This is mainly what we have captured so far; however, feedback, >> suggestions and contributions are not only welcome, but at this time >> even required for the successful setting of a BOF. >> >> Besides, the draft currently identifies two elements for which >> standardization work would be required, namely a discovery mechanisms >> for locating oracles and a signaling protocol for querying them. Do >> people think the core solution for ALTO should include anything else? >> >> -- >> Ciao, >> Enrico >> _______________________________________________ >> p2pi mailing list >> p2pi@ietf.org >> https://www.ietf.org/mailman/listinfo/p2pi > > > _______________________________________________ > p2pi mailing list > p2pi@ietf.org > https://www.ietf.org/mailman/listinfo/p2pi _______________________________________________ p2pi mailing list p2pi@ietf.org https://www.ietf.org/mailman/listinfo/p2pi
- Re: [p2pi] Refining the ALTO problem statement [W… Enrico Marocco
- Re: [p2pi] Refining the ALTO problem statement [W… Laird Popkin
- Re: [p2pi] Refining the ALTO problem statement [W… stefano previdi
- Re: [p2pi] Refining the ALTO problem statement [W… Stephane Bortzmeyer
- Re: [p2pi] Refining the ALTO problem statement [W… Vijay K. Gurbani
- Re: [p2pi] Refining the ALTO problem statement [W… Laird Popkin
- Re: [p2pi] Refining the ALTO problem statement [W… Stephane Bortzmeyer
- Re: [p2pi] Refining the ALTO problem statement [W… Stephane Bortzmeyer
- Re: [p2pi] Refining the ALTO problem statement [W… Vijay K. Gurbani
- Re: [p2pi] Refining the ALTO problem statement [W… Vijay K. Gurbani
- Re: [p2pi] Refining the ALTO problem statement [W… Laird Popkin
- Re: [p2pi] Refining the ALTO problem statement [W… Vijay K. Gurbani
- Re: [p2pi] Refining the ALTO problem statement [W… Enrico Marocco
- Re: [p2pi] Refining the ALTO problem statement [W… Enrico Marocco
- Re: [p2pi] Refining the ALTO problem statement [W… Laird Popkin
- Re: [p2pi] Refining the ALTO problem statement [W… Y. R. Yang
- Re: [p2pi] Refining the ALTO problem statement [W… Stephane Bortzmeyer
- Re: [p2pi] Refining the ALTO problem statement [W… stefano previdi