Re: [p2pi] ALTO and caching (Was: Re: Charter and problem statement)

Laird Popkin <laird@pando.com> Wed, 16 July 2008 19:57 UTC

Return-Path: <p2pi-bounces@ietf.org>
X-Original-To: p2pi-archive@ietf.org
Delivered-To: ietfarch-p2pi-archive@core3.amsl.com
Received: from [127.0.0.1] (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 72EAA3A681A; Wed, 16 Jul 2008 12:57:59 -0700 (PDT)
X-Original-To: p2pi@core3.amsl.com
Delivered-To: p2pi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 946A93A681A for <p2pi@core3.amsl.com>; Wed, 16 Jul 2008 12:57:58 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.947
X-Spam-Level:
X-Spam-Status: No, score=-9.947 tagged_above=-999 required=5 tests=[AWL=-0.282, BAYES_00=-2.599, HABEAS_ACCREDITED_COI=-8, IP_NOT_FRIENDLY=0.334, J_CHICKENPOX_32=0.6]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id E9d5jD2bgjWN for <p2pi@core3.amsl.com>; Wed, 16 Jul 2008 12:57:57 -0700 (PDT)
Received: from dkny.pando.com (dkny.pando.com [67.99.55.163]) by core3.amsl.com (Postfix) with ESMTP id 0748F3A63EC for <p2pi@ietf.org>; Wed, 16 Jul 2008 12:57:56 -0700 (PDT)
Received: from localhost (localhost.localdomain [127.0.0.1]) by dkny.pando.com (Postfix) with ESMTP id B324CE10B5D; Wed, 16 Jul 2008 15:58:26 -0400 (EDT)
X-Virus-Scanned: amavisd-new at
Received: from dkny.pando.com ([127.0.0.1]) by localhost (dkny.pando.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id o3XPSo4vifGR; Wed, 16 Jul 2008 15:57:58 -0400 (EDT)
Received: from dkny.pando.com (dkny.pando.com [10.10.60.11]) by dkny.pando.com (Postfix) with ESMTP id 7C2BCE10B5A; Wed, 16 Jul 2008 15:57:58 -0400 (EDT)
Date: Wed, 16 Jul 2008 15:57:58 -0400
From: Laird Popkin <laird@pando.com>
To: Stas Khirman <stas@khirman.com>
Message-ID: <1529517116.15191216238278469.JavaMail.root@dkny.pando.com>
In-Reply-To: <006001c8e774$faa418e0$a00d11ac@jnpr.net>
MIME-Version: 1.0
X-Originating-IP: [71.187.207.81]
Cc: Stanislav Shalunov <shalunov@bittorrent.com>, p2pi@ietf.org
Subject: Re: [p2pi] ALTO and caching (Was: Re: Charter and problem statement)
X-BeenThere: p2pi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: P2P Infrastructure Discussion <p2pi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/p2pi>, <mailto:p2pi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/pipermail/p2pi>
List-Post: <mailto:p2pi@ietf.org>
List-Help: <mailto:p2pi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/p2pi>, <mailto:p2pi-request@ietf.org?subject=subscribe>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: p2pi-bounces@ietf.org
Errors-To: p2pi-bounces@ietf.org

A few comments:

I believe that "- due necessity to deal with business sensitive information, ALTO "optimization" process will be deployed and operated by ISPs and not by any third-party" should not be a requirement for ALTO. While some ISPs want to use proprietary information to provide guidance, and thus want to run their own P4P iTrackers (and, by analogy, future ALTO servers), other ISPs feel just as strongly that they prefer to have an independent third party (like IANA) operate between the ISPs and the P2P networks. Given this, I believe that ALTO should not mandate a specific option, but should preserve flexibility for the ISPs to choose either model.

Also, "discovery" and "optimization" are quite similar issues (because caches are p2p nodes), communicating related information between the same participants at the same time, so while a few minor details are different, they feel (to me) to be issues that should be addressed within the same framework and protocol. Given that ALTO is extensible, I'd prefer that we address cache location within ALTO than to invent two very similar protocols.

The definition of "peer selection optimization" is (IMO) incorrect, unless I am misinterpreting your terminology. Let me try to provide an example to illustrate. Consider a swarm with 10,000 peers, of which a given peer knows about 40 peers, and is connected to 10. The greatest contribution that ALTO can make is to help the p2p network determine the best 40 peers to tell a peer to connect to, because the speed and performance savings are large, because it would take a long time (perhaps 2,000 minutes) to exchange data with all 10,000 peers and determine which are best/closest. There is minimal if any value in telling a peer which of the 40 known peers is closest, because the peer can fairly rapidly (perhaps 1 minute) measure throughput with all 40 peers and determine the best/closest peers without ALTO. Given this, the assertion that "peer set discovery process has to be accomplished BEFORE optimization" is incorrect, as the ALTO optimization is used to provide the optimal peer set to be 'discovered' by a peer.

- Laird Popkin, CTO, Pando Networks
  mobile: 646/465-0570

----- Original Message -----
From: "Stas Khirman" <stas@khirman.com>
To: "Vijay K. Gurbani" <vkg@alcatel-lucent.com>, "Stanislav Shalunov" <shalunov@bittorrent.com>
Cc: p2pi@ietf.org
Sent: Wednesday, July 16, 2008 2:51:34 PM (GMT-0500) America/New_York
Subject: Re: [p2pi] ALTO and caching (Was: Re: Charter and problem statement)

To summarize my points from recent threads:

"Peer selection optimization" is processes to select "optimal" subset of
peers from set of discovered peers that have specific content or part of it.
This process could/should be implemented without knowledge of specific
protocol (protocols) supported by peers and content they share, but rather
based on peers capabilities, ISP topology, utilization and other
ISP-sensitive information. 

A few important assumptions/requirements for "optimization" process:
- due necessity to deal with business sensitive information, ALTO
"optimization" process will be deployed and operated by ISPs and not by any
third-party
- in general, good optimization requires a knowledge of set of all peers to
chose from (and their capabilities/utilisation). Merging of few "optimized"
subsets may result in sub-optimal decision. That, peer set discovery process
has to be accomplished BEFORE optimization.
- Content and protocol obscurity is very important requirement for ALTO
"optimization" as it removes any concerns about end-user privacy and ISP
legal exposure, which are one of the main reasons for ISPs reluctance to
deal with P2P optimization today (P2P caching as an example) . 

So far, there are no reliable methods/protocols to satisfy "optimization"
task. I strongly believe that once offered, content/protocol agnostic ALTO
"peer selection optimization" will be quickly adopted by application
developers and ISPs.

"Resource discovery" is a process to discover [additional] peers that may be
useful in content delivery. Cache discovery is a special case of it.

A few important assumptions/requirements for "discovery" process:
- Resource discovery process does not need to use ISP-sensitive information,
such as topology, peering agreements, etc.
- Resource discovery process requires some context information, such as type
of resource, content id, etc.
- Resource discovery do not need to deliver "optimal" set of peers. As
matter of fact, it probably has no enough knowledge to do it. 
- Resource discovery and associated resources may be own and run by various
entities, not just end-user ISP. For example, group of regional ISPs may run
a shared caching service in the peering point. Resource discovery may be run
by third party unrelated to given ISP(as matter of fact, BitTorrent tracker
is a "discovery" service as well).
- "Discovery" and "optimization" services may or may not run on the same
equipment. IMHO, there are no technical needs to mix them together at all. 
- Discovery may have a number of issues with end-user privacy and ISP legal
exposure 

There are already few proprietary "discovery protocols"
(CacheLogics/BitTorrect CDP for example), but no one had been widely
adopted. Considering expressed interest, it will be beneficiary to include
open flexible "resource discovery" into ALTO scope. However, due legal
issues, some ISPs may be reluctant to deploy any kind of ALTO "discovery"
components.

Considering business, legal and technical points above, I advocate to keep
"optimization" and "discovery" as separate independent methods, protocols,
implementations, components, etc....


Stas


> -----Original Message-----
> From: p2pi-bounces@ietf.org [mailto:p2pi-bounces@ietf.org] On Behalf Of
> Vijay K. Gurbani
> Sent: Wednesday, July 16, 2008 8:13 AM
> To: Stanislav Shalunov
> Cc: p2pi@ietf.org
> Subject: [p2pi] ALTO and caching (Was: Re: Charter and problem statement)
> 
> I have re-named this thread to track it better and summarize
> the discussion.
> 
> Stanislav Shalunov wrote:
> > The cache discovery portion of ALTO could be equivalent to BEP 22, or it
> > could literally be the mechanism in BEP 22 and that would be quite
> > sufficient for cache discovery.
> 
> Stanislav: I think we're making progress.  So, in that spirit,
> let's try to flesh this out some more.
> 
> Who do you see doing cache discovery?  The ALTO server or the
> individual peers?
> 
> If we take the view that a cache is a selfless peer who wants
> nothing in return, then its discovery is no different than
> discovering other peers.  Then this is best relegated as a
> normal peer-discovery algorithm/process implemented in any
> given P2P overlay.
> 
> Now, let's tackle the ALTO server doing cache discovery.  If the
> ALTO server can use a standard means to discover a cache, why
> can't the peer use the same means?
> 
> I believe that if cache discovery is moved to the realm of
> the ALTO server, it becomes more of cache "dissemination" problem
> than of cache "discovery."  In other words, the ALTO server
> will be pre-configured by the ISP  -- or if it is not operated by
> the ISP, will use other means -- to maintain a set of known caches.
> It has been suggested that caches "register" to the ALTO server.
> I think that this is probably not a good idea for a variety of
> reasons.
> 
> Then, peers will query the ALTO server using a protocol yet to
> be defined (let's refer to this as the TBD protocol.)  This
> TBD protocol will have an extension that the peer can use to
> provide the ALTO server with some hints (overlay protocol, i.e.,
> BitTorrent, eDonkey; hash of the content, etc.)  The TBD
> protocol will arrange it such that in a response, the querying
> peer is told to contact the cache first.
> 
> With some additional work, this extension to the TBD protocol
> could be used to also find other resources like VoIP relays,
> and it can also be applicable for non-P2P uses like CDN.
> 
> Is this an approach that seems feasible?  If so, we can word-
> smith something to this effect and add it to the draft
> charter.
> 
> Thanks,
> 
> - vijay
> --
> Vijay K. Gurbani, Bell Laboratories, Alcatel-Lucent
> 2701 Lucent Lane, Rm. 9F-546, Lisle, Illinois 60532 (USA)
> Email: vkg@{alcatel-lucent.com,bell-labs.com,acm.org}
> WWW:   http://www.alcatel-lucent.com/bell-labs
> _______________________________________________
> p2pi mailing list
> p2pi@ietf.org
> https://www.ietf.org/mailman/listinfo/p2pi

_______________________________________________
p2pi mailing list
p2pi@ietf.org
https://www.ietf.org/mailman/listinfo/p2pi
_______________________________________________
p2pi mailing list
p2pi@ietf.org
https://www.ietf.org/mailman/listinfo/p2pi