Re: [p2pi] Real life torrent statistics

Laird Popkin <laird@pando.com> Wed, 20 August 2008 00:05 UTC

Return-Path: <p2pi-bounces@ietf.org>
X-Original-To: p2pi-archive@ietf.org
Delivered-To: ietfarch-p2pi-archive@core3.amsl.com
Received: from [127.0.0.1] (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 1596028C13A; Tue, 19 Aug 2008 17:05:12 -0700 (PDT)
X-Original-To: p2pi@core3.amsl.com
Delivered-To: p2pi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 222B33A6B4D for <p2pi@core3.amsl.com>; Tue, 19 Aug 2008 17:05:11 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -8.889
X-Spam-Level:
X-Spam-Status: No, score=-8.889 tagged_above=-999 required=5 tests=[AWL=-1.226, BAYES_50=0.001, HABEAS_ACCREDITED_COI=-8, HTML_MESSAGE=0.001, IP_NOT_FRIENDLY=0.334]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id w3zawxtayEaF for <p2pi@core3.amsl.com>; Tue, 19 Aug 2008 17:05:09 -0700 (PDT)
Received: from dkny.pando.com (dkny.pando.com [67.99.55.163]) by core3.amsl.com (Postfix) with ESMTP id A52DA3A6B77 for <p2pi@ietf.org>; Tue, 19 Aug 2008 17:05:09 -0700 (PDT)
Received: from localhost (localhost.localdomain [127.0.0.1]) by dkny.pando.com (Postfix) with ESMTP id 81CAEE10C02; Tue, 19 Aug 2008 20:05:12 -0400 (EDT)
X-Virus-Scanned: amavisd-new at
Received: from dkny.pando.com ([127.0.0.1]) by localhost (dkny.pando.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id StHZouZa9iP1; Tue, 19 Aug 2008 20:04:59 -0400 (EDT)
Received: from dkny.pando.com (dkny.pando.com [10.10.60.11]) by dkny.pando.com (Postfix) with ESMTP id 2AB2BE10AB8; Tue, 19 Aug 2008 20:04:59 -0400 (EDT)
Date: Tue, 19 Aug 2008 20:04:59 -0400
From: Laird Popkin <laird@pando.com>
To: The 8472 <the8472@infinite-source.de>
Message-ID: <685421868.179521219190699129.JavaMail.root@dkny.pando.com>
In-Reply-To: <1104054922.179501219190149690.JavaMail.root@dkny.pando.com>
MIME-Version: 1.0
X-Originating-IP: [10.10.20.79]
Cc: p2pi@ietf.org, p4pwg@yahoogroups.com
Subject: Re: [p2pi] Real life torrent statistics
X-BeenThere: p2pi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: P2P Infrastructure Discussion <p2pi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/p2pi>, <mailto:p2pi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/pipermail/p2pi>
List-Post: <mailto:p2pi@ietf.org>
List-Help: <mailto:p2pi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/p2pi>, <mailto:p2pi-request@ietf.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============0946967062=="
Sender: p2pi-bounces@ietf.org
Errors-To: p2pi-bounces@ietf.org


You're right that swarms with 10 peers can't be optimized - the p2p network will connect all 10 peers to each other and move as much data as possible. 



That being said, I suspect that a small number of very popular swarms that can be optimized would give significant impact on overall data flow, because one swarm with 20,000 peers balances out a lot of 'long tail'. Haiyong Xie of Yale mentioned to me that there was an analysis of this last year, by spidering one of the popular torrent web sites, and their conclusion was that well over 50% of the downloaders were in swarms with 100+ active peers, which is where they estimate that P4P optimization applies. Similarly, when I was in the music business and tracked such things, 2% of the files in the p2p networks accounted for the large majority of the download activity. 



I think that it would be a very interesting analysis, spidering some torrent web sites and seeing how the distribution of bandwidth is balanced between the 'head' and the 'long tail'. 


- Laird Popkin, CTO, Pando Networks 
  mobile: 646/465-0570 

----- Original Message ----- 
From: "The 8472" <the8472@infinite-source.de> 
To: "Stas Khirman" <stas@khirman.com> 
Cc: p2pi@ietf.org, p4pwg@yahoogroups.com 
Sent: Tuesday, August 19, 2008 4:40:39 PM (GMT-0800) America/Los_Angeles 
Subject: Re: [p2pi] Real life torrent statistics 

Stas Khirman wrote: 




To estimate a feasibility of ALTO/P4P for real life torrents , I collected <ip,port> information for peers from one of the most popular “PirateBay” torrents ( almost 20k peers) and maped their IPs to corresponded ASs. Please find attached my working notes with some interesting statistics. Ahh, there is a problem with this one. With torrents you have a significant long tail when it comes to swarm sizes and content. I'm not certain about distribution, but the long tail will probably outweight the... let's say top 100 torrents. Torrents with only 10-20 peers spread throughout several ASNs is a much harder to optimize than the top 100. 
This problem is aggreviated by swarm fragmentation due to private trackers and since bittorrent does not aim to coalesce all torrents with the same content, i.e. due to different piece sizes, file names etc. 







Also, I find it surprising geo distribution of the peers – majority were in UK , not in US (probably because content is available in US theaters).  Places 3-5 taken by Sweden, Poland and Canada (in total – more peers then in US). This will probably be different if you sample torrents from regional trackers or torrents aimed at other audiences. During some DHT tracing on the weekend i saw a significant proportion of DHT traffic coming from asian countries, though i suspect an inefficient implementation of the DHT by a client that's popular in china to play some role in this distribution. 









Certainly, observed “heavy” neighboring of peers is a function of swarm size. I intend to investigate a few medium/small size swarms to have a multi-point picture for any future discussions. as i mentioned above we should try to get the big picture, i.e. how relevant the long tail is, measured in aggregate bandwidth. If the small torrents actually make up the bulk of the traffic then any solution will require a high degree of cooperation between ISPs, e.g. caches that cooperate with each other. 

-- 
The 8472 
independent developer for the Azureus Vuze Bittorrent client 

_______________________________________________ p2pi mailing list p2pi@ietf.org https://www.ietf.org/mailman/listinfo/p2pi
_______________________________________________
p2pi mailing list
p2pi@ietf.org
https://www.ietf.org/mailman/listinfo/p2pi