Re: [ppsp] [decade] Object naming in -req and -arch

Arno Bakker <arno@cs.vu.nl> Fri, 13 July 2012 07:12 UTC

Message-ID: <4FFFCA6D.7060306@cs.vu.nl>
Date: Fri, 13 Jul 2012 09:12:45 +0200
From: Arno Bakker <arno@cs.vu.nl>
User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:11.0) Gecko/20120312 Thunderbird/11.0
MIME-Version: 1.0
To: Peng Zhang <pzhang.thu@gmail.com>
References: <20120710162606039401143@chinamobile.com> <2039343B-5F6B-4777-864E-B4F00B5A258E@gmail.com> <4FFE6D67.80705@cs.vu.nl> <D282B585-33DD-4A0D-8CD3-9CF525C56446@gmail.com>
In-Reply-To: <D282B585-33DD-4A0D-8CD3-9CF525C56446@gmail.com>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
Cc: ppsp <ppsp@ietf.org>, decade <decade@ietf.org>
Subject: Re: [ppsp] [decade] Object naming in -req and -arch
Precedence: list
Reply-To: arno@cs.vu.nl

On 12/07/2012 22:28, Peng Zhang wrote:
>
> On Jul 12, 2012, at 2:23 AM, Arno Bakker wrote:
>
>> The gains of using MHT depend on the chunk size. For PPSP we prefer
>> chunks of 1K that fit in an UDP packet carried over Ethernet. In
>> that case, for a 4 GB file, there are 4 M chunks, resulting in 80
>> MB of leaf hashes when SHA1 is used. Transferring that beforehand
>> as in BitTorrent definitely increases latency ;o)
> Yes, if the chunk size is only 1KB, and each chunk is verified
> individually, we cannot afford to send all hashes beforehand. While
> in the worst case without optimization, almost 2*80M = 160M hashes
> needs to be sent to the receiver, will that be a large overhead
> compared to 4G? Do we really need such a small chunk size? Maybe I
> miss some previous discussion on this.

Hi

For PPSP we want to use UDP as we don't need the in-order and 
reliability features of TCP, and want flexibility to use differnet 
congestion control algorithms and handle NATs. With Ethernet as the 
dominant MAC layer at present and an unreliable transport we don't want 
datagrams to exceed the Ethernet MTU, otherwise the chance of losing a 
datagram increases (an UDP packet taking N IP packets will not be 
delivered when only 1 IP packet is lost). Hence, we use chunks of ~1K.

A good practice in P2P networks is to not forward data you have not 
verified. So to forward the 1K chunks directly we need to be able to 
verify their integrity at this granularity, enter Merkle Hash Trees.
We think the resulting overhead due to the size of the tree is 
acceptable, as it is easy to optimize the number of hashes transmitted
in our use cases.

CU,
     Arno

Re: [ppsp] [decade] Object naming in -req and -ar… Y. Richard Yang
Re: [ppsp] [decade] Object naming in -req and -ar… Arno Bakker
Re: [ppsp] [decade] Object naming in -req and -ar… Arno Bakker
Re: [ppsp] [decade] Object naming in -req and -ar… Arno Bakker
Re: [ppsp] [decade] Object naming in -req and -ar… Peng Zhang
Re: [ppsp] [decade] Object naming in -req and -ar… Songhaibin