Re: [decade] [ppsp] Object naming in -req and -arch

Peng Zhang <pzhang.thu@gmail.com> Wed, 11 July 2012 21:24 UTC

Return-Path: <pzhang.thu@gmail.com>
X-Original-To: decade@ietfa.amsl.com
Delivered-To: decade@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9B49D11E80EC for <decade@ietfa.amsl.com>; Wed, 11 Jul 2012 14:24:36 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.004
X-Spam-Level:
X-Spam-Status: No, score=-3.004 tagged_above=-999 required=5 tests=[AWL=-0.006, BAYES_00=-2.599, HTML_MESSAGE=0.001, J_CHICKENPOX_23=0.6, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PWLt0qyeOmIU for <decade@ietfa.amsl.com>; Wed, 11 Jul 2012 14:24:35 -0700 (PDT)
Received: from mail-qc0-f172.google.com (mail-qc0-f172.google.com [209.85.216.172]) by ietfa.amsl.com (Postfix) with ESMTP id 1D8A611E80EA for <decade@ietf.org>; Wed, 11 Jul 2012 14:24:35 -0700 (PDT)
Received: by qcac10 with SMTP id c10so1179967qca.31 for <decade@ietf.org>; Wed, 11 Jul 2012 14:25:06 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=subject:mime-version:content-type:from:x-priority:in-reply-to:date :cc:message-id:references:to:x-mailer; bh=K5wqYAmrcc8HHNraYHs8CFTUnV9iiTL5+VNdxhC7svQ=; b=JpdQu9k8H6AFqDY2EztEz0e/wUOr6LH+MHF/tMXG8xjzDOailU6bXgrKmfSojxqXDe Aq0ubS9iRrRwMTZT21elgzWrFVw0AqJAAxhgd+Um5oVwjvF6HRprWAjs9M4coJJzsH5M jeiFAi45mi9qcVSxprQCpYJ84PAeEXhpCJp2/aFv9OcX7yQIsy+LCWSj7xwjUlea+g9c JRk3+xfZ+jZGkFIA4HrKbxoqsJt1lcG9WF4JcRolKv7O3wkLMGb7SzFcVtbcfvFKTRbF GqDJIfCKZdFS6EuuBlITR3ezi6AXUgYLHYqrgUaWnMpv6kgAjtRs8thmQOQ6IZ5aCs7w tLTg==
Received: by 10.229.111.78 with SMTP id r14mr10663024qcp.100.1342041906209; Wed, 11 Jul 2012 14:25:06 -0700 (PDT)
Received: from dhcp-128-36-159-196.central.yale.edu (dhcp-128-36-159-196.central.yale.edu. [128.36.159.196]) by mx.google.com with ESMTPS id z9sm4490439qae.15.2012.07.11.14.25.04 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 11 Jul 2012 14:25:05 -0700 (PDT)
Mime-Version: 1.0 (Apple Message framework v1278)
Content-Type: multipart/alternative; boundary="Apple-Mail=_1D44633D-74D8-451E-94C5-2E602888A4A5"
From: Peng Zhang <pzhang.thu@gmail.com>
X-Priority: 3 (Normal)
In-Reply-To: <20120710162606039401143@chinamobile.com>
Date: Wed, 11 Jul 2012 17:25:03 -0400
Message-Id: <2039343B-5F6B-4777-864E-B4F00B5A258E@gmail.com>
References: <20120710162606039401143@chinamobile.com>
To: zhangyunfei <zhangyunfei@chinamobile.com>
X-Mailer: Apple Mail (2.1278)
Cc: decade <decade@ietf.org>, arno@cs.vu.nl
Subject: Re: [decade] [ppsp] Object naming in -req and -arch
X-BeenThere: decade@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "To start the discussion on DECoupled Application Data Enroute, to discuss the in-network data storage for p2p applications and its access protocol" <decade.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/decade>, <mailto:decade-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/decade>
List-Post: <mailto:decade@ietf.org>
List-Help: <mailto:decade-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/decade>, <mailto:decade-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 11 Jul 2012 21:24:36 -0000

Hi Arno,

	Thanks for the clarification. In my understanding, MHT over single hash can enable immediate integrity check before the whole media file is received, which is critical for streaming applications. But the client can also download hashes of every piece, just like in BitTorrent. Does it reduce a lot startup latency or server load by using MHT? Thanks.

	For DECADE, It supports integrity check on piece/chunk level, so that client can verify that the received piece corresponds to the piece name. DECADE is unaware of what file this piece belongs to, and thus does not provide end-to-end integrity check.  For file level, we leave the integrity check to applications. Thus, imo, we should not include MHT in the design of DECADE. But MHT can be built on top of DECADE, which means applications can still use MHT to implement integrity check for themselves.


B,

Peng.

On Jul 10, 2012, at 4:26 AM, zhangyunfei wrote:

> Arno's reply on MHT.
>  
> BR
> Yunfei
>  
> zhangyunfei
>  
> From: Arno Bakker
> Date: 2012-07-10 15:49
> To: ppsp@ietf.org
> Subject: Re: [ppsp] [decade] Object naming in -req and -arch
> Hi all
>  
> I'll try to clarify the rationale and practical overhead of the Merkle 
> Hash Trees in PPSP. For static content, MHTs enable content integrity 
> protection using self-certified naming. Using a hash tree instead of a 
> single hash is useful in all situations where the content is distributed
> in parts (=a sequence of objects as you mention it) that are immediately 
> used. In particular, when the parts are delivered to a higher level app 
> upon receipt they must be integrity checked beforehand. This applies to 
> streaming, but perhaps also to other P2P apps using DECADE

> Even if parts are not immediately used, an integrity check on parts can 
> help to improve efficiency in a P2P context. An end-to-end integrity 
> check when the content is completely downloaded is sufficient, but for
> efficiency it would be nice to know if the individual parts are correct
> instead of finding out at the end, especially for large content.
>  
> Note that Merkle Hash Trees support both partial and end-to-end 
> integrity checks. When a peer has a copy of the content and the name of 
> the object (=its root hash in the MHT) he can calculate the MHT from the 
> content and compare the calculated root hash to the name. He does not 
> need to receive any of the intermediary hashes from others, if that is 
> not required.
>  
> Which brings us to the topic of overhead. As discussed in Sec. 5.5 of
> http://www.ietf.org/id/draft-ietf-ppsp-peer-protocol-02.txt
> the size of the MHT depends on the number of chunks (objects) at the 
> base of the tree. That number depends on the size of the chunks that are 
> processed immediately in the P2P application. For PPSP over UDP over 
> Ethernet these chunks are small. For other P2P apps the chunks may be 
> bigger.
>  
> How much of the MHT tree actually needs to be sent over the wire to a 
> receiving peer depends on the download policy used. For a linear 
> download only part of the tree needs to be transmitted, as the other 
> part of the tree is calculated by the receiving peer while downloading.
> In the example in Sec. 5.5, only 7 of the 16 hashes in the tree are 
> actually transmitted.
>  
> Note that swift, the protocol on which the PPSP peer protocol is based 
> was actually designed as a generic transport protocol, unifying regular 
> downloads, VOD and live streaming. So it still supports efficient 
> non-streaming download policies like BitTorrent's rarest first. In other 
> words, its origins fits the general distribution nature of DECADE.
>  
> Regards,
>       Arno
>  
>  
> On 10/07/2012 07:23, Y. Richard Yang wrote:
> > Hi Peng, Dirk,
> >
> > I am cc'ing the ppsp list as well. You raised a good point on the
> > distinction between one object and a sequence of objects. To generalize,
> > we can discuss even a set of objects (no ordering), and a set of
> > equivalence objects (dynamic streaming that interleaves different
> > resolutions). Your arguement against MTH is higher overhead in the
> > general case (end to end arguements). How much exactly is the overhead?
> > Decade may benefit from the analysis from ppsp. Since streaming is
> > considered as the main app, how much overhead if decade has to build on top?
> >
> > Thanks!
> >
> > Richard
> >
> > On Tuesday, July 10, 2012, Peng Zhang wrote:
> >
> >     Hi Dirk and all,
> >
> >     I agree that the NI specification meets the basic requirement of
> >     DECADE (without optimization on "early name generation").
> >
> >     As for the Merkle Hash Tree, or MTH, it is a integrity-assurance
> >     method for a sequence of objects. It is critical to the PPSP
> >     protocol. But i wonder wether we should incorporate it in our design:
> >
> >     First, DECADE is targeted at general content distribution
> >     applications, and for applications other than P2P streaming, there
> >     is no great value of using Merkle Hash Tree. It may cause high
> >     overhead to these applications due "meta data" including signatures
> >     and full hashes should be exchanged.
> >
> >     Still, we can discuss more on how to better incorporate PPSP based
> >     on NI without hurting the general application of DECADE. Thanks.
> >
> _______________________________________________
> ppsp mailing list
> ppsp@ietf.org
> https://www.ietf.org/mailman/listinfo/ppsp