[ftpext] draft-ietf-ftpext2-hash - partial hashes

Robert Oslin <rto@globalscape.com> Fri, 14 January 2011 23:30 UTC

Return-Path: <rto@globalscape.com>
X-Original-To: ftpext@core3.amsl.com
Delivered-To: ftpext@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 1419B3A6CA5 for <ftpext@core3.amsl.com>; Fri, 14 Jan 2011 15:30:50 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.088
X-Spam-Level:
X-Spam-Status: No, score=-0.088 tagged_above=-999 required=5 tests=[AWL=2.511, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id wRPGfnlqAKSA for <ftpext@core3.amsl.com>; Fri, 14 Jan 2011 15:30:49 -0800 (PST)
Received: from webmail.globalscape.com (exchange.globalscape.com [208.89.186.61]) by core3.amsl.com (Postfix) with ESMTP id D9C353A6C14 for <ftpext@ietf.org>; Fri, 14 Jan 2011 15:30:48 -0800 (PST)
Received: from exchange.forest.intranet.gs ([127.0.0.1]) by exchange ([127.0.0.1]) with mapi; Fri, 14 Jan 2011 17:33:15 -0600
From: Robert Oslin <rto@globalscape.com>
To: "ftpext@ietf.org" <ftpext@ietf.org>
Date: Fri, 14 Jan 2011 17:32:58 -0600
Thread-Topic: draft-ietf-ftpext2-hash - partial hashes
Thread-Index: Acu0Q2BCbIG45I28QRe5qMdFqet68Q==
Message-ID: <F15941D3C8A2D54D92B341C20CACDF2311976FEB98@exchange>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: [ftpext] draft-ietf-ftpext2-hash - partial hashes
X-BeenThere: ftpext@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: <ftpext.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/ftpext>, <mailto:ftpext-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ftpext>
List-Post: <mailto:ftpext@ietf.org>
List-Help: <mailto:ftpext-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ftpext>, <mailto:ftpext-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 14 Jan 2011 23:30:50 -0000

draft-ietf-ftpext2-hash open issues indicates: "Current version of the draft defines full file hashes, but not partial file hashes."

Partial hashes are valuable and applicable to re-world scenarios and should be considered for inclusion in the HASH specification.

Use Case: 

	User initiates download of multi-gigabyte file, such as ISO or similar.
	Transaction is interrupted
	[Later] User re-initiates download of same (supposedly) file

At this point the client software must make a decision as to whether to resume the transfer or not. Filename is the first criteria observed, followed by size. However, these do not take into consideration a corrupted partial file (even with identical byte count), or that the remote file could have changed, especially given the difficulty in assessing time differences between client and host systems.

By requesting the hash for the portion of the remote file matching the bytes for the partial local file, the client could determine whether the local file is indeed valid and partial and subsequently resume the transfer from the appropriate byte offset.

A similar need occurs when the client has no prior knowledge of the transfer (no queue or cache mechanism) and a same name file is identified and the client must determine whether the local file is just a segment/part of the larger file located on the remote. 

Below is an example of overwrite logic performed today by our FTP client using hashes and size comparisons: 

If a user requests to download a file and a file with the same name exists locally, the client will determine if the file sizes are the same or if the destination (local) size is smaller (indicating a possible partially transferred file). If file sizes are the same then the client will compute the hash for the entire file and ask the server for to provide the hash for the corresponding remote file. The client will then skip the transfer if the hashes are identical or overwrite the file if the hashes do not match. If  the remote (source) file is larger the client will ask the server for a partial hash, up to the bytes that match the local (destination) file size. If the partial hash matches then the client will resume the transfer from the byte offset. If the hashes are different then the client will overwrite the file. (e.g. local partial file was corrupted or is not same file).

Robert Oslin
Director of Product Management
Tel: 1 (210) 293-7902
Fax: 1 (210) 690-8824
Send me large files securely
www.globalscape.com