Re: [ftpext] draft-ietf-ftpext2-hash - partial hashes

Mat Berchtold <mb@smartftp.com> Mon, 17 January 2011 09:29 UTC

Return-Path: <mb@smartftp.com>
X-Original-To: ftpext@core3.amsl.com
Delivered-To: ftpext@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 977F528C1AA for <ftpext@core3.amsl.com>; Mon, 17 Jan 2011 01:29:08 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.849
X-Spam-Level:
X-Spam-Status: No, score=-1.849 tagged_above=-999 required=5 tests=[AWL=-0.450, BAYES_00=-2.599, J_CHICKENPOX_14=0.6, J_CHICKENPOX_64=0.6]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id NpoJXF5hykfK for <ftpext@core3.amsl.com>; Mon, 17 Jan 2011 01:29:07 -0800 (PST)
Received: from mail.smartftp.com (mail.smartftp.com [75.126.59.172]) by core3.amsl.com (Postfix) with ESMTP id 1F02C28C1C0 for <ftpext@ietf.org>; Mon, 17 Jan 2011 01:29:07 -0800 (PST)
Received: from M.smartsoft.local ([fe80::fd57:1201:8518:71bc]) by m.smartsoft.local ([fe80::fd57:1201:8518:71bc%12]) with mapi id 14.01.0270.001; Mon, 17 Jan 2011 01:31:41 -0800
From: Mat Berchtold <mb@smartftp.com>
To: Anthony Bryan <anthonybryan@gmail.com>, Robert Oslin <rto@globalscape.com>
Thread-Topic: [ftpext] draft-ietf-ftpext2-hash - partial hashes
Thread-Index: Acu0Q2BCbIG45I28QRe5qMdFqet68QCJIOKAABAlE3A=
Date: Mon, 17 Jan 2011 09:31:40 +0000
Message-ID: <FF57203CB9E6C34C91B833DA2765782F035B12@m.smartsoft.local>
References: <F15941D3C8A2D54D92B341C20CACDF2311976FEB98@exchange> <AANLkTiktAfQuq_utOMXWS11zWiU6B=vzRPM3o7X_Sx9g@mail.gmail.com>
In-Reply-To: <AANLkTiktAfQuq_utOMXWS11zWiU6B=vzRPM3o7X_Sx9g@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [172.16.0.205]
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Cc: "ftpext@ietf.org" <ftpext@ietf.org>
Subject: Re: [ftpext] draft-ietf-ftpext2-hash - partial hashes
X-BeenThere: ftpext@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: <ftpext.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/ftpext>, <mailto:ftpext-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/ftpext>
List-Post: <mailto:ftpext@ietf.org>
List-Help: <mailto:ftpext-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ftpext>, <mailto:ftpext-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 17 Jan 2011 09:29:08 -0000

Hello Anthony ..

>RANG for HASH
I understand that RANG is a replacement for REST. Using the newly introduced command now to extend other commands as HASH might not be what RANG has been originally intended for.  My concerns:

- FEAT Reply
One concern is how to announce the feature in the FEAT reply. Right now the RANG draft proposed to use RANG STREAM. Assuming that the server now supports RANG for HASH as well, how does the FEAT line look like? RANG STREAM;HASH? What if another command that might be introduced in the future wants to use the RANG concept as well? 

- Multiple commands for one function 
The use of multiple commands for one function makes it more complex than it needs to be. 

>Anthony's proposed FEAT reply for partial hashes
I think partial hashes should be a MUST and not optional. Hence there is no need for a special FEAT response.

>MLST style reply
I'm also in favor of the MLST style reply for commands with more than one return value. For the HASH output, everything beside the "Value" fact should be optional. Maybe MFF style arguments should be considered as well:
C>HASH Range=0-1000; File.ext
S>213 Value=f0ad929cd...;Algorithm=sha-256;Range=0-1000;

And my favorite would be to pass the algorithm the same way to the HASH command:
C>HASH Algorithm=sha-256;Range=0-1000; File.ext
S>213 Value=f0ad929cd...;Algorithm=sha-256;Range=0-1000;

Regards,
Mat

-----Original Message-----
From: ftpext-bounces@ietf.org [mailto:ftpext-bounces@ietf.org] On Behalf Of Anthony Bryan
Sent: Monday, 17 January, 2011 09:59
To: Robert Oslin
Cc: ftpext@ietf.org
Subject: Re: [ftpext] draft-ietf-ftpext2-hash - partial hashes

On Fri, Jan 14, 2011 at 6:32 PM, Robert Oslin <rto@globalscape.com> wrote:
> draft-ietf-ftpext2-hash open issues indicates: "Current version of the draft defines full file hashes, but not partial file hashes."
>
> Partial hashes are valuable and applicable to re-world scenarios and should be considered for inclusion in the HASH specification.
>
> Use Case:
>
>        User initiates download of multi-gigabyte file, such as ISO or similar.
>        Transaction is interrupted
>        [Later] User re-initiates download of same (supposedly) file
>
> At this point the client software must make a decision as to whether to resume the transfer or not. Filename is the first criteria observed, followed by size. However, these do not take into consideration a corrupted partial file (even with identical byte count), or that the remote file could have changed, especially given the difficulty in assessing time differences between client and host systems.
>
> By requesting the hash for the portion of the remote file matching the bytes for the partial local file, the client could determine whether the local file is indeed valid and partial and subsequently resume the transfer from the appropriate byte offset.

>
> A similar need occurs when the client has no prior knowledge of the transfer (no queue or cache mechanism) and a same name file is identified and the client must determine whether the local file is just a segment/part of the larger file located on the remote.
>
> Below is an example of overwrite logic performed today by our FTP client using hashes and size comparisons:
>
> If a user requests to download a file and a file with the same name exists locally, the client will determine if the file sizes are the same or if the destination (local) size is smaller (indicating a possible partially transferred file). If file sizes are the same then the client will compute the hash for the entire file and ask the server for to provide the hash for the corresponding remote file. The client will then skip the transfer if the hashes are identical or overwrite the file if the hashes do not match. If  the remote (source) file is larger the client will ask the server for a partial hash, up to the bytes that match the local (destination) file size. If the partial hash matches then the client will resume the transfer from the byte offset. If the hashes are different then the client will overwrite the file. (e.g. local partial file was corrupted or is not same file).

Robert, thanks for joining us & for posting.

a very quick introduction for Robert would mention that he works on CuteFTP and is the originator of the XCRC command.

I think everyone's been unanimous in that we want HASH to support partial file hashes.

I think there are a few things to iron out.

1) how to (optionally) select the byte range to be hashed.

we proposed a new byte RANGe selection command, http://tools.ietf.org/html/draft-bryan-ftp-range , which needed to be fleshed out of course but would look something like this:

   C> RANG 802816 1000000
   S> 350 Byte range starting at 802816, ending at 1000000.


2) how to show that partial hashes are supported, or if that's even needed? add a "p" to the FEAT? or just use an error code if someone tries to do a partial file hash and & it's not allowed or unsupported on the server?

      C> FEAT
      S> 211-Extensions supported:
      S>  ...
      S>  HASH SHA-256p;SHA-512p;SHA-1p*;MD5p
      S>  ...
      S> 211 END

3) the server response which shows it's a partial file hash and not a full file hash. it would probably be good to have the range in there, and it could be mandatory, where if it was a full file hash it would list the start & end of the file


   C> HASH filename.ext
   S> 213 SHA-256 f0ad929cd259957e160ea442eb80986b5f... filename.ext
-802816 1000000

from Lothar:

S> 226 SHA-256 f0ad929cd259957e160ea442eb80986b5f... filename.ext\
 802816 1000000 ASCII transfer complete

from Sob:

Rather than inventing new custom reply format, wouldn't it be better to adopt MLSx style? It's simple, readable, extensible, ...

E.g.:

  S> 213 Hash.SHA-256=f0ad929cd...;Range=802816-1000000; filename.ext

(I intend to reply to the other hash messages backlog shortly)
--
(( Anthony Bryan ... Metalink [ http://www.metalinker.org ]
  )) Easier, More Reliable, Self Healing Downloads _______________________________________________
ftpext mailing list
ftpext@ietf.org
https://www.ietf.org/mailman/listinfo/ftpext