Re: Resumable Uploads

Ken Murchison <murch@andrew.cmu.edu> Mon, 22 April 2013 17:46 UTC

Return-Path: <ietf-http-wg-request@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C1B0321E804E for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Mon, 22 Apr 2013 10:46:03 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -10.598
X-Spam-Level:
X-Spam-Status: No, score=-10.598 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Vap4LmiEOV62 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Mon, 22 Apr 2013 10:46:02 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [128.30.52.56]) by ietfa.amsl.com (Postfix) with ESMTP id 0653621F8E99 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Mon, 22 Apr 2013 10:46:01 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.72) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1UUKmV-0004UI-L9 for ietf-http-wg-dist@listhub.w3.org; Mon, 22 Apr 2013 17:43:59 +0000
Resent-Date: Mon, 22 Apr 2013 17:43:59 +0000
Resent-Message-Id: <E1UUKmV-0004UI-L9@frink.w3.org>
Received: from maggie.w3.org ([128.30.52.39]) by frink.w3.org with esmtp (Exim 4.72) (envelope-from <murch@andrew.cmu.edu>) id 1UUKmN-0004Rh-CR for ietf-http-wg@listhub.w3.org; Mon, 22 Apr 2013 17:43:51 +0000
Received: from smtp.andrew.cmu.edu ([128.2.11.95]) by maggie.w3.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.72) (envelope-from <murch@andrew.cmu.edu>) id 1UUKmK-0000aJ-VH for ietf-http-wg@w3.org; Mon, 22 Apr 2013 17:43:51 +0000
Received: from [192.168.137.21] (cpe-76-180-197-142.buffalo.res.rr.com [76.180.197.142]) (user=murch mech=PLAIN (0 bits)) by smtp.andrew.cmu.edu (8.14.4/8.14.4) with ESMTP id r3MHhLBu028790 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT) for <ietf-http-wg@w3.org>; Mon, 22 Apr 2013 13:43:22 -0400
Message-ID: <517576B9.3020207@andrew.cmu.edu>
Date: Mon, 22 Apr 2013 13:43:21 -0400
From: Ken Murchison <murch@andrew.cmu.edu>
Organization: Carnegie Mellon University
User-Agent: Thunderbird 2.0.0.23 (X11/20090825)
MIME-Version: 1.0
To: ietf-http-wg@w3.org
References: <CADZbJ9dYFGyrceh03M3B0KdKto7160Dis_geh9um0BhVe1re0g@mail.gmail.com>
In-Reply-To: <CADZbJ9dYFGyrceh03M3B0KdKto7160Dis_geh9um0BhVe1re0g@mail.gmail.com>
Content-Type: multipart/alternative; boundary="------------080409050403090702040200"
X-PMX-Version: 5.5.9.388399, Antispam-Engine: 2.7.2.376379, Antispam-Data: 2011.5.19.222118
X-SMTP-Spam-Clean: 8% ( LEO_OBFU_SUBJ_RE 0.1, BODYTEXTH_SIZE_10000_LESS 0, RDNS_GENERIC_POOLED 0, RDNS_POOLED 0, RDNS_RESIDENTIAL 0, RDNS_SUSP 0, RDNS_SUSP_GENERIC 0, RDNS_SUSP_SPECIFIC 0, __ANY_URI 0, __BAT_BOUNDARY 0, __BOUNCE_CHALLENGE_SUBJ 0, __BOUNCE_NDR_SUBJ_EXEMPT 0, __CP_URI_IN_BODY 0, __CT 0, __CTYPE_HAS_BOUNDARY 0, __CTYPE_MULTIPART 0, __CTYPE_MULTIPART_ALT 0, __HAS_HTML 0, __HAS_MSGID 0, __HTML_FONT_GREEN 0, __HTML_FONT_RED 0, __MIME_HTML 0, __MIME_VERSION 0, __MOZILLA_MSGID 0, __RATWARE_SIGNATURE_3_N1 0, __RATWARE_X_MAILER_CS_B 0, __RDNS_POOLED_2 0, __SANE_MSGID 0, __TAG_EXISTS_HTML 0, __TO_MALFORMED_2 0, __TO_NO_NAME 0, __URI_NO_MAILTO 0, __URI_NO_WWW 0, __USER_AGENT 0)
X-SMTP-Spam-Score: 8%
X-Scanned-By: MIMEDefang 2.60 on 128.2.11.95
Received-SPF: none client-ip=128.2.11.95; envelope-from=murch@andrew.cmu.edu; helo=smtp.andrew.cmu.edu
X-W3C-Hub-Spam-Status: No, score=-4.6
X-W3C-Hub-Spam-Report: AWL=-2.300, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-2.3, RP_MATCHES_RCVD=-0.001
X-W3C-Scan-Sig: maggie.w3.org 1UUKmK-0000aJ-VH 8026b8eeb1891650d3faec7d8967d834
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Resumable Uploads
Archived-At: <http://www.w3.org/mid/517576B9.3020207@andrew.cmu.edu>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/17473
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <http://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

On Thu, 18 Apr 2013, Felix Geisendörfer wrote:
> I'm interested in finding out how to perform resumable uploads over http
> while being compliant with existing specifications. The result of this 
> work
> will be shared with the community to create interopable server/client
> software to simplify file uploading on the web.
Here are my thoughts on this, some of which have already been mentioned 
by others.

- The client should/must have a way to signal to the server that its 
supports partial uploads so that the server can respond accordingly.  
I'd suggest a "partial-upload" (or similar)  preference to be used with 
the Prefer [1] request header.

- Allow progress of the upload to be reported back to the client via 
Progress [2] (or similar) response header.

- PATCH seems to be designed explicitly for the purpose up updating 
existing resources and would make sense to use for completing/repairing 
an upload.  I'd suggest use of multipart/byteranges as the "patch" document.

- A server will acknowledge that it supports "partial-upload" by 
including an "Accept-Patch: multipart/byteranges" header in its responses.

- Allow HEAD requests to include range request semantics in the presence 
of "Prefer: partial-upload".

An example of how this might work (I've left out most headers for brevity);

Initial request:

PUT /file.pdf HTTP/1.1
Expect: 100-continue
Prefer: partial-upload
Content-Type: application/pdf
Content-Length: 1000

HTTP/1.1 100 Continue
Allow-Patch: multipart/byteranges

[ 1000 bytes of data ]
HTTP/1.1 102 Processing				<<< optional progress status
Progress: 256/1000

HTTP/1.1 102 Processing				<<< optional progress status
Progress: 512/1000

<<< connection lost >>>


Since in the case above the client already knows that the server 
processed bytes 0-511, it can try to resume the upload immediately 
(saving a round-trip). Otherwise it can check the current status with a 
HEAD range request:

HEAD /file.pdf HTTP/1.1
Prefer: partial-upload
Range: bytes=0-

HTTP/1.1 206 Partial Content
Allow-Patch: multipart/byteranges
Content-Range: bytes 0-511/1000



Resumption of upload:

PATCH /file.pdf HTTP/1.1
Prefer: partial-upload
Content-Type: multipart/byteranges; boundary=THIS_STRING_SEPARATES
Content-Length: xxx

--THIS_STRING_SEPARATES
Content-Type: application/pdf
Content-Range: bytes 512-999/1000

[ 488 bytes of data ]
--THIS_STRING_SEPARATES--

HTTP/1.1 102 Processing				<<< optional progress status
Progress: 256/488

HTTP/1.1 204 No Content				<<< completed successfully


This scheme can probably be tweaked to work with chunked uploads, but I 
haven't thought much about it yet.

In cases where a client does a HEAD/GET on a partial resource without 
"Prefer: partial-upload", I don't know what the server should do.  There 
are at least 4 options:

    * Treat the resource as complete and return 200
    * Treat the resource as partial and always return 206 (will probably
      break clients)
    * Treat the resource as non-existent and return 404
    * Fail the request with a 403 (or similar)


To append data to an existing resource we could extend the Content-Range 
ABNF a little to allow a PATCH request as follows:

PATCH /log.txt HTTP/1.1
Content-Type: multipart/byteranges; boundary=THIS_STRING_SEPARATES
Content-Length: xxx

--THIS_STRING_SEPARATES
Content-Type: text/plain
Content-Range: bytes +200/*			<<<  append 200 bytes to existing length

[ 200 bytes of data ]
--THIS_STRING_SEPARATES--

HTTP/1.1 204 No Content


[1] http://tools.ietf.org/html/draft-snell-http-prefer
[2] http://tools.ietf.org/html/draft-decroy-http-progress

-- 
Kenneth Murchison
Principal Systems Software Engineer
Carnegie Mellon University