Partial uploads, resumable requests, and progress of long-running operations

Austin Wright <aaa@bzfx.net> Sun, 14 July 2019 09:21 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A06BA1200E5 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Sun, 14 Jul 2019 02:21:21 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.751
X-Spam-Level:
X-Spam-Status: No, score=-2.751 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.249, HTML_MESSAGE=0.001, MAILING_LIST_MULTI=-1, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=bzfx.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id i1bfooAv1Tf4 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Sun, 14 Jul 2019 02:21:19 -0700 (PDT)
Received: from frink.w3.org (frink.w3.org [IPv6:2603:400a:ffff:804:801e:34:0:38]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 5A00412008F for <httpbisa-archive-bis2Juki@lists.ietf.org>; Sun, 14 Jul 2019 02:21:18 -0700 (PDT)
Received: from lists by frink.w3.org with local (Exim 4.89) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1hmaeJ-0005cR-Ie for ietf-http-wg-dist@listhub.w3.org; Sun, 14 Jul 2019 09:18:27 +0000
Resent-Date: Sun, 14 Jul 2019 09:18:27 +0000
Resent-Message-Id: <E1hmaeJ-0005cR-Ie@frink.w3.org>
Received: from titan.w3.org ([2603:400a:ffff:804:801e:34:0:4c]) by frink.w3.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.89) (envelope-from <aaa@bzfx.net>) id 1hmaeH-0005bf-Eb for ietf-http-wg@listhub.w3.org; Sun, 14 Jul 2019 09:18:25 +0000
Received: from mail-pf1-x444.google.com ([2607:f8b0:4864:20::444]) by titan.w3.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.89) (envelope-from <aaa@bzfx.net>) id 1hmaeE-0005Wy-Ru for ietf-http-wg@w3.org; Sun, 14 Jul 2019 09:18:25 +0000
Received: by mail-pf1-x444.google.com with SMTP id r1so6085390pfq.12 for <ietf-http-wg@w3.org>; Sun, 14 Jul 2019 02:18:02 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bzfx.net; s=google; h=from:mime-version:subject:message-id:date:to; bh=inyIm6z1fz4xmq1q2nkVE3XycvyKT+B97C4sf5hGAEU=; b=5QeyNRNOlwdpmhvdYAU9hCcHeNwcZd7JpBwol4l5bP/9O8ktyzr22FCL27+GYkMuvM MVJRVlLKVY8U6CxiP+SQD2hBjFVcexeNjHEbblhkhSAMNusuQDRWWqHf1ViiTXd3SygY /OjQboUGF24kEl0hziUF1EuIW3NSk6FAmLcp8=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:mime-version:subject:message-id:date:to; bh=inyIm6z1fz4xmq1q2nkVE3XycvyKT+B97C4sf5hGAEU=; b=Q2imF+WuZUtACUklAlmlnac0mXeKYnFkmE9CO3rLN+pvBwpNWw1dN4U3Ms4iGv8PCa XuAmDNFry/i0OEPZGoNS2tj17n/gdUxKwHqWi2dGLqEnN0c1tLm1ozuIO4kZFieH/zkF zygsj5zxnJTPXrulzcvoA9CrPop9ASCtvke6BWEZxMFJDw/YnD9hCnWLNH/3TJXXKOqj VKuNqV/dIobO7w94JrK8Ph6/2/o7j8WKq57QMVhIbp1UQNBqSzvxRq6eYIiIW7sxvHgp p+/A2+KG0F9hVPkO1xr1SlYuo6ZrSMby3UR7tFi9P2CESVwIFPEHPOOB3g7rsZaWW1DJ OXHg==
X-Gm-Message-State: APjAAAUkKfEgKMSz81zUxXTqGqHKvBbpu8I9QqU2yJESk6HwBkQFYr14 s41QVPkeFJauv8e+1Rz2k6Zde5OJ
X-Google-Smtp-Source: APXvYqy8RYc7JlenI/PEvGsZHnALtWV4hPkbgIYqUN51ixhqgoIj+0oz47vcLBdZi2tafRuJXWBJYQ==
X-Received: by 2002:a63:3ec7:: with SMTP id l190mr21516896pga.334.1563095880713; Sun, 14 Jul 2019 02:18:00 -0700 (PDT)
Received: from [192.168.0.116] (184-98-251-21.phnx.qwest.net. [184.98.251.21]) by smtp.gmail.com with ESMTPSA id w1sm11496786pjt.30.2019.07.14.02.17.59 for <ietf-http-wg@w3.org> (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 14 Jul 2019 02:18:00 -0700 (PDT)
From: Austin Wright <aaa@bzfx.net>
Content-Type: multipart/alternative; boundary="Apple-Mail=_DFC3155D-E118-4091-B279-640A91CB9F3B"
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\))
Message-Id: <FA3941FB-4E8A-467D-A925-4DCA8A3E94F7@bzfx.net>
Date: Sun, 14 Jul 2019 02:17:56 -0700
To: ietf-http-wg@w3.org
X-Mailer: Apple Mail (2.3445.104.11)
Received-SPF: pass client-ip=2607:f8b0:4864:20::444; envelope-from=aaa@bzfx.net; helo=mail-pf1-x444.google.com
X-W3C-Hub-Spam-Status: No, score=-3.8
X-W3C-Hub-Spam-Report: AWL=2.333, BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, W3C_AA=-1, W3C_DB=-1, W3C_IRA=-1, W3C_WL=-1
X-W3C-Scan-Sig: titan.w3.org 1hmaeE-0005Wy-Ru 8694afb8ec80a2b39ab47613881c0384
X-Original-To: ietf-http-wg@w3.org
Subject: Partial uploads, resumable requests, and progress of long-running operations
Archived-At: <https://www.w3.org/mid/FA3941FB-4E8A-467D-A925-4DCA8A3E94F7@bzfx.net>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/36803
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <https://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

Hello HTTP WG,

I am seeking to submit some new HTTP features that helps make large requests, allowing user agents to resume an unsafe request if it is interrupted, upload large documents in multiple parts, receive realtime updates on the processing, and advertise these client capabilities to the server. I was surveying HTTP server applications and noted a proliferation of rituals for performing large uploads and monitoring progress of operations; all in server-specific, non-interoperable fashions. This seems like an area of HTTP that is begging for standardization, so I wrote a specification and a proof-of-concept implementation. [1]

I've split these features into three documents, each of which may be implemented separately, as desired by origin servers and user agents: 

- "partial-upload" specifies a PATCH media type that writes to a specific byte range of a server resource, allowing a file to be uploaded in many smaller requests. Previously, applications would have to define a service-specific mechanism for accepting multiple requests, representing multiple segments of the single upload (e.g. [2]).

- "resume-request" specifies a way to address the current request body and/or response message. Then, if the request is interrupted, a client may resume the upload by appending data in a PATCH request. Likewise, by making a GET request to the response-message or response Content-Location, clients may resume an interrupted response. Previously, an upload would have to be retried from the start, and responses might be lost entirely.

- “progress" specifies a header allowing the server to update the client on progress it is making while generating a response. Previously, applications would have to define a custom mechanism for reading the progress of an operation, typically a read via a separate HTTP request, or WebSockets.

Each of these features are implemented with new headers over 1xx interim responses; no additional HTTP requests are needed, except as necessary after an interrupted connection (e.g. because of a TCP reset). These features may be implemented by HTTP client libraries or directly in the user-agent, without any special support required by application developers or end users. Further, it is intended to be feature compatible with all similar patterns seen in the wild today.

The repository includes a working proof-of-concept, however at present, it requires this patch to Node.js [3] (with any luck, this will be merged and available in the next release of Node.js).

I intend to submit these for consideration as Internet standards, I suppose that would be through this working group. The documents are also split up the way they are so they can be considered separately; I predict the new media type for patching byte ranges would move much faster than the others.

Please review the documents and provide any feedback you may have!

Thank you,

Austin Wright.

[1] <https://github.com/awwright/http-progress <https://github.com/awwright/http-progress>> A few months ago I considered using the 102 (Processing) status code to convey progress of an operation, but realized it had no mechanism to convey additional information, so I wrote up a document describing the “Progress” header. However, I had some difficulty figuring out how the client might receive the final status if a long response was interrupted: while 202 Accepted seems to be the standard solution, it offers no guidance to user agents to get an actual status code. Separately, I wrote that you might be able to use multipart/byteranges in a PATCH request body to patch part of a resource. Shortly thereafter I realized these two problems are related, and wrote resume-request, then a proof-of-concept.

[2] <https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingRESTAPImpUpload.html <https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingRESTAPImpUpload.html>> which, despite Amazon’s description, I do not believe to be RESTful, because its use requires prior knowledge of how that service works; a standard user agent would not be able to discover or make use of this functionality.

[3] <https://github.com/nodejs/node/pull/28459 <https://github.com/nodejs/node/pull/28459>> This patch is required to receive headers in 1xx interim responses; presently Node.js v12.4.0 only exposes the status code, discarding the headers (even though they are fully parsed by the internal parser).