Re: Draft for Resumable Uploads

Austin Wright <aaa@bzfx.net> Wed, 06 April 2022 07:56 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0E24A3A1705 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 6 Apr 2022 00:56:55 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.759
X-Spam-Level:
X-Spam-Status: No, score=-2.759 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.248, HTML_MESSAGE=0.001, MAILING_LIST_MULTI=-1, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=bzfx.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 6tAKnSGr1Ipq for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Wed, 6 Apr 2022 00:56:49 -0700 (PDT)
Received: from lyra.w3.org (lyra.w3.org [128.30.52.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 9B7F73A16FC for <httpbisa-archive-bis2Juki@lists.ietf.org>; Wed, 6 Apr 2022 00:56:49 -0700 (PDT)
Received: from lists by lyra.w3.org with local (Exim 4.92) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1nc0Ul-0002tw-Jj for ietf-http-wg-dist@listhub.w3.org; Wed, 06 Apr 2022 07:54:27 +0000
Resent-Date: Wed, 06 Apr 2022 07:54:27 +0000
Resent-Message-Id: <E1nc0Ul-0002tw-Jj@lyra.w3.org>
Received: from titan.w3.org ([128.30.52.76]) by lyra.w3.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from <aaa@bzfx.net>) id 1nc0Uj-0002sS-II for ietf-http-wg@listhub.w3.org; Wed, 06 Apr 2022 07:54:25 +0000
Received: from mail-pf1-f175.google.com ([209.85.210.175]) by titan.w3.org with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from <aaa@bzfx.net>) id 1nc0Uh-0005Pi-Nm for ietf-http-wg@w3.org; Wed, 06 Apr 2022 07:54:25 +0000
Received: by mail-pf1-f175.google.com with SMTP id b13so1815303pfv.0 for <ietf-http-wg@w3.org>; Wed, 06 Apr 2022 00:54:23 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bzfx.net; s=google; h=from:message-id:mime-version:subject:date:in-reply-to:cc:to :references; bh=Gc3N2MBiRN2yWHq2GuWtbgHtmLYRPbmLCrlgGE1UuxU=; b=UKLxoZXd4Wk3tM3nxlEYqXfvaNv6AroVXyxyfIeWrOk5ERco43nzZpX/IRQ4I5bV0J 7Afu/4OlDKuCSfhgeSRtaJAm143dpDsEv0z5QOfGCBxNFSCgtmLI6GVVPEaHlkB3K2Wy hZgIDH/JeXz+D/XKX1yRqME8IM8w25B/LuwME=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:message-id:mime-version:subject:date :in-reply-to:cc:to:references; bh=Gc3N2MBiRN2yWHq2GuWtbgHtmLYRPbmLCrlgGE1UuxU=; b=lQmKvPQ43AleGcVJLqvkRlb1QHSR0MbHD9NlxgF3d+gV8aQzWqfTl/xswedaQnSPaX UrIHFwusf0AYd+dfUkT9WVkud7iGswmhqo2TaBOZ7HOVHGzM8h0jfOvhCKDQbbTn58LH cD10PILeh3uhc3j7whreDGvkUICqGpnj+mKemscUnx4iLxxQC0xZf1Qm9Ck2c6UEUtFI G3Z44yGNPwKRYJox35bymbw5dfGGkTRmMFdLu/Q2CJiGM35AX18vcC2WphiWHhRcAAUk 2VubEuM4aXSyC8mRyXQI8skzLZzb49eCsp5EHd+MzqHhaTpqWVjj6v6C4O3zkUq7Ck+h PyRQ==
X-Gm-Message-State: AOAM53397Fw+eRNvSKTqUKHheVGG91UEMa9L2Dum79Xf5PtsGEpUxrhj rxRDb8rBOrvUS4Dh79w48rooOA==
X-Google-Smtp-Source: ABdhPJx9qEMTFz7iVsnvYip/SNRNNnjrHitlRPfl6SiLNtAOC4dGR8rYvQIi2NBn3XUxDk/LFUw6eQ==
X-Received: by 2002:a05:6a00:3018:b0:4fa:d533:45e5 with SMTP id ay24-20020a056a00301800b004fad53345e5mr7489251pfb.13.1649231592254; Wed, 06 Apr 2022 00:53:12 -0700 (PDT)
Received: from smtpclient.apple (174-17-141-74.phnx.qwest.net. [174.17.141.74]) by smtp.gmail.com with ESMTPSA id x12-20020aa7956c000000b004fdf7a4d49dsm10809529pfq.158.2022.04.06.00.53.10 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 06 Apr 2022 00:53:11 -0700 (PDT)
From: Austin Wright <aaa@bzfx.net>
Message-Id: <904B5382-ADCA-461F-B47C-583874D4FB55@bzfx.net>
Content-Type: multipart/alternative; boundary="Apple-Mail=_CF1A1EAC-22F3-467C-A1E0-1DB9BECDF17A"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.80.82.1.1\))
Date: Wed, 06 Apr 2022 00:53:08 -0700
In-Reply-To: <17ffd4d64d2.c4f12f9734385.3620821323075353432@zoho.com>
Cc: Julian Reschke <julian.reschke@gmx.de>, Guoye Zhang <guoye_zhang@apple.com>, ietf-http-wg <ietf-http-wg@w3.org>
To: Eric J Bowman <mellowmutt@zoho.com>
References: <CANY19NvMcPQaHRamFe-yy-E38xKo2XrmFCKVRoPbyBMQhoY6vA@mail.gmail.com> <82FAD6B4-F72F-42E0-A72D-4BFAAB9668FD@gbiv.com> <EA8A9F25-D49F-41DE-B98E-A013E1E68CAF@apple.com> <6e64f598-e82b-bff5-5ed9-c3c3f4b01439@gmx.de> <C6907036-146C-4FAB-938E-238473CB42B4@apple.com> <17ff7558cda.10ad81f8113705.2829201994677815148@zoho.com> <2FADC394-0954-4AA2-8F55-6CDF88833CB3@apple.com> <17ff85458eb.119b6ffbd16630.2281063094525551184@zoho.com> <a0670d54-d999-807c-23e2-95e357e73104@gmx.de> <17ff868f14e.d111a4c016833.788757655885004970@zoho.com> <4c1aabee-bc23-6d19-2e5d-8fdf3b3532ad@gmx.de> <892B7A86-57D0-4B21-9899-65EF3FA84A12@bzfx.net> <17ffd4d64d2.c4f12f9734385.3620821323075353432@zoho.com>
X-Mailer: Apple Mail (2.3696.80.82.1.1)
Received-SPF: pass client-ip=209.85.210.175; envelope-from=aaa@bzfx.net; helo=mail-pf1-f175.google.com
X-W3C-Hub-DKIM-Status: validation passed: (address=aaa@bzfx.net domain=bzfx.net), signature is good
X-W3C-Hub-Spam-Status: No, score=-6.1
X-W3C-Hub-Spam-Report: BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, W3C_AA=-1, W3C_DB=-1, W3C_IRA=-1, W3C_WL=-1
X-W3C-Scan-Sig: titan.w3.org 1nc0Uh-0005Pi-Nm 0e2fde4f1c558403579278c1ec70d684
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Draft for Resumable Uploads
Archived-At: <https://www.w3.org/mid/904B5382-ADCA-461F-B47C-583874D4FB55@bzfx.net>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/39970
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <https://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>


> On Apr 5, 2022, at 22:16, Eric J Bowman <mellowmutt@zoho.com> wrote:
> 
> Hi Austin, you ask good questions!
> 
> >
> > Finally, I’m not sure I fully understand what the response 
> > would indicate exactly. Shall the server actually support 
> > “sparse documents” (where some bytes are undefined)? 
> >
> 
> It's all up to how the client supports rendering the media type at hand. Regardless of serving my deliberately-broken image files as 200 or 206 for several years, some browsers "filled it in" with blocks of grey, others used their broken-image icon; some browsers went from broken-icon to filled-in image, while others regressed from filled-in image to broken-image icon. Some browsers behaved differently depending on platform, i.e. comes down to image-rendering libraries. All the server can really do, is somehow indicate to the client that the requested representation is incomplete, if the server knows that for a fact.
> 
> >
> > This would allow users to upload segments out-of-order,
> > and in parallel from different uplinks. If the client does 
> > not support sparse documents, how should the server 
> > respond? (Fill in the undefined regions with zeros?)
> >
> 
> The server only cares about media-type, really. Oh, you need PNG? I have that. Just wanna let you know it's broken! Do with it what you will. I'll allow PUT/PATCH if you no likey and are authorized for those methods.
> 
> Distributed out-of-order uploads... Hmmm. I had to give that some thought, but it doesn't change my position. Any number of participating clients contributing to a file upload, just have to agree how to "fill in the blanks" on the returned representation until it's 200 OK.
> 
> I think you and mnot are correct that we need better-defined PATCH media types, I believe that's where to solve this problem, but how any media type is rendered has traditionally and properly been a client-side concern in HTTP.

So then I think a simple modification of my “Partial Uploads” document (draft-wright-http-partial-upload-01 <https://tools.ietf.org/id/draft-wright-http-partial-upload-01.html>) would work well.

First, it defines the message/byterange media type, for making changes to a specific byte range. This is the bulk of the desired functionality, I think.

Second, 2__ Sparse Resource would indicate that the resource has some regions filled in by the server, and might not be valid according to the media type definition. But I’m not confident that all user-agents would safely handle a 2__ Sparse Resource. If the resource represents executable code, the result could be very bad. Maybe I remove this? It seems wrong to me that a server could send back a document it knows is invalid according to the media type. Maybe there should be an error for clients that request a sparse resource without indicating they can support the response.

Finally, to that document, I would add a Prefer header that indicates the client supports sparse resources, and can accept a 206 Partial Content response that excludes any undefined byte ranges (unlike 2__, which would zero out these regions). If the request is a Range request (with a Range header), the server would exclude undefined ranges in its response. The client could specify in the Prefer header if it prefers multipart responses, or only single-part responses with a Content-Range header.

Synchronizing an interrupted upload might look something like this:

> HEAD /upload-target HTTP/1.1
> Prefer: sparse=single

< HTTP/1.1 206 Partial Content
< Content-Range: 400-499/1000
< Content-Length: 100

Note how the sparse=single forces the response to include a Content-Range header (instead of being embedded in a multipart response); and how the presence of this header indicates the server supports sparse resources.

By the way, I’m calling these “sparse resources” in the same way there’s sparse files <https://en.wikipedia.org/wiki/Sparse_file>. Servers could potentially implement sparse resources by directly mapping onto sparse files.

Thanks,

Austin.

> 
> -Eric
> 
>