Re: Draft v1 Update for Resumable Uploads

Glenn Strauss <gs-lists-ietf-http-wg@gluelogic.com> Mon, 20 June 2022 02:07 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D62BBC157B33 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Sun, 19 Jun 2022 19:07:01 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.662
X-Spam-Level:
X-Spam-Status: No, score=-2.662 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.25, MAILING_LIST_MULTI=-1, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id OVDpU6UkhB83 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Sun, 19 Jun 2022 19:06:57 -0700 (PDT)
Received: from lyra.w3.org (lyra.w3.org [128.30.52.18]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id AD113C157902 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Sun, 19 Jun 2022 19:06:57 -0700 (PDT)
Received: from lists by lyra.w3.org with local (Exim 4.92) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1o36lM-0002Ew-ND for ietf-http-wg-dist@listhub.w3.org; Mon, 20 Jun 2022 02:03:36 +0000
Resent-Date: Mon, 20 Jun 2022 02:03:36 +0000
Resent-Message-Id: <E1o36lM-0002Ew-ND@lyra.w3.org>
Received: from titan.w3.org ([128.30.52.76]) by lyra.w3.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from <gs-lists-ietf-http-wg@gluelogic.com>) id 1o36lK-0002Dv-Ga for ietf-http-wg@listhub.w3.org; Mon, 20 Jun 2022 02:03:34 +0000
Received: from smtp1.atof.net ([52.86.233.228]) by titan.w3.org with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA (Exim 4.94.2) (envelope-from <gs-lists-ietf-http-wg@gluelogic.com>) id 1o36lK-000x6M-4v for ietf-http-wg@w3.org; Mon, 20 Jun 2022 02:03:34 +0000
X-Spam-Language: en
X-Spam-Relay-Country:
X-Spam-DCC: B=; R=smtp1.atof.net 1102; Body=1 Fuz1=1 Fuz2=1
X-Spam-RBL:
X-Spam-PYZOR: Reported 0 times.
Date: Sun, 19 Jun 2022 22:03:12 -0400
From: Glenn Strauss <gs-lists-ietf-http-wg@gluelogic.com>
To: Lucas Pardue <lucaspardue.24.7@gmail.com>
Cc: Guoye Zhang <guoye_zhang@apple.com>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <Yq/VYMe+VUDdnY1c@xps13>
References: <BED5A5BC-3F7F-47E2-815E-DC0483328DFD@apple.com> <Yq67WGkb0LtJIAP9@xps13> <CALGR9oa-zUTL4z_jrzvknBOoeYoksT-mPgz3m3ddUFzBZZ6khg@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <CALGR9oa-zUTL4z_jrzvknBOoeYoksT-mPgz3m3ddUFzBZZ6khg@mail.gmail.com>
Received-SPF: pass client-ip=52.86.233.228; envelope-from=gs-lists-ietf-http-wg@gluelogic.com; helo=smtp1.atof.net
X-W3C-Hub-Spam-Status: No, score=-6.2
X-W3C-Hub-Spam-Report: BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, W3C_AA=-1, W3C_WL=-1
X-W3C-Scan-Sig: titan.w3.org 1o36lK-000x6M-4v ba2ee1f59d0343b078f7ac02abc3e908
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Draft v1 Update for Resumable Uploads
Archived-At: <https://www.w3.org/mid/Yq/VYMe+VUDdnY1c@xps13>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/40171
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <https://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

On Sun, Jun 19, 2022 at 04:52:58PM +0100, Lucas Pardue wrote:
> Hi Glenn,
> 
> I'd like to respond to one aspectdirectly:
> 
> On Sun, Jun 19, 2022 at 7:04 AM <gs-lists-ietf-http-wg@gluelogic.com> wrote:
> 
> > On Thu, Jun 16, 2022 at 02:30:59PM -0700, Guoye Zhang wrote:
> > > Our previous resumable upload draft generated a lot of discussions.
> >
> > At least in my case, I attempted to be polite after you submitted a
> > draft without first doing a survey of existing RFCs.  You admitted no
> > knowledge of WebDAV RFCs, which I deemed a large oversight considering
> > the nature of the tus-v2 protocol.
> >
> > > I’m glad to announce that we have a new draft ready to address many
> > feedbacks that suggested adopting the PATCH method.
> >
> > The draft abstract begins with unsubstantiated claims to justify itself,
> > and I believe that almost all of those claims are also misleading.
> >
> > "HTTP clients often encounter interrupted data transfers as a result of
> > canceled requests or dropped connections. [...] it is often desirable to
> > issue subsequent requests that transfer only the remainder of the
> > representation."
> >
> > The multiple uses of "often" are misrepresentations, IMHO.
> >
> > A large percentage of HTTP requests are GET/HEAD and have no body.
> > A sizable percentage (if not more) of HTTP POST requests are small,
> > e.g. using POST as an alternative to GET along with XSRF tokens.
> >
> > What data do you have to support the claims in the draft Abstract?
> > What percentage of requests have request bodies, and further have
> > request bodies that are sufficiently large that it is excessively
> > wasteful to resend the entire representation? (and when safe to do so!)
> >
> 
> The text you quote from the tus v2 draft is based on an almost verbatim
> section of text from RFC 9110 Section 14 [1], which describes Range
> Requests. Quote:
> 
> > Clients often encounter interrupted data transfers as a result of
> canceled requests or dropped connections. When a client has stored a
> partial representation, it is desirable to request the remainder of that
> representation in a subsequent request rather than transfer the entire
> representation.

In RFC 9110, that paragraph *immediately* follows the heading "Range
Requests".  In that context, "Clients often encounter interrupted data
transfers" implicitly refers to downloads, as described in the section.

On the other hand, when quoted in the Abstract of the tus draft titled
"tus - Resumable Uploads Protocol", I mistook that to refer to uploads.
Hence, I found the statements misleading, since "interrupted data
transfers" is ambiguous and could refer to downloads or uploads.

> I wrote the tus v2 text to explicity juxtapose resumable HTTP downloads
> against uploads. To quote the abstract in full:
> 
> > HTTP clients often encounter interrupted data transfers as a result
>    of canceled requests or dropped connections.  Prior to interruption,
>    part of a representation may have been exchanged.  To complete the
>    data transfer of the entire representation, it is often desirable to
>    issue subsequent requests that transfer only the remainder of the
>    representation.  HTTP range requests support this concept of
>    resumable downloads from server to client.  This document describes a
>    mechanism that supports resumable uploads from client to server using
>    HTTP.

When a client uses an idempotent HTTP request method, such as GET, then
the client may repeat a disconnected request, and may attempt to utilize
a Range request.

When a client is providing a request body upload with a non-idempotent
method, such as POST or PUT, then the request is not necessarily safe to
repeat.  Appropriate behavior to attempt to recover may be
application-specific.

I want to emphasize: **application-specific**

Hence, I still think that Abstract is misleading and an inaccurate
attempt to convey equivalence between client download using an
idempotent method, and client upload using a non-idempotent method.
They are not equivalent; there are very important distinctions that
should be explicitly stated before trying to describe similarities.

> A different reading is that,
> having seen multiple HTTP-based custom resumable upload approaches from
> different vendors, there is an unaddressed use case.

I do agree with the sentiment, just not the proposed solution.

> Hence it is not seldom
> that for applicaitons that feature uploads, resumption is desirable.

Again:
upload recovery is **application-specific** with a non-idempotent method
(automatic retrying by a client is not generically or universally safe)

"tus - Resumable Uploads Protocol" is application-specific as I read it.
I think the draft should proceed to describe an application-specific
protocol and should severely scale back its attempt to generically
extend HTTP for resumable uploads.

Cheers, Glenn