Re: [httpapi] Byte range PATCH

Austin William Wright <aaa@bzfx.net> Sun, 19 February 2023 05:18 UTC

Return-Path: <aaa@bzfx.net>
X-Original-To: httpapi@ietfa.amsl.com
Delivered-To: httpapi@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C1E20C14CE44 for <httpapi@ietfa.amsl.com>; Sat, 18 Feb 2023 21:18:42 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.084
X-Spam-Level:
X-Spam-Status: No, score=-2.084 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_BLOCKED=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_MIME_MALF=0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=bzfx.net
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id fh1hvRThX2Z1 for <httpapi@ietfa.amsl.com>; Sat, 18 Feb 2023 21:18:37 -0800 (PST)
Received: from mail-pl1-x62c.google.com (mail-pl1-x62c.google.com [IPv6:2607:f8b0:4864:20::62c]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id DEF51C14CE42 for <httpapi@ietf.org>; Sat, 18 Feb 2023 21:18:37 -0800 (PST)
Received: by mail-pl1-x62c.google.com with SMTP id q9so91987plh.6 for <httpapi@ietf.org>; Sat, 18 Feb 2023 21:18:37 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bzfx.net; s=google; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:from:to:cc:subject:date:message-id:reply-to; bh=KxiGWcfNAf2p13orNtZ2bDnWqDesv+aiVuiyHHOV7Kg=; b=ETzRpnxtNlnxQ1nCnNymYmIHqEJGNR2ZBPU8Zj05x5CggVw0hoPnhcKJ83QT/80S+s 3CrSuVJ97muv0gLZzGd5KJZqRDUhXKrAVh3/B3qR6Kc2M7ByDZI8wWBC36Os7CYnx4B8 Ujzrpq9t3on2g3WUI4xWlXJXXlfMOZdCoIGX4=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=KxiGWcfNAf2p13orNtZ2bDnWqDesv+aiVuiyHHOV7Kg=; b=uVu1GBkytVaitKWuWruDvzdLPFeptxBqQKIoIaqY53Q9nmp5k9gpNe/B6NjDq/gT2x b1ZqdiHfty07XlQm3MF2jeBrg5ddKBPWA3tzpjvcK9akIydc6NtaXMUs7zgMG4B2R3KP VMCH1CsBWQRHQ6rB6pMdXwTKpMPMPUozkuue0kBRjn53i9o7CcxbncRcppRu1R3/q6Xy lEtijv5VFAO4yTetIe11jP2r3ZMrh4VU+1EuA2SAZDnmw1NBANG+txVKhLsKLPUvrARh xpSFDsT58mKv2ypkBaMBDpUg8SY9HZvIaow0l5OfkZ29xQk72nbZUPxbVQGEFCx5VC5r eOLg==
X-Gm-Message-State: AO0yUKXcj37UpYzGAzo+bPPnao+B2Ztaa0BCjZ/dPyGHoD/UY69nM7rE 9HdAsGZjQ8nckawCk2GIk1Px48OBHdYGgufM
X-Google-Smtp-Source: AK7set8A5WLEltPduY1ttGXkUFZTD8B66NpP3t5rNfzU9cWsMCOhKg6isCuqdt+LknRRfigTGFwqZA==
X-Received: by 2002:a17:903:187:b0:196:82d2:93a with SMTP id z7-20020a170903018700b0019682d2093amr3639150plg.11.1676783916187; Sat, 18 Feb 2023 21:18:36 -0800 (PST)
Received: from smtpclient.apple ([208.95.95.165]) by smtp.gmail.com with ESMTPSA id u9-20020a17090341c900b001994554099esm207552ple.173.2023.02.18.21.18.35 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sat, 18 Feb 2023 21:18:35 -0800 (PST)
From: Austin William Wright <aaa@bzfx.net>
Message-Id: <00C0A857-AFE3-477B-AFDE-5AA7E875225B@bzfx.net>
Content-Type: multipart/alternative; boundary="Apple-Mail=_3C23A2B6-73E4-435D-9C99-DBA11EB80F26"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.400.51.1.1\))
Date: Sat, 18 Feb 2023 22:18:24 -0700
In-Reply-To: <DM6PR01MB59643D37BCB22B0EC5863253A3D59@DM6PR01MB5964.prod.exchangelabs.com>
Cc: HTTP APIs Working Group <httpapi@ietf.org>
To: Darrel Miller <darrel@tavis.ca>
References: <4589CE6D-6C89-4450-AF7A-BC7F5659F3FA@bzfx.net> <DM6PR01MB59643D37BCB22B0EC5863253A3D59@DM6PR01MB5964.prod.exchangelabs.com>
X-Mailer: Apple Mail (2.3731.400.51.1.1)
Archived-At: <https://mailarchive.ietf.org/arch/msg/httpapi/o1kv6W6hyimIiNCAJdjSEpAjMVE>
Subject: Re: [httpapi] Byte range PATCH
X-BeenThere: httpapi@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Building Blocks for HTTP APIs <httpapi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/httpapi>, <mailto:httpapi-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/httpapi/>
List-Post: <mailto:httpapi@ietf.org>
List-Help: <mailto:httpapi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/httpapi>, <mailto:httpapi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 19 Feb 2023 05:18:42 -0000


> On Feb 5, 2023, at 08:47, Darrel Miller <darrel@tavis.ca> wrote:
> 
> Austin,
>  
> Thank you for sharing your proposal.  It is definitely the case that API providers do need this ability to perform partial updates to a resource.
>  
> However, it was my assumption that the updates to RFC 9110 that explicitly call out the potential for doing partial PUT https://www.rfc-editor.org/rfc/rfc9110.html#name-partial-put using the Content-Range header was the acknowledgement that this is what folks are doing and its ok to do it.
>  
> There are several assertions in your draft that suggest you don’t agree with this.
>  
> “this technique cannot generally be used in PUT, because the server may ignore the Content-Range header”
>  
> “However, if an upload is interrupted, no mechanism exists to upload only the remaining data; the entire request must be retried.”
>  
> “Although the Content-Range field cannot be used in the request headers without risking data corruption...”

Yes, I didn’t directly acknowledge RFC 9110. Despite mentioning it, Content-Range still cannot be used on servers not known to support it (it would cause data corruption). The spec says this outright:

> though such support is inconsistent and depends on private agreements with user agents

In contrast, a patch media type wouldn't depend on any prior knowledge, it could safely be used opportunistically (i.e. even if server support is unknown).

> In section 14.5 there is guidance that a server SHOULD respond with a 400 if Content-Range is received but not supported.  If this were a MUST, would that mitigate the concerns you raised?

Unfortunately clients wouldn’t be able to rely on this behavior, as it would be incompatible with existing HTTP servers. And it seems unlikely that new HTTP servers (Node.js applications, etc) would honor this requirement.

Also, it doesn’t make sense to me that specific features like this should be anticipated in the HTTP core semantics. This would introduce an exception to the “unknown headers are ignored” guideline of HTTP extensibility. Byte range patching should be an optional feature that servers can implement or ignore. And of the major extension points offered in HTTP (a header, method, or content-type), a new method or content-type is the correct option.

> Regarding your proposal, I do think there is value to being able to provide multiple parts to the content of the PATCH request.  I could imagine certain sync scenarios where that would be an efficient way of  communicating changes.

Yeah, an important use to me is working with many kinds of media files, where you can perform an append, but you also need to update an index or heading, in one operation.

> I’m not sure I understand the value of having a Content-Type as part of the header. I would have thought that patching would happen assuming the target is effectively application/octet-stream.  It would seem problematic to suggest doing a byte range update to a content-type that is text/plain when the charset is either not known or unspecified.  I don’t see how any kind of media type semantics would impact how the patch is performed.  I am only considering the case where range units are bytes.

For existing resources, Content-Type is not meaningful and would probably be omitted. However when creating new resources, there has to be a substitute for the Content-Type header in a PUT request with Content-Range.

Given this request:

PUT /file.mp3 HTTP/1.1
If-None-Match: *
Content-Range: 0-40/*
Content-Type: image/png

How would you perform an equivalent PATCH request that creates the resource with the correct content type? The content-type of the resource has to be included inside the "envelope" of the patch.

> My only other thought here is, are we reinventing things like VCDIFF https://www.rfc-editor.org/rfc/rfc3284 at this point?

Byte range patch is simpler and maps directly to standard filesystem operations, and so is more likely to be adopted. Even a small VCDIFF file could require a very large file to be entirely rewritten, and not all servers are interested in supporting this. The only thing they really have in common is they’re generic ways to patch an existing document (unlike JSON Patch, these patch formats are not interested in the meaning of the data being written to).

That said, maybe there ought to be a VCDIFF media type, for patching an executable file or tarball, etc. I’m sure there’s a lengthy list of prior work that would have benefitted from a standard.

Thanks,

Austin.

>  
> Darrel
>  
>  
> From: Austin William Wright <mailto:aaa@bzfx.net>
> Sent: January 23, 2023 4:14 PM
> To: HTTP APIs Working Group <mailto:httpapi@ietf.org>
> Subject: [httpapi] Byte range PATCH
>  
> Hello HTTP APIs,
>  
> I’m seeking to standardize a media type for writing to particular byte offsets in a target document. This seems to be a common problem in HTTP applications, including in my own work with streaming parsers.
>  
> This sort of operation is common in filesystems. For example, you append audio data to a .wav file, and then you update the header at the start of the file. I don’t know of any way to do this operation over HTTP, except a PUT that replaces the entire resource.
>  
> This has been discussed in the HTTP WG list, as part of “resumable uploads” <https://httpwg.org/http-extensions/draft-ietf-httpbis-resumable-upload.html>, and I emailed there some months ago about this I-D. But aside from the resumable uploads being in HTTP WG, this seems more like an API-related feature.
>  
> Here’s the I-D I’ve written: draft-wright-http-patch-byterange-01 <https://www.ietf.org/archive/id/draft-wright-http-patch-byterange-01.html>
>  
> Editor’s copy: https://awwright.github.io/http-progress/draft-wright-http-patch-byterange.html
>  
> This is the successor to version 00; in response to some feedback, I added a binary format. And I discuss what is necessary to support “splicing” (an insertion that shifts other content around) and indeterminate-length uploads, which is a requested feature for the “resumable uploads” spec.
>  
> Is HTTP APIs the best place to pursue this? I’m hoping I can discuss this at IETF 116.
>  
> Thanks,
>  
> Austin Wright.