Re: [httpapi] Feedback on draft-ietf-httpapi-patch-byterange-00

Hi Marius,

The technique you’re proposing seems quite similar to the Partial PUT technique already in HTTP (RFC 9110: 14.5) <https://www.rfc-editor.org/rfc/rfc9110.html#name-partial-put>. The major difference is that this would still trigger an error if this feature isn't supported (415 Unsupported Media Type), which is the key reason to use message/byterange (instead of Content-Range in PUT).

However in resumable uploads, I’m not sure that this protection is necessary. If the server returns a URL that represents the contents being uploaded, the only reason a client would ever make a PATCH or PUT request to that resource is to resume the request. It’s unlikely the server would ignore the Content-Range headers in the PUT, or otherwise misunderstand the intent.

Given this, I’d like to suggest is that Partial PUT may be a better solution for you. My objective with “Byte range PATCH” isn’t the message/byterange media type itself, but rather that the Content-Range field has a precise and valuable meaning beyond 206 responses: its useage in requests vs. responses is exactly symmetrical; you can make a Range request and get back a 206 response, then you can use the Content-Range from that response in a PUT or PATCH request to overwrite exactly the same bytes.

---

Now, that said, I think “message/byterange” is still the best way to go (or a binary equivalent, see below). Being the theoretically pure solution, it’s likely to be more broadly useful.

First, encapsulating all patch semantics in the payload makes things more consistent. Putting parameters in the HTTP headers would rigidly bind the semantics of byte range patches to PATCH requests. While Byte Range PATCH is designed for PATCH requests, the message/byterange media type ought to be usable elsewhere, like in GET responses. For example, if I ask for the diff of a resource compared to its parent, the server might respond with a message/byterange document. I can re-play this change by uploading it in a PATCH request. In contrast, placing “Content-Range” in the GET response headers here would conflict with 206 partial content responses.

Furthermore, PATCH seems to imply the request body is completely self-descriptive. Some clients may be written assuming the HTTP headers don’t matter. While it may be tempting to utilize HTTP fields for free data storage, some software may restrict access to these headers. (Web browsers come to mind.)

So even if you decide it’s not practical to place patch parameters in the request body, there’s still value in having a media type that does it this way.

However, there are still some benefits to using a binary patch format that you should consider. It would let clients communicate information about the resumed request as it is streaming, for example “I don’t know the total length of the upload, but here’s some bytes continuing from byte 1000… (2000 bytes later) And now I know the upload is only 6000 bytes total; here’s the last of them… (3000 more bytes...) upload complete.” This would require multiple HTTP requests otherwise.

Finally, compared to the other possible complications, I think parsing the syntax is one of the simpler parts of accepting an uploaded patch, or at least message/byterange is. While I realize many here in HTTP land tend to dislike regular expressions, in this circumstance they are very useful, since they can precisely implement the syntax in just a few lines of code, for example: https://github.com/awwright/http-progress/blob/3c95528c56b3fe49041321ae21a4d3a1923ac501/demo-patch-byterange-server/httpd.js#L155-L184

I think most of the complexity will be in handling conditions like running out of disk space/resource management, multiple Content-Range headers, interrupted request payloads, etc.

---

To summarize, I think there’s genuinely different needs with varying solutions, and that PATCH supports the widest variety of uses; but maybe PUT is simpler for some people, the most important thing is that we use the Content-Range field consistently.

So far I’ve been assuming this is for resumable requests, do you have a specific example? I’d like to brainstorm a few different solutions and contrast them.

Thanks,

Austin.

> On Oct 23, 2023, at 02:57, Marius Kleidl <marius=40transloadit.com@dmarc.ietf.org> wrote:
> 
> Dear Austin, dear WG,
> 
> I am in favor of having a standardized approach for PATCH requests targeted at specific offsets. However, the mechanism describes in this draft appears a bit too complicated to me. By defining a new media type, the draft can put the target range into the request content instead of a header field. For example, using the message/byterange media type (copied from the draft):
> 
> PATCH /uploads/foo HTTP/1.1
> Content-Type: message/byterange
> Content-Length: 272
> 
> Content-Range: bytes 100-299/600
> Content-Type: text/plain
> 
> [200 bytes...]
> This makes the client and server logic more complex and error-prone as the client has to encode the target range into the request content and the server has to perform additional parsing to separate the target range from the actual chunk of data. I would keen on having a simpler approach where the request content is the chunk of data to be applied. The target range and target content type can be placed into the header fields. Thinking out loud, such a request might look like the following:
> 
> PATCH /uploads/foo HTTP/1.1
> Content-Type: message/byterange; target=text/plain
> Content-Length: 200
> Content-Range: bytes 100-299/600
> 
> [200 bytes...]
> 
> Clients can then set the chunk as the request content and server can use the content without any additional parsing been necessary. Note that this approach does not invalidate the use case for multipart/byteranges. multipart/byteranges and a simplified version of message/byterange can exist next to each other, serving different purposes.
> Best regards
> Marius Kleidl
> -- 
> httpapi mailing list
> httpapi@ietf.org
> https://www.ietf.org/mailman/listinfo/httpapi