Re: [httpapi] Feedback on draft-ietf-httpapi-patch-byterange-00

Austin William Wright <aaa@bzfx.net> Tue, 24 October 2023 06:25 UTC

Return-Path: <aaa@bzfx.net>
X-Original-To: httpapi@ietfa.amsl.com
Delivered-To: httpapi@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 197BBC151556 for <httpapi@ietfa.amsl.com>; Mon, 23 Oct 2023 23:25:20 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.105
X-Spam-Level:
X-Spam-Status: No, score=-7.105 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=bzfx.net
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Wz6b8opsH_8f for <httpapi@ietfa.amsl.com>; Mon, 23 Oct 2023 23:25:15 -0700 (PDT)
Received: from mail-pl1-x62c.google.com (mail-pl1-x62c.google.com [IPv6:2607:f8b0:4864:20::62c]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id DA0F5C1522BD for <httpapi@ietf.org>; Mon, 23 Oct 2023 23:25:09 -0700 (PDT)
Received: by mail-pl1-x62c.google.com with SMTP id d9443c01a7336-1c9bca1d96cso27279705ad.3 for <httpapi@ietf.org>; Mon, 23 Oct 2023 23:25:09 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bzfx.net; s=google; t=1698128708; x=1698733508; darn=ietf.org; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:from:to:cc:subject:date:message-id:reply-to; bh=MOBnTv4o1JOBZ6eFaSym3h57qxZ6d7AY6sG74w2paFE=; b=JQqcQAr288QcW6TPAdIa+udCqqoWa7f6zwoQPjUDNT0DxHQxv8Gf6GoPallyrBJAwb BHHyh34raf2LkRzwsCVrP6ppz+JyRBMqW0nmap03mQOqWPagz5jrlbAwOYpeIF4WW+p+ gY3h9si53V/kYSz82qf3J3o7U4itVxQWjqcqY=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698128708; x=1698733508; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=MOBnTv4o1JOBZ6eFaSym3h57qxZ6d7AY6sG74w2paFE=; b=AgfXZDoqfA99JZh/JdFSZhT9bEEb3WYdScxWBn59JEuGMXj872+6otbjB651WDwrUd ESnaf7ZUNuXVwbFs/xFMLyjV/EedziwTDAWgJdbs0JuM+5db0zpHNL4JHJhMrLKaFDjC xAHCf2M5dNPqqV6DLUXRnCdNT7/IupzxC3XlexJYvtvaA72LYxJ/ogC/n/q6/Ib1yxMg vkZP4zuSrdJhaq09vynAEfMT1jbleIV6clxuttfLfo4NUvD8L3E8IXi4ohws/lUyO3pU gEzIuzbuD3acLMLWWrfNdX1jOpkxGg6LCm8Wr3jXiWZzUCMvIGMRUBzXPGMRfO1hXO6S VT/A==
X-Gm-Message-State: AOJu0Yxvynkv/APv7lnYg0fPf4IKAWTB96j8XMaT3+zcmN62Axvi9xrJ JGeei5WBASdOGygHMTz7qmo59g==
X-Google-Smtp-Source: AGHT+IE2sd7tD470hYWXsU3QAnU+X5Hwiy0o7JSGQdtTtt40utuQOJ8k4uKkDaSAZudSuF3JVpXAHw==
X-Received: by 2002:a17:902:7d8e:b0:1bc:7001:6e62 with SMTP id a14-20020a1709027d8e00b001bc70016e62mr7897416plm.35.1698128708168; Mon, 23 Oct 2023 23:25:08 -0700 (PDT)
Received: from smtpclient.apple (71-223-144-201.phnx.qwest.net. [71.223.144.201]) by smtp.gmail.com with ESMTPSA id b15-20020a170902650f00b001c6187f2875sm7079207plk.225.2023.10.23.23.25.06 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 23 Oct 2023 23:25:07 -0700 (PDT)
From: Austin William Wright <aaa@bzfx.net>
Message-Id: <14D8781F-E68F-45BB-87D8-E00001311747@bzfx.net>
Content-Type: multipart/alternative; boundary="Apple-Mail=_5F6EE4C3-3289-4C28-8358-7D465666884A"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3731.700.6\))
Date: Mon, 23 Oct 2023 23:24:56 -0700
In-Reply-To: <CANY19Nsr-WEMEe6an7kta1Z+L+=6wMHwBrEfc2kAX4e4nE+X7w@mail.gmail.com>
Cc: HTTP APIs Working Group <httpapi@ietf.org>, Jonathan Flat <jflat@apple.com>, Guoye Zhang <guoye_zhang@apple.com>
To: Marius Kleidl <marius=40transloadit.com@dmarc.ietf.org>
References: <CANY19Nsr-WEMEe6an7kta1Z+L+=6wMHwBrEfc2kAX4e4nE+X7w@mail.gmail.com>
X-Mailer: Apple Mail (2.3731.700.6)
Archived-At: <https://mailarchive.ietf.org/arch/msg/httpapi/QBorgGSW9KHgzIAcnl00zZlbe18>
Subject: Re: [httpapi] Feedback on draft-ietf-httpapi-patch-byterange-00
X-BeenThere: httpapi@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Building Blocks for HTTP APIs <httpapi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/httpapi>, <mailto:httpapi-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/httpapi/>
List-Post: <mailto:httpapi@ietf.org>
List-Help: <mailto:httpapi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/httpapi>, <mailto:httpapi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 24 Oct 2023 06:25:20 -0000

Hi Marius,

The technique you’re proposing seems quite similar to the Partial PUT technique already in HTTP (RFC 9110: 14.5) <https://www.rfc-editor.org/rfc/rfc9110.html#name-partial-put>. The major difference is that this would still trigger an error if this feature isn't supported (415 Unsupported Media Type), which is the key reason to use message/byterange (instead of Content-Range in PUT).

However in resumable uploads, I’m not sure that this protection is necessary. If the server returns a URL that represents the contents being uploaded, the only reason a client would ever make a PATCH or PUT request to that resource is to resume the request. It’s unlikely the server would ignore the Content-Range headers in the PUT, or otherwise misunderstand the intent.

Given this, I’d like to suggest is that Partial PUT may be a better solution for you. My objective with “Byte range PATCH” isn’t the message/byterange media type itself, but rather that the Content-Range field has a precise and valuable meaning beyond 206 responses: its useage in requests vs. responses is exactly symmetrical; you can make a Range request and get back a 206 response, then you can use the Content-Range from that response in a PUT or PATCH request to overwrite exactly the same bytes.

---

Now, that said, I think “message/byterange” is still the best way to go (or a binary equivalent, see below). Being the theoretically pure solution, it’s likely to be more broadly useful.

First, encapsulating all patch semantics in the payload makes things more consistent. Putting parameters in the HTTP headers would rigidly bind the semantics of byte range patches to PATCH requests. While Byte Range PATCH is designed for PATCH requests, the message/byterange media type ought to be usable elsewhere, like in GET responses. For example, if I ask for the diff of a resource compared to its parent, the server might respond with a message/byterange document. I can re-play this change by uploading it in a PATCH request. In contrast, placing “Content-Range” in the GET response headers here would conflict with 206 partial content responses.

Furthermore, PATCH seems to imply the request body is completely self-descriptive. Some clients may be written assuming the HTTP headers don’t matter. While it may be tempting to utilize HTTP fields for free data storage, some software may restrict access to these headers. (Web browsers come to mind.)

So even if you decide it’s not practical to place patch parameters in the request body, there’s still value in having a media type that does it this way.

However, there are still some benefits to using a binary patch format that you should consider. It would let clients communicate information about the resumed request as it is streaming, for example “I don’t know the total length of the upload, but here’s some bytes continuing from byte 1000… (2000 bytes later) And now I know the upload is only 6000 bytes total; here’s the last of them… (3000 more bytes...) upload complete.” This would require multiple HTTP requests otherwise.

Finally, compared to the other possible complications, I think parsing the syntax is one of the simpler parts of accepting an uploaded patch, or at least message/byterange is. While I realize many here in HTTP land tend to dislike regular expressions, in this circumstance they are very useful, since they can precisely implement the syntax in just a few lines of code, for example: https://github.com/awwright/http-progress/blob/3c95528c56b3fe49041321ae21a4d3a1923ac501/demo-patch-byterange-server/httpd.js#L155-L184

I think most of the complexity will be in handling conditions like running out of disk space/resource management, multiple Content-Range headers, interrupted request payloads, etc.

---

To summarize, I think there’s genuinely different needs with varying solutions, and that PATCH supports the widest variety of uses; but maybe PUT is simpler for some people, the most important thing is that we use the Content-Range field consistently.

So far I’ve been assuming this is for resumable requests, do you have a specific example? I’d like to brainstorm a few different solutions and contrast them.

Thanks,

Austin.

> On Oct 23, 2023, at 02:57, Marius Kleidl <marius=40transloadit.com@dmarc.ietf.org> wrote:
> 
> Dear Austin, dear WG,
> 
> I am in favor of having a standardized approach for PATCH requests targeted at specific offsets. However, the mechanism describes in this draft appears a bit too complicated to me. By defining a new media type, the draft can put the target range into the request content instead of a header field. For example, using the message/byterange media type (copied from the draft):
> 
> PATCH /uploads/foo HTTP/1.1
> Content-Type: message/byterange
> Content-Length: 272
> 
> Content-Range: bytes 100-299/600
> Content-Type: text/plain
> 
> [200 bytes...]
> This makes the client and server logic more complex and error-prone as the client has to encode the target range into the request content and the server has to perform additional parsing to separate the target range from the actual chunk of data. I would keen on having a simpler approach where the request content is the chunk of data to be applied. The target range and target content type can be placed into the header fields. Thinking out loud, such a request might look like the following:
> 
> PATCH /uploads/foo HTTP/1.1
> Content-Type: message/byterange; target=text/plain
> Content-Length: 200
> Content-Range: bytes 100-299/600
> 
> [200 bytes...]
> 
> Clients can then set the chunk as the request content and server can use the content without any additional parsing been necessary. Note that this approach does not invalidate the use case for multipart/byteranges. multipart/byteranges and a simplified version of message/byterange can exist next to each other, serving different purposes.
> Best regards
> Marius Kleidl
> -- 
> httpapi mailing list
> httpapi@ietf.org
> https://www.ietf.org/mailman/listinfo/httpapi