Re: Digest: use in requests

Lucas Pardue <lucaspardue.24.7@gmail.com> Tue, 29 December 2020 12:38 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C30333A13A1 for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Tue, 29 Dec 2020 04:38:59 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.747
X-Spam-Level:
X-Spam-Status: No, score=-2.747 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.25, HTML_MESSAGE=0.001, MAILING_LIST_MULTI=-1, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id JISvu5yOxkFf for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Tue, 29 Dec 2020 04:38:56 -0800 (PST)
Received: from lyra.w3.org (lyra.w3.org [128.30.52.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 41CD93A13A0 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Tue, 29 Dec 2020 04:38:55 -0800 (PST)
Received: from lists by lyra.w3.org with local (Exim 4.92) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1kuEEQ-0000YX-6j for ietf-http-wg-dist@listhub.w3.org; Tue, 29 Dec 2020 12:36:06 +0000
Resent-Date: Tue, 29 Dec 2020 12:36:06 +0000
Resent-Message-Id: <E1kuEEQ-0000YX-6j@lyra.w3.org>
Received: from titan.w3.org ([128.30.52.76]) by lyra.w3.org with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from <lucaspardue.24.7@gmail.com>) id 1kuEEP-0000Xg-58 for ietf-http-wg@listhub.w3.org; Tue, 29 Dec 2020 12:36:05 +0000
Received: from mail-ed1-x52b.google.com ([2a00:1450:4864:20::52b]) by titan.w3.org with esmtps (TLS1.3:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.92) (envelope-from <lucaspardue.24.7@gmail.com>) id 1kuEEM-0000DL-PD for ietf-http-wg@w3.org; Tue, 29 Dec 2020 12:36:05 +0000
Received: by mail-ed1-x52b.google.com with SMTP id dk8so12472308edb.1 for <ietf-http-wg@w3.org>; Tue, 29 Dec 2020 04:36:02 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=SF11KoPQGJRPFjrfp2fqKm+Xz7EYdxdN1MEZeGGCPC8=; b=EmSCo3jv7cAYynRcsObhl1lbCBKGNnMlWs1WHYWiqe7Vzpuxw0gV8JM64q2N9B21iu bloKm5OSPIFyz0n5XHzhmLOFpMcs1mxbaGELPb/kmFo+FCmmVo++Bw2i/i42JaMTJTa0 EIQGuQr6czDM/VmejuXH/nGiuePVA3WuO4XcKjdD0iTM6LZUWa6Ij8W/jZj5Nm2E5e98 sz5Ts4vp0ZgO2WesniqS8IjI5mOnXgcGCJw7W63BE3bcQ0F8eHNOCUMpKu+IOwDm7m1f beRuADvEGQ/9Tw9ApI0rqqzfMbGXhdG6Vc5MIAVBk8RnvIvOK5V5mZ21PUoQnJkuT4wM ZX9A==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=SF11KoPQGJRPFjrfp2fqKm+Xz7EYdxdN1MEZeGGCPC8=; b=XBXX1HBRGvRrvBux2nyXqjtmEMkVrWp68dGsnkVnVJZcP3ABvJFdGwzi09oMZX38QP AWhhPslojZfZx8auhtM4oos2ltW+2ZpUDX1yyDrk4s1urdpu5Lu6l//bY7CLs2U6mAPK ViljAJfgJ2jmdfMgYPenxv+dPRyOGHngky9eKdLrYCk0toE9YHzpZ2jcn4zRgmyynj7U //NTQHCkaY5fek6meNxKedvBHv9NsFlYZZAeerN5PrOHNY3XUBqTZGSbnn65PDtZSDJL DQtOWn5mYmSPivN6gFRrt/sauTF/5/Sam1316MBXL4+0M2GF9zojxlfbuOJC89/Bh0a9 3lEQ==
X-Gm-Message-State: AOAM530Xk4uWBvyMJgSbUFDyLWM7dOo5Zg8iBBcDoDDxnU9NgODAZajL PmzN1h7KdeBq+aU76w3N/Hi/61tS9mDvHzbvY2DBb9unIZk=
X-Google-Smtp-Source: ABdhPJwahN/JWaSO8phOqUYnJN/ZOzbj7c9AZ5MwY97buJ4cVK4oMiZEiwc9ZnG3ir0UVCkyP2QkYrFfphuIWuEAu+k=
X-Received: by 2002:aa7:db01:: with SMTP id t1mr45844455eds.185.1609245351512; Tue, 29 Dec 2020 04:35:51 -0800 (PST)
MIME-Version: 1.0
References: <CAP9qbHVwt35L_h_F=8BsK3zSjPpSWmnhCVDGKhe4kp9Z3umkLg@mail.gmail.com> <0d0e7e90-2a4d-a4b0-3782-7ec3da1c892f@gmx.de> <CAP9qbHWMRsok2C=6JAEVUULTt2BXJ3kHGGDJ9TmNRrA_1J9mKg@mail.gmail.com> <edbc0f95-fc71-e09d-a35d-014356e3b51a@gmx.de> <CALGR9oadRYc-oQHuX13HPSCVmYcNu5z-7RL1JFKzHWkMreeL3w@mail.gmail.com>
In-Reply-To: <CALGR9oadRYc-oQHuX13HPSCVmYcNu5z-7RL1JFKzHWkMreeL3w@mail.gmail.com>
From: Lucas Pardue <lucaspardue.24.7@gmail.com>
Date: Tue, 29 Dec 2020 12:35:40 +0000
Message-ID: <CALGR9oZJLwM2VZpJmxvAuomvhfc6XkFZwSucHO3_uhSWVy35Gw@mail.gmail.com>
To: Julian Reschke <julian.reschke@gmx.de>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Content-Type: multipart/alternative; boundary="000000000000888ce605b799a122"
Received-SPF: pass client-ip=2a00:1450:4864:20::52b; envelope-from=lucaspardue.24.7@gmail.com; helo=mail-ed1-x52b.google.com
X-W3C-Hub-Spam-Status: No, score=-4.8
X-W3C-Hub-Spam-Report: BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, W3C_AA=-1, W3C_IRA=-1, W3C_WL=-1
X-W3C-Scan-Sig: titan.w3.org 1kuEEM-0000DL-PD df499bfb7d5504e3f333687a59c8fe7a
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Digest: use in requests
Archived-At: <https://www.w3.org/mid/CALGR9oZJLwM2VZpJmxvAuomvhfc6XkFZwSucHO3_uhSWVy35Gw@mail.gmail.com>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/38355
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <https://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

Missed one, would not be surprised if there are more:

* Microsoft OneDrive [1]

"To upload the file, or a portion of the file, your app makes a PUT request
to the *uploadUrl* value received in the *createUploadSession* response.
You can upload the entire file, or split the file into multiple byte
ranges, as long as the maximum bytes in any given request is less than 60
MiB.

The fragments of the file must be uploaded sequentially in order. Uploading
fragments out of order will result in an error."
Example
PUT https://sn3302.up.1drv.com/up/fe6987415ace7X4e1eF866337
Content-Length: 26
Content-Range: bytes 0-25/128

[1] -
https://docs.microsoft.com/en-us/onedrive/developer/rest-api/api/driveitem_createuploadsession?view=odsp-graph-online#example

On Tue, Dec 29, 2020 at 12:17 PM Lucas Pardue <lucaspardue.24.7@gmail.com>
wrote:

> Hi Julian,
>
> Just adding my 2c as responses in-line:
>
> On Tue, Dec 29, 2020 at 10:28 AM Julian Reschke <julian.reschke@gmx.de>
> wrote:
>
>> Hm, that seems like an odd choice for a protocol spec. If the spec
>> doesn't say what the Digest means for any request, it's not really
>> defining a protocol.
>>
>> I would *hope* that we can define things so that Digests can
>> automatically produced and checked by user agents (browsers) and servers
>> (such as a servlet container).
>>
>
> FWIW, subresource integrity (SRI) is implemented in browsers. The
> specifics are different, the hash applies to the identity encoding, so UAs
> need to reverse any content encoding before validation. The fundamentals
> carry over so it should be possible but I've not seen any signals that
> browsers are interested in automatic Digest validation (yet?).
>
>
>> > Reading
>> https://httpwg.org/http-core/draft-ietf-httpbis-semantics-latest.html#rfc.section.6.4.1.p.1
>> > ```The purpose of a payload in a request is defined by the method
>> semantics```
>> > iiuc the receiver, aware of the request semantic, knows its purpose
>> > and how to process it, including whether it conveys a partial
>> > representation or not.
>>
>> But "partial repesentation" is a term defined by HTTP; there is (or
>> should be) an algorithm that - when inspecting *any* HTTP message -
>> tells you whether it's "partial" or not. In HTTP, this is defined by the
>> appearance of "Content-Range" for some specific response status codes.
>>
>
>> <snip>
>>
>> It *really* would be good to discuss something *concrete* here.
>>
>> Let's consider an upload protocol that sends multiple chunks, and then
>> lets the server combine these into the final resource.
>>
>> In that protocol, Digest on each chunk would be use to check the
>> integrity of each chunk.
>>
>> For the final step of creating the final full resource, the client could
>> send the expectec digest of the final resource in a *custom* field
>> defined for the upload protocol (it would use the same algorithms etc,
>> but use a different way to convey it to the server).
>>
>> With that, generic libraries could at least verify Digests on each of
>> the chunks.
>>
>
> This is indeed the most likely use case. A very quick survey indicates
> that there seem to be some examples of PUT requests with Content-Range in
> the wild. I have no experience with these, nor knowledge of how popular
> they actually are.
>
> * Amazon S3 Glacier [1]
>
> "This multipart upload operation uploads a part of an archive. You can
> upload archive parts in any order because in your Upload Part request you
> specify the range of bytes in the assembled archive that will be uploaded
> in this part."
>
> Example:
> PUT /AccountId/vaults/VaultName/multipart-uploads/uploadID HTTP/1.1
> Host: glacier.Region.amazonaws.com
> Date: Date
> Authorization: SignatureValue
> Content-Range: ContentRange
> Content-Length: PayloadSize
> Content-Type: application/octet-stream
> x-amz-sha256-tree-hash: Checksum of the part
> x-amz-content-sha256: Checksum of the entire payload
> x-amz-glacier-version: 2012-06-01
>
> * Google Drive [2]
>
> "Upload the content in multiple chunks. Use this approach if you need to
> reduce the amount of data transferred in any single request. You might need
> to reduce data transferred when there is a fixed time limit for individual
> requests, as can be the case for certain classes of Google App Engine
> requests."
>
> "Add these HTTP headers:
>     Content-Length. Set to the number of bytes in the current chunk.
>     Content-Range. Set to show which bytes in the file you upload. For
> example, Content-Range: bytes 0-524287/2000000 shows that you upload the
> first 524,288 bytes (256 x 1024 x 2) in a 2,000,000 byte file."
>
> * Google Cloud Storage [3]
>
> "This page describes how to make a resumable upload request in the Cloud
> Storage JSON and XML APIs. This protocol allows you to resume an upload
> operation after a communication failure interrupts the flow of data."
>
> Example:
> curl -i -X PUT --data-binary @CHUNK_LOCATION \
>     -H "Content-Length: CHUNK_SIZE" \
>     -H "Content-Range: bytes
> CHUNK_FIRST_BYTE-CHUNK_LAST_BYTE/TOTAL_OBJECT_SIZE" \
>     "SESSION_URI"
>
> * draft-wright-http-partial-upload-01 (expired) [4]
>
> "This document specifies a new media type intended for use in PATCH
>    payloads that allows a resource to be uploaded in several segments,
>    instead of a single large request."
>
> Example:
> PATCH /uploads/foo HTTP/1.1
>    Content-Type: message/byterange
>    Content-Length: 283
>    If-Match: "xyzzy"
>    If-Unmodified-Since: Sat, 29 Oct 1994 19:43:31 GMT
>
>    Content-Range: bytes 100-299/600
>    Content-Type: text/plain
>    Content-Length: 200
>
> Finally, Dropbox [5] does things a little differently and uses the
> Dropbox-API-Arg JSON header field to communicate a cursor containing an
> offset of the bytes uploaded so far (which I guess means that parallel
> transfers aren't supported).
>
> Example:
> curl -X POST
> https://content.dropboxapi.com/2/files/upload_session/append_v2 \
>     --header "Authorization: Bearer"
>     --header "Dropbox-API-Arg: {\"cursor\": {\"session_id\":
> \"1234faaf0678bcde\",\"offset\": 0},\"close\": false}"
>     --header "Content-Type: application/octet-stream"
>     --data-binary @local_file.txt
>
> To conclude, I'm not exactly sure how these examples influence the
> discussion. It seems that there are actually concrete cases of "partial
> requests" but it's unclear to me if these break HTTP semantic rules and/or
> if it should be documented for formally. The examples I've seen are for
> APIs that also have their own custom means for integrity checks, or still
> use Content-MD5. It would be nice if something like Digest covered all
> avenues and we could get folks to switch to it, but I've not seen any
> signals that such APIs are interested in Digest. Therefore, I'm wary of
> Digest taking on too much work to describe something without any
> implementer interest. In the interest of progress, if partial requests for
> uploads is something people think needs standardising, I think that could
> be done as an independent follow-on work item e.g. a document that updates
> Digest.
>
> Cheers
> Lucas
>
> [1] -
> https://docs.aws.amazon.com/amazonglacier/latest/dev/api-upload-part.html
> [2] -
> https://developers.google.com/drive/api/v3/manage-uploads#http---multiple-requests
> [3] - https://cloud.google.com/storage/docs/performing-resumable-uploads
> [4] - https://tools.ietf.org/html/draft-wright-http-partial-upload-01
> [5] -
> https://www.dropbox.com/developers/documentation/http/documentation#files-upload_session-append:2
>
>