Re: [Ppm] Httpdir early review of draft-ietf-ppm-dap-09

Hi everyone!
Thanks for your comments, Mark!

Mark, do you have an opinion on what should happen in cases like our report
upload? POST doesn't really fit because it doesn't convey the idempotency
and PUT doesn't really fit because it is not expected that you can ever GET
the report back.

Regarding:

> - In section 4.3, it might be good to reference RFC 6570.
>
I agree with Tim that I would rather not see any changes made here since
the power of RFC 6570 is not really needed.

And regarding:

> Using prefixes is better than static URLs. Ideally, the client would fetch
> something that contains the links (or templates for them) to allow complete
> flexibility, much as HTML does for the Web. Have you considered that
> approach?

I would prefer if we stick to only allowing prefixes to keep things simple.
Since we already have some OOB agreements between the involved parties the
definition of the paths could be removed from the spec and also declared
OOB but I don't see the benefit.

BR
Simon

On Tue, Jan 9, 2024 at 11:27 PM Tim Geoghegan <timgeog@gmail.com> wrote:

> As promised/threatened, some additional PRs that came out of this
> discussion:
>
> require error responses to be consistent -
> https://github.com/ietf-wg-ppm/draft-ietf-ppm-dap/pull/542
> Hoist report ID into upload request path -
> https://github.com/ietf-wg-ppm/draft-ietf-ppm-dap/pull/543
>
> Tim
>
> On Tue, Jan 9, 2024 at 8:40 AM Tim Geoghegan <timgeog+ietf@gmail.com>
> wrote:
>
>> Thanks again for sharing your thoughts. I am working on some further DAP
>> changes to implement more of your recommendations and will follow up here
>> with pull requests links as I post them. Just a few more comments inline,
>> below.
>>
>> Tim
>>
>> On Sun, Jan 7, 2024 at 4:59 PM Mark Nottingham <mnot@mnot.net> wrote:
>>
>>> On 3 Jan 2024, at 8:06 am, Tim Geoghegan <timgeog+ietf@gmail.com> wrote:
>>> >
>>> > - 'Errors can be reported in DAP both at the HTTP layer and within
>>> challenge
>>> > objects as defined in Section 8.' It should be noted that for HTTP
>>> software to
>>> > work correctly, the appropriate status code must be used; indicating
>>> an error
>>> > in the object without doing so may lead to unintended
>>> retries/caching/etc.
>>> >
>>> > Good point. The other problem is that I don't think the reference to a
>>> "challenge object" makes any sense. What we mean is that a message in the
>>> response body might contain error information. So we need to fix that text.
>>> That aside, would it suffice to amend this section to say that if the
>>> object in the response body indicates an error, then the HTTP status MUST
>>> indicate either a client or server error?
>>>
>>> That's one way to do it, yes -- although that might be too constraining
>>> (e.g. when it's an error that doesn't have impact on / significance to HTTP
>>> components).
>>>
>>
>> I understand the distinction you're making between the HTTP/transport
>> layer and the DAP/application layer, but personally I dislike APIs that
>> indicate success at the transport layer but then an error in the message,
>> because inevitably clients only look at the HTTP status. For instance the
>> infamous AWS S3 CompleteMultipartUpload, which will give you a 200 OK but
>> then only in the response body will tell you that your object was not
>> actually committed to storage (
>> https://docs.aws.amazon.com/AmazonS3/latest/API/API_CompleteMultipartUpload.html).
>> This has burnt pretty much everyone who uses the S3 HTTP API. The official
>> AWS SDKs do the right thing for you, but if you're using those, I'd argue
>> you're not really interacting with the HTTP API but rather a
>> Python/Java/Go/whatever abstraction over it, with its own error
>> representation and handling concepts. But I digress...
>>
>>
>>> > - In section 4.3, it might be good to reference RFC 6570.
>>> >
>>> > Speaking as the individual who wrote the current text: I deliberately
>>> avoided using RFC 6570 here because its feature set is much, much richer
>>> than what DAP needs, and so I opted for the weaker, simpler templates that
>>> we have now.
>>> >
>>> > Speaking as a document editor/WG member: I have no doubt that the
>>> templates and expansions DAP uses can be expressed in RFC 6570 syntax, and
>>> will switch over to it if the WG consensus is to do so, but personally I
>>> don't see a compelling reason to go that way (I'm open to hearing arguments
>>> about this of course).
>>>
>>> AIUI these are just an editorial convention / explanatory device, so I
>>> don't feel strongly about this, but 6570 does designate "level 1" for such
>>> simple cases.
>>>
>>
>> Oh, that's neat, I hadn't realized it offered different feature sets.
>> I'll give RFC 6570 another read and see what adopting it for our URL
>> templates would look like.
>>
>> > - Section 4.4.1 and afterwards use static paths for subresources,
>>> hindering
>>> > flexibility and violating at least the spirit of BCP 190. If the
>>> configuration
>>> > were to allow specification of URI templates, it would be much more
>>> flexible.
>>> >
>>> > We tried to write the text to allow serving of the API relative to any
>>> prefix. So in a template like "{aggregator}/hpke_config" (4.4.1),
>>> "{aggregator}" can have arbitrary path components in it relative to the
>>> hostname. Did we express this incorrectly?
>>> >
>>> > Past that, are you objecting to DAP spelling out templates like
>>> "{aggregator}/hpke_config" or
>>> "{helper}/tasks/{task-id}/aggregation_jobs/{aggregation-job-id}"? Are you
>>> suggesting that a deployment should be able to use a template like
>>> "{helper}/foo/{task-id}/bar/{aggregation-job-id}" if it so wishes?
>>>
>>> Using prefixes is better than static URLs. Ideally, the client would
>>> fetch something that contains the links (or templates for them) to allow
>>> complete flexibility, much as HTML does for the Web. Have you considered
>>> that approach?
>>>
>>
>> I assume you're talking about something like the ACME directory (
>> https://datatracker.ietf.org/doc/html/rfc8555#section-7.1.1)? Yes, the
>> DAP authors are aware of that design. Personally I don't think it's right
>> for DAP. I may be revealing my inexperience as an API designer and server
>> operator but I don't see how the flexibility in request path layout is
>> worth the additional complexity of a directory endpoint. Additionally, in a
>> protocol like DAP, which will see far, far greater volume of traffic than
>> ACME, forcing an extra HTTP round trip for some report uploads -- to check
>> where HPKE configs are and where uploads should go -- seems wasteful.
>>
>> But of course that's my opinion as an individual WG member. If someone in
>> the WG wants to champion the idea of a directory and builds rough consensus
>> around the idea, then I'd support it.
>>
>>
>>> > - 4.4.2 uses PUT to create a report, but doesn't seem to make the
>>> uploaded
>>> > report available for later GETs at the same URL. If this stays the
>>> same (i.e.,
>>> > if the server is going to manage the URL of the report), it should be
>>> POST. It
>>> > should only be PUT if the client is effectively deciding what the URL
>>> should be
>>> > (by PUTting to a URL it can later GET). Also, if it can't be GET
>>> later, the
>>> > response status code should be 200, not 201. If it can and POST is
>>> used, it
>>> > should be 201 + a Location header pointing to where it was created.
>>> >
>>> > First, there is a unique identifier for a report that does get chosen
>>> by the client, but it only appears inside the request body (this is
>>> `Report.report_metadata.report_id`). We could hoist it up to the request
>>> path, yielding a template `{leader}/tasks/{task-id}/reports/{report-id}`,
>>> and then you'd have a unique URI for each report. I'm inclined to do this
>>> much no matter what, for consistency with the aggregation job and
>>> collection job resource paths.
>>>
>>> If you do that, PUT makes sense.
>>>
>>> > That said, I'm not sure if it makes sense for that to be a GETtable
>>> resource. DAP doesn't need GET on reports to function and more importantly,
>>> I worry about the implementation burden for servers. In many cases, the
>>> initial uploaded form of a report isn't useful after the first round of
>>> preparation. Servers may wish to discard per-report information
>>> aggressively to limit the growth of their databases, which makes it
>>> difficult to serve up anything meaningful in response to a GET request. So
>>> I'm reluctant to add a requirement or a MUST about GET on this resource.
>>>
>>> Understood, but I wonder how much of a burden it really is for a server
>>> to return some state that it has sitting there already.
>>>
>>> > I really like having this be a PUT request because of the implied
>>> idempotency semantics. What if we leave it as a PUT and change the path
>>> template to incorporate the unique report ID, but then don't say anything
>>> about GET requests (or at most put in a MAY)? I think then implementations
>>> would be free to either respond to GET requests with HTTP 200 and a `struct
>>> Report`, or with HTTP 204 or 410 if they happen to have chucked out the
>>> relevant database row.
>>>
>>> Yes, if the resource isn't available any more, 404 or 410 is entirely
>>> appropriate. PUT isn't a guarantee that the same bytes will always be
>>> returned -- or even returned on the next request. However, it should make
>>> sense, from a systemic view.
>>>
>>> > If that's inappropriate, then I think we can switch to a POST as you
>>> recommend, but what about idempotency? Would we just have to state in DAP
>>> that this particular POST is safe to retry?
>>>
>>> Well, this is under development and should ship soon: <
>>> https://datatracker.ietf.org/doc/draft-ietf-httpapi-idempotency-key-header/
>>> >
>>>
>>
>> In our case, we don't need a synthetic idempotency key because we already
>> have a natural one: the unique report ID. I meant that by switching to
>> POST, we'd lose the inherent implication that the request is safe to retry,
>> and I was asking how protocols should make it clear that a POST request is
>> idempotent. In any case I think this is moot, because we'll stick with PUT
>> for this request and move the unique identifier into the request path.
>>
>>
>>> > - Use of 201 in 4.6.1 seems incorrect if I understand correctly.
>>> >
>>> > The request in question is PUT
>>> {leader}/tasks/{task-id}/collection_jobs/{collection-job-id}, which is
>>> intended to create a collection job resource with the specified ID, in the
>>> specified task. RFC 9110 9.3.4 (
>>> https://www.rfc-editor.org/rfc/rfc9110.html#section-9.3.4-2) states:
>>> >
>>> > If the target resource does not have a current representation and the
>>> PUT successfully creates one, then the origin server MUST inform the user
>>> agent by sending a 201 (Created) response.
>>> > So 201 seems correct: on success, the collection job gets created. The
>>> same paragraph in 9110 discusses using 200 or 204 when successfully
>>> mutating an existing resource, but DAP explicitly forbids that in this
>>> request.
>>>
>>> Ah, sorry, I misread that.
>>>
>>> Cheers,
>>>
>>> --
>>> Mark Nottingham   https://www.mnot.net/
>>>
>>> --
> Ppm mailing list
> Ppm@ietf.org
> https://www.ietf.org/mailman/listinfo/ppm
>