Re: [alto] alto transport streams (Re: early reviews)

Mark Nottingham <mnot@mnot.net> Fri, 22 July 2022 03:00 UTC

Feedback-ID: ie6694242:Fastmail
Content-Type: text/plain; charset="us-ascii"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.100.31\))
From: Mark Nottingham <mnot@mnot.net>
In-Reply-To: <CANUuoLoGYYO3PczKzCyvnrNoqGUa0Zfn8rr-x+Hz2dzqLFBVww@mail.gmail.com>
Date: Fri, 22 Jul 2022 13:00:40 +1000
Cc: Spencer Dawkins <spencerdawkins.ietf@gmail.com>, Martin Duke <martin.h.duke@gmail.com>, IETF ALTO <alto@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <EF370A5A-1005-4F81-98F8-CB7117B9513F@mnot.net>
References: <CANUuoLoGYYO3PczKzCyvnrNoqGUa0Zfn8rr-x+Hz2dzqLFBVww@mail.gmail.com>
To: "Y. Richard Yang" <yry@cs.yale.edu>
Archived-At: <https://mailarchive.ietf.org/arch/msg/alto/HgmIGTygdDHeM0FCiheUAq1UcJM>
Subject: Re: [alto] alto transport streams (Re: early reviews)
Precedence: list

Hi Richard,

I'm just going to give some impressions while reading this, FWIW.

> On 22 Jul 2022, at 6:20 am, Y. Richard Yang <yry@cs.yale.edu> wrote:
> 
> Hi Mark, Martin, Spencer,
> 
> First, thank you so much for the reviews. Sorry for the delay in responding asap, due to summer and travel.
> 
> One common comment we see from all three of you, in different texts, is essentially on the service model of the current document (https://datatracker.ietf.org/doc/draft-ietf-alto-new-transport/): how much control on the ordering of the messages (Mark/Martin), in-order or not (Spencer). Hence, after the meeting early morning today, we feel that we should start a single thread to discuss this issue. We will send a summary of our responses to each of your individual reviews later.
> 
> To help make clear the problem and the current design, let's start by summarizing the transport model of ALTO/SSE (RFC8895; https://datatracker.ietf.org/doc/html/rfc8895). We will state the problem as abstract as possible to see the essence of the problem and the potential design choices:
> - P1: A server offers multiple resources R = {R[1], R[2], R[3], R[4], ...}, where each R[i]i is represnted by a URI;

Good so far; that's pretty much a description of every use of HTTP.

> - P2: A client can request from the server the monitoring of a subset M of the resources: M \subset R, e.g., M = {R[1], R[2], R[3]}; for example, R[1] is a network map, R[2] is a cost map using the network map R[1], and R[3] is an endpoint property map; for simplicity, assume the requested resources numbered 1 to |M|

So, in HTTP terms, that would be creating a new resource whose semantic is "I am a monitor for  _this_ set of resources". The tricky party here is how to realise that in terms of HTTP -- i.e., what does a GET response look like?

One answer would be a feed format like RSS or Atom. Do you support GET in this fashion?

> - P3: The server can push a sequence of incremental updates to the client for each resource in M. Denote {U[i,1], U[i,2], ...} as the update sequence from the server to the client for R[i] in M, where U[i,1] is the first response for the requested resource R[i], U[i,2] is an incremental update (patch) on top of U[i,1], U[i,3] is the incremental update on top of U[i,2], ...

Here's where I get more concerned. Server Push is unproven, and indeed has been shown to be an anti-pattern for its originally intended use case. Folks are still interested in it for API use cases (and I'd see this as one of those), but it's still very wild west, with no real widespread deployment experience that I'm aware of. See especially <https://httpwg.org/specs/rfc9205.html#server-push>. 

The first question I'd ask here is whether polling a resource (like a feed document) is sufficient. That pattern works very well with HTTP, and is well-understood -- _much_ more so than Server Push. 

The downside, of course, is latency, but it's not unknown to deploy applications with very high polling frequencies (e.g., 1/s). Have you considered this approach? Would modifying it to long-polling (where the client always keeps a request 'hanging' until the server has an update) help? Keep in mind that with HTTP/2 and above, long-polling has very few downsides.

Another option would be to invert the relationship and have what's currently the server open connections to the current client and PUT / PATCH updates to them. Much more natural HTTP, but of course you need an identity for the client and a clear path to it. This approach also has considerable deployment experience (commonly known as 'webhooks').

> - P3.1: For flexibility, each U[i,j] can choose its own encoding; for example, U[1,1] is application/alto-networkmap+json, U[1,2] is application/merge-patch+json; concrete examples please see the current draft (search "Promised Stream 4" and "Promised Stream 6")

Sure.

> - P4: Consider the dependency among the information resources in P3: {U[1,1], U[1,2], ...}, {U[2,1], U[2,2], ...}, and {U[3,1], ....}. There are two types:
> - P4.1 We have that U[i,j+1] depends on U[i,j] due to incremental updates---in the general case, the client needs to have received and processed U[i,j] to apply U[i,j+1], unless U[i,j+1] is a snapshot (not an incremental update). 

Right. The closest things that we have for managing this sort of thing in HTTP today are conditional requests; e.g., If-Match.

> - P4.2 It is possible that U[i', j'] depends on U[i, j], where i' != i: for example, a cost map depends on the correct version of the network map.

That's a purely application-level semantic, correct? I.e., resource A isn't useful without resource B, and furthermore representation 1 of resource A requires representation 4 of resource B to be interpreted correctly. This is similar to issues that people face in caching CSS, JavaScript and the like today -- generally it's solved by giving representations with breaking changes different URLs, and referencing them directly from their dependants. We've talked about other ways to solve this, but that's current practice.

> If one goes academic, there is a dependency graph that can describe the dependencies.
> 
> In RFC8895, since it is a single HTTP connection, all of the {U[i,j]} are *linearized" by the server into a single sequence, and Section 6.7 of RFC8895 specifies the linearization (serialization) requirements (P4.1 and P4.2). As we know from concurrency control, linearization is strong and leaves performance on the table. One of the motivations for the new document is to take advantage of the more relaxed concurrency model allowed by HTTP/2 and HTTP/3.
> 
> So what are the design points that the design team has considered:
> 
> - D1 (linearization): the server pushes all {U[i,j]} in a single stream. Due to P3.1, there must be an internal structure to separate the U[i,j] boundaries and different U[i,j] can have different media types, leading back to RFC8895.

I don't understand why it's necessary to do this. AFAICT there are no ordering constraints created by state-changing operations; rather, you just have dependencies that can be resolved once all of the updates are available. 

For example, if you *are* going to stay with Server Push, I immediately wonder what the value of having a monitor resource is, rather than just having the appropriate resources push updates directly using their own identity. Of course you still need some sort of subscription mechanism, but that doesn't mean that the actual updates need to be coalesced into one HTTP response.

> 
> - D2 (max concurrency): Each U[i,j] is sent in an independent (concurrent) push stream (and hence can use its own media type as it is) and let the application-layer handles dependency: ALTO has a build-in dependency check for cross-resource (P4.2) and the sequence numbers of incremental updates allow application (ALTO client) to figure out the same-resource dependency (P4.1). The application (ALTO client) buffers all streams to realize the ordering. 

Right, this seems more reasonable, although again I wonder about the use of Push.

> - D3 (max concurrency with server barrier): It is D2 but requires that the server does not push U[i',j'] if there is a U[i,j], s.t., (1) U[i,j] is still being sent by the server to the client and (2) U[i',j'] depends on U[i,j].
> 
> The intention of the current document is to reflect D3. Note that for D3, the ALTO client still needs to handle the dependency correctly, as the streams may be only buffered at the kernel, and not processed. But the benefit of D3 over D2 is that it implements essentially some flow control, to avoid a faster server overwhelming a slower client.

The streams might be buffered anywhere in the handling chain. Given that H2/H3 already have flow control, is this really necessary? Or are you trying to manage buffer sizes 'above' the HTTP layer?

> One design point that one may consider is 
> - D4 (linearization of same resource and concurrency for different resources): the U[i,j] of the same R[i] is sent in the same stream (and hence linearized); this can be a reasonable design, but due to P3.1, it is still not clean; see D1 discussion.

Indeed.

> I hope that the preceding has clarified the stream control issue that the design faced. One can see that the issue can be considered as a generic issue--what to specify when embedding one concurrency model (P4.1/P4.2) into a given concurrency structure (HTTP/2 or HTTP/3). We took a quick look at DoH. It looks that the dependency model of DoH might be simpler: there is no same-resource (DNS resource type) due to no incremental updates (DNS update model appears to be the idempotent overwrite model) or cross-resource dependency. Please correct us if we have missed it.
> 
> The current document tried to separate the issue into Sec. 8 (ALTO/H2 Stream Management). One way forward we can see is to (1) only specify that each U[i,] is mapped to its own stream (to handle P3.1), and (2) not specify the dependency among U[i,j] (i.e., specify as close as possible to specify nothing) and let the client figure out the dependency at the message; we add the requirement language such as "The client MUST check the dependency ...;"

Overall, this seems like a very awkward layering of an application onto HTTP. I don't know enough about the underlying requirements to tell whether a more natural approach is possible today, but I suspect it might be. 

Being able to get updates to the state of a set of resources is a pretty common requirement, so I'd very much like to see this addressed in a generic way that's both reusable for other applications and that works well with the other parts of HTTP. I wrote a little more about this here: <https://www.mnot.net/blog/2022/02/20/websockets>.

Another thing that comes to mind is that what you're trying to do seems to have _some_ overlap with what the BRAID folks are trying to do: <https://datatracker.ietf.org/doc/html/draft-toomim-httpbis-braid-http>. I'd be very interested to hear if you thought it'd be helpful to have that document move forward.

Hope this helps,

--
Mark Nottingham   https://www.mnot.net/

[alto] alto transport streams (Re: early reviews) Y. Richard Yang
Re: [alto] alto transport streams (Re: early revi… Mark Nottingham
Re: [alto] alto transport streams (Re: early revi… Spencer Dawkins at IETF
Re: [alto] alto transport streams (Re: early revi… Y. Richard Yang
Re: [alto] alto transport streams (Re: early revi… Y. Richard Yang
Re: [alto] alto transport streams (Re: early revi… Mark Nottingham