Re: [alto] Benjamin Kaduk's No Objection on draft-ietf-alto-incr-update-sse-20: (with COMMENT)

Dear Ben,

Thank you so much for the thorough, thoughtful review. Sorry for the late
reply---I did no
t get the initial email containing your review. Please see inline.

On Wed, Mar 11, 2020 at 8:02 PM Benjamin Kaduk via Datatracker <
noreply@ietf.org> wrote:

>
> Benjamin Kaduk has entered the following ballot position for
> draft-ietf-alto-incr-update-sse-20: No Objection
>
> When responding, please keep the subject line intact and reply to all
> email addresses included in the To and CC lines. (Feel free to cut this
> introductory paragraph, however.)
>
> Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
> for more information about IESG DISCUSS and COMMENT positions.
>
> The document, along with other ballot positions, can be found here:
> https://datatracker.ietf.org/doc/draft-ietf-alto-incr-update-sse/
>
> ----------------------------------------------------------------------
> COMMENT:
> ----------------------------------------------------------------------
>
> Thanks for your response to the genart reviewer; I'm happy to see that
> change staged for the next version.
>
>
Thanks!

> It's a bit interesting to see that we present JSON merge patch and JSON
> patch in the reverse order in which they were defined (RFC 7396/7386 vs.
> 6902).
>
>
It is indeed interesting. The historical reason was that the early versions
defined
only JSON merge patch, and then added JSON patch later, motivated by the
need
to handle potential inefficiencies (see the end of Sec. 3.1.1 and the
beginning of
Sec. 3.2.1). Although in reverse chronological order, the flow and
transition appear
to be good so far, and let's keep the flow. OK?

> It's very surprising to me that we replicate the algorithm for JSON
> merge patch and SSE but just refer to RFC 6902 for the JSON patch
> algorithm.
>

Good catch on consistency. As we have removed the algorithm for the merge
patch,
they are treated equally now.

> What happens if either party closes the update stream without the proper
> clean-up message(s)?
>
>
Very good discussion. I see you may mean two cases: (C1) the overall update
stream;
and (C2) one of the substreams. The case of C1 is a special case of C2.

If the server closes (e.g., due to an internal error) a substream without
sending
a control update of a substream, the client may receive a partial update
message.
Hence, it is important that the client uses a transaction based
implementation
to commit updates.

If the client closes a substream, there can be two cases: (1) the client
uses stream
control, and the server is in the middle of sending a data update for the
substream; it is
recommended that the server finishes sending the data update, and the
client is
recommended to know that the close is effective only after getting the
notification
control update. (2) the client could just close the http connection (and
send FIN)
without properly sending the stop as in (1). In this case, the server TCP
connection
will be notified, and the server code should handle it.

Is the above you are looking for?

Section 1
>
>    considerations by both ALTO servers and clients; Section 13 discusses
>    a design feature that is not supported; Section 10 discusses security
>    issues; The last two sections review the requirements for future ALTO
>    services to use SSE and IANA considerations, respectively.
>
> I think this last remark perhaps should not have survived a section
> reordering to put the not-supported design feature (section 13) last.
>

Although I do not see that we make section-level changes (e.g., add
sections
at the end or reordering sections) at this point, self-updating reference
is
always a better idea. We will revise the last remark. Thanks for the
suggestion.

> Section 2
>
>    Stream Control Server: An stream control server providing the stream
>    control service.
>
> Perhaps we could avoid defining a "stream control server" as a "stream
> control server" that does some things.
>

Sounds good. We will remove the redundant structure of the definitions,
e.g.,
Update Stream: An update stream is an HTTP connection ...
->
Update Stream: An HTTP connection ...

We will revise all definitions to be consistent in style.

>
>    Control Update Message: A control update message is a message in an
>    update stream for the update stream server to notify the ALTO client
>    of related control information of the update stream.  The first
>    message of an update stream is a control update message and provides
>    the URI using which the ALTO client can send stream control requests
>    to the stream control server.  Additional control update messages in
>    an update stream allow the update stream server to notify the ALTO
>    client of status changes (e.g., the server will no longer send
>    updates for an information resource).
>
> The way this is written makes me want to double-check that I have the
> flow/terminology right: there's an update stream of update messages from
> server to client, that can include data updates but also can include
> control update messages.
>

Right.

> Control update messages give the client
> information about the stream control service, potentially distinct from
> the (regular) update service, and the client sends requests on the
> stream control service, that are expected to result in changes to what
> the server sends through the (regular) update service? Would it ever
>
happen that a server sends a message on the control service connection, or
> that a control update message is present on the control service?
>
>
Good summary. Let me try to make some small edit/comments based on your
sentences and we may use them:

"Control update messages give the client information about the stream
control service,
potentially distinct from the (regular) update service, ..."
->
"Control update messages give the client information about the transmission
states of data
updates, and hence are distinct from the (regular) data updates. A control
update message
may be triggered by an internal event at the server, such as server
overloading and hence
stoping some updates, or as a result of a client sending a request through
the stream control
service."

"... that are expected to result in changes to what the server sends
through the (regular)
update service?"

->

Yes, the client should expect to get a notification (control update) on the
result of its
request (e.g., start a new substream to update a resource). The timing of
when the
notification happens, however, is asynchronous.

"Would it ever happen that a server sends a message on the control service
connection, or
that a control update message is present on the control service?

->

A control update message is always sent at the same update stream
(connection).

Section 3.1
>
> Is it worth noting an explanation of why the HTTP PATCH method is
> unsuitable for ALTO SSE (presumably because it goes client-to-server and
> we require the opposite)?
>
>
Good point and yes, although one can think of the ALTO client as a holder
of the info
and hence could be considered as a server. But the info at the client is
not required
to be persistent. We will add a sentence to explain a bit.

> Section 3.1.1
>
>    one JSON value into another.  Specifically, JSON merge patch treats
>    the two JSON values as trees of nested JSON objects (dictionaries of
>
> I suggest clarifying "the two JSON values" (e.g., "the old and new JSON
> values").
>
>
There is some revision to the text to address the comments by Suresh and
Adam. We
will add the clarification (i.e., "the old ...) in the revised text---the
two JSON values are
still there.

> Section 3.3
>
>    Despite the two features of HTTP/2, this design chooses an HTTP/1.x
>    based design for the simplicity of HTTP/1.x.  An HTTP/2 based design
>
> I suggest to say "HTTP/1.x-compatible design" rather than "HTTP/1.x
> based design" -- the SSE API is part of HTML5, and as such is usable
> with both those versions of HTTP (and future versions).
>
>
Good suggestion. We will revise using the suggested phrase.

> Section 4.1
>
>    An update stream can provide updates to both GET-mode resources, such
>    as ALTO network and cost maps, and POST-mode resources, such as ALTO
>    endpoint property service.  Also, to avoid creating too many update
>    streams, this design allows an ALTO client to use one update stream
>    to receive updates to multiple requests.  In particular, the client
>    may request to receive updates for the same resource but with
>    different parameters for a POST-mode resource.  The updates for each
>
> This implies by omission that update-stream consolidation is not
> possible for GET-mode resources or different POST-mode resources; I'd
> suggest to reword to clarify that this is in addition to being able to
> consolidate updates for multiple resources into a single stream.
>
>
Interesting catch on potential ambiguity (implication by omission). We will
reword to
clarify as suggested: in addition to being able to consolidate updates for
multiple
resources into a single stream.

>    The objective of an update stream is to continuously update the ALTO
>    client on data value changes given by the client's requests.  This
>
> nit: "data value changes given by the client's requests" almost reads
> like it's the client's requests that are producing the data value
> changes, rather than the client's requests indicating which data values
> to report changes to.
>
>
Interesting point. How about the following:
   The objective of an update stream is to continuously push to an ALTO
client
   data value changes to a set of resources, where the set of resources is
specified
   by the ALTO client's requests.  This

  // Note I used plural requests because the client can issue multiple
stream control
   requests to determine the set.

 Make sense?

Section 4.2
>
>    The ALTO client can then use the URI to ask the update stream server
>    to (1) send data update messages for additional resources, (2) stop
>
> nit(?): if the client is talking to the stream control server, is it
> really asking the *update stream* server to do anything?
>
>
Good, precise comment. How about the following:

  The ALTO client can then use the URI to ask the stream control server
  specified in the URI to request the update stream server to (1) send
  data update messages for additional resources, (2) stop

> Section 5.3
>
>    control-uri:  the URI providing stream control for this update stream
>         (see Section 7).  The server MUST send a control update message
>         with an URI as the first event in an update stream.  If the URI
>
> It seems like it's worth mentioning whether/when the server can send a
> control-uri at other times.
>
>
OK. We will add a sentence that this happens only once:

       control-uri:  the URI providing stream control for this update stream
        (see Section 7).  The server sends a control update message
         notifying the client of the control-uri. This control update
message notifying the
         control-uri will be sent once and MUST be the first event in an
update stream.

   Will this clarify?

> (I agree with Alexey about describing the 'description' as
> "human-readable", which it sounds like is going to happen.)
>
>
Already fixed. Thanks for the note.

> Section 6.3
>
>    If this update stream can provide data update messages with
>    incremental changes for a resource, the "incremental-change-media-
>    types" field has an entry for that resource-id, and the value is the
>    media-type of the incremental change.  Normally this will be
>
> Should this be "media-type or types", since a comma-separated list is
> allowed?
>
>
Yes, good catch. Will change to types.

>    When choosing the media-type to encode incremental changes for a
>    resource, the update stream server SHOULD consider the limitations of
>    the encoding.  For example, when a JSON merge patch specifies that
>    the value of a field is null, its semantics is that the field is
>    removed from the target, and hence the field is no longer defined
>    (i.e., undefined); see the MergePatch algorithm in Section 3.1.1 on
>    how null value is processed.  This, however, may not be the intended
>    result for the resource, when null and undefined have different
>    semantics for the resource.  In such a case, the update stream server
>    SHOULD choose JSON patch over JSON merge patch.
>
> I don't see how this is SHOULD (vs. MUST).  Won't things break if you
> don't?
>
>
Good specification.  SHOULD -> MUST

> Section 6.5
>
>    add: specifies the resources (and the parameters for the resources)
>         for which the ALTO client wants updates.  We say that the add-
>         request creates a substream.  The ALTO client MUST assign a
>         unique substream-id (Section 5.2) for each entry, and uses those
>         substream-ids as the keys in the "add" field.
>
> Unique within what scope?  (Globally unique could prove challenging...)
>
>
Good to clarify. How about the following:

        The ALTO client MUST assign a substream-id that is unique to the
        update stream (Section 5.2) for each entry, and uses those
        substream-ids as the keys in the "add" field.

>    incremental-changes:  the ALTO client specifies whether it is willing
>         to receive incremental changes from the update stream server for
>         this substream.  If the "incremental-changes" field is "true",
>         the update stream server MAY send incremental changes for this
>         substream (assuming the update stream server supports
>         incremental changes for that resource).  In this case, the
>
> Since MAY is not an obligation, I think the parenthetical is not
> strictly speaking needed.
>
>
Good point. The text in the parentheses will be removed.

>         changes" to "false".  An alternative design of incremental-
>         changes control is a more fine-grained control, by allowing a
>         client to select a subset of incremental methods from the set
>         announced in the server's capabilities.  But this adds
>         complexity to the server, which is more likely to be the
>         bottleneck.  Note that the ALTO client cannot suppress full
>
> I suggest rewording to be more clear that this alternative design is not
> part of this specification (and in fact was rejected from
> consideration?).
>
>   Good to clarify. How about the following:

      An alternative design of incremental-changes control which was
      considered is a more fine-grained control, by allowing a client to
select a
      subset of incremental methods from the set announced in the server's
      capabilities.  But the alternative design was not adopted in this
      document because it adds complexity to the server, which is more
likely to
      be the bottleneck.

>         replacement.  When the ALTO client sets "incremental-changes" to
>         "false", the update stream server MUST send a full replacement
>         instead of an incremental change to the ALTO client.  The update
>         stream server MAY wait until more changes are available, and
>         send a single full replacement with those changes.  Thus an ALTO
>         client which declines to accept incremental changes may not get
>         updates as quickly as an ALTO client which does.
>
> Do we have any requirement that incremental changes be sent "as quickly
> as possible" or can they be delayed as well?
>
>
Good discussion.

The current design has no specification on the delay requirement. It is a
(more Internet-style, best-effort) design that allows the server to send
as fast as it can/desire. The client and server may have external agreements
on how fast to push; an analogy is that different levels of stock quote
delays
are often based on business contracts/offerings. QoS is a future topic.
It neither has higher-level flow control to avoid a faster server overwhelms
a slow client, although the client can indeed perform certain control by
not reading from the buffer. The Operations Consideration has some
discussions
on potential issues of such delays.

> Section 6.7.1
>
>    o  The first event MUST be a control update message with the URI of
>       the update stream control service Section 7 for this update
>       stream.
>
> I thought it was allowed to send a null control-uri in this message.
>

Indeed. The first message must be the control-uri. Suppose the IRD
announces
"support-stream-control": false, then the control-uri should be null.
For null control-uri, as an example, it can be
     event: application/alto-updatestreamcontrol+json
     data: {"control-uri": null}

 We will add a couple of sentences to make it clear.

>
>    o  As soon as possible after the ALTO client initiates the
>       connection, the update stream server MUST send a full replacement
>       for each resource-id requested with a version tag.  In this case
>       the update stream server MAY omit the initial full replacement for
>       that resource, if the "tag" field the ALTO client provided for
>       that resource-id matches the tag of the update stream's current
>
> This second sentence contradicts the MUST in the first one; please
> rephrase to avoid the internal inconsistency.  (Also, please clarify
> what "this case" is, since we haven't mentioned a specific one by that
> point in the text.)
>
>
Good comment. Let me rephrase:

        As soon as possible after the ALTO client initiates the connection,
the
        update stream server checks the "tag" field for each added update
request.
        If the "tag" field is not specified in an added update request, the
update stream server
        MUST first send a full replacement for the request. If the the
"tag" field
        is specified, the client can accept incremental changes, and the
server can
        compute an incremental update based on the "tag", the update stream
server
        MAY omit the initial full replacement.

Is this clearer?

>    o  If this update stream provides update for resource-ids and R0 and
>       R1, and if R1 depends on R0, then the update stream server MUST
>
> nit: s/update/updates/
>
>
OK.

> Section 6.7.2
>
>    A update stream server MAY offer several different update stream
>    resources that provide updates to the same underlying resource (that
>    is, a resource-id may appear in the "uses" field of more than one
>    update stream resource).  In this case, those update stream resources
>    MUST return the same update data.
>
> The phrase "update data" is used only once in this document, and in
> light of the previous discussion about sending the "same updates" but
> potentially different patch events, it seems that greater specificity is
> called for in what is intended here.
>

This is a wonderful comment! There were substantial discussions early on
on the subject and let me try to clarify.

Although we want the design to be as transparent to implementation
as possible, the spec can have impacts on the implementation.

Main implementation concern: allow scaling of the push updates to a
resource,
by allowing instantiating multiple update streams to the same resource,
where
the multiple update streams can be instantiated either from the same update
stream resource (e.g., a service provider using DNS or VIP load balancing
to direct clients to multiple update stream servers) or multiple update
stream
resources (e.g., the need to use multiple URIs to direct to multiple update
stream servers).

These update stream servers are likely frontend to a backend which commits
a sequence of internal updates, and the internal updates are likely totally
serialized using a consistent data store update model such as Paxos/RAFT,
if the backend needs high reliability:
s0, s1, s2, s3, ... where each vi is a consistent snapshot.

An early discussion is to allow a weaker consistency model:
- client 1 may receive updates to obtain a sequence of updates to derive
snapshots
s0, s2, s3, ...

Client 2 may receive updates to obtain a different sequence of snapshots
s0, s1, s3, ...

Such subsequence of the global sequence (safety) allows better
scaling---the liveness
requirements of all updates are eventually sent is related to the delay
requirement discussed
above. This weaker consistency model gives better flexibility but can be
harder to understand,
should an application uses it. Hence, the decision is to use a simpler
consistency model: each
update stream delivers s0, s1, s2, ...

If you can OK, we will clarify that according to the simpler model.

> Section 6.7.3
>
>    JSON Patch and Merge Patch provide the incremental encoding benefit
>    but can be applied to only a single JSON object.  If an update stream
>    service (1) supports a resource providing a multipart media type and
>    (2) specifies an incremental media type for the resource in its
>    capabilities, the server MUST (1) use substream-id.content-id in its
>    `event` field, (2) include the content-id in the multipart message,
>    and (3) the content identified by the content-id must be a single
>    JSON object.
>
> Which message is "the multipart message"?  Is it on the update stream?
>
>
Good need for clarification. How about the following:

   JSON patch and merge patch provide the incremental encoding benefit
   but can be applied to only a single JSON object.  If an update stream
   service supports a resource providing a multipart media type, which
   we refer to as a multipart resource, then the update
   stream service needs to handle the issue that the message of a full
multipart
   resource can include multiple JSON objects. To address the issue, when
an
   update stream service specifies that it supports JSON patch or merge
patch incremental
   updates for a multipart resource,  the service MUST
   ensure that (1) each part of a multipart message is a single JSON
object,
   (2) each part is specified by a static content-id in the initial full
message, (3) each
   data update event applies to only one part; and (4) each data update
specifies
   substream-id.content-id as the `event` field of the event, to identify
the part
   to be updated.

Does the preceding clarify?

> Section 7
>
>    An stream control service allows an ALTO client to remove resources
>
> nit: s/An/A/
>
>
Thanks. Will fix.

> Section 7.1
>
> I agree with Alexey to reference RFC 3986 here.
>
>    It is expected that the update stream server will assign a unique
>    stream id to each update stream instance and will embed that id in
>    the associated stream control URI.  However, the exact mechanism is
>    left to the update stream server.  ALTO clients MUST NOT attempt to
>    deduce a stream id from the control URI.
>
> Is this stream ID used for anything outside of the update stream
> server's internal processing?
>
>
It should not. How about the following revision:

   It is expected that there is an internal mechanism to map a stream
control
   URI to the unique update stream instance to be controlled. For example,
   the update stream service may assign a unique, internal stream id to
   each update stream instance. However, the exact
   mechanism is left to the update stream service provider.

>    To prevent an attacker from forging a stream control URI and sending
>    bogus requests to disrupt other update streams, stream control URIs
>    SHOULD contain sufficient random redundancy to make it difficult to
>    guess valid URIs.
>
> I think this needs to be a MUST.
> It is probably also worth referencing
> https://www.w3.org/TR/capability-urls/ .
>
>
Good suggestion. We will add the reference and revise.

> Section 7.5
>
>    An stream control service accepts the same input media type and input
>
> nit: s/An/A/
>
> Thanks, will fix.

> Section 7.6
>
>    o  If any substream-id in the "remove" field was not added in a prior
>       request, the error code of the error message MUST be
>       E_INVALID_FIELD_VALUE; the "field" field SHOULD be "remove" and
>       the "value" field SHOULD be the array of the invalid substream-
>
> nit: s/the array/an array/; we are defining this array for the first
> time.
>
>
Thanks, will change.

>       id in the same request.  However, it is legal to remove a
>       substream-id twice.
>
> I suggest making this requirement to track
> previously-used-but-now-closed substream-ids more explicit.  (It also
> comes uo for the next bullet point.)
>
>
OK. Will add the requirement, which is reasonable.

>    o  If any substream-id in the "add" field has been used before in
>       this stream, the error code of the error message MUST be
>       E_INVALID_FIELD_VALUE, the "field" field SHOULD be "add" and the
>       "value" field SHOULD be the array of invalid substream-ids.
>
> nit: s/the array of/an array of the/ (same reasoning as before)
>
>
OK. See above.

> Section 8
>
> [Just noting that I, as well, did not perform a detailed check of the
> examples.]
>
>
Sure we will double-check ourselves.

> Section 8.4
>
>    If the ALTO client needs the "bandwidth" property for additional
>    endpoints, the ALTO client can send an "add" request on the stream
>    control URI:
>
> The "add" request also includes a priv:ietf-load request that we don't
> mention here.
>
>
OK. We will mention.

> Section 9.3
>
>    An update stream server can avoid such issues by offering update
>    streams only for filtered cost maps which do not allow constraint
>    tests.
>
> Maybe say something about "having to handle such complicated behavior"
> instead of "such issues" (which are not particularly clearly indicated)?
>
>
Good suggestion.
   "An update stream server can avoid having to handle such complicated
behavior
   by offering update streams only for filtered cost maps which do not
allow
   constraint tests."

> Section 10
>
> I greatly appreciate the way these security considerations are framed,
> fiting into the context of the base protocol and then with the
> additional risks discussed separately; thank you!
>
>
> Thanks!

> Is there anything interesting to say about this mechanism making it
> easier to use an updated resource with a stale version of a resource
> used by that first resource?
>
>
Not sure which mechanism. Could you please give a bit more context?

> With the possibility for the update server and stream control server to
> be different entities, there is a risk of information skew or
> communications breakdown between them, as well as invalid or spoofed
> messages claiming to be on that private communications path.  It would
> be worth a brief discussion of what sort of problems can arise from that
> separation of roles.
>
>
Good addition. We will add.

> Just to check: SSE guarantees that we get events in order, so there are
> no reordering considerations within a stream.  IIRC we did have some
> discussion about synchronization of the same updates appearing on
> multiple streams and of dependent updates within a single stream, which
> should take care of the main points in this space.
>
>
Good comments. See above on the cross-stream consistency model.

> I was also not able to read very closely the SSE spec, but it sounded
> like perhaps the client-side implementation was allowed (or required?)
> to introduce linefeed characters into the data buffer for the event in
> question, e.g., to reflect the line breaks in the actual transferred
> encoding.  Do we need to say anything about the sender behavior to avoid
> putting line breaks in places that would have semantic consequences for
> the ALTO updates?
>
>
Specifying the server behavior can help to avoid bugs. There is some
discussion on the line length (Sec. 9.5) and figure events (Sec. 11). This
discussion on server behavior can be added
(and will be added) in Section 9.5. What do you think?

> Section 10.1
>
>    To avoid these attacks on the update stream server, the server SHOULD
>    choose to limit the number of active streams and reject new requests
>    when that threshold is reached.  An update stream server SHOULD also
>
> This transfers the attack from the update server to the honst clients
> trying to use it (as noted in the following paragraph); it may be
> possible to use somewhat more clever logic involving IP reputation,
> rate-limiting, and compartmentalizing the overall threshold into smaller
> thresholds that apply to subsets of potential clients.  (Client
> authentication is also a potential mitigation for some of these attacks,
> as is mentioned in the following paragraph.)  None of this is
> necessarily incompatible with the current "SHOULD choose to limit"
> guideline, of course, though we may want to give a bit more sense of the
> nuances involved.
>
>
Good text, and we will adopt some of them right after the first paragraph.

> Section 10.2
>
>    Also, under overloading, the client may no longer be able to detect
>    whether an information is still fresh or has become stale.  In such a
>    case, the client should be careful in how it uses the information to
>    avoid stability or efficiency issues.
>
> Is it possible to get into a state where stale ALTO data causes a client
> to behave in a way worse for the network than not using any ALTO data at
> all?
>
>
Yes. This is possible, and this a risk that ALTO clients should always pay
attention
to. We plan to move some text on this from the recent cost calendar to this
as well.

> Section 10.3
>
> I thought https was a MUST for ALTO, which would provide the needed
> confidentiality even in the absence of mutual authentication.
>
>
There still can be cases where http may be OK, when neither confidentiality
nor
authentication is an issue.

> Section 11
>
>    At the low level encoding level, new line in SSE has its own
>    semantics.  Hence, this design requires that resource encoding does
>    not include new lines that can confuse with SSE encoding.  In
>    particular, the data update message MUST NOT include "event: " or
>    "data: " at a new line as part of data message.
>
> It may also be worth forbidding these not at a new line, since a modular
> server implementation might apply an arbitrary line-breaking policy.
>
>
Yep. Will add, as above mentioned.

> Section 13
>
> I suggest adding a little more introduction at the start to give the
> background that the following description is a protocol design and
> considerations that could have been used, but was rejected.
>
>
Good suggestion. We are moving it to the appendix in the newer version.

Thank you so much for the extremely careful, insightful review!

Richard

>
>
> _______________________________________________
> alto mailing list
> alto@ietf.org
> https://www.ietf.org/mailman/listinfo/alto
>