Re: [alto] AD review of draft-ietf-alto-incr-update-sse-19

Mirja Kuehlewind <ietf@kuehlewind.net> Tue, 18 February 2020 09:32 UTC

Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\))
From: Mirja Kuehlewind <ietf@kuehlewind.net>
In-Reply-To: <CANUuoLoLWQrYibrOMsAv_VNNDq0qC_D1D-=drNigw1uvVY3A5g@mail.gmail.com>
Date: Tue, 18 Feb 2020 10:32:09 +0100
Cc: draft-ietf-alto-incr-update-sse.all@ietf.org, IETF ALTO <alto@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <993A509C-70B7-45B1-A1ED-44B4458A80C7@kuehlewind.net>
References: <02F985BC-F6A2-4A2C-A2B7-F348D659BFDE@kuehlewind.net> <CANUuoLr3EWWaOmO5LBWwMKub4-dQ+Lrkr1aU9QiJCRqCRuNe=w@mail.gmail.com> <59E63194-3AF2-4601-92AD-439E80982F65@kuehlewind.net> <CANUuoLoLWQrYibrOMsAv_VNNDq0qC_D1D-=drNigw1uvVY3A5g@mail.gmail.com>
To: "Y. Richard Yang" <yry@cs.yale.edu>
Archived-At: <https://mailarchive.ietf.org/arch/msg/alto/3wyPjPCzLn3iv8fT5cvneFve7Xo>
Subject: Re: [alto] AD review of draft-ietf-alto-incr-update-sse-19
Precedence: list

Hi Richard,

Please see inline.

> On 17. Feb 2020, at 21:03, Y. Richard Yang <yry@cs.yale.edu> wrote:
> 
> Hi Mirja,
> 
> Thanks a lot for the ultra-fast response. Please see below.
> 
> On Mon, Feb 17, 2020 at 4:21 AM Mirja Kuehlewind <ietf@kuehlewind.net> wrote:
> Hi Richard,
> 
> Please see below.
> 
> > On 17. Feb 2020, at 05:44, Y. Richard Yang <yry@cs.yale.edu> wrote:
> > 
> > Hi Mirja,
> > 
> > Please see below.
> > 
> > On Fri, Feb 14, 2020 at 4:32 PM Mirja Kuehlewind <ietf@kuehlewind.net> wrote:
> > Hi authors,
> > 
> > I reviewed draft-ietf-alto-incr-update-sse in order to potentially get this document through the process before I step down as AD in March. I have a couple of technical questions below that I would some feedback about before we move on to IETF last call and some fully editorial point that can be updated anytime before final publication (e.g. also after IETF last call has ended).
> > 
> > However, I have one high level question that I would like to at least understand to justify it in the IESG before we moved on. Sorry that I have not raised this earlier in the working group but I realised it just now:
> > 
> > HTTP/2 (and HTTP/3) support multiplexing, why is a separate service needed for stream control, instead of just using a different stream on the same HTTP connection? 
> > Also note that I don’t find the justification, why HTTP/2 Server Push is not used, not very convincing and I’m afraid to get push-back from the ART ADs. Especially given the problems describes in section 9.5. However, we can give it a try. I think I would recommend to have the justification in section 13.1 earlier in the document, e.g. in section 3.
> > 
> > 
> > We are indeed designing an ALTO incremental based on HTTP/2 (/3). For now, we plan to clarify that the design is for earlier HTTP and move 13.1 to Sec. 3. How does this sound?
> 
> It’s okay.
> 
> 
> However as you are already working on a design for HTTP/2 or HTTP/3, would it be better to finish this work up now and specify it from the beginning over this more modern protocols? Or is there a goos reason why HTTP/1 support is needed?
> 
> Thanks. We will finish this work now, for HTTP/1/1.1, as it is mostly done and HTTP/1/1.1 is widely adopted. . HTTP/2 is definitely gaining momentum (https://w3techs.com/technologies/details/ce-http2 states 43.3% in Feb. 2020) and we will do the work during the next phase as soon as we are done with this one.

I don’t think that is a relevant number for alto because that's about Webservers already offering http/2. And I would assume an alto server is usually not co-located with another Webserver. For alto the most important question would be if the http sever implementation you are using is supporting http/2 and I think in terms of implementation most platforms do. So if you run an alto server and want to use SSE with http/2 you need to be willing to update you server; that’s it.

Anyway, I think you need a bit more justification in the draft why this is http/1.1 based.

> 
>  
> >  
> > And one more request before we move on, please add the actual Content-Length values to the examples and confirm that you did run the examples though a syntax checker!
> > 
> > 
> > Yes. All fixed.
> 
> Thanks!
> 
> >  
> > 
> > Here are my other technical questions/comments:
> > 
> > 1) In section 3.2.1
> > "This document adopts
> >    the JSON merge patch message format to encode incremental changes,
> >    but uses a different HTTP method, i.e., it uses POST instead of
> >    PATCH.”
> > I don’t quite understand this sentence. As you use SSE, I guess you are using neither POST nor PATCH. The point here is that SSE has a quite different communication model than otherwise in HTTP, so I’m not sure what this statement actually tells me.
> > 
> > 
> > Excellent review. We have revised the paragraph to be the following:
> > 
> > "JSON merge patch is intended to allow applications to update server resources by sending only incremental changes. <xref target="RFC7396"/> defines the encoding of incremental changes (called JSON merge patch objects)  to be used by the HTTP PATCH method <xref target="RFC5789"/>. This document adopts from <xref target="RFC7396"/> only the JSON merge patch object encoding and does not use the HTTP PATCH method.”
> 
> Much better. Thanks!
> 
> > 
> > The goal of the revision above is to call out explicitly what we adopt---JSON merge objects. What do you think?
> >  
> > 2) Why is the multipart content-type supported if in [RFC7285], each "resource is encoded as a single JSON object" (as you state in section 5.2)?
> > In relation to this question, I have to say that I find it also rather inappropriate for an RFC that section 8.1 shows a "non-standard ALTO service” in the example. 
> > Further I was surprised to see this line in section 8.5:
> > "event: multipart/related;boundary=...”
> > This doesn’t seem to be sufficiently specified in the draft…? Can you please further explain in the draft.
> > 
> > 
> > We are taking a pass across the document to clarify multipart, on both motivation and full specification. A high-level guidance that we need is the following: all existing (i.e., published) ALTO resources have a single JSON object, but the extension, path vector (PV), is best handled by using multipart, to avoid complex consistency handling: a resource consisting of two related JSON objects. Hence, the SSE is revised to handle this. The "non-standard ALTO service example" is based on PV but PV has not been published and we do not want to create document dependency to slow down the finish of SSE. How about the following two options: (1) we remove the "non-standard example"; (2) we make clear that SSE can be used by both current ALTO services and potential future services, and we label the "non-standard" as just one example?
> 
> I would propose to actually reference to the path vector draft. Just saing that is exists and using it in a (non-normative) examples, does not create a normative reference.
> 
> 
> OK. Sounds good and will do now.
>  
> >  
> > 3) section 5.3:
> > "The server MUST send a control update message
> >         with an URI as the first event in an update stream. “
> > I missed this statement on first read. I think it’s not great to have this as port of the message format definition. I would proposed to move this to the beginning of section 5 and also add some more text on the general message flow. I know this is described (non-normatively) in section 4, however, section 4 does rather describe the function and services provided and not really the concrete message flow.
> > 
> > 
> > Excellent suggestion. We will move it to the beginning of Sec. 5 with a concrete global message flow specification, with the following new starting paragraph of Sec. 5:
> > 
> > "        We now define the details of ALTO SSE. This section starts with the specification 
> >         of the global message flow (<xref target="ALTO.SSE.MessageFlow"/>), and then specifies the details of the data update messages (<xref target="ALTO.SSE.UpdateEvents"/>)
> >         and the control update messages (<xref target="ALTO.SSE.ControlEvents"/>) respectively.”
> 
> Thanks! Sounds good.
> 
> >  
> > 4) In section 6.3 you say:
> >        "An
> >         alternative design of incremental-changes control is a more
> >         fine-grained control, by allowing a client to select the subset
> >         of incremental methods from the set announced in the server's
> >         capabilities (see Section Section 6.4).”
> > Does that mean that a client that wants to use incremental changes must support all methods implemented by the server? That seems to make deployment of new methods rather complicated.
> > 
> > 
> > Yes. To be precise, the server may use a subset of incremental encoding for a given resource. To allow "old" client to still proceed, the client can set the acceptance to be false. Here is a revision to try to clarify a bit more:
> > 
> >         If the "incremental-changes" field is "true",
> >         the update stream server MAY send incremental changes for this
> >         substream (assuming the update stream server supports
> >         incremental changes for that resource).  In this case, the
> >         client MUST support all incremental methods from the set
> >         announced in the server's capabilities for this resource; see Section Section 6.4
> >         for server's announcement of potential incremental methods.  If
> >         a client does not support all incremental methods from the set
> >         announced in the server's capabilities, the client can set
> >         "incremental-changes" to "false", and the update stream server
> >         then MUST NOT send incremental changes for that substream.
> 
> Thanks for the clarification. That’s good.  However, I was indirectly question if that is really the right design, rather than given the client also an option to indicate which of the methods it supports. Is there a reason why it’s design this was? Was that discussed in the group?
> 
> 
> It was discussed in the WG. 
> 
> A "standard negotiation" protocol is:
> A -> B: I support set SA
> B -> A: I pick subset SB \subset SA
> 
> The decision, after the discussion, was that the following design:
> S -> C (IRD): this SSE supports set S + full-replacement
> C -> S (in request): I accept S+full-replacement (true) or only full-replacement (false)
> 
> The discussion was that the server (1) has more information (for example, predict/know the set of potential changes, and hence predict either using merge patch or json patch is more efficient), and (2) is more likely to be the bottleneck (e.g., pushing updates to multiple clients). Hence the decision is to allow the server to make the choice to (1) more likely choose a more efficient encoding, and (2) sharing of the encoding overhead (if client C1 chooses encoding e1, and C2 chooses e2, then the server needs to compute both e1 and e2). Switching the design to the "standard negotiation" is a simple modification, and I am actually relatively open to this, but I do feel that the current design is simpler. The authors and others can sure discuss the issues during the weekly meeting. Any remaining comments? 

Okay, maybe add some more explanation in the draft.
> 
>  
> >  
> > 5) section 6.6:
> > "The update stream server MUST close the stream without
> >       sending any events.”
> > What does "close the stream" actually mean? Close the HTTP/TCP connection?
> > 
> > 
> > Excellent catch. It should not be simply close the connection. We will clarify all such error handling cases.
> 
> Thanks! Yes, please double-check the rest of the document to be more clear about connection establishment and closure.
> 
> 
> WIll do.
>  
> >  
> > 6) One question related to section 6.8: Please note that TCP also provides a Keep-alive function. Is there any explanation in [SSE] why that is not used or recommended (sorry didn’t had time t look this up in [SSE] myself and could find anything quick)?
> > 
> > 
> > My understanding is that TCP keep alive may be off or can be quite long: I just checked my Mac and got:
> > %sysctl -A | grep net.inet.tcp.*keep
> > net.inet.tcp.keepidle: 7200000
> > net.inet.tcp.keepintvl: 75000
> > net.inet.tcp.keepinit: 75000
> > net.inet.tcp.keepcnt: 8
> > net.inet.tcp.always_keepalive: 0
> > 
> > Let us dig a bit more and fix the final version.
> 
> These parameters are also configurable. However, it really depends on the network you are in. If there is a NAT with short timers of not. Usually the default TCP values should covers to most common cases.
> > 
> >  
> > 7) section 9.2:
> >   "However, if the ALTO client does not receive
> >    the expected updates, the ALTO client MUST close the update stream
> >    connection, discard the dependent resources, and reestablish the
> >    update stream.”
> > Why is this a MUST requirement?
> > 
> > 
> > Very good point. We gave two approaches, and this sentence is for the second approach: first update the 
> > base (e.g., network map), mark the dependent (e.g., cost map) as invalid, and wait for the dependent to be
> > updated. If the dependent update does not arrive, the system will be in an invalid state. Since the base 
> > has already been updated (committed, using a database term), there is no rollback possible unless we also rollback
> > the base. This sentence gives only a simple transaction-based approach. Let me clarify a bit more.
> 
> So the the idea is that the establishment will send you the full map, right? However, it still seems like a client decision to do that or e.g. just give up; a MUST requirement seems a bit to strong.
> 
> 
> Yes. After close, which is indeed a client decision, and re-establishment, the server will send the full dependent of the dependent to allow recovery. I agree that MUST is a bit strong. A client with a higher level of patience can indeed wait. Let us change to "MAY choose”.

Or SHOULD is fine as well. Maybe rather add a bit more explanation when it might be appropriate to not re-connect


> I will also add that such timeout has the traditional transport timeout caveat, too quick timeout can lead to attack to the server. How does this sound?

That’s actually a really good point, about the timeout. Please add. I guess the timeout should actually be a SHOULD or even MUST. 


>  
> We will send in a new version with all the changes by mid this week (Thursday the lastest), if you see the approaches as we discussed are OK.

That would be great. I think you can just go ahead and submit the new version to the datatracker, so I can start IETF last call right away (if everything is okay).

Thanks!
Mirja



> 
> Thank you so much!!
> 
> Richard
> 
> >  
> > 8) section 10.1:
> >   "To avoid these attacks on the update stream server, the server MAY
> >    choose to limit the number of active streams and reject new requests
> >    when that threshold is reached.”
> > This however should be a SHOULD rather than a MAY.
> > 
> > 
> > OK. Fixed.
> 
> Thanks.
> 
> >  
> > 9) Also section 10.1:
> >   "Therefore an update stream server may prefer to restrict update
> >    stream services to authorized clients, as discussed in Section 15 of
> >    [RFC7285].”
> > This should probably be also a normative MAY.
> > 
> > OK. Fixed.
> 
> Thanks.
> 
> >  
> > 10) section 10.2
> >   "Under overloading, the client may choose to remove the information
> >    resources with high update rates.”
> > Should also be MAY.
> > 
> > 
> > OK. Fixed.
> 
> Thanks.
> 
> >  
> > 11) Section 13.2 states that a stream can not be re-started. However this is not really mentioned explicitly in the body of the document. I think the text in 13.2 needs to be moved to somewhere earlier in the document and in addition clearly state what is the expected behaviour when a connection fails.
> > 
> > 
> > Yes. We are moving 13.1 to the front.
> 
> Great!
> 
> >  
> > 
> > And the editorial comments:
> > 
> > 0) Abstract: For an RFC the abstract is rather long. I’d suggest to simply remove the second paragraph (as it is repeated anyway in the intro).
> > 
> > 1) Please spell out IRD at first occurrence.
> > 
> > 2) This document uses a couple of times “we”. While it is not forbidden to do so, it rather uncommon for an RFC (as “we” is not fully clear and should rather refer to the wg as a whole than the authors only). In many cases “we” can be simply replaced by “in this section”. In other case “we” can basically simply be removed, like here in the intro:
> > 
> > OLD
> > "We intend that the mechanism can also
> >    support new ALTO services to be defined by future extensions…”
> > 
> > NEW
> > “The mechanism can also
> >    support new ALTO services to be defined by future extensions…”
> > 
> > 3) section 5.2:
> > “...encoded using multipart/related [RFC2387].”
> > Is the word "Content-type” missing here? This occurs several times in the document but not sure you say it that way...
> > 
> > 4) also section 5.2:
> > "Each component requiring the service of this document
> >    MUST be identified by a unique Content-ID to be defined in its
> >    defining document.”
> > I’m not sure I understand this sentence? What do you mean by “service of this document” and “to be defined in its defining document”? 
> > 
> > 5) section 5.2 nits:
> > "The encoding of a full replacement
> >    is defined its defining document (e.g., network and cost map messages
> >    by [RFC7285], and uses media type defined in that document.”
> > Missing ending brackets and s/media type/the media type/ or “media types defined in those documents”
> > 
> > 6) section 6.3:
> > "the resource-id of an ALTO resource, and MUST be in the
> >         update stream's "uses" list (Section 8.5.2 of Section 6.5)”
> > There is a broken reference here.
> > 
> > 7) section 6.3:
> > "The default value for "incremental-
> >         changes" is "true", so to suppress incremental changes, the ALTO
> >         client MUST explicitly set "incremental-changes" to "false”.”
> > This should be two sentences -> full-stop after “true”
> > 
> > 8) section 6.7.1:
> > “…to stop updates for one or more resources Section 7,…”
> > Missing bracket -> (Section 7)
> > 
> > 9) section 7:
> > "If the
> >    ALTO client and the stream control server the ALTO client MAY send
> >    multiple stream control requests to the stream control server using
> >    the same HTTP connection.”
> > There is something missing in this sentence…?
> > 
> > 10) Sec 8.3:
> > "POST /updates/streams/2718281828459" HTTP/1.1”
> > -> remove " after path
> > 
> > 11) It seems like section 11 should be earlier in this document, maybe somewhere as subsection in section6?
> > 
> > 
> > 
> > All the edits are fixed.
> 
> Great!
> 
> Thanks!
> Mirja
> 
> 
> >  
> > Thank you so much!
> > 
> > Thanks!
> > Mirja
> > 
> 
> 
> 
> -- 
> -- 
>  =====================================
> | Y. Richard Yang <yry@cs.yale.edu>   |
> | Professor of Computer Science       |
> | http://www.cs.yale.edu/~yry/        |
>  =====================================

[alto] AD review of draft-ietf-alto-incr-update-s… Mirja Kuehlewind
Re: [alto] AD review of draft-ietf-alto-incr-upda… Y. Richard Yang
Re: [alto] AD review of draft-ietf-alto-incr-upda… Y. Richard Yang
Re: [alto] AD review of draft-ietf-alto-incr-upda… Mirja Kuehlewind
Re: [alto] AD review of draft-ietf-alto-incr-upda… Y. Richard Yang
Re: [alto] AD review of draft-ietf-alto-incr-upda… Mirja Kuehlewind
Re: [alto] AD review of draft-ietf-alto-incr-upda… Y. Richard Yang
Re: [alto] AD review of draft-ietf-alto-incr-upda… Y. Richard Yang
Re: [alto] AD review of draft-ietf-alto-incr-upda… Mirja Kuehlewind
Re: [alto] AD review of draft-ietf-alto-incr-upda… Y. Richard Yang