[core] John Scudder's No Objection on draft-ietf-core-new-block-12: (with COMMENT)
John Scudder via Datatracker <noreply@ietf.org> Fri, 21 May 2021 16:22 UTC
Return-Path: <noreply@ietf.org>
X-Original-To: core@ietf.org
Delivered-To: core@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id 861523A168A; Fri, 21 May 2021 09:22:26 -0700 (PDT)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
From: John Scudder via Datatracker <noreply@ietf.org>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-core-new-block@ietf.org, core-chairs@ietf.org, core@ietf.org, marco.tiloca@ri.se, marco.tiloca@ri.se
X-Test-IDTracker: no
X-IETF-IDTracker: 7.30.0
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: John Scudder <jgs@juniper.net>
Message-ID: <162161414645.14500.9969284754936809565@ietfa.amsl.com>
Date: Fri, 21 May 2021 09:22:26 -0700
Archived-At: <https://mailarchive.ietf.org/arch/msg/core/FvCzJEYV7xVT9JI78cFZH11AwDk>
Subject: [core] John Scudder's No Objection on draft-ietf-core-new-block-12: (with COMMENT)
X-BeenThere: core@ietf.org
X-Mailman-Version: 2.1.29
List-Id: "Constrained RESTful Environments \(CoRE\) Working Group list" <core.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/core>, <mailto:core-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/core/>
List-Post: <mailto:core@ietf.org>
List-Help: <mailto:core-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/core>, <mailto:core-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 21 May 2021 16:22:27 -0000
John Scudder has entered the following ballot position for draft-ietf-core-new-block-12: No Objection When responding, please keep the subject line intact and reply to all email addresses included in the To and CC lines. (Feel free to cut this introductory paragraph, however.) Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html for more information about DISCUSS and COMMENT positions. The document, along with other ballot positions, can be found here: https://datatracker.ietf.org/doc/draft-ietf-core-new-block/ ---------------------------------------------------------------------- COMMENT: ---------------------------------------------------------------------- My further comments are resolved in the current GitHub copy of the document, so once it's published as version 13 I think we're good to go as far as I'm concerned. Thanks for the discussion and changes. -- My initial comments have been resolved or partially resolved in version 12, see https://mailarchive.ietf.org/arch/msg/core/6geK8P9I0jZBPp9a5qSrQwQNhUo/. Note I've added two new ones at the end of the list. (draft-ietf-core-new-block-11) 1. Section 3.2 This mechanism is not intended for general CoAP usage, and any use outside the intended use case should be carefully weighed against the loss of interoperability with generic CoAP applications. I’m curious: is the only reason the mechanism isn’t intended for general usage, the fact some implementations won’t support it? Or does it have other deficiencies that also make it unsuitable? 2. Section 4.1 Q-Block2 Option is useful with GET, POST, PUT, FETCH, PATCH, and iPATCH requests and their payload-bearing responses (2.01, 2.02, 2.03, 2.04, and 2.05) (Section 5.5 of [RFC7252]). I found the list of codes incomprehensible on first encountering it, since the concept of response codes hadn’t been introduced yet. I do understand that the document assumes familiarity with CoAP; nonetheless for basic clarity I think this should say “(response codes 2.01, 2.02…”. Additionally, the reference to RFC 7252 §5.5 doesn’t seem to be especially germane? By the way, is 2.03 indeed a payload-bearing response? The only other place the spec touches on it is in §4.4, which says “the server could respond with a 2.03 (Valid) response with no payload”. 3. Section 4.1 To indicate support for Q-Block2 responses, the CoAP client MUST include the Q-Block2 Option in a GET or similar request (FETCH, for example), the Q-Block2 Option in a PUT or similar request, or the Q-Block1 Option in a PUT or similar request so that the server knows that the client supports this Q-Block functionality should it need to send back a body that spans multiple payloads. Otherwise, the server would use the Block2 Option (if supported) to send back a message body that is too large to fit into a single IP packet [RFC7959]. Is this paragraph really supposed to mention both Q-Block2 and Q-Block1? In particular, I’m confused by the mention of both of these in relation to PUT. 4. Section 4.1 The Q-Block1 and Q-Block2 Options are unsafe to forward. That is, a CoAP proxy that does not understand the Q-Block1 (or Q-Block2) Option MUST reject the request or response that uses either option. Presumably (hopefully) this is simply describing the behavior of existing spec-compliant proxies when processing the new messages. As such, is the MUST appropriate? I would think not. 5. Section 4.3 body. Note that the last received payload may not be the one with the highest block number. “Might not” would be less ambiguous than “may not”. 6. Section 4.4 (also two places in §4.3) (This comment rehashes, in more detail, the difficulty explained in my DISCUSS. You may want to skip over it until we’ve resolved the DISCUSS, after which this may, or may not, be relevant.) The client SHOULD wait for up to NON_RECEIVE_TIMEOUT (Section 7.2) I read this as meaning the client should wait for as little as zero, or as long as NON_RECEIVE_TIMEOUT — that’s my understanding of “up to”. Is that the intended meaning? If it is, I think it’s worth writing out as I’ve done, for clarity. If it’s not, it definitely needs to be fixed. There’s a similar issue with “up to NON_PARTIAL_TIMEOUT” later in the section. Referring ahead to Section 7.2 muddies the waters further. Even though the text quoted above says NON_RECEIVE_TIMEOUT is an upper limit on how long to wait, §7.2 says it’s a lower limit instead... maybe? From §7.2: NON_RECEIVE_TIMEOUT is the initial maximum time to wait for a missing “Maximum”, ok great, that means “upper bound” and so lines up with §4.4 although the “initial” is surprising since §4.4 doesn’t say anything about the upper limit increasing. It continues: payload before requesting retransmission for the first time. Every time the missing payload is re-requested, the time to wait value doubles. The time to wait is calculated as: Time-to-Wait = NON_RECEIVE_TIMEOUT * (2 ** (Re-Request-Count - 1)) But this part says it’s (a) an exact time-to-wait, not a “maximum”, and (b) it says it increases exponentially, so NON_RECEIVE_TIMEOUT isn’t a maximum at all, but a minimum. This later text in §7.2 implies that perhaps the problem in the above passages is the word “maximum”, and it should simply be deleted: For the server receiving NON Q-Block1 requests, it SHOULD send back a 2.31 (Continue) Response Code on receipt of all of the MAX_PAYLOADS payloads to prevent the client unnecessarily delaying. If not all of the MAX_PAYLOADS payloads were received, the server SHOULD delay for NON_RECEIVE_TIMEOUT (exponentially scaled based on the repeat request count for a payload) before sending the 4.08 (Request Entity Incomplete) Response Code for the missing payload(s). Similarly “up to” in the quote that began this comment should be “at least”. Whether you adopt those suggestions or not, it seems as though all this needs to be rewritten with careful attention to conveying what the desired behavior is. But the plot thickens. Later in §7.2 we have It is likely that the client will start transmitting the next set of MAX_PAYLOADS payloads before the server times out on waiting for the last of the previous MAX_PAYLOADS payloads. On receipt of the first payload from the new set of MAX_PAYLOADS payloads, the server SHOULD send a 4.08 (Request Entity Incomplete) Response Code indicating any missing payloads from any previous MAX_PAYLOADS payloads. The point being that the retransmission request can be triggered by an event other than timer expiration. So in that sense, “maximum” is right — it provides an upper bound on how long to wait before requesting a retransmission — but in another sense it’s wrong because the exponential increase is applied to it. I think the word “maximum” is trying to do too much work, and more words are probably required in order to make this clear. I also think the problem is exacerbated by the fact both §4.4 and §7.2 are talking normatively about how to use NON_RECEIVE_TIMEOUT. It seems as though the main description is found in §7.2, and some confusion would be avoided by making §4.4 less specific, and simply referring forward to §7.2. And, as noted in my DISCUSS, example 10.2.3 muddies the waters still further since it illustrates yet another behavior. 7. Section 4.4 The client SHOULD wait for up to NON_RECEIVE_TIMEOUT (Section 7.2) after the last received payload for NON payloads before issuing a GET, POST, PUT, FETCH, PATCH, or iPATCH request that contains one or more Q-Block2 Options that define the missing blocks with the M bit unset. The client MAY set the M bit to request this and later blocks from this MAX_PAYLOADS set. Further considerations related to the transmission timing for missing requests are discussed in Section 7.2. I find this whole paragraph pretty confusing with the dueling SHOULD and MAY, where it appears the SHOULD might be doing two jobs at once. I *think* your intent is something like the following? “The client SHOULD wait as specified in Section 7.2 for NON payloads before requesting retransmission of any missing blocks. Retransmission is requested by issuing a GET, POST, PUT, FETCH, PATCH, or iPATCH request that contains one or more Q-Block2 Options that define the missing block(s). Generally the M bit on the Q-Block option(s) SHOULD be unset, although the M bit MAY be set to request this and later blocks from this MAX_PAYLOADS set, see Section 10.2.4 for an example of this in operation.” 8. Section 5 If the size of the 4.08 (Request Entity Incomplete) response packet is larger than that defined by Section 4.6 [RFC7252], then the number of missing blocks MUST be limited so that the response can fit into a single packet. If this is the case, then the server can send Suggestion: “then the number of missing blocks reported MUST...” (The thing being limited is not the actual number of missing blocks. You’re limiting the number you report on.) 9. Section 7.1 It is implementation specific as to whether there should be any further requests for missing data as there will have been significant transmission failure as individual payloads will have failed after MAX_TRANSMIT_SPAN. This paragraph seems as though it’s a non-sequitur. It just doesn’t make sense to me. :-( 10. Section 7.2 (This comment relates to the difficulty explained in my DISCUSS. You may want to skip over it until we’ve resolved the DISCUSS, after which this may, or may not, be relevant.) NON_TIMEOUT is the maximum period of delay between sending sets of MAX_PAYLOADS payloads for the same body. By default, NON_TIMEOUT has the same value as ACK_TIMEOUT (Section 4.8 of [RFC7252]). Presumably the use of “maximum” means it’s fine to delay zero seconds (or any value lower than NON_TIMEOUT). 11. General By the way, none of the timers specify jitter (and indeed, if read literally, jitter would be forbidden). Is this intentional? 12. Section 7.2 If the CoAP peer reports at least one payload has not arrived for each body for at least a 24 hour period and it is known that there are no other network issues over that period, then the value of MAX_PAYLOADS can be reduced by 1 at a time (to a minimum of 1) and the situation re-evaluated for another 24 hour period until there is no report of missing payloads under normal operating conditions. The newly derived value for MAX_PAYLOADS should be used for both ends of this particular CoAP peer link. Note that the CoAP peer will not know about the MAX_PAYLOADS change until it is reconfigured. As a consequence of the two peers having different MAX_PAYLOADS values, a peer may continue indicate that there are some missing payloads as all of its MAX_PAYLOADS set may not have arrived. How the two peer values for MAX_PAYLOADS are synchronized is out of the scope. I take it this is just thrown in here as an operational suggestion? It’s not specifying protocol, right? It seems a little misplaced, if so. 13. Section 10.1.3 (This comment relates to the aside in my DISCUSS. You may want to skip over it until we’ve resolved the DISCUSS, after which this may, or may not, be relevant.) Why doesn’t the server request 1,9,10 in one go? Since its rxmt request is triggered by rx of 11, one would think it could infer 10 had been lost. 14. Section 10.1.4 (also 10.3.3) (This comment relates to the aside in my DISCUSS. You may want to skip over it until we’ve resolved the DISCUSS, after which this may, or may not, be relevant.) Why doesn’t reception of a message with More=0 trigger the server to request retransmission of the missing block? Why does it have to wait for timeout? 15. Section 10.2.3 (This comment relates to my DISCUSS. You may want to skip over it until we’ve resolved the DISCUSS, after which this may, or may not, be relevant.) Why doesn’t reception of QB2:10/0/1024 trigger the client to request retransmission? Why does it have to wait for timeout? Similarly reception of QB2:9/1/1024 later in the example. 16. Section 10.2.4 Since MAX_PAYLOADS is 10, why does the example say “MAX_PAYLOADS has been reached” after payloads 2-9 have been retransmitted? That’s only 8 payloads. -- I do have a couple new comments raised during my review of the changes in version 12: (draft-ietf-core-new-block-12) 17. Section 1: This document introduces the CoAP Q-Block1 and Q-Block2 Options which allow block-wise transfer to work with series of Non-confirmable messages, instead of lock-stepping using Confirmable messages (Section 3). In other words, this document provides a missing piece of [RFC7959], namely the support of block-wise transfer using Non- confirmable where an entire body of data can be transmitted without the requirement for an acknowledgement (but recovery is available should it be needed). As far as I can tell the spec does not really remove the requirement for acknowledgement, it just amortizes the acknowledgements by only sending them every MAX_PAYLOADS_SET. Response Code 2.31 is essentially an acknowledgement, and it gets sent that frequently, right? There’s also (if I recall correctly) some flavor of acknowledgement that is sent when the entire body has been transferred. So, I think the new paragraph isn’t accurate. This observation also applies to this claimed benefit in §3: o They support sending an entire body using NON messages without requiring an intermediate response from the peer. Response Code 2.31 is exactly an intermediate response. I guess maybe your focus is that if the intermediate response isn’t received, transmission continues, albeit more slowly than it would otherwise, and unreliably too, so in that sense the responses aren’t “required”. I think this requires awfully close parsing of the word “required”, though. 18. Section 2: MAX_PAYLOADS_SET is the set of blocks identified by block numbers that, when divided by MAX_PAYLOADS, they have the same numeric Remove “they” result. For example, if MAX_PAYLOADS is set to '10', a MAX_PAYLOADS_SET could be blocks #0 to #9, #10 to #19, etc. Depending on the data size, the MAX_PAYLOADS_SET may not comprise all the MAX_PAYLOADS blocks. I don’t understand the last sentence ("Depending on the data size, the MAX_PAYLOADS_SET may not comprise all the MAX_PAYLOADS blocks.”) Are you trying to say that if the body size isn’t evenly divisible by MAX_PAYLOADS then the final MAX_PAYLOADS_SET will have fewer than MAX_PAYLOADS blocks in it? (I do think this change, to introduce the term MAX_PAYLOADS_SET, is generally helpful; thanks.)
- [core] John Scudder's No Objection on draft-ietf-… John Scudder via Datatracker