[core] Benjamin Kaduk's Discuss on draft-ietf-core-new-block-11: (with DISCUSS and COMMENT)
Benjamin Kaduk via Datatracker <noreply@ietf.org> Thu, 06 May 2021 01:58 UTC
Return-Path: <noreply@ietf.org>
X-Original-To: core@ietf.org
Delivered-To: core@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id 413183A2AE4; Wed, 5 May 2021 18:58:27 -0700 (PDT)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
From: Benjamin Kaduk via Datatracker <noreply@ietf.org>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-core-new-block@ietf.org, core-chairs@ietf.org, core@ietf.org, marco.tiloca@ri.se, marco.tiloca@ri.se
X-Test-IDTracker: no
X-IETF-IDTracker: 7.28.0
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Benjamin Kaduk <kaduk@mit.edu>
Message-ID: <162026630680.17506.6477675472375470197@ietfa.amsl.com>
Date: Wed, 05 May 2021 18:58:27 -0700
Archived-At: <https://mailarchive.ietf.org/arch/msg/core/cTz5FQMhM5_4kWAuQ0fgFtk4xgE>
Subject: [core] Benjamin Kaduk's Discuss on draft-ietf-core-new-block-11: (with DISCUSS and COMMENT)
X-BeenThere: core@ietf.org
X-Mailman-Version: 2.1.29
List-Id: "Constrained RESTful Environments \(CoRE\) Working Group list" <core.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/core>, <mailto:core-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/core/>
List-Post: <mailto:core@ietf.org>
List-Help: <mailto:core-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/core>, <mailto:core-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 06 May 2021 01:58:27 -0000
Benjamin Kaduk has entered the following ballot position for draft-ietf-core-new-block-11: Discuss When responding, please keep the subject line intact and reply to all email addresses included in the To and CC lines. (Feel free to cut this introductory paragraph, however.) Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html for more information about DISCUSS and COMMENT positions. The document, along with other ballot positions, can be found here: https://datatracker.ietf.org/doc/draft-ietf-core-new-block/ ---------------------------------------------------------------------- DISCUSS: ---------------------------------------------------------------------- I have a concern about the MAX_PAYLOADS congestion-control parameter. In Section 7.2 it is stated that both endpoints only SHOULD have the same value. I don't see how this can be anything less than MUST, given that we attribute semantics to whether NUM modulo MAX_PAYLOADS is zero or non-zero in the processing of the Q-Block2 option. If the endpoints disagree on the value of MAX_PAYLOADS they will disagree on the semantics of Q-Block2 -- how can that be interoperable? (Being able to negotiate the value does not seem inherently problematic, but since it is relevant for protocol semantics it seems like the value must be identical on both endpoints.) This seems especially important to have clarity on given that the current specification allows for MAX_PAYLOADS to be decreased at runtime in response to congestion feedback over a 24-hour period, with no synchronization between peers provided ("Note that the CoAP peer will not know about the MAX_PAYLOADS change until it is reconfigured".) ---------------------------------------------------------------------- COMMENT: ---------------------------------------------------------------------- I made some editorial suggestions in a github pull request at https://github.com/core-wg/new-block/pull/21 . It seems that there are now some merge conflicts; I cannot promise to have availability to try to resolve them particularly quickly, but I can do so "eventually" if needed. Section 1 There is a requirement for these blocks of data to be transmitted at higher rates under network conditions where there may be asymmetrical transient packet loss (i.e., responses may get dropped). An example is when a network is subject to a Distributed Denial of Service (DDoS) attack and there is a need for DDoS mitigation agents relying upon CoAP to communicate with each other (e.g., [RFC8782][I-D.ietf-dots-telemetry]). As a reminder, [RFC7959] I suppose the RFC Editor will do the right thing about referencing 8782 vs 8782bis ... and I am overdue in following up on the IETF LC results for the latter :( Section 2 We currently introduce the concept of MAX_PAYLOADs by implicit use in a few places before it is actually given a proper definition. I wonder if mentioning here that it is used to group a batch of blocks would help the reader. Section 3 o They support sending an entire body using Non-confirmable (NON) messages without requiring a response from the peer. I put this change in my github PR, but repeating here for visibility since I am making an assumption: I propose adding "intermediate" for "without requiring an intermediate response from the peer". My understanding is that a final message indicaing successful receiption is still used (or a selective ack in case of loss), so the contrast to RFC 7959 is in the (lack of) need for intermediate responses for each block. o Mixing of NON and CON during requests/responses using Q-Block is not supported. There is perhaps subtle differences across "not supported", "forbidden", and "not defined in this specification". Do we perhaps actually mean "forbidden"? Section 3.2 (DOTS) that cannot use CON responses to handle potential packet loss and that support application-specific mechanisms to assess whether the remote peer is able to handle the messages sent by a CoAP endpoint (e.g., DOTS heartbeats in Section 4.7 of [RFC8782]). Can we get greater clarity on what "able to handle" is intended to mean? I can't tell if it's anywhere between "the transport is able to deliver message bodies" and "the software stack implements and enables a particular feature". Section 4.1 When the Content-Format Option is present together with the Q-Block1 or Q-Block2 Option, the option applies to the body not to the payload (i.e., it must be the same for all payloads of the same body). Do we have a normative requirement somewhere that the recipient track and compare the content-format values across blocks? If not, should we? Q-Block2 Option is useful with GET, POST, PUT, FETCH, PATCH, and iPATCH requests and their payload-bearing responses (2.01, 2.02, 2.03, 2.04, and 2.05) (Section 5.5 of [RFC7252]). Do we need an "e.g." in front of the list, to account for the potential future registration of new payload-bearing response codes? If Q-Block1 Option is present in a request or Q-Block2 Option in a response (i.e., in that message to the payload of which it pertains), Can we reword this parenthetical in a less convoluted way? I'm not even sure I'm parsing it properly. [RFC7252]). To reliably get a rejection message, it is therefore REQUIRED that clients use a Confirmable message for determining support for Q-Block1 and Q-Block2 Options. (I know that some other discussion happened on this mechanism, but I forget if there are already plans to add a clarification that this is only needed once per peer within a given set of exchanges.) The Q-Block2 Option is repeatable when requesting retransmission of missing blocks, but not otherwise. Except that case, any request carrying multiple Q-Block1 (or Q-Block2) Options MUST be handled following the procedure specified in Section 5.4.5 of [RFC7252]. Since these are critical options, the referenced procedures involve rejecting the message, right? Is that important enough to note directly? Note that if Q-Block1 or Q-Block2 Options are included in a packet as Inner options, Block1 or Block2 Options MUST NOT be included as Inner options. Similarly there MUST NOT be a mix of Q-Block and Block for the Outer options. [...] (Hopefully a silly question, but do we make the analogous prohibition against combining Q-Block and regular Block for non-OSCORE cases anywhere? I thought we did, but now I can't find it...) Section 4.3 being transferred. The Request-Tag is opaque, the server still treats it as opaque but the client MUST ensure that it is unique for every different body of transmitted data. (nit) the structure of this sentence seems off, to me. I may just want a comma after "server still treats it as opaque", but looking more closely I might rewrite to more like "The Request-Tag is opaque to the server, but the client MUST ensure that it is unique for every different request body being transmitted". Implementation Note: It is suggested that the client treats the Request-Tag as an unsigned integer of 8 bytes in length. An implementation may want to consider limiting this to 4 bytes to reduce packet overhead size. The initial Request-Tag value should be randomly generated and then subsequently incremented by the client whenever a new body of data is being transmitted between peers. In the vein of draft-gont-numeric-ids-sec-considerations, is the increment necessarily 1 or can there be gaps? Similarly, the risk of information disclosure (via side channel) is reduced if the initial random value is generated anew for each connection. This is maybe implied by the current text but could be stated more clearly. The client MUST send the payloads with the block numbers increasing, starting from zero, until the body is complete (subject to any congestion control (Section 7)). Any missing payloads requested by the server must in addition be separately transmitted with increasing block numbers. When I first read this, I thought that the block numbers of retransmissions needed to continue to increase in the same sequence as the original transmission, i.e., retransmitted blocks are assigned new block numbers. The examples do not bear this out (and it seems like it would be complicated to specify clearly), so I suggest rephrasing to "in order of increasing block number". If the FETCH request includes the Observe Option, then the server MUST use the same token as used for the initial response for returning any Observe triggered responses so that the client can match them up. The client should then release all of the tokens used for this body unless a resource is being observed. If a resource is being observed, should the client release all the other tokens (than the one used for the initial response)? Also, is the "initial response" the first response for the blockwise transfer (which might be a 2.31 or 4.08 for NON requests), or the first one with response code 2.05? 2.31 (Continue) This Response Code can be used to indicate that all of the blocks up to and including the Q-Block1 Option block NUM (all having the M bit set) have been successfully received. The token used MUST be one of the tokens that were received in a request for this block-wise exchange. However, it is desirable to provide the one used in the last received request. Can the client release any tokens upon receipt of such a response? 4.02 (Bad Option) This Response Code MUST be returned for a Confirmable request if the server does not support the Q-Block Options. Note that a reset message must be sent in case of Non-confirmable request. Reset only needs to be sent if the server is not ignoring the request entirely, though, right? %%% The following few comments are interrelated: This Response Code returned with Content-Type "application/ missing-blocks+cbor-seq" indicates that some of the payloads are missing and need to be resent. The client then retransmits the missing payloads using the same Request-Tag, Size1 and Q-Block1 to specify the block NUM, SZX, and M bit as appropriate. The new 'M' bit is "as appropriate" for the new flight of messages, or as was sent initially? (The examples in §10.x suggest "as was sent initially".) The Request-Tag value to use is determined by taking the token in the 4.08 (Request Entity Incomplete) response, locating the matching client request, and then using its Request-Tag. The "value to use" here seems to be indicating the value to use in the retransmitted request... The token used MUST be one of the tokens that were received in a request for this block-wise exchange. However, it is desirable to provide the one used in the last received request. See Section 5 for further information. ... but here the "token used" seems to be indicating the token to be used in constructing the response that has response code 4.08. If my understanding is correct, we really should have more clarity on which value is "used" for which message. Additionally, in the last quoted paragraph we refer to Section 5 for further information, which includes a SHOULD-level requirement to "provide the [token] used in the last received request". It is very surprising to have the normative requirements for behavior split across sections in this manner. (Or was the intent that Section 5 also use the "desirable" wording?) %%% Section 4.4 The ETag is opaque, the client still treats it as opaque but the server MUST ensure that it is unique for every different body of transmitted data. [analogous comment as for Request-Tag] Implementation Note: It is suggested that the server treats the ETag as an unsigned integer of 8 bytes in length. An implementation may want to consider limiting this to 4 bytes to reduce packet overhead size. The initial ETag value should be randomly generated and then subsequently incremented by the server whenever a new body of data is being transmitted between peers. [analogous comment as for Request-Tag] The client SHOULD wait for up to NON_RECEIVE_TIMEOUT (Section 7.2) after the last received payload for NON payloads before issuing a GET, POST, PUT, FETCH, PATCH, or iPATCH request that contains one or more Q-Block2 Options that define the missing blocks with the M bit unset. The client MAY set the M bit to request this and later blocks from this MAX_PAYLOADS set. Further considerations related to the transmission timing for missing requests are discussed in Section 7.2. Does the MAY grant permission to send with M bit set prior to NON_RECEIVE_TIMEOUT, or just permission to send with M bit set in addition to with M bit unset (but still after the timeout)? For Confirmable responses, the client continues to acknowledge each packet. Typically, the server acknowledges the initial request using an ACK with the payload, and then sends the subsequent payloads as CON responses. The server will detect failure to send a packet, but the client can issue, after a MAX_TRANSMIT_SPAN delay, a separate GET, POST, PUT, FETCH, PATCH, or iPATCH for any missing blocks as needed. Starting out with "for confirmable responses" implies that we're going to separately cover non-confirmable responses later, or at some point transition to statements of general applicability (to both confirmable and non-confirmable responses). Where does that happen? A client SHOULD maintain a partial body (missing payloads) for up to NON_PARTIAL_TIMEOUT (Section 7.2) or as defined by the Max-Age Option (or its default of 60 seconds (Section 5.6.1 of [RFC7252])), whichever is the less. On release of the partial body, the client should then release all of the tokens used for this body unless a resource is being observed. [as above, can the client release any subset of tokens in the case of observe?] It is RECOMMENDED that the server maintains a cached copy of the body when using the Q-Block2 Option to facilitate retransmission of any missing payloads. It's surprising to write that the client SHOULD but it is RECOMMENDED that the server cache, when those two requirements keywords have an equivalent strength per BCP 14. Can't we used consistent terminology for the same requirement level? If the server detects part way through a body transfer that the resource data has changed and the server is not maintaining a cached copy of the old data, then the transmission is terminated. Any subsequent missing block requests MUST be responded to using the latest ETag and Size2 Option values with the updated data. This sounds like the server starts responding "in the middle" of the new representation, so the client would need to go back and re-request the initial parts, possibly across multiple groups of MAX_PAYLOADS blocks. It seems like this requirement for client behavior should be more clearly documented somewhere. We do go on to talk about the client removing the stale partial body, but not about completing the new body. Section 4.5 For a response that uses Q-Block2, the Observe value MUST be the same for all the payloads of the same body. This is different from Block2 usage where the Observe value is only present in the first block (Section 3.4 of [RFC7959]). This includes payloads transmitted following receipt of the 'Continue' Q-Block2 Option (Section 4.4) by the server. If a missing payload is requested by a client, then both the request and response MUST NOT include the Observe Option. (side note?) It seems very surprising to omit Observe from only retransmitted payloads but keep it in all initial payload transmissions. Section 4.6 The Size1 or Size2 option values MUST exactly represent the size of the data on the body so that any missing data can easily be determined. Is this MUST duplicating the behavior already specified by RFC 7959? Section 5 The data payload of the 4.08 (Request Entity Incomplete) response is encoded as a CBOR Sequence [RFC8742]. It comprises of one or more I think we want some qualifying text that reaffirms that the behavior being described is applicable only to the application/missing-blocks+cbor-seq content-type case, possibly by having the previous discussion state that "this section defines the behavior and semantics for 4.08 responses using the new content-type." The Concise Data Definition Language [RFC8610] (and see Section 4.1 [RFC8742]) for the data describing these missing blocks is as follows: (Should we mention that this is only informational and that the prose description is normative, in line with RFC 8610 being only an informative reference?) ; A notional array, the elements of which are to be used ; in a CBOR Sequence: (nit) Is there a reason to use a different wording than the referenced example from RFC 8742? Section 6 Implementation Note: By using 8-byte tokens, it is possible to easily minimize the number of tokens that have to be tracked by clients, by keeping the bottom 32 bits the same for the same body and the upper 32 bits containing the current body's request number (incrementing every request, including every re-transmit). This allows the client to be alleviated from keeping all the per- request-state, e.g., in Section 3 of [RFC8974]. If we're going to introduce structure into a nominally opaque identifier, we need to discuss the consequences of that in the security considerations. draft-gont-numeric-ids-sec-considerations has some guidance in this regard. Section 7.1 Congestion control for CON requests and responses is specified in Section 4.7 of [RFC7252]. For faster transmission rates, NSTART will need to be increased from 1. However, the other CON congestion control parameters will need to be tuned to cover this change. [...] I thought there had been some discussion in a different AD's ballot thread on this text, but I can't find it now. I'm happy to defer to the previous discussion if I'm not just imagining it. Anyways, I might suggest phrasing this as "if faster transmission rates are needed, NSTART will need to be increased from 1". It is implementation specific as to whether there should be any further requests for missing data as there will have been significant transmission failure as individual payloads will have failed after MAX_TRANSMIT_SPAN. (editorial) I don't think I can successfully parse this sentence. There may be a few missing words, and splitting into multiple sentences would likely help as well. Section 7.2 NON_RECEIVE_TIMEOUT is the initial maximum time to wait for a missing payload before requesting retransmission for the first time. Every time the missing payload is re-requested, the time to wait value doubles. The time to wait is calculated as: Thank you for being very clear about the exponential backoff procedure :) payloads to prevent the client unnecessarily delaying. If not all of the MAX_PAYLOADS payloads were received, the server SHOULD delay for NON_RECEIVE_TIMEOUT (exponentially scaled based on the repeat request count for a payload) before sending the 4.08 (Request Entity Incomplete) Response Code for the missing payload(s). If this is a repeat for the 2.31 (Continue) response, the server SHOULD send a 4.08 (Request Entity Incomplete) response detailing the missing payloads after the block number that would have been indicated in the 2.31 (Continue). [...] I don't understand what "if this is a repeat for the 2.31 (Continue) response" is intended to mean. The client does not need to acknowledge the receipt of the entire body. Does that mean that the last group of response blocks will always be retransmitted NON_MAX_RETRANSMIT times? Section 10 QB1: Q-Block1 Option values NUM/More/SZX QB2: Q-Block2 Option values NUM/More/SZX What's depicted in the figure seems to be the actual block size, and not the three-bit SZX value. Section 10.1.3 Should we indicate somehow in Figure 6 that the 4.08 responses use the new content-format? Also, is there any value in indicating that there might be a race between the client continuing to send the next set of payloads and the initial 4.08 response? Section 10.2.3 I don't understand why the NON_RECEIVE_TIMEOUT (client) triggers -- shouldn't the delivery of the 11th block indicate that the server thinks it sent a full MAX_PAYLOADS group and thus a selective ACK, after perhaps just a modest reordering delay? Section 10.3.2 [[MAX_PAYLOADS has been reached]] | [[MAX_PAYLOADS blocks acknowledged by client using | 'Continue' Q-Block2]] +--------->| NON FETCH /path M:0x3b T:0xab QB2:10/1/1024 |<---------+ NON 2.05 M:0x8b T:0xaa O:1334 ET=21 QB2:10/0/1024 Shouldn't the server switch to using T:0xab now? +--------->| NON FETCH /path M:0x3c T:0xac QB2:10/1/1024 |<---------+ NON 2.05 M:0x96 T:0xaa O:1335 ET=22 QB2:10/0/1024 and 0xac here? Section 10.3.3 |<---------+ NON 2.05 M:0xa6 T:0xc6 ET=23 QB2:3/0/1024 | ... | [[NON_RECEIVE_TIMEOUT (client) delay expires]] Why does the client time out here (at least with the full NON_RECEIVE_TIMEOUT); the final-message indication seems like it would allow for an ~immediate response (delayed only for some reordering threshold)?
- [core] Benjamin Kaduk's Discuss on draft-ietf-cor… Benjamin Kaduk via Datatracker
- Re: [core] Benjamin Kaduk's Discuss on draft-ietf… mohamed.boucadair
- Re: [core] Benjamin Kaduk's Discuss on draft-ietf… mohamed.boucadair
- Re: [core] Benjamin Kaduk's Discuss on draft-ietf… Benjamin Kaduk
- Re: [core] Benjamin Kaduk's Discuss on draft-ietf… mohamed.boucadair
- Re: [core] Benjamin Kaduk's Discuss on draft-ietf… Benjamin Kaduk
- Re: [core] Benjamin Kaduk's Discuss on draft-ietf… mohamed.boucadair