[core] Adam Roach's No Objection on draft-ietf-core-coap-tcp-tls-09: (with COMMENT)
Adam Roach <adam@nostrum.com> Mon, 22 May 2017 20:48 UTC
Return-Path: <adam@nostrum.com>
X-Original-To: core@ietf.org
Delivered-To: core@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id 38E3F12702E; Mon, 22 May 2017 13:48:22 -0700 (PDT)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
From: Adam Roach <adam@nostrum.com>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-core-coap-tcp-tls@ietf.org, Jaime Jimenez <jaime.jimenez@ericsson.com>, core-chairs@ietf.org, jaime.jimenez@ericsson.com, core@ietf.org
X-Test-IDTracker: no
X-IETF-IDTracker: 6.51.0
Auto-Submitted: auto-generated
Precedence: bulk
Message-ID: <149548610222.24921.6807834750685175839.idtracker@ietfa.amsl.com>
Date: Mon, 22 May 2017 13:48:22 -0700
Archived-At: <https://mailarchive.ietf.org/arch/msg/core/TqnNNGxma3yWM_ZpbrkPsXg6C2Y>
Subject: [core] Adam Roach's No Objection on draft-ietf-core-coap-tcp-tls-09: (with COMMENT)
X-BeenThere: core@ietf.org
X-Mailman-Version: 2.1.22
List-Id: "Constrained RESTful Environments \(CoRE\) Working Group list" <core.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/core>, <mailto:core-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/core/>
List-Post: <mailto:core@ietf.org>
List-Help: <mailto:core-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/core>, <mailto:core-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 22 May 2017 20:48:22 -0000
Adam Roach has entered the following ballot position for draft-ietf-core-coap-tcp-tls-09: No Objection When responding, please keep the subject line intact and reply to all email addresses included in the To and CC lines. (Feel free to cut this introductory paragraph, however.) Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html for more information about IESG DISCUSS and COMMENT positions. The document, along with other ballot positions, can be found here: https://datatracker.ietf.org/doc/draft-ietf-core-coap-tcp-tls/ ---------------------------------------------------------------------- COMMENT: ---------------------------------------------------------------------- I have removed my DISCUSS, but want to be clear that I remain quite distressed about the design aspects of this document. I have adjusted my comments below to trim them down to only those issues that remain in -09. Beyond the comments left over from -08, I am perplexed that no concrete mechanism for UDP/TCP failover is provided, nor is any discussion of the management aspects of configuring between them, nor is any discussion of which transport protocol(s) may be considered MTI. I also wish to highlight the somewhat buried request from my original comments that I believe this document would be vastly improved by splitting it into one document that deals with TCP, and another that deals with WebSockets. They are intended for radically different environments, and a large majority of implementors will care about one but not the other. Combining into a single document just creates more work for them. General — this is a very bespoke approach to what could have been mostly solved with a single four-byte “length” header; it is complicated on the wire, and in implementation; and the format variations among CoAP over UDP, TLS, and WebSockets are going to make gateways much harder to implement and less efficient (as they will necessarily have to disassemble messages and rebuild them to change between formats). The protocol itself mentions gateways in several places, but does not discuss how they are expected to map among the various flavors of CoAP defined in this document. Some of the changes seem unnecessary, but it could be that I’m missing the motivation for them. Ideally, the introduction would work harder at explaining why CoAP over these transports is as different from CoAP over UDP as it is, focusing in particular on why the complexity of having three syntactically incompatible headers is justified by the benefits provided by such variations. Additionally, it’s not clear from the introduction what the motivation for using the mechanisms in this document is as compared to the techniques described in section 10 (and its subsections) of RFC 7252. With the exception of subscribing to resource state (which could be added), it seems that such an approach is significantly easier to implement and more clearly defined than what is in this document; and it appears to provide the combined benefits of all four transports discussed in this document. My concern here is that an explosion of transport options makes it less likely that a client and server can find two in common: the limit of the probability of two implementations having a transport in common as the number of transports approaches infinity is zero. Due to this likely decrease in interoperability, I’d expect to see some pretty powerful motivation in here for defining a third, fourth, fifth, and sixth way to carry CoAP when only TCP is available (I count RFC 7252 http and https as the first and second ways in this accounting). Specific comments follow. Section 3.3, paragraph 3 says that an initiator may send messages prior to receiving the remote side’s CSM, even though the message may be larger than would be allowed by that CSM. What should the recipient of an oversized message do in this case? In fact, I don’t see in here what a recipient of a message larger than it allowed for in its CSM is supposed to do in response at *any* stage of the connection. Is it an error? If so, how do you indicate it? Or is the Max-Message-Size option just a suggestion for the other side? This definitely needs clarification. (Aside — it seems odd and somewhat backwards that TCP connections are provided an affordance for fine-grained control over message sizes, while UDP communications are not.) Section 5 and its subsections define a new set of message types, presumably for use only on connection-oriented protocols, although this is only implied, and never stated. For example, some implementors may see CSM, Ping, and Pong as potentially useful in UDP; and, finding no prohibition in this document against using them, decide to give it a go. Is that intended? If not, I strongly suggest an explicit prohibition against using these in UDP contexts. Section 5.3.2 says that implementations supporting block-wise transfers SHOULD indicate the Block-wise Transfer Option. I can't figure out why this is anything other than a "MUST". It seems odd that this document would define a way to communicate this, and then choose to leave the communicated options as “YES” and “YOUR GUESS IS AS GOOD AS MINE” rather than the simpler and more useful “YES” and “NO”. I find the described operation of the Custody Option in the operation of Ping and Pong to be somewhat problematic: it allows the Pong sender to unilaterally decide to set the Custody Option, and consequently quarantine the Pong for an arbitrary amount of time while it processes other operations. This seems impossible to distinguish from a failure-due-to-timeout from the perspective of the Ping sender. Why not limit this behavior only to Ping messages that include the Custody Option? I am similarly perplexed by the hard-coded “must do ALPN *unless* the designated port takes the magical value 5684” behavior. I don’t think I’ve ever seen a protocol that has such variation based on a hard-coded port number, and it seems unlikely to be deployed correctly (I’m imaging the frustration of: “I changed both the server and the client configuration from the default port of 5684 to 49152, and it just stopped working. Like, literally the *only* way it works is on port 5684. I've checked firewall settings everywhere and don't see any special handling for that port -- I just can't figure this out, and it's driving me crazy.”). Given the nearly universal availability of ALPN in pretty much all modern TLS libraries, it seems much cleaner to just require ALPN support and call it done. Or *don’t* require ALPN at all and call it done. But *changing* protocol behavior based on magic port numbers seems like it’s going to cause a lot of operational heartburn. [I have removed my comments about section 8.1, as I believe EKR is managing the TLS-related issues for this document] Although the document clearly expects the use of gateways and proxies between these connection-oriented usages of CoAP and UDP-based CoAP, Appendix A seems to omit discussion or consideration of how this gatewaying can be performed. The following list of problems is illustrative of this larger issue, but likely not exhaustive. (I'll note that all of these issues evaporate if you move to a simpler scheme that merely frames otherwise unmodified UDP CoAP messages) Section A.1 does not indicate what gateways are supposed to do with out-of-order notifications. The TCP side requires these to be delivered in-order; so, do this mean that gateways observing a gap in sequence numbers need to quarantine the newly received message so that it can deliver the missing one first? Or does it deliver the newly-received message and then discard the “stale” one when it arrives? I don’t think that leaving this up to implementations is particularly advisable. Section A.3 is a bit more worrisome. I understand the desired optimization here, but where you reduce traffic in one direction, you run the risk of exploding it in the other. For example, consider a coap+tcp client connecting to a gateway that communicates with a CoAP-over-UDP server. When that client wants to check the health of its observations, it can send a Ping and receive a Pong that confirms that they are all alive and well. In order to be able to send a Pong that *means* “all your observations are alive and well,” the gateway has to verify that all the observations are alive and well. A simple implementation of a gateway will likely check on each observed resource individually when it gets a Ping, and then send a Pong after it hears back about all of them. So, as a client, I can set up, let’s say, two dozen observations through this gateway. Then, with each Ping I send, the gateway sends two dozen checks towards the server. This kind of message amplification attack is an awesome way to DoS both the gateway and the server. I believe the document needs a treatment of how UDP/TCP gateways handle notification health checks, along with techniques for mitigating this specific attack. Section A.4 talks about the rather different ways of dealing with unsubscribing from a resource. Presumably, gateways that get a reset to a notification are expected to synthesize a new GET to deregister on behalf of the client? Or is it okay if they just pass along the reset, and expect the server to know that it means the same thing as a deregistration? Without explicit guidance here, I expect server and gateway implementors to make different choices and end up with a lack of interop. >From i-d nits (this appears to be in reference to Figure 1): ** There is 1 instance of too long lines in the document, the longest one being 3 characters in excess of 72.