Re: [Lwip] draft-gomez-lwig-tcp-constrained-node-networks-03

"Carles Gomez Montenegro" <carlesgo@entel.upc.edu> Thu, 19 October 2017 15:47 UTC

Return-Path: <carlesgo@entel.upc.edu>
X-Original-To: lwip@ietfa.amsl.com
Delivered-To: lwip@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5133813219B for <lwip@ietfa.amsl.com>; Thu, 19 Oct 2017 08:47:18 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.2
X-Spam-Level:
X-Spam-Status: No, score=-4.2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id laaDiNBAECgm for <lwip@ietfa.amsl.com>; Thu, 19 Oct 2017 08:47:14 -0700 (PDT)
Received: from violet.upc.es (violet.upc.es [147.83.2.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1911B13263F for <lwip@ietf.org>; Thu, 19 Oct 2017 08:47:13 -0700 (PDT)
Received: from entelserver.upc.edu (entelserver.upc.es [147.83.39.4]) by violet.upc.es (8.14.4/8.14.4/Debian-4.1ubuntu1) with ESMTP id v9JFl7Bx027471; Thu, 19 Oct 2017 17:47:07 +0200
Received: from webmail.entel.upc.edu (webmail.entel.upc.edu [147.83.39.6]) by entelserver.upc.edu (Postfix) with ESMTP id 4C8D71D53C1; Thu, 19 Oct 2017 17:47:06 +0200 (CEST)
Received: from 83.37.37.84 by webmail.entel.upc.edu with HTTP; Thu, 19 Oct 2017 17:47:03 +0200
Message-ID: <208d8e47e0b9079d9d97c68e8b2b9848.squirrel@webmail.entel.upc.edu>
In-Reply-To: <ad1ebfaf-03b4-9b3c-4a2d-da3eda7c4648@gmx.net>
References: <ad1ebfaf-03b4-9b3c-4a2d-da3eda7c4648@gmx.net>
Date: Thu, 19 Oct 2017 17:47:03 +0200
From: Carles Gomez Montenegro <carlesgo@entel.upc.edu>
To: Hannes Tschofenig <hannes.tschofenig@gmx.net>
Cc: jon.crowcroft@cl.cam.ac.uk, michael.scharf@nokia.com, "lwip@ietf.org" <lwip@ietf.org>
User-Agent: SquirrelMail/1.4.21-1.fc14
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 8bit
X-Priority: 3 (Normal)
Importance: Normal
X-Virus-Scanned: clamav-milter 0.99.2 at violet
X-Virus-Status: Clean
X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by milter-greylist-4.3.9 (violet.upc.es [147.83.2.51]); Thu, 19 Oct 2017 17:47:07 +0200 (CEST)
Archived-At: <https://mailarchive.ietf.org/arch/msg/lwip/y1M2QL-Jzn9EGdZU-r8_LOFFiLo>
Subject: Re: [Lwip] draft-gomez-lwig-tcp-constrained-node-networks-03
X-BeenThere: lwip@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Lightweight IP stack <lwip.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/lwip>, <mailto:lwip-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/lwip/>
List-Post: <mailto:lwip@ietf.org>
List-Help: <mailto:lwip-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/lwip>, <mailto:lwip-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 19 Oct 2017 15:47:18 -0000

Hi Hannes,

First of all, sorry for the delay, and thanks a lot again for your
comprehensive review.

As you may have seen, we have published a revision of the draft (-01)
which, among others, aims to address your comments.

Please find below a set of inline responses:

> Hi Michael, Jon, Carles
>
> it is great that you have worked on this topic and, as stated during the
> Prague IETF meeting, I believe this document should become a WG item of
> the LWIG working group.
>
> I nevertheless have a couple of comments & questions.
>
> - Who do you think is the main audience for this draft? Is it primarily
> written for implementers of embedded TCP stacks or is it rather written
> for embedded developers who have to decide what stack and what features
> of TCP to use?
>
> If you ask me, I prefer it to be the latter. The reason is that there
> are very few implementers who write their own embedded TCP stack. There
> is, however, also an implication if you are aiming for the latter group,
> namely you cannot expect them to know all the details of TCP well. So,
> you need to present them with enough background so that they can make
> informed trade-off decisions.

[Carles] I think the draft should be useful for both types of audience.
I agree the second group may probably be larger. We have added a bit more
background in the document, but I’m rather inclined to avoid adding a
significant amount of background. IMHO, something in the style of RFC 3481
(which is quite concise) should work. Anyway, we have also added the
following at the end of Section 1:

   “This document assumes that the reader is familiar with TCP.  A
   comprehensive survey of the TCP standards can be found in [RFC7414].
   Similar guidance regarding the use of TCP in special environments has
   been published before, e.g., for cellular wireless networks
   [RFC3481].”

 If needed, more background may be added in further revisions of the draft.

> - The title of the document was probably the reason why I only noticed
> it recently. For some reason the term "Constrained Node Networks" does
> not stick well with me. I would have expect to see something like
> "Guidance for TCP Usage for Internet of Things".

[Carles] I think the most appropriate and better defined term is the one
in RFC 7228 (i.e. “Constrained-Node Networks”). However, we do agree that
“Internet of Things” is more widely recognizable and would increase
visibility of the document, and therefore we have modified the document
title as per your suggestion.

> I am also uncertain why
> you claim that the document defines a profile. It does not really define
> any profiles as far as I can tell.

[Carles] Agreed. This was written already in the -00 version, which
followed a quite different approach from the current version. We have
updated the Abstract and the Introduction accordingly in -01.

> - I would like to see some discussion about the communication patterns.
> For example, in the draft you talk about transactions (and I assume you
> mean request/response interactions). Are you focusing on those only or
> do you also consider cases of firmware updates into account? (In Section
> 4.8 you briefly mention firmware updates.) Have you looked at traffic
> patterns of some IoT applications?

[Carles] We have added a specific subsection (subsection 3.3) on traffic
patterns, which comprises unidirectional transfers, request-response
patterns and bulk data transfers.

> - Section 7 with the information about the protocol stacks is great. I
> hope you will complete the table some time in the near future and
> provide additional information about RAM requirements as well (in
> addition to the codesize).

[Carles] Thanks for the comment. We have now added a few more details (in
the TinyOS column). We will do our best to complete the table and the
details on RAM requirements. We also would like to request the WG to
please provide input, specially regarding information about RAM
requirements and code size.

> More detailed comments:
>
> You write:
>    "In order to meet the requirements that
>    stem from CNNs, the IETF has produced a suite of protocols
>    specifically designed for such environments
>    [I-D.ietf-lwig-energy-efficient]."
>
> The IETF approach on IoT in general has been to re-use as much as
> possible rather than to develop a whole new universe just for IoT. There
> are, however, a few new protocol developments but those are not really
> described well in [I-D.ietf-lwig-energy-efficient] since
> [I-D.ietf-lwig-energy-efficient] talks specifically about energy
> efficiency.

[Carles] Well, 6LoWPAN, RPL and CoAP may be considered as a set of new
protocols... We have added “new” as shown below:

   "In order to meet the requirements that
   stem from CNNs, the IETF has produced new protocols
                                         ^^^
   specifically designed for such environments, and has also reused
   existing ones."

[Carles] The reference focuses on energy efficiency, but does show a
figure (Figure 1 in that document) which illustrates both the traditional
TCP/IP protocol stack and the IoT protocol stack. I am anyway open to
suggestions for other references...

> [I-D.tschofenig-core-coap-tcp-tls] has been replaced by
> [I-D.ietf-core-coap-tcp-tls].

[Carles] Updated in -01. Thanks!

> You write:
>
>    "On the other hand, other application layer protocols not specifically
>    designed for CNNs are also being considered for the IoT space.  Some
>    examples include HTTP/2 and even HTTP/1.1, both of which run over TCP
>    by default [RFC7540][RFC2616], and the Extensible Messaging and
>    Presence Protocol (XMPP) [RFC 6120].  TCP is also used by non-IETF
>    application-layer protocols in the IoT space such as MQTT and its
>    lightweight variants [MQTTS]."
>
> I don't think the reference to [MQTTS] is appropriate. The other variant
> of MQTT, which exists as a standardized protocol is MQTT-SN and it uses
> UDP (if I recall correctly). As such, it does not fit into the argument
> you are making about TCP usage.

[Carles] Agreed. We now just cite the TCP-based one.

> XMPP is also not a good example since it is mostly used on gateways
> rather than low end IoT devices. It is just a very verbose protocol.

[Carles] XMPP was added after a comment/request by Carsten in IETF 96.
Even if it is not very common in low end IoT devices, it may still be good
to keep it in the text for completeness.

> Section 2 about "Characteristics of CNNs relevant for TCP" somehow feels
> a bit misplaced. I am wondering whether there is any loss in value of
> the document if you delete this entire section. RFC 7228, which you
> reference already in the abstract, talks about the constraints of IoT
> devices and there is probably no need to repeat them again (and if you
> think so then maybe it fits better into the introduction).

[Carles] While I think this section may not be strictly required, I still
think it highlights what characteristics of CNNs are relevant for, or may
have impact on performance of, TCP. This may help better understand the
guidance in section 4. In -01, we have modified a bit the approach for the
section (now, section 3).

> Section 3 talks about the scenario and speaks about a model where
> constrained devices connect to unconstrained servers (cloud). What about
> cases where the TCP server itself is running on an IoT device? It
> appears that you consider such a scenario out of scope. Also the text in
> Section 4.1 gives me that impression.

[Carles] We now mention those cases in 3.2 and 3.3. We may need to further
analyze how the guidance in the document changes in the constrained device
to constrained device scenario. It would possibly be good to know how
common this scenario is, though (WG feedback on this is most welcome!).

> Section 4 is where the meat of the document is. I personally would have
> structured the document a bit differently. It seems to me that there is
> the impression in the engineering community that a TCP stack is complex
> (and therefore codesize-wise large) and requires a lot of RAM. I would
> have probably started by informing the reader of where the complexity
> comes from and what "tuning" can be done to make it more lightweight.

[Carles] Tuning an implemention to make it more lightweight is probably
mostly related with the window size (4.3) and options (4.7). We may
provide some introductory text before 4.1 in order to explain in advance
where this “tunability for lightweightness” is. However, we (authors) are
still trying to determine which is the best way to structure the content
in Section 4.

> 4.2. Maximum Segment Size (MSS)
>
> Am I reading the recommendations correctly? You have three cases: If the
> underlying layer supports a frame size of ...
>
>  1) < 1280 bytes THEN use an adaptation layer (like 6lowpan to make it
> look like case #2)
>  2) 1280 bytes THEN you are OK.
>  3) > 1280 bytes THEN limit the MTU to 1280 bytes and you SHOULD use the
> Path MTU mechanism.

[Carles] Agreed for 1) and 2). With regard to 3), if the underlying layer
supports a frame size > 1280 bytes, then the recommendation is to set the
MTU to 1280 bytes, in order to avoid having to support Path MTU Discovery.
This has been added at the end of the paragraph:

OLD
   “For the
   sake of lightweight implementation and operation, unless applications
   require handling large data units (i.e. leading to an IPv6 datagram
   size greater than 1280 bytes), it may be desirable to limit the MTU
   to 1280 bytes.”

NEW
   “For the
   sake of lightweight implementation and operation, unless applications
   require handling large data units (i.e. leading to an IPv6 datagram
   size greater than 1280 bytes), it may be desirable to limit the MTU to
   1280 bytes in order to avoid the need to support Path MTU Discovery.”

> 4.3 Window Size
>
> You write:
> "
>    A TCP stack can reduce the implementation complexity by advertising a
>    TCP window size of one MSS, and also transmit at most one MSS of
>    unacknowledged data, at the cost of decreased performance.  This size
>    for receive and send window is appropriate for simple message
>    exchanges in the CNN space, reduces implementation complexity and
>    memory requirements, and reduces overhead (see section 4.7).
> "
>
> I don't think it is a matter of implementation complexity on how large
> the window size should be but rather a question of how much RAM you
> have. I think that this section could better describe the performance
> tradeoffs.

[Carles] Complexity decrease by using a single-MSS window is not only
related with implementation complexity of window management. It is related
also with the fact that an implementation also does not need to support
several options (see 4.7). Nevertheless, we have updated the text of this
section.

> You write:
> "
>    A TCP window size of one MSS follows the same rationale as the
>    default setting for NSTART in [RFC7252], leading to equivalent
>    operation when CoAP is used over TCP.
> "
>
> Could you explain the relationship between MSS and the NSTART concept in
> CoAP in more details? I only see an indirect relationship (via the
> congestion control mechanism) but not a direct one. I am also uncertain
> what you mean by the reference to CoAP over TCP.

[Carles] We write in –01:

   “If CoAP is used over TCP with the default setting for NSTART in
   [RFC7252], a CoAP endpoint is not allowed to send a new message to a
   destination until a response for the previous message sent to that
   destination has been received.  This is equivalent to an application-
   layer window size of 1.  For this use of CoAP, a maximum TCP window
   of one MSS will be sufficient.”

> Expand and ideally explain all abbreviations, such as RTO

[Carles] Agreed and done in -01.

> You write:
>
> "  For devices that can afford greater TCP window size, it may be useful
>    to allow window sizes of at least five MSSs, in order to allow Fast
>    Retransmit and Fast Recovery [RFC5681].
> "
>
> Could you expand a bit on what you mean by "can afford"? If you have x
> amount of additional KB RAM then ....

[Carles] Not sure we can express it in “absolute” terms of KB, since the
window size is expressed in terms of MSSs. However, in -01 we expand a bit
on this topic and also give an example for a 1220-byte MSS. In this case,
at least 6100 bytes of RAM are needed.

> 4.4 RTO estimation
>
> You write:
>
> "
>    A fundamental
>    trade-off exists between responsiveness and correctness of RTOs
>    [I-D.ietf-tcpm-rto-consider].
> "
>
> Maybe you can explain the reader what the tradeoff is rather than just
> pointing to the document. You make an attempt in the text following the
> statement but it is incomplete (at least according to my reading of the
> tcp-rto-consider document.) At least I would have expected that you
> provide the recommendation from [I-D.ietf-tcpm-rto-consider] regarding
> the RTO setting or mention the timer setting in the DTLS/TLS profiles
> for IoT.

[Carles] We have now provided the tradeoff.

> I also believe that the paragraph about the work on congestion control
> for CoAP isn't really appropriate in this document. I would delete it.
> I understand why Carles wants to have it in there though ;-)

[Carles] Oh :). Text on CoCoA has been reduced in -01, and it is currently
just intended to show that there is margin for improvement with regard to
TCP RTO in IoT scenarios.

> 4. TCP connection lifetime
>
> In the discussions regarding using TCP keep-alive messages for CoAP over
> TCP we essentially got no response:
> https://www.ietf.org/mail-archive/web/maprg/current/msg00016.html
>
> I would expect a recommendation whether TCP keep-alives should or should
> not be used. With CoAP over TCP we have also defined a separate
> ping/pong mechanism.

[Carles] Keep-alives may only be useful in scenarios where middleboxes
such as firewalls (with early state deletion) are not present, in order to
release constrained devices’ state of inactive connections. Where such
middleboxes do exist, keep-alives are not useful.

> 4.7.  TCP options
>
> You write:
>
> "
>    TCP implementation for a constrained device that uses a single-MSS
>    TCP receive or transmit window size may not benefit from supporting
>    the following TCP options: Window scale [RFC1323], TCP Timestamps
>    [RFC1323], Selective Acknowledgements (SACK) and SACK-Permitted
>    [RFC2018].  Other TCP options should not be used, in keeping with the
>    principle of lightweight operation.
>
>    Other TCP options should not be supported by a constrained device, in
>    keeping with the principle of lightweight implementation and
>    operation.
>    "
>
> The last sentence starting with "Other TCP options ..." appears twice.

[Carles] Corrected in -01. Thanks!

> I am not sure I understand the recommendation: Are you saying that "TCP
> implementation for a constrained device that uses a single-MSS TCP
> receive or transmit window size should not implement any TCP options?"

[Carles] Yes (well, adding that options 0, 1 and 2 MUST be supported),
although TFO may be used as well. Is there maybe any other option we
should take into account?

> Then, for all other devices should they implement SACK and TFO?

[Carles] In -01 we explain that SACK is particularly useful for bulk data
transfers (for devices that can afford advertising a TCP window size of
several MSSs). With regard to TFO, it can be useful for all devices
regardless of their window sizes (we may indicate this in -02).

> Maybe you want to explain your rational a bit more, particularly under
> why you do not consider certain TCP options useful in an IoT environment.

[Carles] The options we have explicitly mentioned in this category are
useless in a single-MSS implementation, since these options are useful
only for larger window sizes than a single MSS. We may need to add some
(possibly brief) explanation like this one in -02.

> 4.8.  Delayed Acknowledgments
>
> The recommendation is not clear to me. It sounds like you are suggesting
> to almost dynamically adjust the ACKs based on the type of traffic being
> sent.

[Carles] Not dynamically, but there should be an analysis about the type
of traffic pattern(s) expected in a deployment. Based on this, a decision
can be made with regard to what appears to be more beneficial in terms of
delayed ACKs.

> 5. Security Considerations
>
> I don't think that the The TCP Authentication Option is a useful option
> for IoT deployments. At least I haven't even heard anyone suggesting it
> to be used so far. Most standards (even outside the IETF) recommend the
> use of TLS.

[Carles] We have added TLS as an example of best current practice.
Nevertheless, for the sake of completeness, we also include TCP options
related with security.

> 8. References
>
> IMHO there are too many references in the normative reference section. I
> would put the background reading into the informative section.

[Carles] Agreed and done in -01.

> Ciao
> Hannes

[Carles] Once again, thank you so much!

Cheers,

Carles