Re: [core] Martin Stiemerling's Discuss on draft-ietf-core-coap-15: (with DISCUSS and COMMENT)

Martin Stiemerling <martin.stiemerling@neclab.eu> Wed, 15 May 2013 19:34 UTC

Message-ID: <5193E302.7010101@neclab.eu>
Date: Wed, 15 May 2013 21:33:22 +0200
From: Martin Stiemerling <martin.stiemerling@neclab.eu>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130404 Thunderbird/17.0.5
MIME-Version: 1.0
To: Carsten Bormann <cabo@tzi.org>
References: <20130424092228.10345.76059.idtracker@ietfa.amsl.com> <8C88F8A7-9B07-4D82-8EA8-89794BD32EFC@tzi.org>
In-Reply-To: <8C88F8A7-9B07-4D82-8EA8-89794BD32EFC@tzi.org>
Content-Type: text/plain; charset="windows-1252"; format="flowed"
Content-Transfer-Encoding: 8bit
Cc: draft-ietf-core-coap@tools.ietf.org, core-chairs@tools.ietf.org, The IESG <iesg@ietf.org>, core@ietf.org
Subject: Re: [core] Martin Stiemerling's Discuss on draft-ietf-core-coap-15: (with DISCUSS and COMMENT)
Precedence: list

Hi Carsten, all,

On 04/25/2013 01:26 PM, Carsten Bormann wrote:
> Hi Martin,
>
> thanks for this detailed review.
> I have done some of the changes as simple editorial fixes, these are
> marked like [1310] below and can be reviewed in
> http://trac.tools.ietf.org/wg/core/trac/changeset/1310
> (Overview in http://trac.tools.ietf.org/wg/core/trac/timeline).
>
> Some of the replies are marked "-> Ticket": This means that the
> authors think that the change is a good idea, but probably needs a bit
> more discussion with the WG, so we will handle this as a ticket.
>
> Grüße, Carsten
>
>          ----------------------------------------------------------------------
>          DISCUSS:
>          ----------------------------------------------------------------------
>
>          A well-written document and I have a few points to discuss.
>
>          The congestion avoidance mechanisms look ok, but I assume we will get
>          feedback from implementers and deployments on the parameters and
>          mechanisms. It would be good to get this feedback documented at some
>          point.
>
> Indeed, this will require active attention by the WG.
> Fortunately, researchers are looking at this, and I expect
> additional results to become available soon.

Good to know!

>
>          Here are the issues (based on my own review and input from Joe Touch and
>          Michael Scharf):
>
>          1) IPv6 UDP checksum calculation
>          It is not clear if zero UDP checksums are permitted or not permitted with
>          COAP.?
>          (UDP zero checksums:
>          https://datatracker.ietf.org/doc/draft-ietf-6man-udpzero/)
>          That should be specified.
>
>          2) Handling of UDP-lite
>          Can UDP-lite (RFC 3828) be used or cannot be used in conjunction with
>          CoAP?
>
> Re 1 and 2: We just had a bit of discussion on the WG list, because we
> never had considered this.  The consensus seems to be that CoAP will
> be used on a wide variety of systems, and neither host support nor
> support e.g. in RFC 6282 is available.  (Citing from the discussion:
> "They seem to be specialized optimizations that are not well deployed
> and somehow seem to add overall deployment complexity and performance
> risk to the solution even if they provide some CPU reduction.")
>
> I don't actually think we need to say anything new in the draft
> because UDP is distinct from UDP-lite and we are not referencing 3828;
> neither do we reference 6936-to-be, so we are stuck with the features
> in 0768 and 2460.  (I also believe the discusion in 6936 puts CoAP out
> of its own scope.)  But, of course, we would be open to suggested
> text.

This means UDP-lite and UDP zero should not be used for COAP.
How about adding this clarifying text to Section 1.1.  "Features":
OLD
    o  UDP [RFC0768] binding with optional reliability supporting unicast
       and multicast requests.
NEW
    o  UDP [RFC0768] binding with optional reliability supporting unicast
       and multicast requests. UDP-lite [RFC3828] and UDP zero checksum
       [RFC 6936] are not supported by COAP.


>
>          3) Fragmentation of messages
>          The recommendation in Section 4.6 about the path MTU is generally valid
>          only for IPv6. For IPv4, 567 bytes is the safe area to work without
>          fragmentation, though in today WANs 1280 work perfectly, but I am not so
>          sure about the networks envisioned for CoAP. This 576 bytes for IPv4 are
>          mentioned in the implementation note, but deserves text on the same level
>          as for IPv6.
>
> IPv4 simply hasn't received a lot of attention here.  The more

It worries me that IPv4 has not received a lot of attention. It is a 
general issue of the draft or just in this single place?

> normative text is about message size selection; there should be little
> practical difference between IPv4 and IPv6 here.
> The 576 byte MRU is more of a theoretical value.  IPv4 implementations
> will have live with IP layer fragmentation for the larger message
> sizes just as 6LoWPAN will have to live with adaptation layer fragmentation.

I think I can live with the discussions of the issue in the draft.

>
>          4) Ensuring no fragmentation with IPv4
>          The implementation note in Section 4.6 states that for IPv4 it is 'harder
>          to ensure that there is no IP fragmentation'. This neglects the
>          possibility of using the Don't Fragment (DF) flag in the IPv4 header and
>          also that there is possibly feedback from a node enroute that the MTU is
>          too big if the DF flag is set, i.e., by means of an ICMP error message.
>          Should there be any recommendation or protocol machinery to deal with
>          path probing? E.g., referencing RFC 4821 (PMTUD).
>
> CoAP is meant to be operable without persistent state between
> exchanges.  Normal operation of CoAP in constrained implementations
> (if they even implement IPv4) will not use DF.  More advanced
> implementations may be able to keep state about peers; it should be
> pretty obvious how to do this (and will generally be combined with
> establishing congestion control state).  I have added a reference to
> RFC 4821 to the discussion of path MTU discovery [1310].

Is there a particular reason why DF is not used in COAP?
Thanks for adding the reference to RFC 4821!

>
>          5) Reaction to network errors that are signalled
>          I wonder why the draft is not discussing any reaction to network failures
>          signalled through ICMP messages. This relates also to my DISCUSS issue no
>          4.
>
> We didn't find much guidance in existing UDP-based protocols on
> handling ICMP messages.  RFC5405 section 3.7 is on a level of "can
> utilize", and the practical problems of robustness and validation of
> messages (including against attacks) make handling ICMP messages in
> constrained implementations difficult.  In any case, even advanced
> forms of ICMP handling are unlikely to impact CoAP protocol processing
> beyond improved local error handling, so we believe the subject is
> best left to a point in time when more operational experience is
> available.

I agree to your points and also the difficulties in using, or even 
receiving, ICMP error messages, but a recommendation to take them into 
account when handling network errors would be beneficial for the 
protocol. This part looks underspecified, especially since the a larger 
than 1280 byte MTU can also cause issues in IPv6 networks.
Not documenting it all looks rather incomplete.

An incomplete text proposal:
"COAP implementations should take ICMP error messages into account when 
handling error conditions in the transmission of COAP messages."

I'm not sure where this would fit best in the draft.

>
>          6) Idempotency
>          The discussion of idempotency is useful, but overlooks message order.
>          I.e., the discussion appears to assume that a sequence of the same
>          actions has the same effect as a single action, but this is true only if
>          different sets of actions (from different sources, or copies of different
>          actions from a single source) aren't interleaved. This should be
>          addressed.
>
> The CoAP specification generally does not attempt to explain all the
> relevant concepts of the Web, but defers to other specifications.
> Section 9.1.2 of RFC2616 contains a discussion about sequences of
> idempotent method executions.  Section 9.1 is explicitly referenced
> from section 5.1, which is the main section discussing idempotence.

Ah, I have missed the reference to S 9.1 in RFC 2616. Cleared.

>
>          7) Protocol reactions to reserved or prohibited values
>          Regarding reserved or prohibited values in the IANA section, it would be
>          useful to be clear about what happens when those values are seen. I.e.,
>          should they be ignored, generate an error, etc.
>
> Good point.  We need to check this in detail.
> -> Ticket.

Ok.

>
>          8) Flow Control/Receiver Buffer
>          The protocol does not have any real means for the receiver to control the
>          amount of data that is being sent to it. I can understand the attempt to
>          provide a simple protocol, but adding a very basic flow control mechanism
>          will not prohibitively increase the complexity of the protocol, while
>          improving robustness.
>          According to Section 2.1, a node can always return a RST if the message
>          cannot be process for whatever reason.
>          I propose to add an option to the RST message that allows the message
>          receiver to state how much data it is willing to accept from a particular
>          sender or in general (up to the implementation).
>
> (RST messages are empty messages and cannot have options.)  CoAP
> servers currently perform load shedding by not reacting to an incoming
> message at all.  Note that an 5.03 error can also set the Max-Age
> option in place of the "Retry-After" known from HTTP (section
> 5.9.3.4).  There has been discussion on more explicit feedback for
> load shedding, e.g.,
> draft-greevenbosch-core-minimum-request-interval-00; currently, the WG
> feels that finding a good solution (or even understanding the problem
> space) for this requires more study (see minutes from Orlando, where
> we discussed Bert's draft).

This is a huge issue, as the message receiver does not have any means to 
reduce the message size to an degree where it can process it. This issue 
is orthogonal to control the sending rate of the sender.
A Standards Track protocol needs have measures to ensure that the 
receiver can tune how much data is sent at once to it.

>
>          9) Handling a wrapping message IDs
>          According to Section 4.4.:
>          "The same Message ID MUST NOT be re-used (in communicating with the same
>          endpoint) within the EXCHANGE_LIFETIME (Section 4.8.2)" with
>          EXCHANGE_LIFETIME of 247s.
>          By now it is unrealistic that the message ID of 16 bits will wrap around
>          in that time frame, but protocols live long and at some later time it can
>          be possible.
>          However, the protocol doesn't have any means to detect wrapped message
>          IDs.
>
> Indeed, the onus is on the sender to ensure that the Message ID does
> not wrap around within EXCHANGE_LIFETIME.  In contrast to, say, the
> IPv4 IP ID, the potential problem of Message ID reuse has been
> well-highlighted, and it is receiving additional attention in the LWIG
> drafts that are starting to provide guidance on CoAP implementation.
> Implementations that need more than ~ 250 messages per second (per
> peer endpoint) may need to use multiple source endpoints.

Is this limitation documented in the draft?

> We don't think much more can be or should be done here.
>
>          ----------------------------------------------------------------------
>          COMMENT:
>          ----------------------------------------------------------------------
>
>          1) Endpoint vs. host
>          This document uses the term "endpoint" to refer to the combination of
>          address and port, and possibly also security association, that is local
>          to one end of an association. I would have expected the more common term
>          "socket", as originated in TCP parlance, to be used instead (even though
>          here the term is used in a connectionless context).
>
> Most implementers have a quite different idea of a "socket", so this
> language would be rather confusing for them.  The authors might have
> used "transport address", but "endpoint" seemed shorter.

You mean socket as in power socket?
Anyhow, it is a comment only :)

>
>          2) Reaction to network errors due to local link errors
>          Link layers can give some hints if the link is up, down, etc.
>          Traditionally, this has not been taken into account too much when design
>          transport protocols, but wouldn't it make sense to take it into account
>          for CoAP, as it is much more working in constrained environments?
>
> As a quality of implementation issue: certainly.
> I also expect this to come up in the LWIG work.
> But how would it impact the CoAP specification?

The link-level could indicate the congestion level on a specific link, 
for instance.
However, I can see that this is something that will be learned over time.

>
>          3) Short messages
>          Section 3., paragraph 1:
>
>            CoAP is based on the exchange of short messages which, by default,
>            are transported over UDP (i.e. each CoAP message occupies the data
>            section of one UDP datagram).  CoAP may also be used over Datagram
>
>          What are short messages in terms of bytes? Is this a hidden protocol
>          requirement?
>
> Section 4.6 discusses message sizes and should leave the implementer
> with a pretty good idea what message sizes are a good fit for CoAP.
> I don't think forward-referencing to 4.6 from section 3 is necessary.

FWIW: Section 4.6 discusses MTUs and I am not sure where short messages 
are discussed over there. Short can be anything, even 500 bytes might be 
considered short.

>
>          4) randomization of message IDs
>
>          Section 4.4., paragraph 3:
>
>            Implementation Note:  Several implementation strategies can be
>               employed for generating Message IDs.  In the simplest case a
>          CoAP
>               endpoint generates Message IDs by keeping a single Message ID
>               variable, which is changed each time a new Confirmable or Non-
>               confirmable message is sent regardless of the destination
>          address
>               or port.  Endpoints dealing with large numbers of transactions
>               could keep multiple Message ID variables, for example per prefix
>               or destination address (note that some receiving endpoints may
>          not
>               be able to distinguish unicast and multicast packets adressed to
>               it, so endpoints generating Message IDs need to make sure these
>          do
>               not overlap).  The initial variable value should be randomized.
>
>           the initial variable SHOULD be randomized, just to avoid blind off
>           path attacks, right?
>
> Yes.  We are trying to avoid RFC 2119 language in the implementation notes.
> Since this is about a variable that only exists in a specific
> implementation strategy, a SHOULD wouldn't work very well, anyway.

Ok, but I would add a statement why randomized message IDs are need to 
make a secure protocol. E.g. "It is strongly recommended that the 
initial variable value is randomized, in order to make off path attacks 
to the protocol less likely."

>
>          5)
>          In Section 4.6.:
>
>           larger than an IP fragment result in undesired packet fragmentation.
>          should read larger than an 'IP packet' instead of 'IP fragment'.
>
> Indeed, [1311].

Ok.

>
>          6)
>          Section 5.4.1., paragraph 7:
>
>            Critical/Elective rules apply to non-proxying endpoints.  A proxy
>            processes options based on Unsafe/Safe classes as defined in
>            Section 5.7.
>
>           I suggest moving this statement to the beginning of this subsection,
>           as it provides important information that shouldn’t be missed.
>
> Since the entire next subsection also discusses the subject, I think
> there is little danger that this will be missed.  (Putting the
> exception early confuses the section, so I would like to avoid this
> change.)

Fine with me.

>
>          7) Dependency between application layer and CoAP
>          Section 5.2.2., paragraph 2:
>
>            The server maybe initiates the attempt to obtain the resource
>            representation and times out an acknowledgement timer, or it
>            immediately sends an acknowledgement knowing in advance that there
>            will be no piggy-backed response.  The acknowledgement effectively
>          is
>            a promise that the request will be acted upon.
>
>          This may or may not be an issue:
>          Assuming that the server did sent an ACK for a request but is never ever
>          fulfilling its promise to send any real 'response'. The request/response
>          initiated from the client is done on the CoAP level, but not for the
>          application on top.
>          Is there any recommendation for the application on top of CoAP how to
>          handle such cases?
>
> Generally, we would expect applications to handle this in similar ways
> they are handling other application-layer timeouts.  E.g., many e-mail
> and web applications timeout requests after a time on the order of a
> minute.  We think this is another issue best left for discussion after
> some operational experience is available.

Fine with me.
However, do you have a document where the WG lists the open issues to be 
explored? That would be awesome to have in order to revisit the open 
issues after a while.

Thanks,

   Martin
-- 
martin.stiemerling@neclab.eu

NEC Laboratories Europe
NEC Europe Limited
Registered Office:
Athene, Odyssey Business Park, West End  Road, London, HA4 6QE, GB
Registered in England 2832014

[core] Martin Stiemerling's Discuss on draft-ietf… Martin Stiemerling
Re: [core] Martin Stiemerling's Discuss on draft-… Carsten Bormann
Re: [core] Martin Stiemerling's Discuss on draft-… Martin Stiemerling
Re: [core] Martin Stiemerling's Discuss on draft-… Carsten Bormann