[core] Some comments on: draft-ietf-core-cocoa-03

G Fairhurst <gorry@erg.abdn.ac.uk> Mon, 19 March 2018 16:35 UTC

Return-Path: <gorry@erg.abdn.ac.uk>
X-Original-To: core@ietfa.amsl.com
Delivered-To: core@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2DA5212D7FC for <core@ietfa.amsl.com>; Mon, 19 Mar 2018 09:35:09 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.91
X-Spam-Level:
X-Spam-Status: No, score=-1.91 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, T_RP_MATCHES_RCVD=-0.01] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WgBQ8QR4JhlL for <core@ietfa.amsl.com>; Mon, 19 Mar 2018 09:35:06 -0700 (PDT)
Received: from pegasus.erg.abdn.ac.uk (pegasus.erg.abdn.ac.uk [IPv6:2001:630:241:204::f0f0]) by ietfa.amsl.com (Postfix) with ESMTP id CE49A12D874 for <core@ietf.org>; Mon, 19 Mar 2018 09:35:05 -0700 (PDT)
Received: from dhcp-8e53.meeting.ietf.org (unknown [IPv6:2001:67c:370:128:ec88:e430:e6cb:55b3]) by pegasus.erg.abdn.ac.uk (Postfix) with ESMTPSA id C13801B0012A; Mon, 19 Mar 2018 16:35:04 +0000 (GMT)
Message-ID: <5AAFE6B8.2050407@erg.abdn.ac.uk>
Date: Mon, 19 Mar 2018 16:35:04 +0000
From: G Fairhurst <gorry@erg.abdn.ac.uk>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:12.0) Gecko/20120428 Thunderbird/12.0.1
MIME-Version: 1.0
To: Carsten Borman <cabo@tzi.org>
CC: jaime@iki.fi, core@ietf.org, Gorry Fairhurst <gorry@erg.abdn.ac.uk>
References: <5AAFD717.6090502@erg.abdn.ac.uk>
In-Reply-To: <5AAFD717.6090502@erg.abdn.ac.uk>
X-Forwarded-Message-Id: <5AAFD717.6090502@erg.abdn.ac.uk>
Content-Type: text/plain; charset="windows-1252"; format="flowed"
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/core/T31Zut4ZjXJzGbaevKS_332ayFg>
Subject: [core] Some comments on: draft-ietf-core-cocoa-03
X-BeenThere: core@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Constrained RESTful Environments \(CoRE\) Working Group list" <core.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/core>, <mailto:core-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/core/>
List-Post: <mailto:core@ietf.org>
List-Help: <mailto:core-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/core>, <mailto:core-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 19 Mar 2018 16:35:09 -0000

To understand draft-ietf-core-cocoa-03, I did a quick read through of
this. Sorry, it took a long read to understand what this ID is saying. I
finally worked it out and some of this commentary may be useful to others
so I include it below.

Gorry
(p.s. I am not a list memebr, please post to core WG - I can be reached at the cc line above).

  ----------------

Checking against BCPs:

Section 3.1.1 of RFC8085 provides BCP for timers.

This ID would benefit from saying something like this from RFC8085:
The method uses averaging of multiple recent measurement samples to
account for variance using an exponentially weighted moving average.

RFC8085 says this:

Independent latency estimates SHOULD be maintained for each
destination with which an endpoint communicates.

But rather ambiguously this ID says this:
“ It may also be worthwhile to perform RTT estimation not just based on
information measured from a single destination endpoint, but also
based on entire hosts (IP addresses) and/or complete prefixes (e.g.,
maintain an RTT estimate for a whole /64). The exact way this can be
used to reduce the amount of state in an initiator is for further
study.”

  - but yet this ID does not warn of the issues in not being able to predict a
consistent RTT (because a few (?) of the endpoints you talk to are actually
remotely located across a path, albeit they are within a network
prefix). I think this is a potential issue.

This ID sets the “Blind RTO Estimate” to 2 seconds, whereas RFC8085 states:
“Such applications SHOULD NOT send more than one UDP datagram every 3
seconds and SHOULD use an even less aggressive rate when possible.“
- I’m not sure why 2 s was chosen in preference to 3? Although they are
both fairly large numbers.

RFC 5033 probably could be mentioned in the intro, but actually targets
a different use case, and that can be said I think.

----------------

The abstract isn’t super clear about what is “normal and advanced” modes
of operations, in fact the advanced mode is also claimed as simple.
That’s not easy to follow. As I see this is an advanced RTT for
congestion control.

Section 4 “Advanced CoAP Congestion Control: RTO Estimation”
- This isn’t really a “Advanced CoAP Congestion Control” - it is
advanced RTO estimation. As I read it, the congestion control remains the same.

I don’t really understand this phrase:
  
“For an initiator that plans to make multiple requests to one
destination endpoint, it may be worthwhile to make RTT measurements in
order to compute a more appropriate RTO than the default initial timeout
of 2 to 3 s. In particular, a wide spectrum of RTT values is expected in
different types of networks where CoAP is used.“

What does “may be worthwhile” mean? … I don’t understand the context.
Perhaps something like this was what was intended?:

“The initiator as defined in [RFCxxx] uses a default initial timeout of
2 to 3 s. An initiator that plans to make multiple requests to one
destination endpoint, can benefit from making RTT measurements that
enable it to compute a more appropriate RTO. This is expected to improve
performance across the wide range of RTT values that may be expected
across the types of networks where CoAP is used.“

Section 4 says:
”To ensure that the new scheme is not posing a danger to the
network,”

  - This seems rather too bold, I don’t know how to “ensure” this. Although I
do expect it would be good tos ay something like:
  “suggest there is not a significant risk”.

This also was hard to follow:
“ Note that such a mechanism must, during idle periods, decay RTO
estimates that are shorter or longer than the default RTO estimate back
to the default RTO estimate, until fresh measurements become available
again, as proposed in Section 4.3.”

  - why isn’t the ability to grow the RTT during idle periods actually not
a standards requirement? I could not find normative text in section 4.3
either, and I really do think this particular point should be written
using MUST clauses.

Finally, this also is not really very clear to me:

  “A client that has arrived at a RTO estimate shorter than 1 s SHOULD
therefore use a larger backoff factor for retransmissions to avoid
expending all of its retransmissions (MAX_RETRANSMIT, see Section 4.2 of
[RFC7252], normally 4) in the default interval of 2 to 3 s.”
- what does this actually say you need to implement? (I think this is just
a wording issue??)