Re: [Dtls-iot] AD review of draft-ietf-dice-profile-13

"FOSSATI, Thomas (Thomas)" <thomas.fossati@alcatel-lucent.com> Thu, 23 July 2015 06:24 UTC

Return-Path: <thomas.fossati@alcatel-lucent.com>
X-Original-To: dtls-iot@ietfa.amsl.com
Delivered-To: dtls-iot@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 87C5A1A8771 for <dtls-iot@ietfa.amsl.com>; Wed, 22 Jul 2015 23:24:36 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.91
X-Spam-Level:
X-Spam-Status: No, score=-6.91 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_HI=-5, T_RP_MATCHES_RCVD=-0.01] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ZF3kbImEOQCM for <dtls-iot@ietfa.amsl.com>; Wed, 22 Jul 2015 23:24:34 -0700 (PDT)
Received: from smtp-fr.alcatel-lucent.com (fr-hpida-esg-02.alcatel-lucent.com [135.245.210.21]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 09FAA1A879A for <dtls-iot@ietf.org>; Wed, 22 Jul 2015 23:24:32 -0700 (PDT)
Received: from fr712usmtp2.zeu.alcatel-lucent.com (unknown [135.239.2.42]) by Websense Email Security Gateway with ESMTPS id 597CCF60319BC; Thu, 23 Jul 2015 06:24:29 +0000 (GMT)
Received: from FR711WXCHHUB02.zeu.alcatel-lucent.com (fr711wxchhub02.zeu.alcatel-lucent.com [135.239.2.112]) by fr712usmtp2.zeu.alcatel-lucent.com (GMO) with ESMTP id t6N6OUCZ018196 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=FAIL); Thu, 23 Jul 2015 08:24:30 +0200
Received: from FR711WXCHMBA08.zeu.alcatel-lucent.com ([169.254.4.234]) by FR711WXCHHUB02.zeu.alcatel-lucent.com ([135.239.2.112]) with mapi id 14.03.0195.001; Thu, 23 Jul 2015 08:24:29 +0200
From: "FOSSATI, Thomas (Thomas)" <thomas.fossati@alcatel-lucent.com>
To: Stephen Farrell <stephen.farrell@cs.tcd.ie>, "dtls-iot@ietf.org" <dtls-iot@ietf.org>
Thread-Topic: [Dtls-iot] AD review of draft-ietf-dice-profile-13
Thread-Index: AQHQuAc7gNfJL+A2WUmOuinYGFblJ53osGKA
Date: Thu, 23 Jul 2015 06:24:29 +0000
Message-ID: <D1D5A7E5.31F3A%thomas.fossati@alcatel-lucent.com>
References: <559AA9F8.6080005@cs.tcd.ie>
In-Reply-To: <559AA9F8.6080005@cs.tcd.ie>
Accept-Language: en-GB, en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
user-agent: Microsoft-MacOutlook/14.5.2.150604
x-originating-ip: [135.239.27.41]
Content-Type: text/plain; charset="us-ascii"
Content-ID: <CA4DF570A1450C4280682072AD68E5F4@exchange.lucent.com>
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Archived-At: <http://mailarchive.ietf.org/arch/msg/dtls-iot/ZEJAXyc5F9JhMShgLvgh2ymfibY>
Subject: Re: [Dtls-iot] AD review of draft-ietf-dice-profile-13
X-BeenThere: dtls-iot@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DTLS for IoT discussion list <dtls-iot.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dtls-iot>, <mailto:dtls-iot-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dtls-iot/>
List-Post: <mailto:dtls-iot@ietf.org>
List-Help: <mailto:dtls-iot-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dtls-iot>, <mailto:dtls-iot-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 23 Jul 2015 06:24:36 -0000

Hi Stephen,

On 06/07/2015 17:16, "dtls-iot on behalf of Stephen Farrell"
<dtls-iot-bounces@ietf.org on behalf of stephen.farrell@cs.tcd.ie> wrote:
>(6) 13: sorry I don't get why a 1 second initial timer
>is an issue because messages "need longer" (we're not in
>a DTN, right:-) I don't think you've justified the 10s
>initial value and that might be bad in fact e.g.  for
>sleepy nodes one might want a much shorter initial
>timeout so the node can go back to snoozing.  (It could
>be that this 10s value was justified on the list, but
>I'm not getting it from the text. Or is it appendix A
>you're thinking of? If so, that seems overly specialised
>for a generic 10s recommendation.)

the relevant text is: "[...] these values are too aggressive and lead to
spurious failures when messages in flight need longer." which I reckon is
a bit too terse.

The full story is:
* TLS protocol steps can take longer due to the time spent figuring out
  the crypto bits on the constrained side;
* DTLS retransmission, which is per-flight, interacts very badly with low
  bandwidth networks;
so, it's essential that the probability of a spurious rertansmit is
minimised.

Also, on packet loss, the sending node should not react too aggressively:
if lost packets are re-injected too quickly into a temporarily congested
WSN, congestion worsens.

The 10s is a conservative guess that tries to cater for the worse case
(i.e. very constrained endpoints and/or networks that are congested or
with high delay variance - e.g. GSM-SMS).

Re: sleepy nodes.  Even when starting with a 1s timeout, they need to be
prepared to stay awake for the maximum wait time which is roughly 1
minute.  This is not a MUST, anyway, so sleepy nodes could use a lower
value, both for the initial timer and, most importantly, for the threshold.


I have this new text in my working copy.  (Note I have slightly tweaked
the value -- 10 has become 9 -- to get the same max wait you'd get with
the usual 1s initial timeout):

   TLS protocol steps can take longer due to higher processing time on
   the constrained side.  On the other hand, the way DTLS handles
   retransmission, which is per-flight instead of per-segment, tends to
   interact poorly with low bandwidth networks.

   For these reasons, it's essential that the probability of a spurious
   retransmit is minimized and, on timeout, the sending endpoint does
   not react too aggressively.  The latter is particularly relevant when
   the WSN is temporarily congested: if lost packets are re-injected too
   quickly, congestion worsens.

   An initial timer value of 9 seconds with exponential back off up to
   no less then 60 seconds is therefore RECOMMENDED.

   This value is chosen big enough to absorb large latency variance due
   to either slow computation on constrained endpoints or to intrinsic
   network characteristics (e.g.  GSM-SMS), as well as to produce a low
   number of retransmission events and relax the pacing between them.
   Its worst case wait time is the same as using 1s timeout (i.e. 63s),
   while triggering less then half retransmissions (2 instead of 5).

   In order to minimise the wake time during DTLS handshake, sleepy
   nodes might decide to select a lower threshold, and consequently a
   smaller initial timeout value.  If this is the case, the
   implementation MUST keep into account the considerations about
   network stability described in this section.

How does it sound?

Cheers, t