[clue] Question about timers (retry and timeouts) in draft-ietf-clue-protocol

"Roni Even (A)" <roni.even@huawei.com> Mon, 26 February 2018 14:19 UTC

Return-Path: <roni.even@huawei.com>
X-Original-To: clue@ietfa.amsl.com
Delivered-To: clue@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 80A5C1242F5 for <clue@ietfa.amsl.com>; Mon, 26 Feb 2018 06:19:26 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.23
X-Spam-Level:
X-Spam-Status: No, score=-4.23 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id O-p3-JRj-1qX for <clue@ietfa.amsl.com>; Mon, 26 Feb 2018 06:19:22 -0800 (PST)
Received: from huawei.com (lhrrgout.huawei.com [194.213.3.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 8F26F120725 for <clue@ietf.org>; Mon, 26 Feb 2018 06:19:21 -0800 (PST)
Received: from LHREML711-CAH.china.huawei.com (unknown [172.18.7.108]) by Forcepoint Email with ESMTP id 1EB35A2E161A1 for <clue@ietf.org>; Mon, 26 Feb 2018 14:19:17 +0000 (GMT)
Received: from DGGEMM423-HUB.china.huawei.com (10.1.198.40) by LHREML711-CAH.china.huawei.com (10.201.108.34) with Microsoft SMTP Server (TLS) id 14.3.361.1; Mon, 26 Feb 2018 14:19:19 +0000
Received: from DGGEMM506-MBX.china.huawei.com ([169.254.3.214]) by dggemm423-hub.china.huawei.com ([10.1.198.40]) with mapi id 14.03.0361.001; Mon, 26 Feb 2018 22:19:09 +0800
From: "Roni Even (A)" <roni.even@huawei.com>
To: "clue@ietf.org" <clue@ietf.org>
Thread-Topic: Question about timers (retry and timeouts) in draft-ietf-clue-protocol
Thread-Index: AdOj2NoMzin0OvqsR8qbld+eN+ClsA==
Date: Mon, 26 Feb 2018 14:19:09 +0000
Message-ID: <6E58094ECC8D8344914996DAD28F1CCD86763B@DGGEMM506-MBX.china.huawei.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.200.202.90]
Content-Type: multipart/alternative; boundary="_000_6E58094ECC8D8344914996DAD28F1CCD86763BDGGEMM506MBXchina_"
MIME-Version: 1.0
X-CFilter-Loop: Reflected
Archived-At: <https://mailarchive.ietf.org/arch/msg/clue/DrNmjNP0aUztx5ydLioGl4u-iNU>
Subject: [clue] Question about timers (retry and timeouts) in draft-ietf-clue-protocol
X-BeenThere: clue@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: CLUE - ControLling mUltiple streams for TElepresence <clue.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/clue>, <mailto:clue-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/clue/>
List-Post: <mailto:clue@ietf.org>
List-Help: <mailto:clue-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/clue>, <mailto:clue-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 26 Feb 2018 14:19:26 -0000

Hi,
An important issue that was made by Adam during his AD review has to do with timeout and retry thresholds, please provide feedback

This is the comment:

BLOCKER: General: There are several mentions of timeouts and retry thresholds in the text and its corresponding state machines; however, the document neither defines nor cites a document as defining what these timeout and retry values are. These need to be defined and described. If the timer and retry scheme allows the two ends of the connection to have different values for timeouts and number of retries, then there need to be additional error procedures that allow the MC and MP state machines to stay in sync (if the timer/retry values can be different, it's possible for one state machine to transition to "terminated," while the other is still active, and you need messaging to clean this up). The remainder of this comment is non-blocking: Related to this, the document frequently refers to retries as "expiring" (e.g., "retry expired" on the state diagrams). That doesn't really make sense unless "retry" is the name of a timer rather than a counter; I think you mean to say "exhausted" or something similar.

My view as individual:

The CLUE protocol is delivered using SCTP "CLUE entities are required to use ordered SCTP message delivery, with full reliability" so there is no problem with timeout for retry for the data channel (https://tools.ietf.org/html/draft-ietf-clue-datachannel-14#section-3.3.2 )

So the retry and timeout are for the application level and not for the sctp transport.

>From the discussion on the mailing list during the WGLC


"The idea here is that the MC avoids entering a loop where the MP keeps on sending an erroneous ADV hence forcing the MC to respond with a NACK. If this situation iterates for a while (# of retries), the MC terminates the ongoing CLUE "session".

I noticed that the discussion started even earlier and the conclusion was that retry and timeout are needed but we also need default values which were never listed

My understanding is that when a receiver of a protocol message sends a negative ack as response he allows for reties of fixed message and will allow it n times or quit after x time if no new message arrives.

I think that for retries any number is good but I think that 2 is OK since is the message sender cannot send a valid message we should abandon the call. As for timeout this can be a short one that will still allow for a round trip so my view that 1 second is enough

Other thoughs?



One side comment in section 6.2 there are two instances of "number of timeouts" ?



Roni Even
Clue co-chair