Re: [clue] Question about timers (retry and timeouts) in draft-ietf-clue-protocol

"Roni Even (A)" <roni.even@huawei.com> Sun, 08 April 2018 05:48 UTC

Return-Path: <roni.even@huawei.com>
X-Original-To: clue@ietfa.amsl.com
Delivered-To: clue@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C517B12711D for <clue@ietfa.amsl.com>; Sat, 7 Apr 2018 22:48:33 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.21
X-Spam-Level:
X-Spam-Status: No, score=-4.21 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9CkyA-Sphj42 for <clue@ietfa.amsl.com>; Sat, 7 Apr 2018 22:48:31 -0700 (PDT)
Received: from huawei.com (lhrrgout.huawei.com [194.213.3.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id CEF22126FB3 for <clue@ietf.org>; Sat, 7 Apr 2018 22:48:30 -0700 (PDT)
Received: from lhreml703-cah.china.huawei.com (unknown [172.18.7.108]) by Forcepoint Email with ESMTP id 596F4768379EF for <clue@ietf.org>; Sun, 8 Apr 2018 06:48:25 +0100 (IST)
Received: from DGGEMM421-HUB.china.huawei.com (10.1.198.38) by lhreml703-cah.china.huawei.com (10.201.108.44) with Microsoft SMTP Server (TLS) id 14.3.382.0; Sun, 8 Apr 2018 06:48:26 +0100
Received: from DGGEMM506-MBX.china.huawei.com ([169.254.3.214]) by dggemm421-hub.china.huawei.com ([10.1.198.38]) with mapi id 14.03.0361.001; Sun, 8 Apr 2018 13:48:22 +0800
From: "Roni Even (A)" <roni.even@huawei.com>
To: Simon Pietro Romano <spromano@unina.it>
CC: "clue@ietf.org" <clue@ietf.org>, Adam Roach <adam@nostrum.com>, Roberta Presta <roberta.presta@unina.it>
Thread-Topic: [clue] Question about timers (retry and timeouts) in draft-ietf-clue-protocol
Thread-Index: AQHTzdT7Q206d8/kzU+mjJtceT/eGaP2W3tg
Date: Sun, 08 Apr 2018 05:48:22 +0000
Message-ID: <6E58094ECC8D8344914996DAD28F1CCD871B66@DGGEMM506-MBX.china.huawei.com>
References: <6E58094ECC8D8344914996DAD28F1CCD863A56@DGGEMM506-MBX.china.huawei.com> <A57BAD5A-E794-491B-914C-D3C7B78AD691@unina.it>
In-Reply-To: <A57BAD5A-E794-491B-914C-D3C7B78AD691@unina.it>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.200.202.69]
Content-Type: multipart/alternative; boundary="_000_6E58094ECC8D8344914996DAD28F1CCD871B66DGGEMM506MBXchina_"
MIME-Version: 1.0
X-CFilter-Loop: Reflected
Archived-At: <https://mailarchive.ietf.org/arch/msg/clue/H-o9ZklWBXrUEsP9mCq7snIGntk>
Subject: Re: [clue] Question about timers (retry and timeouts) in draft-ietf-clue-protocol
X-BeenThere: clue@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: CLUE - ControLling mUltiple streams for TElepresence <clue.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/clue>, <mailto:clue-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/clue/>
List-Post: <mailto:clue@ietf.org>
List-Help: <mailto:clue-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/clue>, <mailto:clue-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 08 Apr 2018 05:48:34 -0000

HI Simon,
I think that Adam’s and yours suggestion to remove the timers make sense since we are using a reliable protocol but I am wondering if the case when the MP continues to send Advertisements and keep getting NACK back from the MC is a real case that need to be addressed (section 6.1), this will still require a  retry count but we can add text that will tell the MP how many time to retry without having an explicit parameter.
Roni

From: Simon Pietro Romano [mailto:spromano@unina.it]
Sent: Friday, April 06, 2018 9:28 PM
To: Roni Even (A)
Cc: clue@ietf.org; Adam Roach; Roberta Presta
Subject: Re: [clue] Question about timers (retry and timeouts) in draft-ietf-clue-protocol

Hello again Roni,


Hi,
An important issue that was made by Adam during his AD review has to do with timeout and retry thresholds, please provide feedback

This is the comment:
BLOCKER: General: There are several mentions of timeouts and retry thresholds in the text and its corresponding state machines; however, the document neither defines nor cites a document as defining what these timeout and retry values are. These need to be defined and described. If the timer and retry scheme allows the two ends of the connection to have different values for timeouts and number of retries, then there need to be additional error procedures that allow the MC and MP state machines to stay in sync (if the timer/retry values can be different, it's possible for one state machine to transition to "terminated," while the other is still active, and you need messaging to clean this up). The remainder of this comment is non-blocking: Related to this, the document frequently refers to retries as "expiring" (e.g., "retry expired" on the state diagrams). That doesn't really make sense unless "retry" is the name of a timer rather than a counter; I think you mean to say "exhausted" or something similar.

My view as individual:

The CLUE protocol is delivered using SCTP “CLUE entities are required to use ordered SCTP message delivery, with full reliability” so there is no problem with timeout for retry for the data channel (https://tools.ietf.org/html/draft-ietf-clue-datachannel-14#section-3.3.2 )

So the retry and timeout are for the application level and not for the sctp transport.

From the discussion on the mailing list during the WGLC


“The idea here is that the MC avoids entering a loop where the MP keeps on sending an erroneous ADV hence forcing the MC to respond with a NACK. If this situation iterates for a while (# of retries), the MC terminates the ongoing CLUE “session”.

I noticed that the discussion started even earlier and the conclusion was that retry and timeout are needed but we also need default values which were never listed

My understanding is that when a receiver of a protocol message sends a negative ack as response he allows for reties of fixed message and will allow it n times or quit after x time if no new message arrives.

I think that for retries any number is good but I think that 2 is OK since is the message sender cannot send a valid message we should abandon the call. As for timeout this can be a short one that will still allow for a round trip so my view that 1 second is enough

Other thoughs?

There is indeed Adam’s answer to my answer, that I’m copy/pasting below for the benefit of the readers:

In thinking through what this scheme should look like, it occurs to me that the CLUE messages are defined to be sent over a reliable transport (SCTP), which has its own retransmission timers and eventual timeouts. Implementing a retransmission scheme on top of a reliable transport -- especially one as aggressive as you suggest above -- will put more traffic on the network when congestion occurs rather than less.

So, in the final analysis, I think the action here is to remove retransmission timers and retry counts altogether. If the underlying transport takes longer to detect a failure than is sensible for CLUE (and it likely does), then a supervisory timer that declares the session failed might make sense.


I would be personally inclined to take Adam’s suggestion and get rid of those timers and retry counts altogether. What is your feeling about that?

Thanks,

Simon


                                                                        _\\|//_
                                                                           ( O-O )
      ~~~~~~~~~~~~~~~~~~~~~~o00~~(_)~~00o~~~~~~~~~~~~~~~~~~~~~~~~
                                                            Simon Pietro Romano
                                                             Universita' di Napoli Federico II
                                         Computer Engineering Department
                         Phone: +39 081 7683823 -- Fax: +39 081 7683816
                                           e-mail: spromano@unina.it<mailto:spromano@unina.it>

                            <<Molti mi dicono che lo scoraggiamento è l'alibi degli
                            idioti. Ci rifletto un istante; e mi scoraggio>>. Magritte.
                                                                     oooO
       ~~~~~~~~~~~~~~~~~~~~~~~(   )~~~ Oooo~~~~~~~~~~~~~~~~~~~~~~~~~
                                                                             \ (            (   )
                                                                      \_)          ) /
                                                                       (_/




One side comment in section 6.2 there are two instances of “number of timeouts” ?



Roni Even
Clue co-chair
_______________________________________________
clue mailing list
clue@ietf.org<mailto:clue@ietf.org>
https://www.ietf.org/mailman/listinfo/clue