Re: [clue] clue protocol

Simon Pietro Romano <spromano@unina.it> Tue, 06 February 2018 09:04 UTC

Return-Path: <spromano@unina.it>
X-Original-To: clue@ietfa.amsl.com
Delivered-To: clue@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2E0B8126BF6 for <clue@ietfa.amsl.com>; Tue, 6 Feb 2018 01:04:10 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.909
X-Spam-Level:
X-Spam-Status: No, score=-1.909 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id fRr4nds7SgHm for <clue@ietfa.amsl.com>; Tue, 6 Feb 2018 01:04:04 -0800 (PST)
Received: from unina.it (fmvip.unina.it [IPv6:2001:760:3403:ffff::7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 3432A120727 for <clue@ietf.org>; Tue, 6 Feb 2018 01:04:04 -0800 (PST)
Received: from smtp2.unina.it (smtp2.unina.it [192.132.34.62]) by leas1.unina.it with ESMTP id w1693ii8009870-w1693iiA009870 (version=TLSv1.0 cipher=ECDHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 6 Feb 2018 10:03:44 +0100
Received: from [143.225.28.167] ([143.225.28.167]) (authenticated bits=0) by smtp2.unina.it (8.14.4/8.14.4) with ESMTP id w1693hVH030567 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 6 Feb 2018 10:03:43 +0100
From: Simon Pietro Romano <spromano@unina.it>
Message-Id: <0D0F98C3-D9BC-4F39-BC59-DD192CE9F8D1@unina.it>
Content-Type: multipart/alternative; boundary="Apple-Mail=_B7558A76-E69B-4D04-86F0-75EEE1B3D127"
Mime-Version: 1.0 (Mac OS X Mail 11.2 \(3445.5.20\))
Date: Tue, 06 Feb 2018 10:03:42 +0100
In-Reply-To: <6E58094ECC8D8344914996DAD28F1CCD8621CD@DGGEMM506-MBX.china.huawei.com>
Cc: Adam Roach <adam@nostrum.com>, "pkyzivat@alum.mit.edu" <pkyzivat@alum.mit.edu>, Paul Kyzivat <paul.kyzivat@comcast.net>, "clue@ietf.org" <clue@ietf.org>
To: "Roni Even (A)" <roni.even@huawei.com>
References: <6E58094ECC8D8344914996DAD28F1CCD8621CD@DGGEMM506-MBX.china.huawei.com>
X-Mailer: Apple Mail (2.3445.5.20)
Archived-At: <https://mailarchive.ietf.org/arch/msg/clue/MgxbB5sAsOItbO9wDL2XT66PndM>
Subject: Re: [clue] clue protocol
X-BeenThere: clue@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: CLUE - ControLling mUltiple streams for TElepresence <clue.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/clue>, <mailto:clue-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/clue/>
List-Post: <mailto:clue@ietf.org>
List-Help: <mailto:clue-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/clue>, <mailto:clue-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 06 Feb 2018 09:04:10 -0000

Yep! I’ll come back with a revised version asap. BTW, Adam was asking for some general feedback from the WG...

Cheers,

Simon
                     				            _\\|//_
                           				   ( O-O )
      ~~~~~~~~~~~~~~~~~~~~~~o00~~(_)~~00o~~~~~~~~~~~~~~~~~~~~~~~~
                    				Simon Pietro Romano
             				 Universita' di Napoli Federico II
                		     Computer Engineering Department 
	             Phone: +39 081 7683823 -- Fax: +39 081 7683816
                                           e-mail: spromano@unina.it <mailto:spromano@unina.it>

		    <<Molti mi dicono che lo scoraggiamento è l'alibi degli 
		    idioti. Ci rifletto un istante; e mi scoraggio>>. Magritte.
               			                     oooO
       ~~~~~~~~~~~~~~~~~~~~~~~(   )~~~ Oooo~~~~~~~~~~~~~~~~~~~~~~~~~
					                 \ (            (   )
			                                  \_)          ) /
                                                                       (_/



> On 6 Feb 2018, at 07:51, Roni Even (A) <roni.even@huawei.com> wrote:
> 
> Hi Simon,
> Can you look at the open comments from Adam so that we can finish the work in CLUE
> Roni
> 
> -----Original Message-----
> From: clue [mailto:clue-bounces@ietf.org] On Behalf Of Adam Roach
> Sent: Friday, November 03, 2017 10:07 PM
> To: Simon Pietro Romano
> Cc: clue@ietf.org
> Subject: Re: [clue] AD Review: draft-ietf-clue-protocol-13
> 
> Thanks for your thorough response! I've removed those issues that you've addressed, and included responses to the remaining issues below.
> 
> IMPORTANT NOTE TO CLUE WORKING GROUP: There is a discussion regarding retransmission timers that you need to have. See the comments below for details.
> 
> 
> On 10/6/17 1:37 PM, Simon Pietro Romano wrote:
>> 
>>> Title: The rule of thumb is that all but a small handful of 
>>> well-known acronyms need to be expanded in titles and abstracts. I 
>>> recognize that "CLUE" is a bit tortured, as acronyms go, but the 
>>> title of this document is, broadly speaking, opaque. Please change it 
>>> to something meaningful, such as "Protocol for Controlling Multiple 
>>> Streams for Telepresence (CLUE)”
>>> 
>> [SPR] Done.
> 
> I think this got missed. The header of the current version still reads:
> 
> CLUE Working Group R. Presta
> Internet-Draft S. P. Romano
> Intended status: Standards Track University of Napoli
> Expires: April 9, 2018 October 6, 2017
> 
>                              CLUE protocol
>                       draft-ietf-clue-protocol-14
> 
>> 
>>> General: There are three instances of excessively long lines in the 
>>> document.
>>> 
>> [SPR] ???
> 
> The reformatting of XML in -14 has fixed this issue.
> 
>> BLOCKER: General: There are several mentions of timeouts and retry 
>> thresholds in the text and its corresponding state machines; however, 
>> the document neither defines nor cites a document as defining what 
>> these timeout and retry values are. These need to be defined and 
>> described. If the timer and retry scheme allows the two ends of the 
>> connection to have different values for timeouts and number of 
>> retries, then there need to be additional error procedures that allow 
>> the MC and MP state machines to stay in sync (if the timer/retry 
>> values can be different, it's possible for one state machine to 
>> transition to "terminated," while the other is still active, and you 
>> need messaging to clean this up).
>> 
>> [SPR] Our idea was to rely on a single, fixed, application-level 
>> (i.e., CLUE-level) timeout value, rather than leveraging a number of 
>> them (T1, T2, T4, plus Timer A through Timer K) as in SIP. The value 
>> of the one and only timer would be the usual RTT estimate (~500ms).
>> Differently from protocols like SIP, we are also leveraging retransmit 
>> counts to get out of a timeout-driven retransmission loop. We’re not 
>> adding such information to the revised version of the draft, because 
>> we’d first like to get your feedback on the approach. If you (and the 
>> rest of the group) believe this is a feasible approach, we can expand 
>> on that in a further version of the document.
> 
> In thinking through what this scheme should look like, it occurs to me that the CLUE messages are defined to be sent over a reliable transport (SCTP), which has its own retransmission timers and eventual timeouts. 
> Implementing a retransmission scheme on top of a reliable transport -- especially one as aggressive as you suggest above -- will put more traffic on the network when congestion occurs rather than less.
> 
> So, in the final analysis, I think the action here is to remove retransmission timers and retry counts altogether. If the underlying transport takes longer to detect a failure than is sensible for CLUE (and it likely does), then a supervisory timer that declares the session failed might make sense.
> 
>> 
>>> Section 5: The XML definition allows zero or more <clueId> elements 
>>> to appear in a message. If more than one is allowed, the document 
>>> should explain how multiple IDs are handled. If they are not, then 
>>> the schema and/or text needs to prohibit having more than one.
>>> 
>> [SPR] Actually, the schema states:
>> 
>> <xs:element name="clueId" type="xs:string" minOccurs="0”/>
>> 
>> This should indicate that clueId can appear either 0 or 1 times in a 
>> valid XML document. The default value for maxOccurs is in fact 1.
>> Don’t you agree on that?
> 
> Ah, you're correct. I'm not used to seeing minOccurs without maxOccurs, but this is entirely fine as it's currently written.
> 
>> 
>>> BLOCKER: Section 5.1: The description of <supportedVersions> 
>>> describes a scheme in which multiple supported versions can be 
>>> listed; and, if the list is omitted, it implies that only the version 
>>> described in <v> is supported. This text does not define (nor does 
>>> any other text that I can find) what <v> should be set to when 
>>> <supportedVersions> is used. Intuitively, it seems that <v> should be 
>>> set to the largest minor version of the smallest major version 
>>> advertised in <supportedVersions>, but that (or whatever the correct 
>>> answer is) needs to be clearly spelled out.
>>> 
>> [SPR] I believe we were giving for granted that “v” should be, in such 
>> case, the highest supported version (i.e., largest major and larger 
>> minor numbers). Now that you make me think around that, I would simply 
>> say that the “v” attribute MUST be set to one of the versions that the 
>> entity in question supports, as per the <supportedVersions> list. I 
>> can, e.g., decide that I send an ‘options’ with “v=12.23” if the 
>> <supportedVersions> list is like in the following:
>> 
>> <version>33.44</version>
>> <version>27.0</version>
>> <version>12.345</version>
>> <version>1.44</version>
> 
> I think this formulation is a problem. I assume the intention behind the "v" field is that some implementation that receives a version number in the "v" field with a major number higher than it understands is supposed to close the connection, since it runs a risk of misinterpreting the contents of messages. So, in your example, if you sent "v=12.23" to my implementation that only spoke major version 1, I would reject it, even though you also support major version 1. That's why I was thinking it should contain the smallest major version being advertised.
> 
> The minor version is obviously less useful in this context, since they're defined to be backwards and forwards compatible, but it's more useful to know the highest minor version supported than some random minor version, as it indicates the full feature set that is supported. 
> The reason it's less useful is that the value can be parsed out of the <version> list.
> 
>>> BLOCKING: The state machines in section 6 and its subsections don't 
>>> have transitions for all possible messages that could arrive in a 
>>> state. This can cause interop issues. Please add text that clearly 
>>> indicates whether such messages do or do not cause a transition.
>>> (This might be as simple as "messages not shown for a state do not 
>>> cause the state to change," but only if you carefully check that this 
>>> is true -- for example, what should an MP state machine do do if it 
>>> gets a "CONF + ACK" in the state "WAIT FOR CONF"?)
>>> 
>> [SPR] In the specific case you mention, I would expect that a 302 is 
>> generated (Invalid Parameter), since the incoming CONF should not 
>> contain the <ack> element. This said, I see your point related to the 
>> potential lack of some edges in the state machine graph. This point is 
>> obviously also related to the “timeout” issue you raised before.
>> Different ways of managing timeouts will definitely have different 
>> impacts on the current state machines.
>> We will defer the modification of the diagrams in question to the next 
>> revision round. In the meanwhile, we’ll give a thought to potential 
>> further unexpected transactions to be taken into account.
> 
> Based on my analysis above, I think the state machines will get a bit simpler, as they won't have retransmission counts any longer.
> 
>> 
>>> Section 6.2 describes a state machine that starts in a state called 
>>> "WAIT FOR ADV." This state does not appear to be timer-supervised, 
>>> meaning that implementations of this state machine can stay in this 
>>> state literally forever. Is that the intention?
>>> 
>> [SPR] The idea would be to open a socket and start listening for 
>> incoming advertisements. Isn’t that OK?
> 
> At some point, I think you're going to want to conclude that the CLUE data channel isn't going to get set up correctly, and signal that to the application in some way. This is probably on the order of a minute or so.
> 
>>> Section 8 indicates that extensions need to indicate "the standard 
>>> version of the protocol the extension refers to." Given that there is 
>>> compatibility within a major version of a protocol, I think this 
>>> means to say "the major standard version of the protocol that the 
>>> extension refers to.”
>>> 
>> [SPR] My extension might be associated with version 3.6 of the 
>> protocol. This means that it will need to be supported by all versions 
>> higher than that (3.7 onwards). Though, this does not entail that 
>> versions 3.5 and lower should understand it. Right?
> 
> I'm a bit confused. Is the notion here that extensions, once introduced, always become a mandatory-to-implement part of the protocol? If that's the case, it's not clear why you would need to advertise them explicitly, since support would be implied by the version number itself. 
> (This isn't unheard of: it's how NFS has historically handled version numbers, for example.) However, I didn't get the impression that CLUE protocol extensions were handled in such a way. If this is the intention, there needs to be a lot more text that makes this MTI aspect of extensions clear, and a reconsideration of why you're signaling them individually rather than inferring them from the version number.
> 
>>> Section 8 contains schema that contains a <version> element. It 
>>> should be clarified that this is the *protocol* version, not the
>>> *extension* version (or, if it's the extension version, that needs to 
>>> be spelled out too, but I think you'll need a new namespace for 
>>> that...?)
>>> 
>> [SPR] I thought this was clear from the third bullet point in the list:
>> 
>> - the standard version of the protocol the extension refers to.
>> 
>> Isn’t this enough? In such a case, what would you suggest to add?
> 
> Ah, I see. It wasn't clear that these three bullets corresponded to the elements in the XML. Perhaps make this clearer with something like "The values of the fields of the <extension> element take the following values:" (and maybe move it closer to the actual XML).
> 
>> 
>>> The remainder of this comment is non-blocking: I also find the 
>>> "SHOULD" in this section to be highly perplexing. Can you explain the 
>>> rationale behind requiring schema, but not requiring any description 
>>> of what the schema *means?)
>>> 
>> [SPR] It is like coding without adding comments/documentation. 
>> Documentation is most welcome, but not mandatory. Don’t you agree?
> 
> I don't. It seems more like committing a header file without the corresponding implementation file. Just because people know what things are called doesn't mean that they can handle them. For interop, you're going to need some description of what the fields mean and how the messages are handled.
> 
>>> 
>>> The schema in section 9 contains:
>>> 
>>> <xs:import namespace="urn:ietf:params:xml:ns:clue-info"
>>> schemaLocation="data-model-schema-17.xsd"/>
>>> 
>>> If you do not intend to bake this "-17" into the document (and I 
>>> can't imagine you do), please add an RFC editor note to change it to 
>>> something else upon publication.
>>> 
>> [SPR] Done. I am not very familiar with RFC editor notes, indeed. Here 
>> is what we added at the beginning of section 9:
>> 
>> "NOTE TO THE RFC-Editor: Please replace "data-model-schema-17.xsd" 
>> with the right schema location for the CLUE data model schema document 
>> (which is still to be defined at the time of this writing) in this 
>> section prior to publication as an RFC.”
>> 
>> Does this work for you?
> 
> That's perfect. Thanks!
> 
>>> Section 10 in general: While it does consume a lot of space, I don't 
>>> think that defining six rather different message types and then 
>>> showing only *one* type in the examples is very illustrative of the 
>>> protocol. I would *STRONGLY* suggest adding at least a response 
>>> message, and ideally the example section should contain at least one 
>>> example for each of the six message types.
>>> 
>> [SPR] Will do in the (hopefully final) revision of the document. Let’s 
>> first double check that the proposed modifications are OK.
> 
> Sounds good.
> 
>> 
>>> Section 12.1 has a strange double-double quote around the URN name, 
>>> and section 12.3 repeats this for the MIME type.
>>> 
>> [SPR] I don’t understand this. Can you please clarify?
>> 
> 
> Replace
> 
>    ""urn:ietf:params:xml:ns:clue-protocol"".
> 
> with
> 
>    "urn:ietf:params:xml:ns:clue-protocol".
> 
> 
> And then replace
> 
>    This section registers the ""application/clue+xml"" MIME type.
> 
> with
> 
>    This section registers the "application/clue+xml" MIME type.
> 
> 
> Thanks!
> 
> /a
> 
> _______________________________________________
> clue mailing list
> clue@ietf.org
> https://www.ietf.org/mailman/listinfo/clue