Re: [tcpm] Some comments on: draft-kuehlewind-tcpm-accecn-reqs

"Scheffenegger, Richard" <rs@netapp.com> Sat, 08 March 2014 23:08 UTC

Return-Path: <rs@netapp.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E716D1A02EB for <tcpm@ietfa.amsl.com>; Sat, 8 Mar 2014 15:08:32 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.449
X-Spam-Level:
X-Spam-Status: No, score=-7.449 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_HI=-5, RP_MATCHES_RCVD=-0.547, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4EDh-LQcXYiK for <tcpm@ietfa.amsl.com>; Sat, 8 Mar 2014 15:08:29 -0800 (PST)
Received: from mx12.netapp.com (mx12.netapp.com [216.240.18.77]) by ietfa.amsl.com (Postfix) with ESMTP id E1D551A02EC for <tcpm@ietf.org>; Sat, 8 Mar 2014 15:08:28 -0800 (PST)
X-IronPort-AV: E=Sophos;i="4.97,616,1389772800"; d="scan'208";a="148485169"
Received: from vmwexceht06-prd.hq.netapp.com ([10.106.77.104]) by mx12-out.netapp.com with ESMTP; 08 Mar 2014 15:08:23 -0800
Received: from SACEXCMBX02-PRD.hq.netapp.com ([169.254.1.77]) by vmwexceht06-prd.hq.netapp.com ([10.106.77.104]) with mapi id 14.03.0123.003; Sat, 8 Mar 2014 15:08:23 -0800
From: "Scheffenegger, Richard" <rs@netapp.com>
To: "gorry@erg.abdn.ac.uk" <gorry@erg.abdn.ac.uk>, "draft-kuehlewind-tcpm-accecn-reqs@tools.ietf.org" <draft-kuehlewind-tcpm-accecn-reqs@tools.ietf.org>, "tcpm@ietf.org" <tcpm@ietf.org>
Thread-Topic: [tcpm] Some comments on: draft-kuehlewind-tcpm-accecn-reqs
Thread-Index: AQHPOrvPnnC2Yencfk2wJMba8ybB+prXXZjw
Date: Sat, 08 Mar 2014 23:08:23 +0000
Message-ID: <012C3117EDDB3C4781FD802A8C27DD4F260D7F08@SACEXCMBX02-PRD.hq.netapp.com>
References: <531AF534.6050209@erg.abdn.ac.uk>
In-Reply-To: <531AF534.6050209@erg.abdn.ac.uk>
Accept-Language: de-AT, en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.106.53.53]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Archived-At: http://mailarchive.ietf.org/arch/msg/tcpm/vdAIpUTQm-CqzgGG-Spu67kaH0I
Cc: Brian Trammell <trammell@tik.ee.ethz.ch>
Subject: Re: [tcpm] Some comments on: draft-kuehlewind-tcpm-accecn-reqs
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 08 Mar 2014 23:08:33 -0000

Hi Gorry,

Thanks for the review!


> (1) I am not clear on the intended scope - is this proposing an update for
> the internet - for cones for the DC environment, all?

The scope was to include all scenarios (even though I can not decipher what you refer to with "cones", I assume you mean "core [internet]" vs. "the [edge and core] internet"...

The requirements really are mostly a list of what we (authors) could come up with, what would have some applicability somewhere. It is up to a specific mechanism that is supposed to be deployed anywhere to accommodate these requirements as good as it can.

> (2) The requirements listed in section 4  to figure out what is needed,
> desirable or just nice to have. The document could use requirements
> language if needed, but even if not, I really think it should call out
> more clearly what is required.

If we use 2119 language in there, we (WG) need to agree upon a set of *minimum* requirements that would be necessary at least. And yes, that list will probably be much shorter or concise than what is now in section 4.





> Detailed comments follow:
> 
> Scope:
> 
> (above point 1) I see the abstract says "Recent new TCP mechanisms (like
> ConEx or DCTCP) need more accurate ECN feedback in the case where more
> than one marking is received in one RTT." - which is fine, but this leaves
> open whether the methods proposed are for used in controlled environments,
> within domains such as Conex or the general Internet.
> (This seems to be a perhaps unintentional change from what I looked at in
> rev -03).
> 
> (above point 1) I really think the work should target methods that *can*
> be deployed in the general Internet. I'd suggest this actually becomes a
> requirement.

Agreed. We probably want to put something very much like that sentence in the abstract.

The new scheme should work at least as good as what we have, but ideally much better.

 
> Section1:
> 
> (above point 1), I read "his document lists requirements for a robust and
> interoperable more accurate TCP/ECN feedback protocol that all
> implementations of new TCP extensions, like ConEx and/or DCTCP, can use.
> " - but from the current I-D, I read the scope as can be used by these
> two, but I think it needs to be applicable to general Internet deployment.
> Please clarify the scope here also.
> 
> Section 2:
> 
> Disagree with text: "However, as the ECN Nonce is a separate extension to
> ECN, even if a sender tries to protect itself with the ECN Nonce, any
> receiver wishing to conceal marked packets only has to pretend not to
> support the ECN Nonce and simply does not provide any nonce sum feedback."
>
> - To me this argument is ridiculous (and I have said so), and I think we
> should not perpetuate this argument!
>
> [[Here is why:  if a sender were to require a NS, then a receiver would
> need to provide that or expect the sender would not continue use of ECN,
> or at least limit is trust of ECN.  So the wording could be improved, but
> RFC3540 says:  "If the receiver has never sent a non-zero nonce sum, the
> sender can infer that the receiver does not understand the nonce, and rate
> limit the connection, place it in a lower-priority queue, or cease setting
> ECT in outgoing segments."]]
> 
> - Saying the nonce sum has a second problem in that it only works if it is
> deployed and actually used seems obvious.
> 
> Other points that may help the argument: The WG should look for sound
> reasons to obsolete specs, saying the NS was not significantly (or not
> known to be) deployed however, does seem true to me. Saying there are
> equivalent methods may be true, and I think this is presented. Saying that
> the NS only validates non-congestion and hence provides much less useful
> protection for the more frequent marking of immediate-ECN is also true.

We try to come up with different text that essentially captures that in a few words.


> Section 3:
> 
> This section confuses me. I find no motivation for use beyond the two use-
> cases.
> 
> It seems to be a list of two use-cases, I assume these are *examples* of
> use, rather than required use-cases to consider?
> 
> It speaks of DCTCP (which I think is currently only defined for controlled
> environments) and Conex (specified for a specific domain).

As discussed during the DCTCP presentation, the mechanis could have more wide spread (internet wide) applicability, if done right (sender reaction, receiver behavior, ecn semantic, network aqm settings *AND* TCP ECN feedback signal). 

> Reiterating (point 1), I'd hoped these were examples, but that the actual
> WG goal was now to define a general purpose method for feedback.
> Or is it?

That was our aim, to eventually have a signaling scheme that could carter for the mentioned examples (we need to describe them as that), but also enabling future algorithms which we don't know about today.


 
> Ordering of examples: How much deployment experience do we have  of Conex?
> I'm not arguing to  remove this use-case, but wonder whether this is best
> to be placed first in the examples of use?

The ordering was mostly based on what has been done in IETF so far; Conex received much more formal attention than DCTCP so far; alternatively, I could claim that the list was done alphabetically to make clear that the order of these doesn't have any more significance... :) (but that isn't really how the order happened to end up)

 
> Is there an implied use-case of standard TCP and a possible use case of an
> updated ECN for TCP? - I think this is only hinted.

Definitely, we didn't explicitly mention it though.


 
> Section 4:
> 
> Usage of RFC 2119: Personally, I really do not like use of RFC2119
> keywords in this way: "This leads to the following requirements, which
> MUST be discussed for any proposed more accurate ECN feedback scheme:" -
> Mandating discussion is not helpful.

Slightly reworded.

> (point 2 above): If we have requirements as the title suggests, then I
> personally would encourage you to list them as requirements using RFC
> 2119 language - but I don't mind seeing no RFC 2119 keywords if the WG
> wants this, as long as I can see the relative importance of each topic.
> At the moment I can't see this.

We definitely need more WG discussion around this, and also around the importance of each aspect of the various requirements. I'm trying to do something along those lines, in the next version to allow comparison.

 
> Comments on specific topics:
> 
> Resilience
> 
> "Moreover, delayed ACK are mostly used with TCP.  That means in most cases
> only every second data packets triggers an ACK."
> - Isn't it one ACK for every 2 full-sized MSS of data? The ACK rate per
> segment can be much less if the sender disables the Nagle algorithm. I'd
> really like a requirement that this mechanism is robust for use with
> "thin" streams that disable Nagle - and to understand what this means, and
> if we simply require interop in this case, or whether we require Acc ECN
> to work accurately.

My recollection of both Linux and BSD stacks is, that they actually operate on every other data segment, not when MSS+1 bytes are received. Checking - Yupp, looks that way still in freebsd :) This is more restrictive than RFC1122 / 4.2.3.2 though, which allows what you state.


> - Some reference is probably needed to explain the variety of TCP ACK
> mangling - e.g. RFC 3449 illustrates at the least the breadth of
> mechanisms out there.
> 
> "Also, a more accurate feedback protocol should still work if delayed ACKs
> covered more than two packets."
> - In IETF understanding is this really a "should" - I see it is lower
> case, but it is best not to be ambiguous? does this mean it may not work,
> or may not be accurate or may cause the TCP receiver to reset (I hope
> not)?

I think the phrase to tighten here is "should still work" to mean "should still provide more accurate feedback then classic ECN ...", right? 


 
> - Is this resilience to loss?

Correct. Apparently we are missing an initial sentence here.


> - Are there implied ordering requirements for the path? - i.e. I'd assume
> we want a system that does is not impacted by reordering of ACKs, even
> when the ACK'ed sequence number does not increase, or do we assume an
> ordered path?

Good catch; the (my) implied assumption was that the ECN signal is interpreted *after* the segment acceptance check. For pure ACKs that means, unordered ACKs are not processed; 

 
> - I think it worth also noting a  *separate topic* for the need for
> robustness to known middle boxes as a desirable feature? ... I think this is
> added in "Backward and forward compatibility" - to me such a section is
> usually expect to see as a protocol evolution issue, rather than a middle
> box issue.
> 
> Timeliness
> 
> "A CE mark can be induced by a network node on the transmission path and
> is then echoed by the receiver in the TCP ACK. "
> - I think CE can also be set by the sending host, please update.

Will do.
 
> "Thus when this information arrives at the sender, it is naturally already
> about one RTT old."
> - not really, it is at least 1/2 RTT old - depending on where loss was
> encountered.

Well, if we want to have acute precise wording, would that sound better: "about one backward plus a fraction of the forward delay old"? The backward delay can be vastly different from the forward delay (asymmetric satellite links with ground return link), thus 1/2 RTT is as skewed as saying about one RTT...

 
> "A RFC5681 TCP sender without ConEx:"
>   ^^
> /A/An/ - is it better
> 

Thanks,


> Integrity
> - This starts "it should be possible" and later mentions requirements, is
> this "a sender SHOULD implement..." (required) or "senders MAY implement ..."
> (can) or "a SENDER should implement and MAY enable"
> (required feature, optional deployment) - please clarify and avoid
> possible, etc.
> 
> "Whether a sender should enforce when it detects wrong feedback
> information, and what kind of enforcement it should apply, are policy
> issues that need not be specified as part of more accurate ECN feedback
> scheme."
> - To me this seems like a clear recommendation not to do this - are these
> really policy decisions, are you saying that the IETF doesn't need to do
> this. I am not sure I agree, if the sender state machine can detect and
> react to bogus feedback this maybe should be considered, and I think it
> would be desirable to know that such mechanisms can be used with any
> selected approach.

The exact reaction was also left out of RFC3540, unfortunately. This document focuses on the requirements of the signaling protocol, whereas the policy would be enforced by e.g. a congestion control algorithm.

> 
> - If there is policy, be clear what sort of policy - per stack, per
> socket, per interface, per provisioning domain...

I would lead to a per stack granularity, ie. limiting cwnd, reacting overly conservatively to loss, etc. I think this discussion would belong in a CC-draft, that makes use of the signals coming out of a more accurate ECN feedback scheme. We could perhaps define these signals in the requirements draft though.

> 
> Accuracy
> 
> "However, assuming the sender marks all data packets as ECN-capable and
> uses the default setting of ECT(0),"
> This may just be wording: - Is use of only ECT(0) assumed somewhere in the
> RFC series, (mate in PCN?) ? RFC 4774 is a BCP specifies different
> semantics can be used for the two ECT code points for a different DSCP, to
> me this indicates that transports should at least be robust to this, and
> I'd expect that TCP could be used in this case. Is this a requirement? (I
> think it should be).

RFC3168 states: "When only one ECT codepoint
   is needed by a sender for all packets sent on a TCP connection,
   ECT(0) SHOULD be used."


The point this paragraph tries to make is, that as the sender does know which packets get sent with non-ECT, ECT(0), ECT(1) and CE, if one of ECT(0) or ECT(1) is the default, the receiver needs only reflect the other two so that the sender can make accurate accounting in what state the ECT segments were received... So, a scheme could swap ECT(0)/ECT(1) here. However, having yet-different semantics between ECT(0) and ECT(1) would be detrimental to the case of breaking the linkage of CE mark == loss.

I added the reference to 3168 here, though.






> The clause  continues to say feedback of CE and ECT(1) is sufficient.
> So wouldn't this also be the case if both ECT(0) and ECT(1) were used?
> i.e. maybe it works fine for other DSCP semantics, by feeding back CE and
> ECT(1). Please clarify.

I try again: We have some known entities at the sender #total = (# ECT0 + # ECT1 + # CE) segments sent; in order to deduce these three values at the receiver side, the signal must convey at least two back to the sender, which can then calculate the third of the set locally. 

How about:

          As the sender can keep account of the transmitted segments 
          with any of the three ECN codepoints, conveying any two 
          of these back to the sender is sufficient for it to 
          reconstruct the third as observed by the receiver.
          


> Complexity:
> 
> Maybe this is in the wrong section: "Furthermore, the receiver should not
> make assumptions about the mechanism that was used to set the markings nor
> about any interpretation or reaction to the congestion signal. The
> receiver only needs to faithfully reflect congestion information back to
> the sender."
> - This does not seem like a complexity requirement to me. It seems like an
> actual requirement. I'd have expected stronger language against the
> network not interpreting the meaning of the reaction. Could this be "MUST
> NOT"? (maybe needs to be in a middle box section).

Moved.

> Backward and forward compatibility
> 
> /it should to be/it should be/
> 
> Middleboxes (see earlier)
> 
> "A more accurate ECN feedback extension should aim to be able to traverse
> most existing middleboxes."
> - should ... aim .. most ... existing ... This seems full of questions: Why not
> "should aim to traverse most middle boxes"? ... but see note above, is this
> really backwards compatibility?
> 
> I'd love to see more here that motivates why the methods in "draft-
> kuehlewind-tcpm-ecn-fallback" need to be considered. I think this it is
> really important that any ECN method works through commonly deployed
> middle boxes and also that there are at least tools to verify that this
> working is correct, even if we do not directly mandate methods to probe
> and validate the path (which I also think we should seriously consider).
> 
> I have found a  separate section on middle boxes that uses the words
> "firewall and NAT" is generally helpful to ensure that people from the
> community actually find advice and know it is directed to them. If they
> don't like these requirements they should tell us before we do the work.


I've places these keywords in the relevant paragraph, so that a grep spots them...


> 
> Section 5:
> 
> "In case of " - insert "the"
> 
> "highly vulnerable to ACK loss." - how is this different to being simply
> vulnerable? - can you explain "highly"?
> 
> "still highly ambiguous" - as above, it seems ambiguous?
> 
> "A couple of coding schemes"
> - is this a couple as in a linked set or as in two, please clarify
> language.
> 
> "Urgent Pointer field" - please refer to the recent TCPM RFC on use of
> URG.

Added reference to rfc6093.

> "but still not ideal." - I don't see ideal as a goal, consider rewording?
> 
> I found the last para of 5.1 is a hard read, unless the reader knows what
> is being said already.


Which may be a good thing :) Will try to get this paragraph split and straightened.

> Sect 5.2:
> 
> "Alternatively, the receiver could use bits in the Urgent Pointer field to
> signal more bits of its congestion signal counter, but only whenever it
> does not set the Urgent Flag."
> - I'd urge rewording this, far too often I see sentences from RFCs
> replicated in other places. In place of this can we say that this is NOT
> permitted, but a new method could standardise use of these bits...

Reworked.

 
> Sect 5.3:
> 
> "and SCTP counts the number" - probably this needs to be cited as a
> "proposal", since this draft has not to date been progressed.
> 
> "this option would need to be carried by most or all ACKs" - can you
> explain why? even in times of non-congestion?
> 
> Sect 8
> 
> I would expect to see some discussion that states that ECN feedback should
> only be used if the other information indicates the congestion was on-path
> - i.e. feedback MUST use normal TCP sequence number check techniques to
> verify the CE-marked packet was a part of the current flow, and similarly
> ECN marking feedback is only accepted on valid ACKs.
> 
> I'd really like this section to point to the mechanisms that may be used
> to validate that the remote endpoint responds appropriately to ECN. It is
> relatively easy to do sender-side changes, but knowing the receiver
> "correctly" implements a function is much harder.
> 
> 
> I am happy to review again, this is heading in a good direction.
> 


Lot's of good feedback, thank you very much !

Richard Scheffenegger