Re: [tcpm] Some comments on: draft-kuehlewind-tcpm-accecn-reqs

gorry@erg.abdn.ac.uk Sun, 09 March 2014 08:26 UTC

Return-Path: <gorry@erg.abdn.ac.uk>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A69051A017E for <tcpm@ietfa.amsl.com>; Sun, 9 Mar 2014 00:26:44 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.748
X-Spam-Level:
X-Spam-Status: No, score=-4.748 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RP_MATCHES_RCVD=-0.547, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id moOCEWK48u86 for <tcpm@ietfa.amsl.com>; Sun, 9 Mar 2014 00:26:40 -0800 (PST)
Received: from spey.erg.abdn.ac.uk (spey.erg.abdn.ac.uk [139.133.204.173]) by ietfa.amsl.com (Postfix) with ESMTP id 53DCC1A0160 for <tcpm@ietf.org>; Sun, 9 Mar 2014 00:26:40 -0800 (PST)
Received: from www.erg.abdn.ac.uk (blake.erg.abdn.ac.uk [139.133.210.30]) by spey.erg.abdn.ac.uk (Postfix) with ESMTPSA id DFF532B40E7; Sun, 9 Mar 2014 08:26:34 +0000 (GMT)
Received: from 212.159.18.54 (SquirrelMail authenticated user gorry) by www.erg.abdn.ac.uk with HTTP; Sun, 9 Mar 2014 08:26:35 -0000
Message-ID: <42bc59eb086542a1e28c435527a85ed5.squirrel@www.erg.abdn.ac.uk>
In-Reply-To: <012C3117EDDB3C4781FD802A8C27DD4F260D7F08@SACEXCMBX02-PRD.hq.netapp.com>
References: <531AF534.6050209@erg.abdn.ac.uk> <012C3117EDDB3C4781FD802A8C27DD4F260D7F08@SACEXCMBX02-PRD.hq.netapp.com>
Date: Sun, 09 Mar 2014 08:26:35 -0000
From: gorry@erg.abdn.ac.uk
To: "Scheffenegger, Richard" <rs@netapp.com>
User-Agent: SquirrelMail/1.4.22
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 8bit
X-Priority: 3 (Normal)
Importance: Normal
Archived-At: http://mailarchive.ietf.org/arch/msg/tcpm/99Px8zZoOvUHg6IupTcLL9tCjlE
Cc: "draft-kuehlewind-tcpm-accecn-reqs@tools.ietf.org" <draft-kuehlewind-tcpm-accecn-reqs@tools.ietf.org>, "tcpm@ietf.org" <tcpm@ietf.org>, Brian Trammell <trammell@tik.ee.ethz.ch>
Subject: Re: [tcpm] Some comments on: draft-kuehlewind-tcpm-accecn-reqs
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 09 Mar 2014 08:26:44 -0000

A few responses in-line. It sounds like the new revision will address
most/all of my comments,


> Hi Gorry,
>
> Thanks for the review!
>
>
>> (1) I am not clear on the intended scope - is this proposing an update
>> for
>> the internet - for cones for the DC environment, all?
>
> The scope was to include all scenarios (even though I can not decipher
> what you refer to with "cones", I assume you mean "core [internet]" vs.
> "the [edge and core] internet"...
>
Good I agree with this.
/cones/ was a typo.

> The requirements really are mostly a list of what we (authors) could come
> up with, what would have some applicability somewhere. It is up to a
> specific mechanism that is supposed to be deployed anywhere to accommodate
> these requirements as good as it can.
>
Excellent, but then the intro and examples should state this,

>> (2) The requirements listed in section 4  to figure out what is needed,
>> desirable or just nice to have. The document could use requirements
>> language if needed, but even if not, I really think it should call out
>> more clearly what is required.
>
> If we use 2119 language in there, we (WG) need to agree upon a set of
> *minimum* requirements that would be necessary at least. And yes, that
> list will probably be much shorter or concise than what is now in section
> 4.
>
I agree, the choice of using RFC 2119 requirements language is up to the WG.

>> Detailed comments follow:
>>
>> Scope:
>>
>> (above point 1) I see the abstract says "Recent new TCP mechanisms (like
>> ConEx or DCTCP) need more accurate ECN feedback in the case where more
>> than one marking is received in one RTT." - which is fine, but this
>> leaves
>> open whether the methods proposed are for used in controlled
>> environments,
>> within domains such as Conex or the general Internet.
>> (This seems to be a perhaps unintentional change from what I looked at
>> in
>> rev -03).
>>
>> (above point 1) I really think the work should target methods that *can*
>> be deployed in the general Internet. I'd suggest this actually becomes a
>> requirement.
>
> Agreed. We probably want to put something very much like that sentence in
> the abstract.
>
> The new scheme should work at least as good as what we have, but ideally
> much better.
>
OK
>
>> Section1:
>>
>> (above point 1), I read "his document lists requirements for a robust
>> and
>> interoperable more accurate TCP/ECN feedback protocol that all
>> implementations of new TCP extensions, like ConEx and/or DCTCP, can use.
>> " - but from the current I-D, I read the scope as can be used by these
>> two, but I think it needs to be applicable to general Internet
>> deployment.
>> Please clarify the scope here also.
>>
>> Section 2:
>>
>> Disagree with text: "However, as the ECN Nonce is a separate extension
>> to
>> ECN, even if a sender tries to protect itself with the ECN Nonce, any
>> receiver wishing to conceal marked packets only has to pretend not to
>> support the ECN Nonce and simply does not provide any nonce sum
>> feedback."
>>
>> - To me this argument is ridiculous (and I have said so), and I think we
>> should not perpetuate this argument!
>>
>> [[Here is why:  if a sender were to require a NS, then a receiver would
>> need to provide that or expect the sender would not continue use of ECN,
>> or at least limit is trust of ECN.  So the wording could be improved,
>> but
>> RFC3540 says:  "If the receiver has never sent a non-zero nonce sum, the
>> sender can infer that the receiver does not understand the nonce, and
>> rate
>> limit the connection, place it in a lower-priority queue, or cease
>> setting
>> ECT in outgoing segments."]]
>>
>> - Saying the nonce sum has a second problem in that it only works if it
>> is
>> deployed and actually used seems obvious.
>>
>> Other points that may help the argument: The WG should look for sound
>> reasons to obsolete specs, saying the NS was not significantly (or not
>> known to be) deployed however, does seem true to me. Saying there are
>> equivalent methods may be true, and I think this is presented. Saying
>> that
>> the NS only validates non-congestion and hence provides much less useful
>> protection for the more frequent marking of immediate-ECN is also true.
>
> We try to come up with different text that essentially captures that in a
> few words.
>
OK

>
>> Section 3:
>>
>> This section confuses me. I find no motivation for use beyond the two
>> use-
>> cases.
>>
>> It seems to be a list of two use-cases, I assume these are *examples* of
>> use, rather than required use-cases to consider?
>>
>> It speaks of DCTCP (which I think is currently only defined for
>> controlled
>> environments) and Conex (specified for a specific domain).
>
> As discussed during the DCTCP presentation, the mechanis could have more
> wide spread (internet wide) applicability, if done right (sender reaction,
> receiver behavior, ecn semantic, network aqm settings *AND* TCP ECN
> feedback signal).
>
Agree, text saying this would be good.

>> Reiterating (point 1), I'd hoped these were examples, but that the
>> actual
>> WG goal was now to define a general purpose method for feedback.
>> Or is it?
>
> That was our aim, to eventually have a signaling scheme that could carter
> for the mentioned examples (we need to describe them as that), but also
> enabling future algorithms which we don't know about today.
>
>
OK
>
>> Ordering of examples: How much deployment experience do we have  of
>> Conex?
>> I'm not arguing to  remove this use-case, but wonder whether this is
>> best
>> to be placed first in the examples of use?
>
> The ordering was mostly based on what has been done in IETF so far; Conex
> received much more formal attention than DCTCP so far; alternatively, I
> could claim that the list was done alphabetically to make clear that the
> order of these doesn't have any more significance... :) (but that isn't
> really how the order happened to end up)
>
OK - I may not have noted this if the section said here are 3 example use
cases, Conex, DTCP, and a future ti be defined IETF mechanisms for the
general Internet.

>
>> Is there an implied use-case of standard TCP and a possible use case of
>> an
>> updated ECN for TCP? - I think this is only hinted.
>
> Definitely, we didn't explicitly mention it though.
>
>
>
>> Section 4:
>>
>> Usage of RFC 2119: Personally, I really do not like use of RFC2119
>> keywords in this way: "This leads to the following requirements, which
>> MUST be discussed for any proposed more accurate ECN feedback scheme:" -
>> Mandating discussion is not helpful.
>
> Slightly reworded.
>
>> (point 2 above): If we have requirements as the title suggests, then I
>> personally would encourage you to list them as requirements using RFC
>> 2119 language - but I don't mind seeing no RFC 2119 keywords if the WG
>> wants this, as long as I can see the relative importance of each topic.
>> At the moment I can't see this.
>
> We definitely need more WG discussion around this, and also around the
> importance of each aspect of the various requirements. I'm trying to do
> something along those lines, in the next version to allow comparison.
>
>
>> Comments on specific topics:
>>
>> Resilience
>>
>> "Moreover, delayed ACK are mostly used with TCP.  That means in most
>> cases
>> only every second data packets triggers an ACK."
>> - Isn't it one ACK for every 2 full-sized MSS of data? The ACK rate per
>> segment can be much less if the sender disables the Nagle algorithm. I'd
>> really like a requirement that this mechanism is robust for use with
>> "thin" streams that disable Nagle - and to understand what this means,
>> and
>> if we simply require interop in this case, or whether we require Acc ECN
>> to work accurately.
>
> My recollection of both Linux and BSD stacks is, that they actually
> operate on every other data segment, not when MSS+1 bytes are received.
> Checking - Yupp, looks that way still in freebsd :) This is more
> restrictive than RFC1122 / 4.2.3.2 though, which allows what you state.
>
And sending one ACK every other segment may be good for this proposed
mechanism --- but then what happens if you have a receiver that uses the
definition in the RFC-series…. should this new mechanism *always* at least
ACK every other segment when using AccECN or what is required/recommended?

>
>> - Some reference is probably needed to explain the variety of TCP ACK
>> mangling - e.g. RFC 3449 illustrates at the least the breadth of
>> mechanisms out there.
>>
>> "Also, a more accurate feedback protocol should still work if delayed
>> ACKs
>> covered more than two packets."
>> - In IETF understanding is this really a "should" - I see it is lower
>> case, but it is best not to be ambiguous? does this mean it may not
>> work,
>> or may not be accurate or may cause the TCP receiver to reset (I hope
>> not)?
>
> I think the phrase to tighten here is "should still work" to mean "should
> still provide more accurate feedback then classic ECN ...", right?
>
That works or me, require a specific receiver behaviour.
>
>
>> - Is this resilience to loss?
>
> Correct. Apparently we are missing an initial sentence here.
>
OK
>
>> - Are there implied ordering requirements for the path? - i.e. I'd
>> assume
>> we want a system that does is not impacted by reordering of ACKs, even
>> when the ACK'ed sequence number does not increase, or do we assume an
>> ordered path?
>
> Good catch; the (my) implied assumption was that the ECN signal is
> interpreted *after* the segment acceptance check. For pure ACKs that
> means, unordered ACKs are not processed;
>
>
>> - I think it worth also noting a  *separate topic* for the need for
>> robustness to known middle boxes as a desirable feature? ... I think
>> this is
>> added in "Backward and forward compatibility" - to me such a section is
>> usually expect to see as a protocol evolution issue, rather than a
>> middle
>> box issue.
>>
>> Timeliness
>>
>> "A CE mark can be induced by a network node on the transmission path and
>> is then echoed by the receiver in the TCP ACK. "
>> - I think CE can also be set by the sending host, please update.
>
> Will do.
>
OK

>> "Thus when this information arrives at the sender, it is naturally
>> already
>> about one RTT old."
>> - not really, it is at least 1/2 RTT old - depending on where loss was
>> encountered.
>
> Well, if we want to have acute precise wording, would that sound better:
> "about one backward plus a fraction of the forward delay old"? The
> backward delay can be vastly different from the forward delay (asymmetric
> satellite links with ground return link), thus 1/2 RTT is as skewed as
> saying about one RTT...
>
>
Indeed, this is much better (is "return path" better then "backward").

>> "A RFC5681 TCP sender without ConEx:"
>>   ^^
>> /A/An/ - is it better
>>
>
> Thanks,
>
>
>> Integrity
>> - This starts "it should be possible" and later mentions requirements,
>> is
>> this "a sender SHOULD implement..." (required) or "senders MAY implement
>> ..."
>> (can) or "a SENDER should implement and MAY enable"
>> (required feature, optional deployment) - please clarify and avoid
>> possible, etc.
>>
>> "Whether a sender should enforce when it detects wrong feedback
>> information, and what kind of enforcement it should apply, are policy
>> issues that need not be specified as part of more accurate ECN feedback
>> scheme."
>> - To me this seems like a clear recommendation not to do this - are
>> these
>> really policy decisions, are you saying that the IETF doesn't need to do
>> this. I am not sure I agree, if the sender state machine can detect and
>> react to bogus feedback this maybe should be considered, and I think it
>> would be desirable to know that such mechanisms can be used with any
>> selected approach.
>
> The exact reaction was also left out of RFC3540, unfortunately. This
> document focuses on the requirements of the signaling protocol, whereas
> the policy would be enforced by e.g. a congestion control algorithm.
>
I wonder what we should do here?

>>
>> - If there is policy, be clear what sort of policy - per stack, per
>> socket, per interface, per provisioning domain...
>
> I would lead to a per stack granularity, ie. limiting cwnd, reacting
> overly conservatively to loss, etc. I think this discussion would belong
> in a CC-draft, that makes use of the signals coming out of a more accurate
> ECN feedback scheme. We could perhaps define these signals in the
> requirements draft though.
>
I think hetting agreement in this draft saves discussion later (if we do
happen to make a mistake, ewe can correct this - but if we have WG
agreement on this sort of thing, it can make the next step easier).
>>
>> Accuracy
>>
>> "However, assuming the sender marks all data packets as ECN-capable and
>> uses the default setting of ECT(0),"
>> This may just be wording: - Is use of only ECT(0) assumed somewhere in
>> the
>> RFC series, (mate in PCN?) ? RFC 4774 is a BCP specifies different
>> semantics can be used for the two ECT code points for a different DSCP,
>> to
>> me this indicates that transports should at least be robust to this, and
>> I'd expect that TCP could be used in this case. Is this a requirement?
>> (I
>> think it should be).
>
> RFC3168 states: "When only one ECT codepoint
>    is needed by a sender for all packets sent on a TCP connection,
>    ECT(0) SHOULD be used."
>
>
> The point this paragraph tries to make is, that as the sender does know
> which packets get sent with non-ECT, ECT(0), ECT(1) and CE, if one of
> ECT(0) or ECT(1) is the default, the receiver needs only reflect the other
> two so that the sender can make accurate accounting in what state the ECT
> segments were received... So, a scheme could swap ECT(0)/ECT(1) here.
> However, having yet-different semantics between ECT(0) and ECT(1) would be
> detrimental to the case of breaking the linkage of CE mark == loss.
>
> I added the reference to 3168 here, though.
>
>
I think you argue ECT(1) and CE feedback are sufficient, if so, I'd agree.
>
>
>
>
>> The clause  continues to say feedback of CE and ECT(1) is sufficient.
>> So wouldn't this also be the case if both ECT(0) and ECT(1) were used?
>> i.e. maybe it works fine for other DSCP semantics, by feeding back CE
>> and
>> ECT(1). Please clarify.
>
> I try again: We have some known entities at the sender #total = (# ECT0 +
> # ECT1 + # CE) segments sent; in order to deduce these three values at the
> receiver side, the signal must convey at least two back to the sender,
> which can then calculate the third of the set locally.
>
> How about:
>
>           As the sender can keep account of the transmitted segments
>           with any of the three ECN codepoints, conveying any two
>           of these back to the sender is sufficient for it to
>           reconstruct the third as observed by the receiver.
>
>
/As/Because/ and that would be very clear.
>
>> Complexity:
>>
>> Maybe this is in the wrong section: "Furthermore, the receiver should
>> not
>> make assumptions about the mechanism that was used to set the markings
>> nor
>> about any interpretation or reaction to the congestion signal. The
>> receiver only needs to faithfully reflect congestion information back to
>> the sender."
>> - This does not seem like a complexity requirement to me. It seems like
>> an
>> actual requirement. I'd have expected stronger language against the
>> network not interpreting the meaning of the reaction. Could this be
>> "MUST
>> NOT"? (maybe needs to be in a middle box section).
>
> Moved.
>
>> Backward and forward compatibility
>>
>> /it should to be/it should be/
>>
>> Middleboxes (see earlier)
>>
>> "A more accurate ECN feedback extension should aim to be able to
>> traverse
>> most existing middleboxes."
>> - should ... aim .. most ... existing ... This seems full of questions:
>> Why not
>> "should aim to traverse most middle boxes"? ... but see note above, is
>> this
>> really backwards compatibility?
>>
>> I'd love to see more here that motivates why the methods in "draft-
>> kuehlewind-tcpm-ecn-fallback" need to be considered. I think this it is
>> really important that any ECN method works through commonly deployed
>> middle boxes and also that there are at least tools to verify that this
>> working is correct, even if we do not directly mandate methods to probe
>> and validate the path (which I also think we should seriously consider).
>>
>> I have found a  separate section on middle boxes that uses the words
>> "firewall and NAT" is generally helpful to ensure that people from the
>> community actually find advice and know it is directed to them. If they
>> don't like these requirements they should tell us before we do the work.
>
>
> I've places these keywords in the relevant paragraph, so that a grep spots
> them...
>
Thanks
>
>>
>> Section 5:
>>
>> "In case of " - insert "the"
>>
>> "highly vulnerable to ACK loss." - how is this different to being simply
>> vulnerable? - can you explain "highly"?
>>
>> "still highly ambiguous" - as above, it seems ambiguous?
>>
>> "A couple of coding schemes"
>> - is this a couple as in a linked set or as in two, please clarify
>> language.
>>
>> "Urgent Pointer field" - please refer to the recent TCPM RFC on use of
>> URG.
>
> Added reference to rfc6093.
>
>> "but still not ideal." - I don't see ideal as a goal, consider
>> rewording?
>>
>> I found the last para of 5.1 is a hard read, unless the reader knows
>> what
>> is being said already.
>
>
> Which may be a good thing :) Will try to get this paragraph split and
> straightened.
>
>> Sect 5.2:
>>
>> "Alternatively, the receiver could use bits in the Urgent Pointer field
>> to
>> signal more bits of its congestion signal counter, but only whenever it
>> does not set the Urgent Flag."
>> - I'd urge rewording this, far too often I see sentences from RFCs
>> replicated in other places. In place of this can we say that this is NOT
>> permitted, but a new method could standardise use of these bits...
>
> Reworked.
>
>
>> Sect 5.3:
>>
>> "and SCTP counts the number" - probably this needs to be cited as a
>> "proposal", since this draft has not to date been progressed.
>>
>> "this option would need to be carried by most or all ACKs" - can you
>> explain why? even in times of non-congestion?
>>
>> Sect 8
>>
>> I would expect to see some discussion that states that ECN feedback
>> should
>> only be used if the other information indicates the congestion was
>> on-path
>> - i.e. feedback MUST use normal TCP sequence number check techniques to
>> verify the CE-marked packet was a part of the current flow, and
>> similarly
>> ECN marking feedback is only accepted on valid ACKs.
>>
>> I'd really like this section to point to the mechanisms that may be used
>> to validate that the remote endpoint responds appropriately to ECN. It
>> is
>> relatively easy to do sender-side changes, but knowing the receiver
>> "correctly" implements a function is much harder.
>>
>>
>> I am happy to review again, this is heading in a good direction.
>>
>
>
> Lot's of good feedback, thank you very much !
>
> Richard Scheffenegger
>

Gorry