[tcpm] Some comments on: draft-kuehlewind-tcpm-accecn-reqs

Gorry Fairhurst <gorry@erg.abdn.ac.uk> Sat, 08 March 2014 10:47 UTC

Return-Path: <gorry@erg.abdn.ac.uk>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C1ED81A0166 for <tcpm@ietfa.amsl.com>; Sat, 8 Mar 2014 02:47:31 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.748
X-Spam-Level:
X-Spam-Status: No, score=-4.748 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RP_MATCHES_RCVD=-0.547, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ZOEdi4w2shfx for <tcpm@ietfa.amsl.com>; Sat, 8 Mar 2014 02:47:28 -0800 (PST)
Received: from spey.erg.abdn.ac.uk (spey.erg.abdn.ac.uk [139.133.204.173]) by ietfa.amsl.com (Postfix) with ESMTP id 8FCEE1A016A for <tcpm@ietf.org>; Sat, 8 Mar 2014 02:47:27 -0800 (PST)
Received: by spey.erg.abdn.ac.uk (Postfix, from userid 5001) id E1F292B4519; Sat, 8 Mar 2014 10:47:21 +0000 (GMT)
Received: from Gorrys-MacBook-Air.local (fgrpf.plus.com [212.159.18.54]) by spey.erg.abdn.ac.uk (Postfix) with ESMTPSA id 7C3D12B44C7; Sat, 8 Mar 2014 10:47:18 +0000 (GMT)
Message-ID: <531AF534.6050209@erg.abdn.ac.uk>
Date: Sat, 08 Mar 2014 10:47:16 +0000
From: Gorry Fairhurst <gorry@erg.abdn.ac.uk>
Organization: The University of Aberdeen is a charity registered in Scotland, No SC013683.
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:12.0) Gecko/20120428 Thunderbird/12.0.1
MIME-Version: 1.0
To: draft-kuehlewind-tcpm-accecn-reqs@tools.ietf.org, tcpm@ietf.org
Content-Type: text/plain; charset="windows-1252"; format="flowed"
Content-Transfer-Encoding: 8bit
Archived-At: http://mailarchive.ietf.org/arch/msg/tcpm/bDhKMYjIx71m282safnG754ZZQY
Subject: [tcpm] Some comments on: draft-kuehlewind-tcpm-accecn-reqs
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
Reply-To: gorry@erg.abdn.ac.uk
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 08 Mar 2014 10:47:32 -0000

draft-ietf-tcpm-accecn-reqs-05

I have comments on this draft, they seem long and have a mixture of 
important and editorial-like comments. I am sorry for being late in 
returning this, I had intended to send something before the meeting in 
London.

Gorry

---

Overall - this ID  seems a valuable list of things to be considered, but 
  I am not sure the requirements are clear. Two top level comments are:

(1) I am not clear on the intended scope - is this proposing an update 
for the internet - for cones for the DC environment, all?

(2) The requirements listed in section 4  to figure out what is needed, 
desirable or just nice to have. The document could use requirements 
language if needed, but even if not, I really think it should call out 
more clearly what is required.

---

Detailed comments follow:

Scope:

(above point 1) I see the abstract says "Recent new TCP mechanisms (like 
ConEx or DCTCP) need more accurate ECN feedback in the case where more 
than one marking is received in one RTT." - which is fine, but this 
leaves open whether the methods proposed are for used in controlled 
environments, within domains such as Conex or the general Internet. 
(This seems to be a perhaps unintentional change from what I looked at 
in rev -03).

(above point 1) I really think the work should target methods that *can* 
be deployed in the general Internet. I'd suggest this actually becomes a 
requirement.

Section1:

(above point 1), I read "his document lists requirements for a robust 
and interoperable more accurate TCP/ECN feedback protocol that all 
implementations of new TCP extensions, like ConEx and/or DCTCP, can use. 
" - but from the current I-D, I read the scope as can be used by these 
two, but I think it needs to be applicable to general Internet 
deployment.  Please clarify the scope here also.

Section 2:

Disagree with text: "However, as the ECN Nonce is a separate extension 
to ECN, even if a sender tries to protect itself with the ECN Nonce, any 
receiver wishing to conceal marked packets only has to pretend not to 
support the ECN Nonce and simply does not provide any nonce sum feedback."
- To me this argument is ridiculous (and I have said so), and I think we 
should not perpetuate this argument!
[[Here is why:  if a sender were to require a NS, then a receiver would 
need to provide that or expect the sender would not continue use of ECN, 
or at least limit is trust of ECN.  So the wording could be improved, 
but RFC3540 says:  "If the receiver has never sent a non-zero nonce sum, 
the sender can infer that the receiver does not understand the nonce, 
and rate limit the connection, place it in a lower-priority queue, or 
cease setting ECT in outgoing segments."]]
- Saying the nonce sum has a second problem in that it only works if it 
is deployed and actually used seems obvious.

Other points that may help the argument: The WG should look for sound 
reasons to obsolete specs, saying the NS was not significantly (or not 
known to be) deployed however, does seem true to me. Saying there are 
equivalent methods may be true, and I think this is presented. Saying 
that the NS only validates non-congestion and hence provides much less 
useful protection for the more frequent marking of immediate-ECN is also 
true.

Section 3:

This section confuses me. I find no motivation for use beyond the two 
use-cases.

It seems to be a list of two use-cases, I assume these are *examples* of 
use, rather than required use-cases to consider?

It speaks of DTCP (which I think is currently only defined for 
controlled environments) and Conex (specified for a specific domain). 
Reiterating (point 1), I'd hoped these were examples, but that the 
actual WG goal was now to define a general purpose method for feedback. 
Or is it?

Ordering of examples: How much deployment experience do we have  of 
Conex? I'm not arguing to  remove this use-case, but wonder whether this 
is best to be placed first in the examples of use?

Is there an implied use-case of standard TCP and a possible use case of 
an updated ECN for TCP? - I think this is only hinted.

Section 4:

Usage of RFC 2119: Personally, I really do not like use of RFC2119 
keywords in this way: "This leads to the following requirements, which 
MUST be discussed for any proposed more accurate ECN feedback scheme:" - 
Mandating discussion is not helpful.

(point 2 above): If we have requirements as the title suggests, then I 
personally would encourage you to list them as requirements using RFC 
2119 language - but I don't mind seeing no RFC 2119 keywords if the WG 
wants this, as long as I can see the relative importance of each topic. 
At the moment I can't see this.

Comments on specific topics:

Resilience

"Moreover, delayed ACK are mostly used with TCP.  That means in most 
cases only every second data packets triggers an ACK."
- Isn't it one ACK for every 2 full-sized MSS of data? The ACK rate per 
segment can be much less if the sender disables the Nagle algorithm. I'd 
really like a requirement that this mechanism is robust for use with 
"thin" streams that disable Nagle - and to understand what this means, 
and if we simply require interop in this case, or whether we require Acc 
ECN to work accurately.

- Some reference is probably needed to explain the variety of TCP ACK 
mangling - e.g. RFC 3449 illustrates at the least the breadth of 
mechanisms out there.

"Also, a more accurate feedback protocol should still work if delayed 
ACKs covered more than two packets."
- In IETF understanding is this really a "should" - I see it is lower 
case, but it is best not to be ambiguous? does this mean it may not 
work, or may not be accurate or may cause the TCP receiver to reset (I 
hope not)?

- Is this resilience to loss?

- Are there implied ordering requirements for the path? - i.e. I'd 
assume we want a system that does is not impacted by reordering of ACKs, 
even when the ACK'ed sequence number does not increase, or do we assume 
an ordered path?

- I think it worth also noting a  *separate topic* for the need for 
robustness to known middle boxes as a desirable feature? … I think this 
is added in "Backward and forward compatibility" - to me such a section 
is usually expect to see as a protocol evolution issue, rather than a 
middle box issue.

Timeliness

"A CE mark can be induced by a network node on the transmission path and 
is then echoed by the receiver in the TCP ACK. "
- I think CE can also be set by the sending host, please update.

"Thus when this information arrives at the sender, it is naturally 
already about one RTT old."
- not really, it is at least 1/2 RTT old - depending on where loss was 
encountered.

"A RFC5681 TCP sender without ConEx:"
  ^^
/A/An/ - is it better

Integrity
- This starts "it should be possible" and later mentions requirements, 
is this "a sender SHOULD implement…" (required) or "senders MAY 
implement …" (can) or "a SENDER should implement and MAY enable" 
(required feature, optional deployment) - please clarify and avoid 
possible, etc.

"Whether a sender should enforce when it detects wrong feedback 
information, and what kind of enforcement it should apply, are policy 
issues that need not be specified as part of more accurate ECN feedback 
scheme."
- To me this seems like a clear recommendation not to do this - are 
these really policy decisions, are you saying that the IETF doesn't need 
to do this. I am not sure I agree, if the sender state machine can 
detect and react to bogus feedback this maybe should be considered, and 
I think it would be desirable to know that such mechanisms can be used 
with any selected approach.

- If there is policy, be clear what sort of policy - per stack, per 
socket, per interface, per provisioning domain…

Accuracy

"However, assuming the sender marks all data packets as ECN-capable and 
uses the default setting of ECT(0),"
This may just be wording: - Is use of only ECT(0) assumed somewhere in 
the RFC series, (mate in PCN?) ? RFC 4774 is a BCP specifies different 
semantics can be used for the two ECT code points for a different DSCP, 
to me this indicates that transports should at least be robust to this, 
and I'd expect that TCP could be used in this case. Is this a 
requirement? (I think it should be).

The clause  continues to say feedback of CE and ECT(1) is sufficient. 
So wouldn't this also be the case if both ECT(0) and ECT(1) were used? 
i.e. maybe it works fine for other DSCP semantics, by feeding back CE 
and ECT(1). Please clarify.

Complexity:

Maybe this is in the wrong section: "Furthermore, the receiver should 
not make assumptions about the mechanism that was used to set the 
markings nor about any interpretation or reaction to the congestion 
signal. The receiver only needs to faithfully reflect congestion 
information back to the sender."
- This does not seem like a complexity requirement to me. It seems like 
an actual requirement. I'd have expected stronger language against the 
network not interpreting the meaning of the reaction. Could this be 
"MUST NOT"? (maybe needs to be in a middle box section).

Backward and forward compatibility

/it should to be/it should be/

Middleboxes (see earlier)

"A more accurate ECN feedback extension should aim to be able to 
traverse most existing middleboxes."
- should … aim .. most … existing … This seems full of questions: Why 
not "should aim to traverse most middle boxes"? … but see note above, is 
this really backwards compatibility?

I'd love to see more here that motivates why the methods in 
"draft-kuehlewind-tcpm-ecn-fallback" need to be considered. I think this 
it is really important that any ECN method works through commonly 
deployed middle boxes and also that there are at least tools to verify 
that this working is correct, even if we do not directly mandate methods 
to probe and validate the path (which I also think we should seriously 
consider).

I have found a  separate section on middle boxes that uses the words 
"firewall and NAT" is generally helpful to ensure that people from the 
community actually find advice and know it is directed to them. If they 
don't like these requirements they should tell us before we do the work.

Section 5:

"In case of " - insert "the"

"highly vulnerable to ACK loss." - how is this different to being simply 
vulnerable? - can you explain "highly"?

"still highly ambiguous" - as above, it seems ambiguous?

"A couple of coding schemes"
- is this a couple as in a linked set or as in two, please clarify language.

"Urgent Pointer field" - please refer to the recent TCPM RFC on use of URG.

"but still not ideal." - I don't see ideal as a goal, consider rewording?

I found the last para of 5.1 is a hard read, unless the reader knows 
what is being said already.

Sect 5.2:

"Alternatively, the receiver could use bits in the Urgent Pointer field 
to signal more bits of its congestion signal counter, but only whenever 
it does not set the Urgent Flag."
- I'd urge rewording this, far too often I see sentences from RFCs 
replicated in other places. In place of this can we say that this is NOT 
permitted, but a new method could standardise use of these bits…

Sect 5.3:

"and SCTP counts the number" - probably this needs to be cited as a 
"proposal", since this draft has not to date been progressed.

"this option would need to be carried by most or all ACKs" - can you 
explain why? even in times of non-congestion?

Sect 8

I would expect to see some discussion that states that ECN feedback 
should only be used if the other information indicates the congestion 
was on-path - i.e. feedback MUST use normal TCP sequence number check 
techniques to verify the CE-marked packet was a part of the current 
flow, and similarly ECN marking feedback is only accepted on valid ACKs.

I'd really like this section to point to the mechanisms that may be used 
to validate that the remote endpoint responds appropriately to ECN. It 
is relatively easy to do sender-side changes, but knowing the receiver 
"correctly" implements a function is much harder.


I am happy to review again, this is heading in a good direction.