[tcpm] AD review: draft-ietf-tcpm-tcp-lcd-01

Lars Eggert <lars.eggert@nokia.com> Tue, 20 July 2010 14:07 UTC

Return-Path: <lars.eggert@nokia.com>
X-Original-To: tcpm@core3.amsl.com
Delivered-To: tcpm@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 59ED83A6B0F for <tcpm@core3.amsl.com>; Tue, 20 Jul 2010 07:07:38 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.898
X-Spam-Level:
X-Spam-Status: No, score=-4.898 tagged_above=-999 required=5 tests=[AWL=-1.313, BAYES_40=-0.185, J_CHICKENPOX_33=0.6, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id aEfnOGWIoBhG for <tcpm@core3.amsl.com>; Tue, 20 Jul 2010 07:07:35 -0700 (PDT)
Received: from mgw-mx09.nokia.com (smtp.nokia.com [192.100.105.134]) by core3.amsl.com (Postfix) with ESMTP id DB3A83A6A0B for <tcpm@ietf.org>; Tue, 20 Jul 2010 07:07:34 -0700 (PDT)
Received: from esebh106.NOE.Nokia.com (esebh106.ntc.nokia.com [172.21.138.213]) by mgw-mx09.nokia.com (Switch-3.3.3/Switch-3.3.3) with ESMTP id o6KE7iEO023119 for <tcpm@ietf.org>; Tue, 20 Jul 2010 09:07:49 -0500
Received: from esebh102.NOE.Nokia.com ([172.21.138.183]) by esebh106.NOE.Nokia.com with Microsoft SMTPSVC(6.0.3790.4675); Tue, 20 Jul 2010 17:07:47 +0300
Received: from mgw-sa01.ext.nokia.com ([147.243.1.47]) by esebh102.NOE.Nokia.com over TLS secured channel with Microsoft SMTPSVC(6.0.3790.4675); Tue, 20 Jul 2010 17:07:47 +0300
Received: from mail.fit.nokia.com (esdhcp030222.research.nokia.com [172.21.30.222]) by mgw-sa01.ext.nokia.com (Switch-3.3.3/Switch-3.3.3) with ESMTP id o6KE7keq026176 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for <tcpm@ietf.org>; Tue, 20 Jul 2010 17:07:47 +0300
From: Lars Eggert <lars.eggert@nokia.com>
X-Virus-Status: Clean
X-Virus-Scanned: clamav-milter 0.96.1 at fit.nokia.com
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Date: Tue, 20 Jul 2010 17:07:34 +0300
Message-Id: <398D629B-8155-4089-825F-3CA17B0ED5B8@nokia.com>
To: "tcpm@ietf.org Extensions" <tcpm@ietf.org>
Mime-Version: 1.0 (Apple Message framework v1081)
X-Mailer: Apple Mail (2.1081)
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.5 (mail.fit.nokia.com); Tue, 20 Jul 2010 17:07:34 +0300 (EEST)
X-OriginalArrivalTime: 20 Jul 2010 14:07:47.0175 (UTC) FILETIME=[EE3EEF70:01CB2814]
X-Nokia-AV: Clean
Subject: [tcpm] AD review: draft-ietf-tcpm-tcp-lcd-01
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 20 Jul 2010 14:07:38 -0000

Hi,

below is my AD review for draft-ietf-tcpm-tcp-lcd-01. Basically, it's ready to go forward, but there is one question I'd like to briefly discuss: The draft says "SHOULD do X" in a bunch of places where at least to me it isn't clear why it doesn't say "MUST do X". Remember that ideally, each use of "SHOULD" explains under which conditions it is appropriate to not follow the recommendation. I'm missing that here, and at least to me it isn't obvious. See the detailed review below.

(The detailed review below also has a longish list of language/grammar nits, but the authors can simply see if they want to incorporate them or not, no need to discuss them.)

Lars



  Note: Most comments marked as "nits" below have been automatically
  flagged by review scripts - there may be some false positives in there.


Section 1., paragraph 2:
>    their retransmission timer.  In this document the terms

  Nit: s/document /document, /


Section 1., paragraph 5:
>    For the purposes of this specification we define the term "timeout-

  Nit: s/specification /specification, /


Section 1., paragraph 6:
>    based loss recovery" that refers to the state, which a TCP sender

  Nit: s/state, which/state that/


Section 1., paragraph 7:
>    example the NewReno modification to TCP's Fast Recovery algorithm

  Nit: s/example /example, /


Section 2., paragraph 1:
>    frequency of connectivity disruptions depends on the property of the

  Nit: s/property/properties/


Section 2., paragraph 2:
>    disruptions can occur in traditional wired networks too, e.g., caused

  Nit: s/ too//


Section 2., paragraph 3:
>    by an unplugged network cable, the likelihood of occurrence is

  Nit: s/of occurrence/of their occurrence/


Section 2., paragraph 5:
>    Depending on their duration connectivity disruptions can be

  Nit: s/duration /duration, /


Section 2., paragraph 7:
>    disruptions".  In particular, it focuses on the period "prior" to the

  Nit: s/"prior"/prior/ (don't get what the quotes are for)


Section 2., paragraph 8:
>    peer node.  The document does not describe any modifications of TCP's

  Nit: s/of/to/


Section 2., paragraph 9:
>    behavior and its congestion control mechanisms [RFC5681] "after"

  Nit: s/"after"/after/ (don't get what the quotes are for)


Section 2., paragraph 11:
>    When a long connectivity disruption occurs on a TCP connection the

  Nit: s/connection /connection, /


Section 2., paragraph 12:
>    each retransmission attempt.  However, the RTO's growth may be

  Nit: s/the RTO's growth/RTO growth/


Section 3., paragraph 1:
>    If the queue of an intermediate router experiencing a link outage can

  Nit: s/router experiencing/router that is experiencing/


Section 3., paragraph 4:
>    Provided that no other route to the specific destination exists the

  Nit: s/exists /exists, /


Section 3., paragraph 6:
>    hard errors are of no use for the proposed scheme, since TCP should

  Nit: s/the proposed/this/


Section 3., paragraph 8:
>    two peculiarities of ICMP messages.  Firstly, they do not necessarily

  Nit: s/Firtsly/First/


Section 3., paragraph 9:
>    elicited them.  When a router drops a packet due to a missing route

  Nit: s/route/route,/


Section 3., paragraph 10:
>    but will rather queue it for later delivery.  Secondly, ICMP messages

  Nit: s/Secondly/Second/


Section 3., paragraph 11:
>    window of data due to a link outage, it will hardly send as many ICMP

  Nit: s/it will hardly/it is unlikely to/


Section 3., paragraph 12:
>    unreachable messages as it dropped TCP segments.  Depending on the

  Nit: s/as it/as/


Section 3., paragraph 13:
>    load of the router it may even send no ICMP unreachable messages at

  Nit: s/router /router, / and s/may even send no/may not even send any/


Section 3., paragraph 15:
>    the sending host to match the ICMP error message to the transport

  Nit: s/transport/transport connection/


Section 3., paragraph 16:
>    that elicited it.  RFC 1812 [RFC1812] augments the requirements and

  Nit: s/the/these/


Section 3., paragraph 17:
>    the received ICMP message and to identify the faulty connection.

  Nit: s/faulty/affected/


Section 3., paragraph 19:
>    evidence that the segment was not dropped due to congestion, but was
>    successfully delivered to the temporary end-point of the employed
>    path, i.e., the reporting router.  It therefore did not witness any

  Nit: s/to the temporary end-point of the employed path, i.e.,/as far
  as the/


Section 4.1., paragraph 1:
>    whenever an ICMP unreachable message reports on the sequence number

  Nit: s/reports on the/is received that contains a segment with a/


Section 4.1., paragraph 3:
>    rate in case of congestion.  If either the retransmission itself, or

  Nit: s/itself,/itself/


Section 4.1., paragraph 4:
>    the corresponding ICMP message, is dropped the previously performed

  Nit: s/message,/message/


Section 4.2., paragraph 1:
>    A TCP sender using RFC 2988 [RFC2988] to compute TCP's retransmission

  Nit: s/using/that uses/


Section 4.2., paragraph 2:
>    timer MAY employ the following scheme to avoid over-conservative
>    retransmission timer backoffs in case of long connectivity
>    disruptions.  If a TCP sender does implement the following steps, the
>    algorithm MUST be initiated upon the first timeout of the oldest
>    outstanding segment (SND.UNA) and MUST be stopped upon the arrival of
>    the first acceptable ACK.  The algorithm MUST NOT be re-initiated
>    upon subsequent timeouts for the same segment.  The scheme SHOULD NOT
>    be used in SYN-SENT or SYN-RECEIVED states [RFC0793] (i.e., during
>    connection establishment).

  Why SHOULD NOT? Why not NUST NOT? When would it be useful to do this?


Section 4.2., paragraph 3:
>    A TCP sender that does not employ RFC 2988 [RFC2988] to compute TCP's
>    retransmission timer SHOULD NOT use TCP-LCD.  We envision that the
>    scheme could be easily adapted to algorithms others than RFC 2988.
>    However, we leave this as future work.

  Then this should be a MUST NOT, no?


Section 4.2., paragraph 4:
>    In rule (2.5) RFC 2988 [RFC2988] provides the option to place a

  Nit: s/(2.5)/(2.5),/


Section 4.2., paragraph 5:
>    maximum value on the RTO.  When a TCP implements this rule to provide
>    an upper bound for the RTO, it SHOULD also be used in the following
>    algorithm.  In particular, if the RTO is bounded by an upper limit
>    (maximum RTO), the "MAX_RTO" variable used in this scheme SHOULD be
>    initialized with this upper limit.  Otherwise, if the RTO is
>    unbounded, the "MAX_RTO" variable SHOULD be set to infinity.

  Why are these all SHOULDs and not MUSTs? (Remember that each SHOULD
  should come with an explanation of when it makes sense to not follow
  the recommendation.)


Section 4.2., paragraph 33:
>    retransmission timer it enters the timeout-based loss recovery and

  Nit: s/timer /timer, /


Section 4.2., paragraph 34:
>    account for the expiration of the retransmission timer the TCP sender

  Nit: s/timer /timer, /


Section 4.2., paragraph 36:
>    In case the retransmission timer expires again (step 3a) a TCP will

  Nit: s/(step 3a) a /(step 3a), /


Section 4.2., paragraph 37:
>    back off the retransmission timer once more (step R) [RFC2988] as

  Nit: s/[RFC2988] /[RFC2988], /


Section 4.2., paragraph 39:
>    recovery are ignored since the ACK clock is already restarting due to

  Nit: s/ignored /ignored, /


Section 4.2., paragraph 41:
>    step (4) permits it, a TCP SHOULD undo one backoff for each ICMP

  Nit: s/a TCP/TCP/


Section 4.2., paragraph 42:
>    unreachable message reporting an error on a retransmission.  To
>    decide if an ICMP unreachable message reports on a retransmission,
>    the sequence number therein is exploited (step 5, step 6).  The undo

  Complicated sentence. Why not: To decide if an ICMP unreachable
  message was elicited by a retransmission, the sequence number it
  contains is inspected (step 5, step 6).


Section 4.2., paragraph 44:
>    one backoff there is the possibility that the shortened

  Nit: s/backoff/backoff,/


Section 4.2., paragraph 45:
>    retransmission timer has already expired (step 8).  Then, a TCP
>    SHOULD retransmit immediately, i.e., an ICMP message clocked
>    retransmission.  In case the shortened retransmission timer has not

  I don't understand what the "i.e." clause means.


Section 5., paragraph 1:
>    indications in form of ICMP unreachable messages during timeout-based

  Nit: s/in form/in the form/


Section 5., paragraph 2:
>    acceptable ACK.  Thus, by defintion the algorithm triggers only in

  Nit: s/defintion/definition,/


Section 5., paragraph 3:
>    case of long connectivity disruptions.

  Nit: s/in case/in the case/


Section 5., paragraph 4:
>    that the retransmissions were not dropped due to congestion but were

  Nit: s/congestion/congestion,/


Section 5., paragraph 5:
>    successfully delivered to the temporary end-point of the employed
>    path, i.e., the reporting router.  In other words, there is no

  Nit: s/temporary end-point of the employed path, i.e., the//

>    evidence for any congestion at least on that very part of the path
>    that was traveled by both, the TCP segment eliciting the ICMP

  Nit: s/traveled/traversed/ and s/both,/both/


Section 5., paragraph 7:
>    can happen, albeit the received ICMP unreachable message reports on

  Nit: s/albeit/even when/ and s/reports on/contains/


Section 5., paragraph 8:
>    the segment number of a retransmission (SND.UNA) because the TCP

  Nit: s/(SND.UNA)/(SND.UNA),/


Section 5., paragraph 10:
>    describes the motivation for not reacting on ICMP unreachable

  Nit: s/on/to/


Section 5.1., paragraph 2:
>    The revert strategy of the given algorithm suffers from a form of

  Nit: s/revert/reversion/


Section 5.1., paragraph 3:
>    is, there is an ambiguity which TCP segment an ICMP unreachable

  Nit: s/which/with regards to which/


Section 5.1., paragraph 5:
>    However, for the algorithm this ambiguity is not considered to be a
>    problem.  The assumption that a received ICMP message provides

  Nit: s/algorithm/algorithm,/ (or move "for the algorithm" to the end
  of the sentence)


Section 5.2., paragraph 1:
>    there is another source of ambiguity about the TCP sequence numbers

  Nit: s/about/related to/


Section 5.2., paragraph 2:
>    contained in ICMP unreachable messages.  For high bandwidth paths
>    like modern gigabit links the sequence space may wrap rather quickly,
>    thereby allowing the possibility that delayed ICMP unreachable
>    messages - a router dropping packets due to a link outage is not
>    obliged to send ICMP unreachable messages in a timely manner
>    [RFC1812] - may coincidentally fit as valid input in the proposed
>    scheme.

  Complex sentence. Why not: For high bandwidth paths, the sequence
  number space can wrap quickly. This might cause the sequence numbers
  contained in delayed ICMP unreachable messages to incorrectly match
  the sender's current SND.UNA.


Section 5.2., paragraph 4:
>    backoffs the retransmission timer normally without any undoing.  At

  Nit: s/backoffs the retransmission timer/backs the retransmission timer
  off/


Section 5.2., paragraph 6:
>    connection with n segments in-flight will be disrupted at some point

  Nit: s/in-flight/in flight/


Section 5.2., paragraph 7:
>    due to a link outage by an intermediate router R. For each segment

  s/by an/at an/


Section 5.2., paragraph 8:
>    in-flight, router R may generate an ICMP unreachable message.

  Nit: s/in-flight/in flight/


Section 5.2., paragraph 9:
>    emits the delayed ICMP unreachable messages now, one spurious undoing

  Nit: s/one spurious undoing...is possible/spurious undoing...is
  possible once/


Section 5.3., paragraph 1:
>    recovery than it actually has sent timeout-based retransmissions.

  Nit: s/it actually has//


Section 5.6., paragraph 2:
>    multi-path routing even the receipt of a legitimate ICMP unreachable

  Nit: s/routing/routing,/


Section 5.6., paragraph 3:
>    message cannot be exploited accurately because there is the option

  Nit: s/accurately/accurately,/ and s/option/possibility/


Section 6., paragraph 0:
> 6.  Dissolving Ambiguity Issues (the Safe Variant)

  Not sure if "the Safe Variant" is really capturing the essence of this
  alternative. Maybe: Improved Resolution of Ambiguity Issues for
  Connections Using Timestamps?


Section 6., paragraph 1:
>    Given that the TCP Timestamps option [RFC1323] is enabled for a

  s/Given that/If/


Section 6., paragraph 2:
>    connection, a TCP sender MAY use the following algorithm to dissolve

  MAY? If we believe this algorithm is better, shouldn't this be a
  SHOULD? (And the reason to not do the SHOULD is likely implementation
  complexity.)


Section 6., paragraph 20:
>    The downside of the safe variant is twofold.  Firstly, the
>    modifications come at a cost: the TCP sender is required to store the
>    timestamps of all retransmissions sent during one timeout-based loss
>    recovery.  Second, the safe variant can only undo a retransmission
>    timer backoff if the intermediate router experiencing the link outage
>    implements [RFC1812] and chooses to include as many more than the
>    first 64 bits of the payload of the triggering datagram, as are
>    needed to include the TCP Timestamps option in the ICMP unreachable
>    message.

  See above, wouldn't call this "safe" - it's not like the other variant
  is unsafe.


Section 7.1., paragraph 1:
>    retransmissions.  RFC 1122 [RFC1122] states in Section 4.2.3.5 that a

  Nit: s/RFC 1122 [RFC1122]/[RFC1122]/


Section 7.1., paragraph 2:
>    two thresholds R1 and R2 measuring the number of retransmissions that

  Nit: s/measuring/that measure/


Section 7.1., paragraph 4:
>    Due to TCP-LCD's revert strategy of the retransmission timer, the

  Nit: s/revert/reversion/


Section 7.1., paragraph 5:
>    assumption that a certain number of retransmissions corresponds to a
>    specific time interval no longer holds, as additional retransmissions
>    may be performed during timeout-based-loss recovery to detect the end
>    of the connectivity disruption.  Therefore, a TCP employing TCP-LCD
>    either SHOULD measure the thresholds R1 and R2 in time units or, in
>    case R1 and R2 are counters of retransmissions, SHOULD convert them
>    into time intervals, which correspond to the time an unmodified TCP
>    would need to reach the specified number of retransmissions.

  Why not MUST? When is it appropriate to not use time units and still
  do LCD?


Section 7.2., paragraph 1:
>    By the use of Explicit Congestion Notification (ECN) [RFC3168] ECN-

  Nit: s/By the use of/With/ and s/[RFC3168]/[RFC3168],/


Section 7.2., paragraph 2:
>    congestion indication.  Instead, they can set the Congestion

  Nit: s/as congestion indication/to indicate congestion/


Section 7.2., paragraph 4:
>    With TCP-LCD it may happen that during a connectivity disruption a

  Nit: s/LCD/LCD,/ and s/disruption/disruption,/


Section 7.2., paragraph 5:
>    received ICMP unreachable message has been elicited by a timeout-
>    based retransmission that was marked with the CE codepoint before
>    reaching the router experiencing the link outage.  In such a case, we
>    suggest that the TCP sender SHOULD additionally reset the
>    retransmission timer in case the algorithm undoes a retransmission
>    timer backoff.

  Why not MUST? When is it appropriate to not follow the SHOULD?


Section 7.3., paragraph 1:
>    available for TCP-LCD as in the case of IPv4.

  Nit: s/as/than/


Section 7.4., paragraph 2:
>    If, for example, end-to-end tunnels like IPSec in transport mode

  Nit: s/IPSec/IPsec/


Section 7.4., paragraph 3:
>    contains enough information, i.e., SEQ.SEG is extractable, these
>    information MAY still be used as a valid input for the proposed
>    algorithm.

  Nit: s/these information/this information/ and s/MAY/can/ (not sure a
  RFC2119 term makes sense here)


Section 7.4., paragraph 4:
>    replayed ICMP unreachable messages MAY still be used in TCP-LCD.

  Nit: s/MAY/can/ (not sure a RFC2119 term makes sense here)


Section 8., paragraph 0:
> 8.  Related Work

  Maybe make this an appendix? Or move it up after the introduction?


Section 8., paragraph 1:
>    example [SM03] introduces a "smart link layer", which buffers one

  Nit: s/example/example,/


Section 8., paragraph 4:
>    case of a link failure the TCP sender stops sending segments and

  Nit: s/failure/failure,/


Section 8., paragraph 6:
>    zero window probing with a exponential backoff.  ICMP destination

  Nit: s/a exponential/an exponential/


Section 8., paragraph 7:
>    combined.  However, due to security considerations it does not seem

  Nit: s/considerations/considerations,/


Section 8., paragraph 8:

>    appropriate to adopt ATCP's reaction as discussed in Section 5.6.

  Nit: s/reaction/reaction,/


Section 8., paragraph 9:
>    Schuetz et al. describe, in [I-D.schuetz-tcpm-tcp-rlci], a set of TCP

  Nit: s/describe, in
  [I-D.schuetz-tcpm-tcp-rlci],/[I-D.schuetz-tcpm-tcp-rlci]/


Section 10., paragraph 1:
>    TCP-LCD to flood the network just by sending forged ICMP unreachable

  Nit: s/to flood/flood/


Section 12.2., paragraph 4:
>    [I-D.ietf-tcpm-icmp-attacks]
>               Gont, F., "ICMP attacks against TCP",
>               draft-ietf-tcpm-icmp-attacks-12 (work in progress),
>               March 2010.

  Nit: Outdated reference: draft-ietf-tcpm-icmp-attacks has been
  published as RFC 5927


Appendix A., paragraph 0:
> Appendix A.  Changes from previous versions of the draft

  Please add a Note to the RFC Editor to remove this appendix upon
  publication.