Re: [tcpm] poll for adoption of long connectivity disruptions draft

Alexander Zimmermann <Alexander.Zimmermann@nets.rwth-aachen.de> Tue, 15 September 2009 17:59 UTC

Return-Path: <Alexander.Zimmermann@nets.rwth-aachen.de>
X-Original-To: tcpm@core3.amsl.com
Delivered-To: tcpm@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 076753A6A5F for <tcpm@core3.amsl.com>; Tue, 15 Sep 2009 10:59:05 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.25
X-Spam-Level:
X-Spam-Status: No, score=-3.25 tagged_above=-999 required=5 tests=[AWL=-0.908, BAYES_20=-0.74, HELO_EQ_DE=0.35, HELO_MISMATCH_DE=1.448, J_CHICKENPOX_33=0.6, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mpvad3SqBEF0 for <tcpm@core3.amsl.com>; Tue, 15 Sep 2009 10:59:03 -0700 (PDT)
Received: from mail-i4.informatik.rwth-aachen.de (mail-i4.informatik.RWTH-Aachen.DE [137.226.12.21]) by core3.amsl.com (Postfix) with ESMTP id 4180A28C130 for <tcpm@ietf.org>; Tue, 15 Sep 2009 10:58:39 -0700 (PDT)
Received: from exchange-ts.nets.rwth-aachen.de (exchange-ts.nets.rwth-aachen.de [137.226.13.5]) by mail-i4.informatik.rwth-aachen.de (Postfix) with ESMTP id C5B7757CE4; Tue, 15 Sep 2009 19:59:25 +0200 (CEST)
Received: from exchange-mb.nets.rwth-aachen.de ([2002:89e2:d06::89e2:d06]) by exchange-ts.nets.rwth-aachen.de ([2002:89e2:d05::89e2:d05]) with mapi; Tue, 15 Sep 2009 19:59:25 +0200
From: Alexander Zimmermann <Alexander.Zimmermann@nets.rwth-aachen.de>
To: Joe Touch <touch@ISI.EDU>
Date: Tue, 15 Sep 2009 19:59:24 +0200
Thread-Topic: [tcpm] poll for adoption of long connectivity disruptions draft
Thread-Index: Aco2LkK5SXI9NaY0Qcu0XZw/g1RAvQ==
Message-ID: <ABDA5352-E524-4E63-BA14-AC2117EFBBB2@nets.rwth-aachen.de>
References: <C304DB494AC0C04C87C6A6E2FF5603DB479B8A30CD@NDJSSCC01.ndc.nasa.gov> <EBDBFEBB-F697-49A4-A665-DC06F3916CB6@iki.fi> <4AA6DF6D.10606@isi.edu> <E1F126EA-CB8E-47A1-B014-D0526E44BEDD@nets.rwth-aachen.de> <4AA97092.3080105@isi.edu>
In-Reply-To: <4AA97092.3080105@isi.edu>
Accept-Language: de-DE
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: de-DE
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Cc: "tcpm@ietf.org" <tcpm@ietf.org>
Subject: Re: [tcpm] poll for adoption of long connectivity disruptions draft
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 15 Sep 2009 17:59:05 -0000

Hi Joe,

first of all, thank you very much for your valuable comments! Since  
I'm not sure
I understand you correctly, I break down our problem to a concrete  
example with
we can play with.

* If I understand you correctly in this threat as well in the threat  
"WG Last
Call for ICMP Attacks", the "possibility" and the "impact" of a false  
positive
is something different and both should be included in our I-D. Till  
now only the
"possibility" is included. I will fix that.

* Every time I had the problem "RTO-based loss recovery AND router can  
delay to
send/generate ICMPs AND sequence space can wrap" in mind, I mean problem
"B.4.2". However, I have the feeling you mean problem "2.4.2". Right?

Alex

---

Situation:

S (Source) ---------------- R (Router) ---------------- D (Destination)

- At time t0, a steady state TCP flow with packets p1,...,pn in flight  
between a
   source S and a destination D. Packets are routed by an intermediate  
router R.

- At time t1 a connectivity disruption occurs at router R.

- At time t2, S starts to send RTO based retransmissions r1,...,rm and  
backs off
   and undoes according the algorithm.

- At time t3, the connection is re-established

- At time t4, the sequence space wraps

- At time t5, the connection is in RTO-based loss recovery again


The following scenarios are possible, when ICMP DUs for the packets  
p1,...,pn
are delayed for a longer time (or are not generated at all):

1. R does not generate ICMP DUs for p1,...,pn (congestion)
    => Backoff according to RFC 2988, no undo by design.

2. R generates ICMP DUs for p1,...,pn
   2.1 S receives ICMP DUs BEFORE t2 (standard)
       => Impact: None. ICMP DUs are ignored by design.

   2.2 S receives ICMP DUs AFTER t2, but BEFORE t3
       => Impact: Retransmission ambiguity. To be considered as not  
harmful.
       (described in I-D section 4.3, para 5)

   2.3 S receives ICMP DUs AFTER t3, but BEFORE t5
       => Impact: None. ICMP DUs are ignored by design.

   2.4 S receives ICMP DUs AFTER t5
     2.4.1 RTO due to connectivity disruption
       => Impact: No false positives since we count backoffs.

     2.4.2 RTO due to congestion
       => Impact: one false positive possible
       => Possibility: at most n/2^32
       (not described in I-D yet!!!)


The following scenarios are possible, when ICMP DUs for the  
retransmissions
r1,...,rm are delayed for a longer time (or are not generated at all):

A. R does not generate ICMP DUs for r1,...,rm (congestion)
    => Backoff according to RFC 2988, no undo by design.

B. R generates ICMP DUs FOR ALL r1,...,rm
   B.1 S receives ICMP DUs BEFORE t3
       => ICMP DUs are exploited by design. => Undo

   B.2 S receives ICMP DUs AFTER t3, but BEFORE t5
       => Impact: None. ICMP DUs are ignored by design.

   B.3 S receives ICMP DUs AFTER t5
     B.4.1 RTO due to connectivity disruption
       => Impact: No false positive since we count backoffs.

     B.4.2 RTO due to congestion
       => Impact: at most m false positives
       => Possibility: at most 1/2^32
       (described (not completely correct) in I-D section 4.3, para 6)


B.1) A steady state TCP flow between a sender S and a destination D is  
disrupted
at some router R. The first few RTO retransmissions trigger ICMP DUs  
at R, but
are delayed by it. Then, while S is still in the same loss recovery  
phase, the
accumulated ICMPs are emitted by R and received by S, triggering  
multiple
reversions. This is safe, as the result would be the same, when ICMPs  
were
emitted instantly.

B.2) A steady state TCP flow between a sender S and a destination D is  
disrupted
at some router R. The first few RTO retransmission trigger ICMP DUs at  
R, but
are delayed by it. Then, connectivity is re-established by a successful
retransmission and S leaves the loss state. If the delayed ICMPs are  
emitted by
R now, they will have no effect on S.

2.4.2) A steady state TCP flow between a sender S and a destination D is
disrupted at some router R. Each packet p[i], which is in flight from  
S to T
passes R and may (ICMP rate limiting) trigger a DU message, which is  
delayed
at R. Afterwards, when the link outage is over, the connection is
re-established, and S slow-starts the connection. When the sender's TCP
performs a sequence number wrap and approaches the sequence numbers of  
p[i],
an RTO occurs and S enters RTO-based loss recovery. Now, R emits the  
delayed
ICMPs for the p[i] of the last sequence number cycle, which may  
trigger one
false reversion. In case of congestion, this leads to one false
retransmission, while in the case of another link outage, the additional
reversion is harmless. However, it is at most one false reversion, as  
long
as we assume only one sequence number cycle of delayed ICMPs. The  
probability of
this to happen depends on the number of packets which are in flight  
towards R.

B.4.2) A steady state TCP flow between a sender S and a destination T is
disrupted at some router R. The sender performs multiple consecutive RTO
retransmissions, but the corresponding ICMPs are delayed by R. Then,  
after
several more retransmissions, connectivity is re-established and S  
leaves
loss state. A sequence space wrap later, the path is disrupted again,  
exactly
at the time at which the current SND.UNA matches the SND.UNA form the  
previous
cycle. If the router R emits the delayed ICMPs now, but is currently  
congested,
S will undo the multiple backoffs falsely. The probability of this  
should be at
most 1 to 4 billion.
Given sufficiently many RTO retransmissions in the first loss phase, the
corresponding ICMPs can pull the RTO in the second loss phase from  
MAX_RTO
to the initial RTO. However, once the ICMPs are depleted, standard  
exponential
backoff will be performed. Roughly speaking, the standard congestion
response will be delayed by a few false retransmissions, and after
log2(MAX_RTO/MIN_RTO) of those retransmissions, MAX_RTO is reached  
again.

//
// Dipl.-Inform. Alexander Zimmermann
// Department of Computer Science, Informatik 4
// RWTH Aachen University
// Ahornstr. 55, 52056 Aachen, Germany
// phone: (49-241) 80-21422, fax: (49-241) 80-22220
// email: zimmermann@cs.rwth-aachen.de
// web: http://www.umic-mesh.net
//