Re: [tcpm] Detect Lost Retransmit with SACK

"Scheffenegger, Richard" <rs@netapp.com> Tue, 10 November 2009 14:34 UTC

Return-Path: <rs@netapp.com>
X-Original-To: tcpm@core3.amsl.com
Delivered-To: tcpm@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 448F73A672F for <tcpm@core3.amsl.com>; Tue, 10 Nov 2009 06:34:21 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -5.728
X-Spam-Level:
X-Spam-Status: No, score=-5.728 tagged_above=-999 required=5 tests=[AWL=0.271, BAYES_00=-2.599, J_CHICKENPOX_33=0.6, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id iZsVooGlt1N2 for <tcpm@core3.amsl.com>; Tue, 10 Nov 2009 06:34:19 -0800 (PST)
Received: from mx4.netapp.com (mx4.netapp.com [217.70.210.8]) by core3.amsl.com (Postfix) with ESMTP id A87D63A67F0 for <tcpm@ietf.org>; Tue, 10 Nov 2009 06:34:18 -0800 (PST)
X-IronPort-AV: E=Sophos;i="4.44,716,1249282800"; d="scan'208";a="113497021"
Received: from smtp3.europe.netapp.com ([10.64.2.67]) by mx4-out.netapp.com with ESMTP; 10 Nov 2009 06:34:44 -0800
Received: from amsrsexc1-prd.hq.netapp.com (webmail.europe.netapp.com [10.64.251.107]) by smtp3.europe.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id nAAEYiF5008532; Tue, 10 Nov 2009 06:34:44 -0800 (PST)
Received: from LDCMVEXC1-PRD.hq.netapp.com ([10.65.251.108]) by amsrsexc1-prd.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.3959); Tue, 10 Nov 2009 15:34:44 +0100
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Date: Tue, 10 Nov 2009 14:34:12 -0000
Message-ID: <5FDC413D5FA246468C200652D63E627A065D0E76@LDCMVEXC1-PRD.hq.netapp.com>
In-Reply-To: <2C92861C-3B66-4E7F-9255-66AE6C2B1BB1@nets.rwth-aachen.de>
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Thread-Topic: Detect Lost Retransmit with SACK
Thread-Index: AcpiBGYLGyWQ0mcASf2LP8IRMC6nHQABnQnw
From: "Scheffenegger, Richard" <rs@netapp.com>
To: Alexander Zimmermann <alexander.zimmermann@nets.rwth-aachen.de>
X-OriginalArrivalTime: 10 Nov 2009 14:34:44.0181 (UTC) FILETIME=[F1F57850:01CA6212]
Cc: tcpm@ietf.org
Subject: Re: [tcpm] Detect Lost Retransmit with SACK
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 10 Nov 2009 14:34:21 -0000

Thanks Alex,

I will try to give my example in another form: (RWND = 40000; 
the timing might not be perfectly depicted).

      ACK         Transmitted       Received    ACK Sent
      Received    Segment           Segment     (Including SACK Blocks)
	

                      0-  999
                   1000- 1999
						    0-  999	
								 1000
	 1000		                   1000- 1999
			 2000- 2999                    2000
	 2000        3000- 3999
			 4000- 4999		 2000- 2999
	             5000- 5999				 3000
	 3000		 6000- 6999		 3000- 3999
			 7000- 7999				 4000
	 4000		 8000- 8999		 4000- 4999
			 9000- 9999				 5000
	 5000		10000-10999		 5000- 5999
			11000-11999				 6000
	 6000		12000-12999		 6000- 6999
			13000-13999				 7000
	 7000					 7000- 7999
			14000-14999				 8000
	 8000					 8000- 8999
			15000-15999				 9000
	 9000					 9000-10000
			16000-16999				10000
	10000					(dropped)
			17000-17999
						(dropped)
			.			
						(dropped)
			.			
						(dropped)
			.
						(dropped)
			.	
						15000-15999
								10000,
SACK=15k-16k
	10000, SACK=15k-16k		16000-16999
			18000-18999				10000,
SACK=15k-17k
	10000, SACK=15k-17k		17000-17999
			19000-19999				10000,
SACK=15k-18k
	10000, SACK=15k-18k		.
                  10000-10999
						18000-18999
								10000,
SACK=15k-19k
	10000, SACK=15k-19k		19000-19999
			11000-11999				10000,
SACK=15k-20k
	10000, SACK=15k-20k		(dropped)
			12000-12999		
						11000-11999
								10000,
SACK=11k-12k;15k-20k
	10000, SACK=11k-12k;15k-20k	12000-12999
			13000-13999				10000,
SACK=11k-13k;15k-20k
	10000, SACK=11k-13k;15k-20k
			14000-14999		13000-13999
								10000,
SACK=11k-14k;15k-20k
	10000, SACK=11k-14k;15k-20k	14000-14999
!*A			20000-20999				10000,
SACK=11k-20k
!	10000, SACK=11k-20k
!			21000-21999		20000-20999
!			22000-22999				10000,
SACK=11k-21k
!	10000, SACK=11k-21k
!			23000-23999		21000-21999
!			24000-24999				10000,
SACK=11k-22k
!	10000, SACK=11k-22k		22000-22999
!			25000-25999				10000,
SACK=11k-23k
!	10000, SACK=11k-23k		23000-23999
!*B			26000-26999				10000,
SACK=11k-24k
!     ::          ::                ::          ::
!RWND Full:                                     10000, SACK=11k-50k
!     10000, SACK=11k-50k
!     
!     ::          ::                ::          ::
!
!RTO:
!                 10000-10999
!								50000
!Slow-Start                                               


All the Lines marked with ! Indicate current best practise behaviour 
(RFC, no drafts), if I'm not mistaken - left out LimitedTransmit for 
simplicity's sake.

At point *A (or one ACK later), the sender would have the earliest 
possibility to detect a lost retransmission, taking into account the 
usual 3 ACK reordering hold-down... In this example, this happens by 
coincidence at the same time, that the sender has finished fast 
retransmission (and would go into fastrecovery, restoring CWND, etc; 
this is NOT a requirement....; Actually, the cwnd is likely to be in
the order of 100reds of segments, and most of the time, the sender
will have finished the retransmission episode before one RTT is up)

At point *B, the sender could detect with 100% certainty (2*RTT) that 
one retransmitted segment was lost.

However, current practise (excluding Linux with FACK for now, as 
that's not in RFCs) is to continue sending SND.NXT, until RWND is 
full or RTO expires...

Note that the suggested algorithm will never trigger if no 
retransmitted packet is lost - if would behave exactly the same as 
currently.

Only when *A or *B marks are detected, the DUPACK detection logic 
would re-arm (a more complex implementation could re-arm at the first 
sign of retransmission loss, and dis-arm if a ACK within DupAck 
distance ACKs the segment, for which it armed before).

Thus, a lost retransmission would be retried close to the earliest 
possibility, instead of waiting until RTO (a bit like what FACK seems 
to try to do, but with low reliability as it seems).

Multiple Burst loss events, each lossing a different segment in 
one cwnd would be handled by SACK.

If there are extended periods of time, where no communication is 
possible (the same segment doesn't get lost only twice, but multiple 
times), RTO would eventually fire and, using the RTO backoff 
algorithm, retry at ever increasing intervals with very low 
(1-2 segments) rate...


Do you have a test bed, where you can deliberately drop the same
segment twice (or n times) and check the TCP Behaviour for yourself?

Best regards,



Richard Scheffenegger
Field Escalation Engineer
NetApp Global Support 
NetApp
+43 1 3676811 3146 Office (2143 3146 - internal)
+43 676 654 3146 Mobile
www.netapp.com 
Franz-Klein-Gasse 5
1190 Wien 


-----Original Message-----
From: Alexander Zimmermann
[mailto:alexander.zimmermann@nets.rwth-aachen.de] 
Sent: Dienstag, 10. November 2009 13:51
To: Scheffenegger, Richard
Cc: tcpm@ietf.org
Subject: Re: Detect Lost Retransmit with SACK

Hi Richard,

I discussed your example with Arnd (he is a line-of-sight colleague).
Your "algorithms"
may workwhen you have only one bust lost per cwnd. If you have multiple
non-burst
loss (e.g. WLAN), IMHO, is doesn't work.

Am 09.11.2009 um 18:27 schrieb Scheffenegger, Richard:

> 
> Hi Alexander,
> 
> Thanks for the welcome :)
> 
> I fork another thread with the LimitedTransport||FastRecovery / ABC
interaction...
> 
> 
> I will try to sketch up an example to demonstrate what problem I'm
trying to address:
> 
> 
> Let's assume the cwnd is already open for at least 7 segments, before
the segment with
> sequence number 10000 is the first one to be dropped by the network.
> 
> Also, let's assume that FastRetransmit runs from the left edge of the
leftmost hole
> (SND.UNA) upwards, and that per ACK only a single segment is sent.
> 
> 
>             Triggering    ACK      Left     Right    Left     Right
>             Segment                Edge 1   Edge 1   Edge 2   Edge 2
> 
>              9000          9000
>             10000  (lost)       *
>             11000  (lost)
>             12000  (lost)
>             13000  (lost)
>             14000  (lost)
>             15000         10000    15000    16000
>             16000         10000    15000    17000
>             17000         10000    15000    18000

Ok, I count 9 segments ;-) Anyway, your example is a little bit strange.
You assume you send 9 segment in a burst. Then your ACK for 9000 will
trigger
the segment 18000. Then the 2 DUPACKs  for 15000 and 16000 will trigger
19000,
20000 respectively (Limited Transmit). The third DUPACK will trigger the
Fast
Retransmit 10000. Since NextSeq() and pipe allow you retransmit 11000,
12000.

At this point you have to wait since the pipe is full (if I calculate
pipe in a rush correctly).
A RTT later you will get 3 Dupacks, even if the 10000 is not lost.

OK, I think you can adjust your example so that it works for the given
case, however in
other scenarios (multiple loss, reordering, packet duplication,...) it
will be much more
complicated.

I suggest writing examples as Ilpo does in
http://tools.ietf.org/html/draft-ietf-tcpm-sack-recovery-entry-00.
It's much more easier to read.

> 
> 3 ACKs trigger fast retransmit
> 
>             10000  (lost again)
>             11000         10000    11000    12000    15000    18000
>             12000         10000    11000    13000    15000    18000
>             13000         10000    11000    14000    15000    18000
> -> here we have again 3 ACKs indicating a another loss of one of the
retransmitted
> packets. The leftmost hole did not change, while the overall number of
SACKed
> octets did decrease for 3 consecutive ACKs (4; 3 and 2 segments marked
by SACK).
> 

Alex

//
// Dipl.-Inform. Alexander Zimmermann
// Department of Computer Science, Informatik 4
// RWTH Aachen University
// Ahornstr. 55, 52056 Aachen, Germany
// phone: (49-241) 80-21422, fax: (49-241) 80-22220
// email: zimmermann@cs.rwth-aachen.de
// web: http://www.umic-mesh.net
//