Re: [tcpm] flow control and fast recovery

Yoshifumi Nishida <nishida@sfc.wide.ad.jp> Wed, 14 August 2013 08:19 UTC

Return-Path: <nishida@sfc.wide.ad.jp>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E301621F96EF for <tcpm@ietfa.amsl.com>; Wed, 14 Aug 2013 01:19:12 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -101.978
X-Spam-Level:
X-Spam-Status: No, score=-101.978 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, NO_RELAYS=-0.001, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id urcWN8v4XEYZ for <tcpm@ietfa.amsl.com>; Wed, 14 Aug 2013 01:19:12 -0700 (PDT)
Received: from mail.sfc.wide.ad.jp (ns.sfc.wide.ad.jp [IPv6:2001:200:0:8803:203:178:142:143]) by ietfa.amsl.com (Postfix) with ESMTP id B670F21F8C93 for <tcpm@ietf.org>; Wed, 14 Aug 2013 01:19:11 -0700 (PDT)
Received: from mail-la0-x22f.google.com (mail-la0-x22f.google.com [IPv6:2a00:1450:4010:c03::22f]) by mail.sfc.wide.ad.jp (Postfix) with ESMTPSA id 3B3352780C3 for <tcpm@ietf.org>; Wed, 14 Aug 2013 17:19:07 +0900 (JST)
Received: by mail-la0-f47.google.com with SMTP id eo20so6460877lab.6 for <tcpm@ietf.org>; Wed, 14 Aug 2013 01:19:05 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=CdG+vfzDhIBM30YrrlJywrTpfo07mQgkRGvtOwpQJ3M=; b=MEIIYa78BnoV4eH+55r1WeUitUhqKFjYXhSIDavwJKIPjMc4EZBhhpEkUUAVMoMJjN qxOqa+rs0VPQGWSfkEo6ajdOVpe7wwLDq6qlYT6EPaMzLshqYqfXvPa9SwCKO43pQdTM gvSrdk/AjASwS/esXk1rWbUway5q81HAEiIQlbXgAPkcd8ez3bgRBP0No8O6WACPL/91 nYRDhTLQwvkcjQOK73mvXpkwdL/m7Q2IsfC0s2AFv2xRHoPhG79KlVkkPKme8lj0NVKj s1Ln6vRqlUUR33DBL5vrrPL68TDKiOW/yY485MXuinYrnzhRyESc7FWBDjuAqdvRa5Pv zueQ==
MIME-Version: 1.0
X-Received: by 10.112.172.137 with SMTP id bc9mr1663854lbc.21.1376468345341; Wed, 14 Aug 2013 01:19:05 -0700 (PDT)
Received: by 10.114.1.115 with HTTP; Wed, 14 Aug 2013 01:19:05 -0700 (PDT)
In-Reply-To: <CAK6E8=cjfj6YQ7s=awLUWFdv=N9ZMoeohajwDnMdLLRm8CwqRQ@mail.gmail.com>
References: <5205547D.90608@palermo.edu> <CAO249ye6brLzWTa8QAaAA=FVbW04_adGoyA9A96TF7UmR3bKyg@mail.gmail.com> <5207C7B6.7020700@palermo.edu> <20130811190504.GA11031@cpaasch-mac> <52087B8B.1060204@palermo.edu> <CAO249ycNAZJsNuCQB_BOZg6rLBRGzNyqaK=MEpxcOR322f2Zeg@mail.gmail.com> <CAK6E8=cjfj6YQ7s=awLUWFdv=N9ZMoeohajwDnMdLLRm8CwqRQ@mail.gmail.com>
Date: Wed, 14 Aug 2013 01:19:05 -0700
Message-ID: <CAO249yfAAY4_N1zg9fw9t11RH-BPAF5p2GhPu3r0FLYXY0pXXA@mail.gmail.com>
From: Yoshifumi Nishida <nishida@sfc.wide.ad.jp>
To: Yuchung Cheng <ycheng@google.com>
Content-Type: text/plain; charset="ISO-8859-1"
Cc: tcpm <tcpm@ietf.org>
Subject: Re: [tcpm] flow control and fast recovery
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 14 Aug 2013 08:19:13 -0000

Hi Yuchung,

I tried to mention how much buffer size will be required to guarantee
fast retransmit and fast recovery works propoerly.
But, I'm not very sure if this plot is a proper example for detailed
discussions on standards. Because the behavior of the sender is a bit
strange to me.
It seems that it tries to do rate halving (or PRR) at the beginning of
recovery, but it speeds up after a while even though it stills in
recovery phase.
This aggressive increase will require more buffer space at the
receiver, but I think it's not a standard behavior.
Thanks,
--
Yoshifumi


On Tue, Aug 13, 2013 at 8:56 AM, Yuchung Cheng <ycheng@google.com> wrote:
> On Mon, Aug 12, 2013 at 10:39 PM, Yoshifumi Nishida
> <nishida@sfc.wide.ad.jp> wrote:
>> Hi Alejandro,
>>
>> Thanks for the dump file.
>> Please correct me if I miss something.
>>
>> During the first retransmit and fast recovery, the sender can transmit
>> 2/sender's cwnd - MSS bytes of new data. (because 1MSS is consumed by
>> retransmission)
>> This means in the worst case, you will need to have 3/2 sender's cwnd
>> receiver buffer to keep all data transmitted during this period.
>> But, the worst case can only happen when the retransmit segment by
>> fast retransmit arrives after all new data has arrived.
>> (Another example will be a case where the receiver's application
>> becomes slow to read data all of sudden during loss recovery.)
>>
>> You're right that if we want to prepare the worst case, the receiver
>> will need 1.5 times larger size of buffer than the sender's buffer.
>> But, I'm not sure this is a problem. Because I'm not very sure TCP
>> needs to guarantee full performance when sender's buffer size equals
>> receiver's buffer size.
> It's not about sender's buffer size.
>
> It'll be easier to explain with time sequence graph, but I doubt this
> list allows binary attachment so
> try this
> tcptrace -CSzxy <pcap> && xplot.org b2a_tsg.xpl
>
> During the recovery, the rwin remains stale and the yellow line is horizontal.
>
> The fundamental problem is that the receiver (likely a Linux box) does
> not account for OOO packets received to adjust RWIN.
> But congestion control, or cwin, does account for SACKed packets and
> retransmits.
>
> When the receiver have too many OOO packets it unintentionally thwart
> the sender, creating a bubble in the data pipeline at 5s.
>
> Due to bufferbloat, by the time the loss happens, the network has
> buffered a ton. By the time the fast retransmit finally arrives to
> repair the losses, the rwin opens up the flood gate, hence the big
> jump of the yellow line. Interesting the sender is probably
> application-limited so it didn't send out a burst.
>
> So the receiver is unintentionally throttling the sender to make
> forward progress during recovery. Not a good idea unless the receiver
> really can't afford the extra 100KB.
>
>
>>
>> In your tcpdump, the first retransmit segment by fast retransmit seems
>> to be lost, hence you got extra dup acks.
>> But, I think this is not a situation that fast retransmit logic expects.
>>
>> Thanks,
>> --
>> Yoshifumi
>>
>>
>> On Sun, Aug 11, 2013 at 11:07 PM, Alejandro Popovsky <apopov@palermo.edu> wrote:
>>> Hi Christoph,
>>>
>>> I left another example where the receiver is tuning its receive window, but
>>> not
>>> as dynamically as to prevent a sender stall during fast recovery:
>>>
>>> http://www.palermo.edu/ingenieria/comm/flowCtrlFastRecovery2.pdf
>>>
>>> http://www.palermo.edu/ingenieria/comm/exampleDumpFlowCtrlFastRecovery2.pcap
>>>
>>> Best regards, Alejandro.
>>>
>>>
>>>
>>> On 11/08/13 04:05 PM, Christoph Paasch wrote:
>>>>
>>>> Hello,
>>>>
>>>> On 11/08/13 - 14:19:50, apopov@palermo.edu wrote:
>>>>>
>>>>> I have just left an example connection in:
>>>>>
>>>>> http://www.palermo.edu/ingenieria/comm/exampleDumpFlowCtrlFastRecovery.pcap
>>>>
>>>> the trace looks rather like the receiver has its window capped at 64K.
>>>> E.g.,
>>>> through a socket-option. Because from the beginning on, the announced
>>>> window
>>>> is at 64K and it never changes.
>>>>
>>>> If the client would not cap the window, the autotuning should do its job
>>>> to
>>>> adjust the window at 2*BDP, and thus allow full speed - even during
>>>> recovery.
>>>>
>>>>
>>>> Cheers,
>>>> Christoph
>>>>
>>>>> I am leaving also an analysis of the connection showing the flow
>>>>> control limitation reached during fast recovery, here:
>>>>> http://www.palermo.edu/ingenieria/comm/flowCtrlFastRecovery.pdf
>>>>>
>>>>> Let me know if you want some other examples.
>>>>>
>>>>> Best regards, Alejandro.
>>>>>
>>>>>
>>>>>
>>>>> On 11/08/13 05:44 AM, Yoshifumi Nishida wrote:
>>>>>>
>>>>>> Hi Alejandro,
>>>>>> Is it possible to see tcpdump files for this? It might be better if we
>>>>>> can discuss with real data.
>>>>>> --
>>>>>> Yoshifumi
>>>>>>
>>>>>> On Fri, Aug 9, 2013 at 1:43 PM, Alejandro Popovsky wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I have been observing that many connections become limited
>>>>>>> by the receiver window during fast recovery, even when this window
>>>>>>> is above RTT*maximumPathRate (and using windows scaling).
>>>>>>>
>>>>>>> This is because during fast recovery the congestion window is
>>>>>>> artificially inflated on each duplicate ack (after the third). And the
>>>>>>> number of unacked bytes may come up to double RTT*maximumPathRate.
>>>>>>>
>>>>>>> For this to be prevented, the receiver may grow its reception window
>>>>>>> up to double its size when generating duplicate acks.
>>>>>>>
>>>>>>>
>>>>>>> I observed this at the traffic of service providers that were having an
>>>>>>> important percentage of their traffic limited by flow control (most of
>>>>>>> the
>>>>>>> traffic is generally limited by the network, or by the data generation
>>>>>>> rate
>>>>>>> at the source).
>>>>>>>
>>>>>>>
>>>>>>> Best regards, Alejandro Popovsky.
>>>>>>>
>>>>> _______________________________________________
>>>>> tcpm mailing list
>>>>> tcpm@ietf.org
>>>>> https://www.ietf.org/mailman/listinfo/tcpm
>>>
>>>
>>> _______________________________________________
>>> tcpm mailing list
>>> tcpm@ietf.org
>>> https://www.ietf.org/mailman/listinfo/tcpm
>> _______________________________________________
>> tcpm mailing list
>> tcpm@ietf.org
>> https://www.ietf.org/mailman/listinfo/tcpm