Re: [tcpm] I-D Action: draft-ietf-tcpm-newcwv-03.txt

Gorry Fairhurst <gorry@erg.abdn.ac.uk> Thu, 17 October 2013 18:14 UTC

Return-Path: <gorry@erg.abdn.ac.uk>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2B70D11E8287 for <tcpm@ietfa.amsl.com>; Thu, 17 Oct 2013 11:14:49 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -105.868
X-Spam-Level:
X-Spam-Status: No, score=-105.868 tagged_above=-999 required=5 tests=[AWL=-0.469, BAYES_00=-2.599, J_CHICKENPOX_35=0.6, J_CHICKENPOX_38=0.6, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id wZFZV5US+Woc for <tcpm@ietfa.amsl.com>; Thu, 17 Oct 2013 11:14:44 -0700 (PDT)
Received: from spey.erg.abdn.ac.uk (spey.erg.abdn.ac.uk [139.133.204.173]) by ietfa.amsl.com (Postfix) with ESMTP id EAEAB11E8191 for <tcpm@ietf.org>; Thu, 17 Oct 2013 11:14:43 -0700 (PDT)
Received: by spey.erg.abdn.ac.uk (Postfix, from userid 5001) id D98062B4533; Thu, 17 Oct 2013 19:14:37 +0100 (BST)
Received: from ERG-research.local (gorry-mac.erg.abdn.ac.uk [139.133.207.5]) by spey.erg.abdn.ac.uk (Postfix) with ESMTPSA id C45D82B425D; Thu, 17 Oct 2013 19:14:31 +0100 (BST)
Message-ID: <52602907.5040605@erg.abdn.ac.uk>
Date: Thu, 17 Oct 2013 19:14:31 +0100
From: Gorry Fairhurst <gorry@erg.abdn.ac.uk>
Organization: The University of Aberdeen is a charity registered in Scotland, No SC013683.
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:24.0) Gecko/20100101 Thunderbird/24.0.1
MIME-Version: 1.0
To: Yuchung Cheng <ycheng@google.com>, "Karen E. Egede Nielsen" <karen.nielsen@tieto.com>
References: <20131010032507.15919.27025.idtracker@ietfa.amsl.com> <067d205b8f1f18269c6375d7612d3638@mail.gmail.com> <FD2F17B9B55D72489D521ADC634E4628A2A6C3@pwsvl-excmbx-05.internal.cacheflow.com> <CAK6E8=f89TEWDTRXs6m=n9Rb8iMPVJMGLU1=9Os=3SoP1ZJv-A@mail.gmail.com> <52583A7D.4070108@erg.abdn.ac.uk> <CAK6E8=eHtu4-Rso2SAGRXZNF3cjMe15HgeyYftaC4-B=x+QgnQ@mail.gmail.com> <b2027e06a4dab568ce2f5d28caf5daea@mail.gmail.com> <CAK6E8=eTi4t_z_WA8VH83_1TYWCnfuo3kDFvbD4qZ304NdfJtg@mail.gmail.com>
In-Reply-To: <CAK6E8=eTi4t_z_WA8VH83_1TYWCnfuo3kDFvbD4qZ304NdfJtg@mail.gmail.com>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
Cc: "tcpm@ietf.org Extensions" <tcpm@ietf.org>
Subject: Re: [tcpm] I-D Action: draft-ietf-tcpm-newcwv-03.txt
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
Reply-To: gorry@erg.abdn.ac.uk
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Oct 2013 18:14:49 -0000

This all helps, to figure how much still needs to be done, please see 
comments in-line.

Gorry

On 15/10/2013 15:50, Yuchung Cheng wrote:
> On Mon, Oct 14, 2013 at 5:10 AM, Karen E. Egede Nielsen <
> karen.nielsen@tieto.com> wrote:
>
>> Hi,
>>
>>>>
>>>> I see the problem with pinning cwnd to a low ssthresh - and I'd be
>>>> happy to collaborate to write a draft on fixing this. I offered to
>>>> collaborate to write something, that's still open. It can be a big
>>>> issue for apps that are not even rate-limited in any way.
>>>>
>>>> However, it's not the same problem as CWV - although CWV would likely
>>>> benefit from this.
>>>>
>> Yes. But it may define the boundaries within which CWV applies, assuming
>> that this is kept outside of CWV.
>> Meaning that if the ssthresh value (relative to CWND) puts the connection
>> in CA and CWND <= IW, then
>> after a certain idle time, RTO?, the connection should really not be worse
>> off than a new connection.
>>
GF: I'm intrigued in the idea of "resetting" ssthresh when there is no 
negative information for a long enough time to know that the network is 
again stable, this would be a big change to TCP. I don't think RTO is 
enough time, i.e. we would need to use a much longer time since the last 
loss event than simply an RTO.

There is a possibility also that if the loss occurs in a shared 
bottleneck that you may lead to synchonisation and instability at some 
predictable time after you increase the ssthresh. Increasing ssthresh 
may not be easy.

>>>> Pacing is always going to help in these cases from the network
>>>> perspective, and probably would be a major performance for apps. If I
>>>> knew more about Pacing we can certainly write more in the CWV draft.
>>>>
>>>> However, I don't think it's a solution on its own - I also think that
>>>> simply letting cwnd grow without check seems illogical - especially
>>>> when there are significant changes in available path capacity. That's
>>>> where I really think CWV is needed.
>> Agree.
>>> Absolutely. I completely agree that we need that part of cwv-draft.
>>> but the second half about reducing cwnd by X after idling Y always
>> sounds
>>> shamanism. To me there are just two choices.
>>>
>>> 1) conservative: revert CC to as in initial slow start (cwnd=RW,
>> ssthresh=inf)
>>> 2) keep cwnd as-is but pace (if available)
>>
>> Perhaps the solution shall consist of both of these with qualification on
>> idle time on
>> whether the resulting ssthresh and usable CWND from option 2) puts the
>> connection in a worse position than 1).
>>>
>>> the 2nd option follows the same rationale as RFC 2140 on persisting cwnd
>>> over connections. Yes it might be wrong at times, but the immediate
>> threat is
>>> burst, not high cwnd. I am not sure about any good justification for a
>> third
>>> option.
>>>
GF: I'll "revive" the thinking behind new-cwv:

The primary goal of new-cwv is to stop cwnd growing arbitrarily when the 
application rate is reached, and to allow it to be kept at this "safe" 
rate for future use.

The issue though, which is where the "decay" comes in, is when an app 
does not use a reasonable fraction of the capacity associated with cwvnd 
for a *long* time (NVP), and then starts to resume at what TCP then 
perceives as the "safe" rate. [This is more to deal with issues such as 
path changes, apps being idle for long period and then simultaneous 
waking-up on some trigger, re-routes, etc.] I think these are all corner 
cases, but we need to robust.

So... We currently take this decision of using a NVP period of 5 minutes 
to cover this corner case.

If the WG now decides to reset ssthresh high after a period of time, 
this would affect the choice of the NVP period and the final value of 
ssthresh. We should ask people how comfortable they would be in this 
change.

>>> note 1) may still burst if the app writes within an RTO. that happens
>> all the
>>> time on video transfers when the receiver plays abuse receive-window to
>>> throttle sender.
>>>
>> Yes, and such may happen even at start of a new connection. This issue is
>> thus not particular to the situation after idle.
>>
>> In SCTP the solution to such  bursting has been to introduce a max.burst
>> parameter. I think that Randy S. also
>> gave this comment as the tcpm session in summer.
>
> Thanks for the reminder, there is indeed a third option and is widely
> implemented in BSD, according to Randy. max-burst is a middle-road between
> cwnd/pacing and iw/ss. but what if the app sends max.burst bytes per-write
> frequently enough to form a big burst? it seems possible with TSO
> (deferral). So I guess the implementation has some kind of rate-limiting,
> similar to pacing, to deal with that.
>
Max-burst maybe what we seek in terms of trying to restore pacing for an 
ACK stream - and is a really valubale method - I think the reported 
experience with SCTP is all positive, and it is easier to implement than 
pacing.

However, one of the features of the apps governed by new-cwv is that in 
many cases there is no ACK clock - therefore max-burst could not be used 
when an APP transmits in bursts and goes idle in between. This varying 
application behaviour is why we have proposed the methods in the draft, 
and why some form of pacing may be very desirable (or at least reassure 
sceptics that the new methods really are safe).

I'm still hoping someone will tell us that you can turn-on pacing for 
times when TCP would benefit - and if so, we should make transmission 
during the NVP to be paced.

>
>>
>> BR, Karen
>>
>>>>
>>>> Gorry
>>>>
>>>>
>>>>
>>>> On 11/10/2013 18:12, Yuchung Cheng wrote:
>>>>>
>>>>> On Fri, Oct 11, 2013 at 9:50 AM, McAlpine, Gary
>>>>> <gary.mcalpine@bluecoat.com> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I agree with Karen. I have been investigating cases where a single
>>>>>> packet drop ultimately results in ssthresh getting set to 2*MSS and
>>>>>> the connection getting severely penalized by immediately going into
>>>>>> CA at the beginning of a slow-start. One case occurs due to a
>>>>>> feature intended to limit the severity of a DOS attack by a
>>>>>> malicious TCP that ignores the receiver's window size (i.e. limiting
>>>>>> the max number of segments in a reassembly buffer). Unfortunately,
>>>>>> if the average segment size is small (as can happen when data is
>>>>>> being compressed on the fly by the sender) and the RTT long enough,
>>>>>> then a single packet drop in the network can be followed by a tail
>>>>>> drop at the receiver, followed by slow-start with ssthresh = 2*MSS
>>>>>> (we are working on a solution to avoid this occurrence on perfectly
>>>>>> valid connections with valid traffic). We are also investigating a
>>>>>> different case that also results in ssthresh getting set to 2*MSS
>> after a
>>> single packet drop by the network.
>>>>>>
>>>>>> The problem is, once ssthresh gets set to a value too far below the
>>>>>> actual loss flight size, then it can take a very long time to
>>>>>> recover (and may never recover as long as that connection is
>>>>>> established). That would be a good thing on real DOS attacks, but
>>>>>> not so good on valid connections and traffic.
>>>>>
>>>>> I have definitely seen this problem (w/ cubic). Although I believe
>>>>> the root problem is loss is often not correlated with congestion
>>>>> these days (but due to burst), I second the idea to reset ssthresh
>>>>> after a long idle. For cubic the hystart will avoid ss overshoot so
>>>>> it's safer.
>>>>>
>>>>> imo newcwv is better than RFC2861 but I prefer to just keep cwnd
>>>>> as-is and enable pacing. After idle TCP will burst and that's the
>>>>> real issue. Any cwnd moderation helps to lower burst and reduce loss,
>>>>> so it might appear the magic factor, be 3/4, 1/2, 0.1322, is good.
>>>>> But it's papering the flaw of window-based ack-clocked design.
>>>>>
>>>>>>
>>>>>> I think the problem with ssthresh stems from it being set as a
>>>>>> function of CWND (which is required to be set to a small value in
>>>>>> any slow-start situation). I would suggest setting it as a function
>>>>>> of pipeACK and/or LossFlightSize, which should be better indicators
>>>>>> of a burst size that can be used without loss.
>>>>>>
>>>>>> Thanks,
>>>>>> Gary
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: tcpm-bounces@ietf.org [mailto:tcpm-bounces@ietf.org] On
>>> Behalf
>>>>>> Of Karen E. Egede Nielsen
>>>>>> Sent: Friday, October 11, 2013 3:01 AM
>>>>>> To: gorry@erg.abdn.ac.uk
>>>>>> Cc: tcpm@ietf.org
>>>>>> Subject: Re: [tcpm] I-D Action: draft-ietf-tcpm-newcwv-03.txt
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> RFC5681 (section 4.1) stipulates that the TCP should start in slow
>>>>>> start with IW after an idle period of RTO.
>>>>>> The associated adjustment of ssthresh, however, has been left
>>>>>> under-specified. I.e., should the stthresh go to the standard
>>>>>> restart condition, which then would mean setting ssthresh equal to
>>>>>> infinite, or should the ssthresh be kept where it was left ?
>>>>>>
>>>>>> This question is very significant for a TCP connection where the
>>>>>> ssthresh is low due to the occurrence of a  prior
>> retransmission-timeout.
>>>>>> And it is even more  critical if a sequence of 2 retransmission
>>>>>> time-outs have occurred as the ssthresh then would be left as low as
>>>>>> 2MTUs (RFC5681)
>>>>>>
>>>>>> Newcwv (section 4.4.2) suggests for an adjustment of ssthresh as
>>>>>> ssthresh = max(ssthresh, 3*cwnd/4) after the non-validated phase.
>>>>>> This proposal will result in a severe disadvantage,  compared to a
>>>>>> clean restart of the connection [ssthresh = infinite],  when
>>>>>> resuming usage after an idle period on a TCP connection following a
>>> retransmission-timeout.
>>>>>>
>>>>>> I wonder if there are any thoughts in TCMP, possibly in newcwv,
>>>>>> possibly in some other work item, on clarifying the ssthresh
>>>>>> handling when the adjustment ssthresh = max(ssthresh, 3*cwnd/4),
>>>>>> would bring the connection to start in congestion avoidance.
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> BR, Karen
>>>>>>
>>>>>>
>>>>>>
>>>>>> Resume of traffic in a TCP contexts after a (or multiple)
>>>>>> retransmission-timeout perhaps is a special case for many single TCP
>>>>>> connection usecase scenarios (I don't know).  At least it is clear
>>>>>> why one would like to overcome such congestion avoidance phase by
>>>>>> performing a clean restart.
>>>>>> But  for SCTP multi-home (read multiple path) scenarios, and
>>>>>> possibly (?) for MPTCP scenarios as well, then temporary leave and
>>>>>> subsequent resume of paths where retransmission-timeout have
>>>>>> occurred is part of the standard failure recovery  operation of the
>>> protocol.
>>>>>> The above issue is thus, apart from it being general relevant for
>>>>>> TCP I suppose, very relevant for ongoing work in tsvwg on CC during
>>>>>> path failovers in SCTP (Quick Failover).
>>>>>>
>>>>>> -----Original Message-----
>>>>>> From: tcpm-bounces@ietf.org [mailto:tcpm-bounces@ietf.org] On
>>> Behalf
>>>>>> Of internet-drafts@ietf.org
>>>>>> Sent: 10. oktober 2013 05:25
>>>>>> To: i-d-announce@ietf.org
>>>>>> Cc: tcpm@ietf.org
>>>>>> Subject: [tcpm] I-D Action: draft-ietf-tcpm-newcwv-03.txt
>>>>>>
>>>>>>
>>>>>> A New Internet-Draft is available from the on-line Internet-Drafts
>>>>>> directories.
>>>>>>    This draft is a work item of the TCP Maintenance and Minor
>>>>>> Extensions Working Group of the IETF.
>>>>>>
>>>>>>           Title           : Updating TCP to support Rate-Limited
>> Traffic
>>>>>>           Author(s)       : Godred Fairhurst
>>>>>>                             Arjuna Sathiaseelan
>>>>>>                             Raffaello Secchi
>>>>>>           Filename        : draft-ietf-tcpm-newcwv-03.txt
>>>>>>           Pages           : 19
>>>>>>           Date            : 2013-10-09
>>>>>>
>>>>>> Abstract:
>>>>>>      This document proposes an update to RFC 5681 to address issues
>> that
>>>>>>      arise when TCP is used to support traffic that exhibits periods
>> where
>>>>>>      the sending rate is limited by the application rather than the
>>>>>>      congestion window.  It updates TCP to allow a TCP sender to
>> restart
>>>>>>      quickly following either an idle or rate-limited interval.  This
>>>>>>      method is expected to benefit applications that send
>> rate-limited
>>>>>>      traffic using TCP, while also providing an appropriate response
>> if
>>>>>>      congestion is experienced.
>>>>>>
>>>>>>      It also evaluates the Experimental specification of TCP
>> Congestion
>>>>>>      Window Validation, CWV, defined in RFC 2861, and concludes that
>> RFC
>>>>>>      2861 sought to address important issues, but failed to deliver a
>>>>>>      widely used solution.  This document therefore recommends that
>> the
>>>>>>      status of RFC 2861 is moved from Experimental to Historic, and
>> that
>>>>>>      it is replaced by the current specification.
>>>>>>
>>>>>>      NOTE: The standards status of this WG document is under review
>> for
>>>>>>      consideration as either Experimental (EXP) or Proposed Standard
>> (PS).
>>>>>>      This decision will be made later as the document is finalised.
>>>>>>
>>>>>>
>>>>>> The IETF datatracker status page for this draft is:
>>>>>> https://datatracker.ietf.org/doc/draft-ietf-tcpm-newcwv
>>>>>>
>>>>>> There's also a htmlized version available at:
>>>>>> http://tools.ietf.org/html/draft-ietf-tcpm-newcwv-03
>>>>>>
>>>>>> A diff from the previous version is available at:
>>>>>> http://www.ietf.org/rfcdiff?url2=draft-ietf-tcpm-newcwv-03
>>>>>>
>>>>>>
>>>>>> Please note that it may take a couple of minutes from the time of
>>>>>> submission until the htmlized version and diff are available at
>>>>>> tools.ietf.org.
>>>>>>
>>>>>> Internet-Drafts are also available by anonymous FTP at:
>>>>>> ftp://ftp.ietf.org/internet-drafts/
>>>>>>
>>>>>> _______________________________________________
>>>>>> tcpm mailing list
>>>>>> tcpm@ietf.org
>>>>>> https://www.ietf.org/mailman/listinfo/tcpm
>>>>>> _______________________________________________
>>>>>> tcpm mailing list
>>>>>> tcpm@ietf.org
>>>>>> https://www.ietf.org/mailman/listinfo/tcpm
>>>>>> _______________________________________________
>>>>>> tcpm mailing list
>>>>>> tcpm@ietf.org
>>>>>> https://www.ietf.org/mailman/listinfo/tcpm
>>>>>
>>>>>
>>>>
>>
>