[tcpm] alpha_cubic (was: Concluding WGLC for draft-ietf-tcpm-rfc8312bis-03)

Bob Briscoe <ietf@bobbriscoe.net> Tue, 07 September 2021 12:17 UTC

To: Markku Kojo <kojo=40cs.helsinki.fi@dmarc.ietf.org>, Yoshifumi Nishida <nsd.ietf@gmail.com>
Cc: "tcpm@ietf.org Extensions" <tcpm@ietf.org>, tcpm-chairs <tcpm-chairs@ietf.org>
References: <CAAK044SjMmBnO8xdn2ogWMZTcecXoET1dmZqd6Dt3WzOUi359A@mail.gmail.com> <alpine.DEB.2.21.2108300740560.5845@hp8x-60.cs.helsinki.fi>
From: Bob Briscoe <ietf@bobbriscoe.net>
Message-ID: <ccfc4dc2-0570-1ba2-66a5-b5e199f11359@bobbriscoe.net>
Date: Tue, 07 Sep 2021 13:17:34 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0
MIME-Version: 1.0
In-Reply-To: <alpine.DEB.2.21.2108300740560.5845@hp8x-60.cs.helsinki.fi>
Content-Type: multipart/alternative; boundary="------------7C614BF2CF38B9B3E619668A"
Content-Language: en-GB
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/aEh4OSRnL6BCw2aiARcWJ_q4OAo>
Subject: [tcpm] alpha_cubic (was: Concluding WGLC for draft-ietf-tcpm-rfc8312bis-03)
Precedence: list

Yoshi,

You recently asked Markku for an explanation of his point #4.

Coincidentally, a few weeks ago, I was checking the formula in RFC8312 
for Cubic in TCP-Friendly mode (called "C-Reno" in this email for 
brevity), and I also checked back on that preliminary paper that Markku 
refers to. I think Markku has a different way of explaining the flaw in 
the equation, but below I'll try to explain what I think the problems 
are. See [BB] inline...

If you don't read HTML email, I'm afraid the maths is probably going to 
look like gobbledygook.

On 30/08/2021 17:33, Markku Kojo wrote:
> Hi Yoshi, all,
>
> On Wed, 18 Aug 2021, Yoshifumi Nishida wrote:
>
> 4. Fairness to AIMD congestion control
>
>    The equation on page 12 to derive increase factor α_cubic that
>    intends to achieve the same average window as AIMD TCP seems to
>    have its origins in a preliminary paper that states that the
>    authors do not have an explanation to the discrepancy between
>    their AIMD model and experimental results, which clearly deviate.
>    It seems to have gone unnoticed that the equation assumes equal
>    drop probability for the different values of the increase factor
>    and multiplicative decrease factor but the drop probability
>    changes when these factors change. 

[BB] The source of eqn (4) in RFC8312 is eqn (4) in [FHP00]. This paper 
has been cited 250 times but it is not peer reviewed.

Equations (2) & (3) of [FHP00] that lead up to (4) use an RTT (R) as if 
it doesn't vary. But it does, because the sawteeth vary the queue. But 
the real problem is that the model (silently) assumes that R is an 
average RTT sitting part-way up the sawteeth, whatever the link rate or 
base RTT. This overlooks the fact that, if the tips of the sawteeth are 
always at the same queue depth (as in a tail-drop buffer) the average R 
will be higher if the sawteeth have smaller amplitude. This sounds 
picky, but the assumption on whether sawteeth vary about their bottom, 
top or middle greatly alters the outcome:

I've taken three possible alternative assumptions, with their resulting 
additive increase factors for theoretical equality with Reno (I've given 
all the derivations at the end of this email):
#1) All sawteeth have the same R_min : α ~= (1/β^2 - 1) / 3;        
Example: β = 0.7; α ~= 0.35
#2) All sawteeth have the same R_max : α ~= 4(1-β^2 ) / 3;            
Example: β = 0.7; α ~= 0.68
#3) All sawteeth have the same Ravg: α ~= 3(1-β) / (1+β); Example: β = 
0.7; α ~= 0.53    <== used in [FHP00] and therefore in RFC8312

Which assumption is most applicable?
#2 models a tail-drop queue.
#3 (or somewhere between #3 and #2) models a probabilistic AQM, such as 
PIE or RED.
#1 is not applicable to the current Internet, but it would be if a magic 
AQM was invented that minimized the standing queue.
I don't think any of the assumptions are good models of C-Reno's 
interaction with CoDel.

So which one should draft-ietf-tcpm-rfc8312bis recommend?
You might think #2, given one suspects the vast majority of queues are 
still tail drop...
However, it may be 'none of the above', 'cos the paper also says it 
assumes deterministic marking; i.e. that that each flow experiences just 
one drop or mark at the top of each sawtooth. If that applies at all, it 
only applies to AQMs, not to tail-drop buffers that tend to drop more 
than one packet at a time. Nonetheless, for the purpose of rate 
comparison, all the flows will share the same bottleneck. So if flow 
rates are equal under a certain AQM, they might be equal under tail-drop 
too (see {Note 1} at the very end).

I would still like to say 'none of the above' because Linux C-Reno has 
been widely used with α=1, β=0.7 for many years now without the sky 
falling. So it's questionable whether friendliness should be defined 
relative to Reno. Otherwise, from day 1, RFC8312bis will be stating the 
forlorn hope that Linux C-Reno should regress to become less aggressive 
than its former self, just to be friendly to Reno, which apparently 
barely exists any more (according to the Great TCP Congestion Control 
Census of 2019 [MSJ+19]).

Nonetheless, we need to bear in mind that BSD variants of Cubic (e.g. 
Apple's) are also widely used and I believe they comply with the α for 
C-Reno defined in RFC8312.

(FAQ: Why does C-Reno add <1 segment per RTT to be friendly to Reno, 
which adds 1 segment per RTT?
C-Reno in the RFC multiplicatively decreases cwnd to 0.7*cwnd on a loss 
(multiplicative decrease factor β=0.7), whereas Reno uses β=0.5. So, 
because C-Reno reduces less, it's also meant to increase less per round 
so that the sawteeth don't reach the max window again more quickly than 
Reno would. That is, it's meant to use a smaller AI factor (α) than 1 
segment. Using α<1 doesn't necessarily slow down Cubic's acceleration 
into unused capacity, because it can switch to a convex (hockey stick) 
Cubic curve once its slow linear increase has made the queue large 
enough to come out of its 'TCP-Friendly' mode.)

Whatever, rfc8312bis certainly shouldn't refer solely to [FHP00] without 
caveats.

Then we have the problem that none of the above formulae are verified 
against reality.
Even in the [FHP00] paper, which uses assumption #3, the simulations are 
out by about 2x. The paper used β=7/8. So, from the theory, for each of 
the above three assumptions, the additive increase factor would have had 
to be:
α_1 = 0.10
α_2 = 0.31
α_3 = 0.20
They used a RED AQM (which should be somewhere between assumption #2 & #3).
But they had to configure C-Reno with α=0.4 to get close to equal flow 
rates with Reno
Using my 'finger in the air' modification to the formulae in {Note 1}, 
α_3 = sqrt(0.20) = 0.44, which would have been about right (perhaps only 
coincidentally).

The question over what α to put in the RFC could run and run. So here's 
a suggested path through this quagmire:
These alternatives throw doubt on the wisdom of using assumption#3. So 
we could at least pick a new assumption for a theoretical α. 
Assumption#2 is probably the 'most correct', and also happens to give 
the highest value of α for the 'TCP-Friendly' region (α=0.68 when β=0.7).
(By my unsubstantiated formula in {Note 1}, α_2 = sqrt(0.68) = 0.82.)

If we look for an empirical value of α that gives reasonably equal flow 
rates against Reno, I suspect we will get all sorts of different answers 
depending on the conditions, e.g. AQM or not; high or low multiplexing; 
high or low RTT (shared by all flows); pacing, burstiness, etc,etc. If 
you want to take this path, feel free, but if you're still on it in a 
year's time, don't say I didn't warn you.

Derivations of the above three formulae follow at the end.

> The equations for the drop
>    probability / the # of packets in one congestion epoch
>    are available in the original paper and one can easily verify
>    this. Therefore, the equations used in CUBIC are not correct
>    and seem to underestimate _W_est_ for AIMD TCP, resulting in
>    moving away from AIMD-Friendly region too early. This gives
>    CUBIC unjustified advantage over AIMD TCP particularly in
>    environments with low level of statistical multiplexing. With
>    high level of multiplexing, drop probability goes higher and
>    differences in the drop probablilities tend to get small. On the
>    other hand, with such high level of competition, the theoretical
>    equations may not be that valid anymore.
>
> /Markku

[BB] Derivations of the formulae for α under the three different 
assumptions follow. It's possible/likely that this is all already in a 
paper somewhere, but I haven't been able to find one.

      0) Preparatory material

Terminology:
The subscript for C-Reno is omitted given it would be on every term.
The recovery time of an AIMD congestion control is the average time for 
additive increase to recover the window reduction after a multiplicative 
decrease.

The overall approach is:
A) Find a formula in terms of the AI and MD factors (α & β), for the 
recovery time of C-Reno.
B) Equate this to the recovery time of Reno, which will produce a 
formula for the additive increase factor, α.

The recovery time formula for C-Reno is derived by:
1) summing up all the round trips in additive increase, which become 
increasingly large as the queue grows. This sum is stated in terms of 
the number of rounds per cycle, J.
2) Then J is found by writing two formulae;
   2a) one for the average sawtooth decrease
   2b) and the other for the increase.
Then by assuming a steady state the two can be equated.
Here goes...

The following formula gives the recovery time, T_r , from one max window 
to the next, in C-Reno with:
* additive increase (AI) factor α [segment / RTT]
* and packet-rate r [packet / s]
assuming that the minimum window of each sawtooth still fully utilizes 
the link.
     T_r = Σ^J-1 _j=0     (R_min + jα/r)
where j is the index of each round, and J is the number of round trips 
of additive increase per sawtooth. The rationale for the second term is 
that the queue grows by α fractional packets in each round, and the 
queue delay per packet is 1/r (seconds per packet is reciprocal of 
packets per second). The sum of all the second terms forms an arithmetic 
progression, the sum of which is given by the well-known formula, as 
follows:
          = J.R_min + J(J-1)α/2r
        ~= J.R_min + J^2 α/2r.                     (1)
The approximation holds as long as J>>1. In some implementations (e.g. 
Linux), cwnd increases continuously, so strictly the limits should be 
j=1/2 to J-1/2 (the average half way through the first and last rounds), 
but details like this are insignificant compared to the approximation.

The difference in RTT due to the additional queue delay between the top 
and bottom of the sawtooth can be stated in two ways.
Either as the sum of all the queue delays of each additive increase:
     (R_max - R_min ) = Jα/r,              (2a)
Or by the multiplicative relationship between the max and min RTTs, by 
the definition of β: R_min = β.R_max .
     (R_max - R_min ) = (R_min / β) - R_min
                                = R_min (1-β)/β    (2b)
Equating (2a) and (2b):
     J = r.R_min (1-β) / (αβ)           (2)

Substituting for J from (2) in (1):
     T_r = r.R_min ^2 (1-β) / (βα) + r.R_min ^2 (1-β)^2 /(2β^2 α)
          = r.R_min ^2 (1-β) / (βα) * (1 + (1-β)/2β )
          = r.R_min ^2 (1-β)(1+β) / (2β^2 α)
          = r.R_min ^2 (1-β^2 ) / (2β^2 α)                             
     (3)
And, because R_min = β.R_max .
     T_r = r.R_max ^2 (1-β^2 ) / (2α)                            
              (4)

The well-known steady-state Reno formula from [FF99] is:
     r_Reno ~= (1/R_avg ) sqrt(3 / 2p)                                 
         (5)
assuming deterministic marking (which is also assumed for the above 
model of C-Reno).

Assuming a Reno sawtooth is approximately linear, the average RTT of a 
Reno flow,
     R_avg ~= (R_max + R_min )/2
and
     R_min = R_max /2
Therefore
     r_Reno ~= (1/R_min ) sqrt(2 / 3p)                                 
         (6)
     r_Reno ~= (1/R_max ) sqrt(8 / 3p)                                 
         (7)

With the preparatory material done, next we take each of the three 
assumptions, one at a time.

      #1) All sawteeth have the same R_min

 From Eqn (3):
     T_r = r.R_min ^2 (1-β^2 ) / (2β^2 α)
Total packets sent over period T_r is r.T_r during which time assume 1 
packet is lost (or marked) at the sawtooth peak {Note 1}. So loss 
probability:
     p = 1 / r.T_r
Substituting for T_r and rearranging :
     r^2 = 2β^2 α / R_min ^2 (1-β^2 )p
     r = (β/R_min ) sqrt(2α / (1-β^2 )p)
Equating the Reno packet rate, r_Reno , from eqn (6) to the C-Reno 
packet rate, r, for all α and β:
     r = r_Reno
        = (1/R_min ) sqrt(2 / 3p).
Therefore,
     (1/R_min ) sqrt(2 / 3p) ~= (β/R_min ) sqrt(2α / (1-β^2 )p)
Rearranging
     2/3 ~= 2αβ^2 / (1-β^2 )
     α ~= (1-β^2 )/3β^2
        ~= (1/β^2 - 1) / 3                                             
     (8)

      #2) All sawteeth have the same R_max = R_min / β

 From Eqn (4):
     T_r = r.R_max ^2 (1-β^2 ) / (2α)
As before, after substituting for Tr and rearranging:
     p = 1 / r.T_r
     r = (1/R_max ) sqrt(2α / (1-β^2 )p)
Equating to r_Reno from eqn (7) for all α and β,
       ~= (1/R_max ) sqrt(8 / 3p).
Rearranging
     8/3 ~= 2α / (1-β^2 )
     α ~= 4(1-β^2 ) / 3                                             (9)

      #3) All sawteeth have the same R_avg

R_avg = (R_max + βR_max )/2
          = R_max (1+β)/2
 From Eqn (4):
     T_r = r.R_max ^2 (1-β^2 ) / (2α)
          = 2r.R_avg ^2 (1-β) / α(1+β)
As before, after substituting for Tr and rearranging:
     p = 1 / r.T_r
     r = (1/R_avg ) sqrt(α (1+β) / 2(1-β)p)
Equating to r_Reno from eqn (5) for all α and β,
       ~= (1/R_avg ) sqrt(3 / 2p).
Rearranging
     3/2 ~= α(1+β) / 2(1-β)
     α ~= 3(1-β) / (1+β)                             (10)

This last formula is same as the equation used for C-Reno in RFC8312 and 
in the RFC8312bis draft.

{Note 1}: The assumption in [FHP00] of deterministic marking is suspect 
for the Internet (meaning that the spacing between drops or marks is 
even and there is always 1 packet dropped or marked at the top of each 
sawtooth). The actual average dropped (or marked) per sawtooth will 
depend on whether the buffer uses tail-drop or an AQM, and if so which 
AQM. It is perhaps better not to assume the actual average number 
dropped (or marked) is known, but instead assume the ratio of the 
average drops or marks between C-Reno and Reno will be roughly 
proportional to their AI factors. This would modify the formula for 
C-Reno's drop probability to:
     p = α_Reno / α_C-Reno .r.T_r
Substituting α_Reno = 1:
     p = 1 / α_C-Reno .r.T_r
This would modify all the results to:

#1) All sawteeth have the same R_min : α ~= sqrt( (1/β^2 - 1) / 3 );    
     Example: β = 0.7; α ~= 0.59
#2) All sawteeth have the same R_max : α ~= sqrt( 4(1-β^2 ) / 3 );    
         Example: β = 0.7; α ~= 0.82
#3) All sawteeth have the same Ravg: α ~= sqrt( 3(1-β) / (1+β) );        
Example: β = 0.7; α ~= 0.73

However, these would then be wrong where there really is deterministic 
marking.

References
=========
[FHP00]    Floyd, S., Handley, M., and J. Padhye, "A Comparison of 
Equation-Based and AIMD Congestion Control", May 2000, 
<https://www.icir.org/tfrc/aimd.pdf>

[MSJ+19] AyushMishra,XiangpengSun,AtishyaJain, 
SameerPande,RajJoshi,andBenLeong.The 
GreatInternetTCPCongestionControlCensus. Proc. ACM on Measurement and 
Analysis of Computing Systems, 3(3), December 2019

[FF99] Sally Floyd and Kevin Fall, "Promoting the Use of End-to-End 
Congestion Control in the Internet", IEEE/ACM ToN (1999)

Bob

-- 
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/

[tcpm] Concluding WGLC for draft-ietf-tcpm-rfc831… Yoshifumi Nishida
Re: [tcpm] Concluding WGLC for draft-ietf-tcpm-rf… Yoshifumi Nishida
Re: [tcpm] Concluding WGLC for draft-ietf-tcpm-rf… Lars Eggert
Re: [tcpm] Concluding WGLC for draft-ietf-tcpm-rf… Vidhi Goel
Re: [tcpm] Concluding WGLC for draft-ietf-tcpm-rf… Vidhi Goel
Re: [tcpm] Concluding WGLC for draft-ietf-tcpm-rf… Lisong Xu
Re: [tcpm] Concluding WGLC for draft-ietf-tcpm-rf… Sangtae Ha
Re: [tcpm] Concluding WGLC for draft-ietf-tcpm-rf… Markku Kojo
Re: [tcpm] Concluding WGLC for draft-ietf-tcpm-rf… Lars Eggert
Re: [tcpm] Concluding WGLC for draft-ietf-tcpm-rf… Lars Eggert
Re: [tcpm] Concluding WGLC for draft-ietf-tcpm-rf… Markku Kojo
Re: [tcpm] Concluding WGLC for draft-ietf-tcpm-rf… Yoshifumi Nishida
Re: [tcpm] Concluding WGLC for draft-ietf-tcpm-rf… Lars Eggert
[tcpm] alpha_cubic (was: Concluding WGLC for draf… Bob Briscoe
Re: [tcpm] alpha_cubic (was: Concluding WGLC for … Yoshifumi Nishida
Re: [tcpm] Concluding WGLC for draft-ietf-tcpm-rf… Markku Kojo
Re: [tcpm] alpha_cubic (was: Concluding WGLC for … Yoshifumi Nishida
Re: [tcpm] alpha_cubic (was: Concluding WGLC for … Jonathan Morton
Re: [tcpm] alpha_cubic Bob Briscoe
Re: [tcpm] Concluding WGLC for draft-ietf-tcpm-rf… Bob Briscoe
Re: [tcpm] alpha_cubic Yoshifumi Nishida
Re: [tcpm] alpha_cubic Bob Briscoe
Re: [tcpm] alpha_cubic Neal Cardwell
Re: [tcpm] alpha_cubic Bob Briscoe
Re: [tcpm] alpha_cubic Bob Briscoe
Re: [tcpm] alpha_cubic Markku Kojo
Re: [tcpm] alpha_cubic Yoshifumi Nishida
Re: [tcpm] alpha_cubic (Issue 1) Bob Briscoe
Re: [tcpm] alpha_cubic (Issue 1) Bob Briscoe