Re: [tcpm] alpha_cubic

Markku Kojo <kojo@cs.helsinki.fi> Tue, 22 March 2022 16:48 UTC

Return-Path: <kojo@cs.helsinki.fi>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3D7CB3A09B5 for <tcpm@ietfa.amsl.com>; Tue, 22 Mar 2022 09:48:32 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 2.991
X-Spam-Level: **
X-Spam-Status: No, score=2.991 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, GB_SUMOF=5, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cs.helsinki.fi
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id aHgW4CHIAAtE for <tcpm@ietfa.amsl.com>; Tue, 22 Mar 2022 09:48:27 -0700 (PDT)
Received: from script.cs.helsinki.fi (script.cs.helsinki.fi [128.214.11.1]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 086D03A08F5 for <tcpm@ietf.org>; Tue, 22 Mar 2022 09:47:34 -0700 (PDT)
X-DKIM: Courier DKIM Filter v0.50+pk-2017-10-25 mail.cs.helsinki.fi Tue, 22 Mar 2022 18:47:06 +0200
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cs.helsinki.fi; h=date:from:to:cc:subject:in-reply-to:message-id:references :mime-version:content-type:content-id; s=dkim20130528; bh=ylg9yk nXE9gxe6z1K1HIluE60G8SI6RRNUAjcFTZ99U=; b=TAkq1GY8bnr6B3ZXUt1nnL 34EpVbf38tjgdzoMUDKT2CYVeuwocyup7OWcprplmBs0fnqscn1TN33IH6KVpvpG +cop+gp59n521SvSrzf1nTMHfS5/fkvDkzIkZr5Yp4LHirAkORmvoY3neIbu/Qm/ 6Pw7ZfiqkLkjlG7+HJx2E=
Received: from hp8x-60 (88-113-49-197.elisa-laajakaista.fi [88.113.49.197]) (AUTH: PLAIN kojo, TLS: TLSv1/SSLv3,256bits,AES256-GCM-SHA384) by mail.cs.helsinki.fi with ESMTPSA; Tue, 22 Mar 2022 18:47:06 +0200 id 00000000005A1C65.000000006239FD8A.0000246B
Date: Tue, 22 Mar 2022 18:47:04 +0200
From: Markku Kojo <kojo@cs.helsinki.fi>
To: Bob Briscoe <ietf@bobbriscoe.net>
cc: Yoshifumi Nishida <nsd.ietf@gmail.com>, Markku Kojo <kojo=40cs.helsinki.fi@dmarc.ietf.org>, "tcpm@ietf.org Extensions" <tcpm@ietf.org>
In-Reply-To: <3fa617e4-82ac-a359-e087-a00ff7d4c633@bobbriscoe.net>
Message-ID: <alpine.DEB.2.21.2202142333550.4019@hp8x-60.cs.helsinki.fi>
References: <CAAK044SjMmBnO8xdn2ogWMZTcecXoET1dmZqd6Dt3WzOUi359A@mail.gmail.com> <alpine.DEB.2.21.2108300740560.5845@hp8x-60.cs.helsinki.fi> <ccfc4dc2-0570-1ba2-66a5-b5e199f11359@bobbriscoe.net> <CAAK044T-ZtZUuq4xBSuB1E9aqHOn96orXe=8ZMJHao_j4xpK3Q@mail.gmail.com> <2a3f7032-6548-061f-c6b1-a39442699228@bobbriscoe.net> <CAAK044Rcw2iryxQ7g-1nFGjnwxNkTX8NhnuGaOeDD=sQFH0sKw@mail.gmail.com> <3fa617e4-82ac-a359-e087-a00ff7d4c633@bobbriscoe.net>
User-Agent: Alpine 2.21 (DEB 202 2017-01-01)
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="=_script-9346-1647967626-0001-2"
Content-ID: <alpine.DEB.2.21.2203221821470.4019@hp8x-60.cs.helsinki.fi>
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/bds-h_a6-NliTjx-ZqUSaFpSSnA>
Subject: Re: [tcpm] alpha_cubic
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 22 Mar 2022 16:48:33 -0000

Hi Yoshi, Bob, all,

below please find my understanding of and feedback on Bob's report
(https://raw.githubusercontent.com/bbriscoe/cubic-reno/main/creno_tr.pdf). 
This hopefully also clarifies what I meant when saying that the equation 
[in the draft] assumes equal drop probability for the different values of 
alpha and beta but the drop probability actually changes when these 
values are changed from alpha=1, beta = 0.5. I have also thought this a 
bit more after reading Bob's report and include my understanding of why 
the performance with C-Reno and Reno differ in case of AQM.

In brief, the model for steady state behavior for the Reno-friendly 
region in a tail-drop queue indeed is incorrect when the additive 
increase factor alpha < 1 is applied for an alternate AIMD(a, b) TCP. 
This is because the packet dynamics and behavior of TCP that injects more 
packets only when cwnd has increased by at least one MSS. This differs 
from the assumptions of the model and what Bob has used in his analysis 
(see more below). Also the AQM case remains unvalidated, my reasoning 
differs, or extends, somewhat the reasoning that Bob gave in his report.

The math itself in the original paper [FHP00] which the CUBIC 
TCP-friendly region is based on seems correct, likewise the math in Bob's 
report as well as in Yoshi's explanation below:

>  Yoshifumi Nishida <nsd.ietf@gmail.com> wrote:
> 
> Also, in one congestion epoch, (α=1.0,  β =0.5) increases 0.5 W1 while
> (α=X,  β =0.7) increases 0.3 W2
> Now, if we define the congestion epoch as e RTTs, e = 0.5 W1 / 1.0  and 
> also e = 0.3 W2 / X
> This means, X should satisfy
>  0.5 W1 /1.0  = 0.3 * 1.5/1.7 W1/ X
> then we get X = 0.529.

> This will mean if we launch (α=1.0,  β =0.5) and (α=0.529,  β =0.7)
> at the same time and reduce them when the sum of their windows
> becomes Wmax, the throughput of them will mostly be the same.

I'll try to explain the problems with the model separately for a tail-drop 
and AQM queue.

1. Tail-drop queue (*)

The problem with the use of the formula in the original paper as well as 
in Bob's and Yoshi's analyses is, like it often easily happens, that the 
theoretical model that has been derived in one context using certain 
assumptions is reused in another context where the assumptions do not 
hold, i.e, the model does not match the reality.

In this case, the correct use of formula presumes that the drop 
probability is the same (on the average) for a standard TCP = AIMD(1,1/2) 
and AIMD(a,b). That is implicitly assumed also in Bob's report for 
synchronized (tail drop) case as Bob's math is derived using the same 
number of packets over a cycle (= in between drops, which is reciprocal 
of drop rate). (Btw, involving RTT in the calculations is unnecessary as 
all competing flows see the same queue on every RTT, so RTT is the same 
for all flows sharing the same queue, assuming the base RTT for the flows 
is the same). Also, Yoshi's calculations above effectively make the same 
assumption that both flows reduce cwnd at the same time ("when the sum of 
their windows becomes Wmax").

To understand the difference, we need to look at the packet dynamics of 
flows competing in a tail drop queue, not just the math. The assumption 
of equal drop probability holds for any two (or n) AIMD(1, 1/2) TCPs 
competing on a shared bottleneck: when the bottleneck tail drop buffer 
becomes full towards the end of a cycle (on RTT n), both TCPs increase 
their cwnd by 1 for the RTT n+1 and inject two back-to-back pkts 
resulting in a drop of 1 pkt each (the latter pkt of the two back-to-back 
pkts = the excess pkt according to packet conservation rule becomes 
dropped). When an AIMD(1, 1/2) TCP competes with an AIMD(a=0.53, b=0.7) 
TCP (=CUBIC) and the tail drop buffer becomes full on RTT n, the 
AIMD(1,1/2) TCP increases its cwnd by 1 on RTT n+1 and gets 1 pkt 
dropped. However, AIMD(0.53,0.7) TCP increases its cwnd by 0.53 and 
only injects an excess packet on RTT n+1 roughly every second cycle 
because a TCP sender does not inject an additional packet until cwnd has 
increased by at least by 1 MSS.

Hence, in a synchronized case the AIMD(0.53, 0.7) TCP avoids a packet 
drop and cwnd reduction roughly every second cycle and is able to extend 
its cycle because the competing AIMD(1, 1/2) TCP reduced its cwnd on RTT 
n+1 and the bottleneck buffer is not anymore full on RTT n+2 when 
AIMD(0.53,0.7) TCP injects its additional packet (and would encounter a 
pkt drop only if the queue is full). 
Because the cycle is systematically extended in this way for AIMD(0,53, 
0.7) (=CUBIC in Reno-friendly mode), it increases the average number of 
pkts between drops and thereby decreases the drop probility of CUBIC to 
be lower than that of the competing AIMD (1, 1/2) TCP. These packet 
dynamics breaks the model.

So, the basic problem is that the model used for the Reno-friendly 
region obviously is not valid. The results in the original paper [FHP00] 
do not support the validity of the model either and I am not aware that 
the model would have been validated by anyone.

A good, unresolved question is why the simulation results in the 
original paper [FHP00] gave lower throughput for AIMD(1/5, 7/8) 
(denoted as AIMD(1/5, 1/8) in the original paper) for the AQM case.
Note that the paper did not give lower throughput for AIMD(1/5, 1/8) with 
a tail drop bottleneck as the results in the paper were shown only for an 
AQM bottleneck and the paper just said the results for tail drop are 
similar which is a vague expression; it may also mean that he results 
were reversed with similar difference in throughput? Anyway, the results 
obviously did not validate the model.

Bob's explanation for the reason why C-Reno's packet rate decreases is 
correct but what would happen if the buffer was not deep enough and 
drained out seems not all correct. The packet rate is the relative share 
on a shared bottleneck link with fixed capacity. The cwnd of the C-Reno 
flow is larger than that of Reno in the beginning of the cycle, becomes 
equal in the middle, and stays increasingly behind for the rest of the 
cycle. Therefore, C-Reno's *proportional* share decreases during the 
cycle; the overall througput is the same all the time and bounded by the 
link capacity.

If the buffer was not deep enough and drained out, the relative packet 
rate between the two flows still behaves the same even if there is no 
queue and also when the link is underutilized for the beginning of the 
cycle. When the link is underutilized, the pkt rate increases for both 
flows until the link becomes fully utilized (note that RTT does not 
change when the link is underutilized). The pkt rate of C-Reno 
would increase slower though but it won't change its proportinal share, 
i.e., C-Reno will not suffer more unlike Bob suggests. The overall average 
rate is lower in such a case because the link is underutilized for the 
beginning of the cycle.

AQM queue:

Bob's explanation for AQM case seems pretty much being on the correct 
track.

In the AQM case the desynchronization of the flows is merely guaranteed 
for low level of multiplexing. I agree with Bob that modelling AQM 
case is hard and it also depends on the AQM in use. I also agree Bob's 
analysis that towards the end of a cycle a probabilistic AQM is more 
likely to hit Reno somewhat more often than C-Reno. On the other hand, 
with desynchronized flows the AQM may hit any flow irrespective of its 
phase. Therefore, the AQM is more likely to hit C-Reno towards the 
beginning of its cycle somewhat more often than Reno because C-Reno's 
pkt rate is then higher than that of Reno, and a hit in the beginning of 
the cycle hurts the performance much more than in the end of 
the cycle. A hit in the beginning of the cycle results in a cwnd value 
that is significantly lower than in the beginning of the previous cycle 
while a hit towards the end of cycle results in a cwnd value much closer 
to that in the beginning of the previous cycle (the behavious the 
resembles steady state regular cycle). This might explain, at least in 
part, the lower throughput for C-Reno in AQM case.

I think that an additional possible explanation for the discrepancy in 
the original paper is that possibly something was wrong with the 
simulation implementation, or maybe in the simulation parameters. Not all 
parameters were exposed in the paper. The results also seem not to 
involve any replications, so we cannot expect statistical validity from 
the results, a part of the story may be explained just by random effects.

In addition, the reasons for the discrepancy speculated in the original 
paper seem likely to be correct:

  "One factor could be the assumption of regular, deterministic
   drops in the deterministic model; another factor could be the
   role played by retransmit timeouts for TCP in regimes with
   high packet drop rates."

The first one above means the incorrect assumption that drop 
probabilities are the same. Moreover, the role of RTOs is definitely one 
potential reason with the high number of competing flows as the model 
considers the regular steady-state only.

So, in summary, the AQM case clearly remains unvalidated as well, or the 
validation in the original paper actually shows that the model is not 
correct.

An additional problem with the incorrect model is that it tends to result 
is overaggressive CUBIC behaviour regardless of whether the model yields 
too low or too high throughput (cwnd) for CUBIC sender in the 
Reno-frindly region: if the model yields too low throughput, the CUBIC 
sender leaves the Reno-friendly region too early and becomes (much) more 
aggressive actual Reno sender would; otherwise, if the model yields too 
high throughput, the CUBIC sender is too aggressive throughout the 
Reno-friendly region.

(*) Note:

The tail drop case in not always synchronized for competing Reno CC 
flows either. Flows get more or less randomly desynchronized from time to 
time because the total sum of the cwnds for the competing flows at the end 
of the cycle does not necessarily match the buffering capacity of the path.
This, however, is not a big problem for the model because competing flows 
alternate such that each flow encounters a longer cycle more or less 
randomly. A flow having a longer cycle is then later pushed towards the 
correct awg cwnd. In the long run the cycle lengths (= drop probability) 
averages to be the same, and very importantly, there is no systematic 
bias.

Thanks,

/Markku

Bob Briscoe wrote:

> Markku, Yoshi,
> 
> As promised, I've tried to develop the maths to derive the additive increase factor for the
> Reno-Friendly mode of Cubic. I did it for two cases:
> * synchronized (tail-drop)
> * desynchronized (AQM)
> 
> I've written a short LaTeX report and uploaded here:
>     https://raw.githubusercontent.com/bbriscoe/cubic-reno/main/creno_tr.pdf
> If Markku and others review it, I'll update it with any corrections, and once ready, I can publish
> it on ArXiv, so that we have an archival reference.
> Here's a brief summary, but pls read the report before criticizing anything in the summary.
> 
> Tail Drop case
>  *  This turned out to be straightforward, and ended up with the same formula as in RFC8312 (which
>     came from [FHP00] which assumed /non/-tail-drop!).
>  *  I believe my derivation is straightforward and rigorous and it doesn't rely on loss probability,
>     unlike [FHP00], which was Markku's main concern.
>  *  I've also included a possible explanation for the result with tail-drop in [FHP00] where the
>     throughput of CUBIC in Reno-friendly mode was lower, not the same as Reno:
>      +  I've explained why C-Reno would suffer more than Reno if the buffer was not deep enough to
>         take the combined synchronized decreases without draining out completely.
>      +  but I haven't proved this empirically by trying with a deeper buffer (and I don't intend to
>         - someone else can do that if they want).
> Tail-drop summary: I believe we now have some solid theory for the formula in RFC8312, at least for
> tail drop scenarios, which are (sadly) still likely to be prevalent.
> 
> 
> AQM case
>  *  Too hard for me to produce even an approximate derivation of the AI factor.
>  *  I've included a discussion of why the formula in [FHP00] is incorrect for the AQM case, which is
>     more deeply thought-through than my earlier emails:
>      +  my critique seems similar to Markku's and it is possible it's the same, but there's not
>         enough explanation in Markku's emails to know for sure.
>      +  @Markku, perhaps you can see if what I've written describes what you meant?
>  *  However, in the AQM case, the simulations in the Floyd report give the opposite results to what
>     I would expect.
> AQM summary: A formula for the AI factor in the AQM case is still beyond me, and the results in
> [FHP00] are still the opposite of what I'd expect.
> 
> 
> Bob
> 
> On 17/09/2021 10:06, Yoshifumi Nishida wrote:
>       Hi Bob,
> 
> Thanks for the comments.
> I at least think we both can agree that the draft uses [FHP00] as the basis of their AIMD or
> NewReno model.
> If you have any suggestions to update the texts to address your concerns, please share. (but,
> in this case, using github might be straightforward.)
>  
> Unfortunately, I'm not sure about the empirical evidence to support the formula. 
> But, given that cubic has been widely deployed for long time and we don't see strong voices to
> point out problems for it, I am guessing the issues might not be too obvious at least.
> If you have any specific concerns in the formula, could you share? 
> 
> Thanks,
> --
> Yoshi 
> 
> 
> On Mon, Sep 13, 2021 at 4:17 PM Bob Briscoe <ietf@bobbriscoe.net> wrote:
>       Yoshi,
>
>       On 11/09/2021 22:02, Yoshifumi Nishida wrote:
>       Hi Bob,
> 
> Thanks for the very detailed explanation. 
> I agree with some of your points. But, I have two comments on this.
> 
> The first point is about RTT variation. 
> I think you have a point there, but I personally think the paper presumes very
> shallow queue so that the model can be very simple.
> In this case, we don't have to think about min or max of RTT as they will be
> mostly the same.
> 
> 
> [BB] The paper gets the same answer as I got under assumption #3 (all sawtooth vary
> around a set point half-way between max and min).
> I didn't assume shallow queues, so it can't be assuming shallow queues.
>
>       I think we can probably argue the accuracy of the model, but I think it is
>       too broad as a scope of this document. It looks a very deep research topic
>       to me.
> I think the model is basically derived from Equation-based Congestion Control
> paper in sigcomm 2000. If we want to update or renew it, we might need another
> RFC5348.
> 
> 
> [BB] I didn't use Equation-based CC at all and, at least under one of my three
> assumptions, I got the same result as the [FHP00] paper (with one further assumptions
> about how many drops there are at the top of different sawteeth).
> 
> I just wanted to check the outcome wouldn't be different if I used a model of varying
> RTT, 'cos I'd already recently done this maths for another reason - to derive the
> recovery time after a loss. This proved that, given [FHP00] got the same result as one
> of my cases just using basic AIMD sawtooth geometry, at least in this one case, the
> varying RTT doesn't make any difference.
> 
> Nonetheless, I think you're right that deriving the correct formula would involve deeper
> research. I don't think publication of the RFC should block on that research.
>
>       So, I think it would be better for this doc to use the model as it is and
>       leave its enhancements for future discussions.
> 
> 
> [BB] That's up to the WG. But I would not refer to that paper, at least not alone. And I
> don't think the draft should claim that the formula for Reno-friendly Cubic that it uses
> from that paper is correct, when it clearly doesn't produce the desired result, even in
> the paper that proposed it.
> 
> 
> The second point is α in the draft might be too big.
> 
> 
> [BB] How so? The paper's own simulations (exploring a very limited space) conclude that
> α from the model is more than 2x too small.
>
>       During my WGLC review, I read [FHP00] once and I had similar thoughts.
> But, after I re-read the paper several times, I start having a different
> interpretation. 
> 
> First off, it is very clear that β=0.7 is more aggressive than β =0.5 when there's
> no other traffic.
> As cwnd growth is linear, if there's enough long time for the data transfer, the
> average cwnd for β =0.5 will be (1.0 + 0.5)/2 = 0.75 Wmax and it will be (1.0 +
> 0.7)/2 = 0.85 Wmax for β=0.7.
> 
> 
> [BB] Correct
>
>       So, it is obvious that this model does not aim for the cases where there's
>       no other traffic.
> 
> 
> [BB] It is certainly obvious that comparing flow rates is not about single flows.
> 
> But, by the same token, does it mean anything to say a flow is 'more aggressive when
> there's no other traffic'?
> You can have different cwnd for the same throughput, if the avg queue delay is higher
> and therefore RTT is higher.  But I don't think that says anything about aggression. If
> anything, aggression is about shorter recovery time.
> 
> Yes, the analysis I used was for each flow on its own. When they are not on their own,
> but competing in the same queue, (obviously) there cannot be two different average queue
> delays. But you can think of them as each progressing in parallel, each taking losses
> and each contributing to the queue. I'm not saying my analysis is the final word by any
> stretch of the imagination. However, if on their own they have different recovery times,
> then when together A will be pulling B away from its natural recovery time and towards
> A's recovery time, and vice versa.
> 
> 
> I believe the choice of (α,  β) in [FHP00] is designed to be more or less fair
> only when it competes with (α=1.0,  β =0.5)
> Hence, I think we should not apply the loss rate for no-other-traffic situation to
> the formula, which can lead to α3 = 0.20
> Instead, I am thinking we can think in this way..
> 
> (α=1.0,  β =0.5) model oscillates between 0.5 W1 and 1.0 W1 in a congestion epoch.
> (α=X,  β =0.7) model oscillates between 0.7 W2 and 1.0 W2 in the same congestion
> epoch.
>   where W1 is max window for (α=1.0,  β =0.5), W2 is max window for (α=X,  β =0.7)
> When these two models have the same loss ratio, it should satisfy (1.0 + 0.5) W1 =
> (1.0 + 0.7) W2   
> 
> Also, in one congestion epoch, (α=1.0,  β =0.5) increases 0.5 W1 while (α=X,  β
> =0.7) increases 0.3 W2
> Now, if we define the congestion epoch as e RTTs, e = 0.5 W1 / 1.0  and also e =
> 0.3 W2 / X
> This means, X should satisfy
>   0.5 W1 /1.0  = 0.3 * 1.5/1.7 W1/ X
> then we get X = 0.529.
> 
> 
> [BB] This is the same equation as in [FHP00], but with numbers not variables. it's also
> the same as the formula in the current Cubic RFC. And it gives the same answer you've
> got when you plug in the numbers. Because you've used the same technique to compare the
> geometry of the sawteeth.
> 
> But is there empirical evidence to support this formula, given the simulations in the
> paper itself don't support it? That's the question.
> 
> Also, I'd like to hear Markku's response to your original question to him. I was only
> trying to second-guess what he was trying to say.
> 
> 
> Bob
> 
> 
> 
> This will mean if we launch (α=1.0,  β =0.5) and (α=0.529,  β =0.7) at the same
> time and reduce them when the sum of their windows becomes Wmax, the throughput of
> them will mostly be the same.
> --
> Yoshi
>  
> On Tue, Sep 7, 2021 at 5:17 AM Bob Briscoe <ietf@bobbriscoe.net> wrote:
>       Yoshi,
>
>       You recently asked Markku for an explanation of his point #4.
>
>       Coincidentally, a few weeks ago, I was checking the formula in RFC8312
>       for Cubic in TCP-Friendly mode (called "C-Reno" in this email for
>       brevity), and I also checked back on that preliminary paper that
>       Markku refers to. I think Markku has a different way of explaining the
>       flaw in the equation, but below I'll try to explain what I think the
>       problems are. See [BB] inline...
>
>       If you don't read HTML email, I'm afraid the maths is probably going
>       to look like gobbledygook.
>
>       On 30/08/2021 17:33, Markku Kojo wrote:
>       Hi Yoshi, all,
>
>       On Wed, 18 Aug 2021, Yoshifumi Nishida wrote:
>
>       4. Fairness to AIMD congestion control
>
>          The equation on page 12 to derive increase factor α_cubic
>       that
>          intends to achieve the same average window as AIMD TCP seems
>       to
>          have its origins in a preliminary paper that states that the
>          authors do not have an explanation to the discrepancy between
>          their AIMD model and experimental results, which clearly
>       deviate.
>          It seems to have gone unnoticed that the equation assumes
>       equal
>          drop probability for the different values of the increase
>       factor
>          and multiplicative decrease factor but the drop probability
>          changes when these factors change.
> 
> 
> [BB] The source of eqn (4) in RFC8312 is eqn (4) in [FHP00]. This paper has
> been cited 250 times but it is not peer reviewed.
> 
> Equations (2) & (3) of [FHP00] that lead up to (4) use an RTT (R) as if it
> doesn't vary. But it does, because the sawteeth vary the queue. But the real
> problem is that the model (silently) assumes that R is an average RTT
> sitting part-way up the sawteeth, whatever the link rate or base RTT. This
> overlooks the fact that, if the tips of the sawteeth are always at the same
> queue depth (as in a tail-drop buffer) the average R will be higher if the
> sawteeth have smaller amplitude. This sounds picky, but the assumption on
> whether sawteeth vary about their bottom, top or middle greatly alters the
> outcome:
> 
> I've taken three possible alternative assumptions, with their resulting
> additive increase factors for theoretical equality with Reno (I've given all
> the derivations at the end of this email):
> #1) All sawteeth have the same Rmin: α ~= (1/β2 - 1) / 3;        Example: β
> = 0.7; α ~= 0.35
> #2) All sawteeth have the same Rmax: α ~= 4(1-β2) / 3;            Example: β
> = 0.7; α ~= 0.68
> #3) All sawteeth have the same Ravg: α ~= 3(1-β) / (1+β);        Example: β
> = 0.7; α ~= 0.53    <== used in [FHP00] and therefore in RFC8312
> 
> 
> Which assumption is most applicable?
> #2 models a tail-drop queue.
> #3 (or somewhere between #3 and #2) models a probabilistic AQM, such as PIE
> or RED.
> #1 is not applicable to the current Internet, but it would be if a magic AQM
> was invented that minimized the standing queue.
> I don't think any of the assumptions are good models of C-Reno's interaction
> with CoDel.
> 
> So which one should draft-ietf-tcpm-rfc8312bis recommend?
> You might think #2, given one suspects the vast majority of queues are still
> tail drop...
> However, it may be 'none of the above', 'cos the paper also says it assumes
> deterministic marking; i.e. that that each flow experiences just one drop or
> mark at the top of each sawtooth. If that applies at all, it only applies to
> AQMs, not to tail-drop buffers that tend to drop more than one packet at a
> time. Nonetheless, for the purpose of rate comparison, all the flows will
> share the same bottleneck. So if flow rates are equal under a certain AQM,
> they might be equal under tail-drop too (see {Note 1} at the very end).
> 
> I would still like to say 'none of the above' because Linux C-Reno has been
> widely used with α=1, β=0.7 for many years now without the sky falling. So
> it's questionable whether friendliness should be defined relative to Reno.
> Otherwise, from day 1, RFC8312bis will be stating the forlorn hope that
> Linux C-Reno should regress to become less aggressive than its former self,
> just to be friendly to Reno, which apparently barely exists any more
> (according to the Great TCP Congestion Control Census of 2019 [MSJ+19]).
> 
> Nonetheless, we need to bear in mind that BSD variants of Cubic (e.g.
> Apple's) are also widely used and I believe they comply with the α for
> C-Reno defined in RFC8312.
> 
> (FAQ: Why does C-Reno add <1 segment per RTT to be friendly to Reno, which
> adds 1 segment per RTT?
> C-Reno in the RFC multiplicatively decreases cwnd to 0.7*cwnd on a loss
> (multiplicative decrease factor β=0.7), whereas Reno uses β=0.5. So, because
> C-Reno reduces less, it's also meant to increase less per round so that the
> sawteeth don't reach the max window again more quickly than Reno would. That
> is, it's meant to use a smaller AI factor (α) than 1 segment. Using α<1
> doesn't necessarily slow down Cubic's acceleration into unused capacity,
> because it can switch to a convex (hockey stick) Cubic curve once its slow
> linear increase has made the queue large enough to come out of its
> 'TCP-Friendly' mode.)
> 
> Whatever, rfc8312bis certainly shouldn't refer solely to [FHP00] without
> caveats.
> 
> Then we have the problem that none of the above formulae are verified
> against reality.
> Even in the [FHP00] paper, which uses assumption #3, the simulations are out
> by about 2x. The paper used β=7/8. So, from the theory, for each of the
> above three assumptions, the additive increase factor would have had to be:
> α1 = 0.10
> α2 = 0.31
> α3 = 0.20
> They used a RED AQM (which should be somewhere between assumption #2 & #3).
> But they had to configure C-Reno with α=0.4 to get close to equal flow rates
> with Reno
> Using my 'finger in the air' modification to the formulae in {Note 1}, α3 =
> sqrt(0.20) = 0.44, which would have been about right (perhaps only
> coincidentally).
> 
> The question over what α to put in the RFC could run and run. So here's a
> suggested path through this quagmire:
> These alternatives throw doubt on the wisdom of using assumption#3. So we
> could at least pick a new assumption for a theoretical α. Assumption#2 is
> probably the 'most correct', and also happens to give the highest value of α
> for the 'TCP-Friendly' region (α=0.68 when β=0.7).
> (By my unsubstantiated formula in {Note 1}, α2 = sqrt(0.68) = 0.82.)
> 
> If we look for an empirical value of α that gives reasonably equal flow
> rates against Reno, I suspect we will get all sorts of different answers
> depending on the conditions, e.g. AQM or not; high or low multiplexing; high
> or low RTT (shared by all flows); pacing, burstiness, etc,etc. If you want
> to take this path, feel free, but if you're still on it in a year's time,
> don't say I didn't warn you.
> 
> 
> Derivations of the above three formulae follow at the end.
>
>       The equations for the drop
>          probability / the # of packets in one congestion epoch
>          are available in the original paper and one can easily verify
>          this. Therefore, the equations used in CUBIC are not correct
>          and seem to underestimate _W_est_ for AIMD TCP, resulting in
>          moving away from AIMD-Friendly region too early. This gives
>          CUBIC unjustified advantage over AIMD TCP particularly in
>          environments with low level of statistical multiplexing. With
>          high level of multiplexing, drop probability goes higher and
>          differences in the drop probablilities tend to get small. On
>       the
>          other hand, with such high level of competition, the
>       theoretical
>          equations may not be that valid anymore.
>
>       /Markku
> 
> 
> [BB] Derivations of the formulae for α under the three different assumptions
> follow. It's possible/likely that this is all already in a paper somewhere,
> but I haven't been able to find one.
>
>   0) Preparatory material
> 
> Terminology:
> The subscript for C-Reno is omitted given it would be on every term.
> The recovery time of an AIMD congestion control is the average time for
> additive increase to recover the window reduction after a multiplicative
> decrease.
> 
> The overall approach is:
> A) Find a formula in terms of the AI and MD factors (α & β), for the
> recovery time of C-Reno.
> B) Equate this to the recovery time of Reno, which will produce a formula
> for the additive increase factor, α.
> 
> The recovery time formula for C-Reno is derived by:
> 1) summing up all the round trips in additive increase, which become
> increasingly large as the queue grows. This sum is stated in terms of the
> number of rounds per cycle, J.
> 2) Then J is found by writing two formulae;
>   2a) one for the average sawtooth decrease
>   2b) and the other for the increase.
> Then by assuming a steady state the two can be equated.
> Here goes...
> 
> The following formula gives the recovery time, Tr, from one max window to
> the next, in C-Reno with:
> * additive increase (AI) factor α [segment / RTT]
> * and packet-rate r [packet / s]
> assuming that the minimum window of each sawtooth still fully utilizes the
> link.
>     Tr = ΣJ-1j=0    (Rmin + jα/r)
> where j is the index of each round, and J is the number of round trips of
> additive increase per sawtooth. The rationale for the second term is that
> the queue grows by α fractional packets in each round, and the queue delay
> per packet is 1/r (seconds per packet is reciprocal of packets per second).
> The sum of all the second terms forms an arithmetic progression, the sum of
> which is given by the well-known formula, as follows:
>          = J.Rmin + J(J-1)α/2r
>        ~= J.Rmin + J2α/2r.                                (1)
> The approximation holds as long as J>>1. In some implementations (e.g.
> Linux), cwnd increases continuously, so strictly the limits should be j=1/2
> to J-1/2 (the average half way through the first and last rounds), but
> details like this are insignificant compared to the approximation.
> 
> The difference in RTT due to the additional queue delay between the top and
> bottom of the sawtooth can be stated in two ways.
> Either as the sum of all the queue delays of each additive increase:
>     (Rmax - Rmin) = Jα/r,                                (2a)
> Or by the multiplicative relationship between the max and min RTTs, by the
> definition of β: Rmin = β.Rmax.
>     (Rmax - Rmin) = (Rmin / β) - Rmin
>                                = Rmin(1-β)/β                 (2b)
> Equating (2a) and (2b):
>     J = r.Rmin(1-β) / (αβ)                                  (2)
> 
> Substituting for J from (2) in (1):
>     Tr = r.Rmin2 (1-β) / (βα) + r.Rmin2 (1-β)2/(2β2α)
>          = r.Rmin2 (1-β) / (βα) * (1 + (1-β)/2β )
>          = r.Rmin2 (1-β)(1+β) / (2β2α)
>          = r.Rmin2 (1-β2) / (2β2α)                                     (3)
> And, because Rmin = β.Rmax.
>     Tr = r.Rmax2 (1-β2) / (2α)                                         (4)
> 
> The well-known steady-state Reno formula from [FF99] is:
>     rReno ~= (1/Ravg) sqrt(3 / 2p)                                       
>             (5)
> assuming deterministic marking (which is also assumed for the above model of
> C-Reno).
> 
> Assuming a Reno sawtooth is approximately linear, the average RTT of a Reno
> flow,
>     Ravg ~= (Rmax + Rmin)/2
> and
>     Rmin = Rmax/2
> Therefore
>     rReno ~= (1/Rmin) sqrt(2 / 3p)                                       
>             (6)
>     rReno ~= (1/Rmax) sqrt(8 / 3p)                                       
>             (7)
> 
> With the preparatory material done, next we take each of the three
> assumptions, one at a time.
>
>   #1) All sawteeth have the same Rmin 
> 
> From Eqn (3):
>     Tr = r.Rmin2 (1-β2) / (2β2α)
> Total packets sent over period Tr is r.Tr during which time assume 1 packet
> is lost (or marked) at the sawtooth peak {Note 1}. So loss probability:
>     p = 1 / r.Tr
> Substituting for Tr and rearranging :
>     r2 = 2β2α / Rmin2 (1-β2)p
>     r = (β/Rmin) sqrt(2α / (1-β2)p)
> Equating the Reno packet rate, rReno, from eqn (6) to the C-Reno packet
> rate, r, for all α and β:
>     r = rReno
>        = (1/Rmin) sqrt(2 / 3p).
> Therefore,
>     (1/Rmin) sqrt(2 / 3p) ~= (β/Rmin) sqrt(2α / (1-β2)p)
> Rearranging
>     2/3 ~= 2αβ2 / (1-β2)
>     α ~= (1-β2)/3β2
>        ~= (1/β2 - 1) / 3                                                   
>                             (8)
>
>   #2) All sawteeth have the same Rmax = Rmin / β
> 
> From Eqn (4):
>     Tr = r.Rmax2 (1-β2) / (2α)
> As before, after substituting for Tr and rearranging:
>     p = 1 / r.Tr
>     r = (1/Rmax) sqrt(2α / (1-β2)p)
> Equating to rReno from eqn (7) for all α and β,
>       ~= (1/Rmax) sqrt(8 / 3p).
> Rearranging
>     8/3 ~= 2α / (1-β2)
>     α ~= 4(1-β2) / 3                                                       
>                         (9)
>
>   #3) All sawteeth have the same Ravg
> 
> Ravg = (Rmax + βRmax )/2
>          = Rmax (1+β)/2
> From Eqn (4):
>     Tr = r.Rmax2 (1-β2) / (2α)
>          = 2r.Ravg2 (1-β) / α(1+β)
> As before, after substituting for Tr and rearranging:
>     p = 1 / r.Tr
>     r = (1/Ravg) sqrt(α (1+β) / 2(1-β)p)
> Equating to rReno from eqn (5) for all α and β,
>       ~= (1/Ravg) sqrt(3 / 2p).
> Rearranging
>     3/2 ~= α(1+β) / 2(1-β)
>     α ~= 3(1-β) / (1+β)                                                     
>                     (10)
> 
> This last formula is same as the equation used for C-Reno in RFC8312 and in
> the RFC8312bis draft.
> 
> 
> {Note 1}: The assumption in [FHP00] of deterministic marking is suspect for
> the Internet (meaning that the spacing between drops or marks is even and
> there is always 1 packet dropped or marked at the top of each sawtooth). The
> actual average dropped (or marked) per sawtooth will depend on whether the
> buffer uses tail-drop or an AQM, and if so which AQM. It is perhaps better
> not to assume the actual average number dropped (or marked) is known, but
> instead assume the ratio of the average drops or marks between C-Reno and
> Reno will be roughly proportional to their AI factors. This would modify the
> formula for C-Reno's drop probability to:
>     p = αReno / αC-Reno.r.Tr
> Substituting αReno = 1:
>     p = 1 / αC-Reno.r.Tr
> This would modify all the results to:
> 
> #1) All sawteeth have the same Rmin: α ~= sqrt( (1/β2 - 1) / 3 );       
> Example: β = 0.7; α ~= 0.59
> #2) All sawteeth have the same Rmax: α ~= sqrt( 4(1-β2) / 3 );           
> Example: β = 0.7; α ~= 0.82
> #3) All sawteeth have the same Ravg: α ~= sqrt( 3(1-β) / (1+β) );       
> Example: β = 0.7; α ~= 0.73
> 
> However, these would then be wrong where there really is deterministic
> marking.
> 
> References
> =========
> [FHP00]    Floyd, S., Handley, M., and J. Padhye, "A Comparison of
> Equation-Based and AIMD Congestion Control", May 2000, 
> <https://www.icir.org/tfrc/aimd.pdf>
> 
> [MSJ+19] Ayush Mishra, Xiangpeng Sun, Atishya Jain, Sameer Pande, Raj Joshi,
> and Ben Leong. The Great Internet TCP Congestion Control Census. Proc. ACM
> on Measurement and Analysis of Computing Systems, 3(3), December 2019
> 
> [FF99] Sally Floyd and Kevin Fall, "Promoting the Use of End-to-End
> Congestion Control in the Internet", IEEE/ACM ToN (1999)
> 
> 
> Bob
> 
> -- 
> ________________________________________________________________
> Bob Briscoe                               http://bobbriscoe.net/
> 
> 
> -- 
> ________________________________________________________________
> Bob Briscoe                               http://bobbriscoe.net/
> 
> 
> -- 
> ________________________________________________________________
> Bob Briscoe                               http://bobbriscoe.net/
> 
>