[tcpm] CUBIC rfc8312bis / WGLC Issue 1

Markku Kojo <kojo@cs.helsinki.fi> Tue, 14 June 2022 14:38 UTC

Return-Path: <kojo@cs.helsinki.fi>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 672F0C157B3B for <tcpm@ietfa.amsl.com>; Tue, 14 Jun 2022 07:38:12 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.01
X-Spam-Level:
X-Spam-Status: No, score=-7.01 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_HI=-5, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cs.helsinki.fi
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CCBWSt_7fSH2 for <tcpm@ietfa.amsl.com>; Tue, 14 Jun 2022 07:38:07 -0700 (PDT)
Received: from script.cs.helsinki.fi (script.cs.helsinki.fi [128.214.11.1]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6B8BBC157B40 for <tcpm@ietf.org>; Tue, 14 Jun 2022 07:38:03 -0700 (PDT)
X-DKIM: Courier DKIM Filter v0.50+pk-2017-10-25 mail.cs.helsinki.fi Tue, 14 Jun 2022 17:37:54 +0300
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cs.helsinki.fi; h=date:from:to:subject:message-id:mime-version:content-type; s= dkim20130528; bh=rU0tOmQXJNJm3lUW6kmELAndhzuFf6l1597QcdfsTfM=; b= frsrPMH5bRgM3TOYdX2nz1lZVX2xj0rLSas2vNhVjz7jYzeT0WkjEnW6c3nao39F 8qB8mCDxR+gxRu1KvyabEr566Ml+K2iUcg4cGWI4j+n3rWzLMMkKhso1Wt2xs1Do n27MzVSP6MAVHkcxERrWxpQ6eTW+igbMPhDjZkJEujc=
Received: from hp8x-60 (85-76-82-174-nat.elisa-mobile.fi [85.76.82.174]) (AUTH: PLAIN kojo, TLS: TLSv1/SSLv3,256bits,AES256-GCM-SHA384) by mail.cs.helsinki.fi with ESMTPSA; Tue, 14 Jun 2022 17:37:53 +0300 id 00000000005A0403.0000000062A89D42.00005915
Date: Tue, 14 Jun 2022 17:37:53 +0300
From: Markku Kojo <kojo@cs.helsinki.fi>
To: tcpm@ietf.org
Message-ID: <alpine.DEB.2.21.2206061517230.7292@hp8x-60.cs.helsinki.fi>
User-Agent: Alpine 2.21 (DEB 202 2017-01-01)
MIME-Version: 1.0
Content-Type: text/plain; format="flowed"; charset="US-ASCII"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/pN1fARPVzqcDlt--xiMoDm_loAM>
Subject: [tcpm] CUBIC rfc8312bis / WGLC Issue 1
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 14 Jun 2022 14:38:12 -0000

Hi all,

this thread starts the discussion on the issue 1: the incorrect model 
for determining CUBIC alpha for the congestion avoidance (CA) phase 
(Issue 1 a) and the inadequate validation of a proper constant C for the 
CUBIC window increase function (Issue 1 b).


Issue 1 a)
----------

The model that CUBIC uses to be fair to Reno CC (in Reno-friendly region) 
is unvalidated and actually incorrect.

A more detailed description of the issue:

The original paper manuscript that CUBIC bases its behaviour in the 
Reno-friendly region did a preliminary attempt to validate the model but 
failed (and the paper never got published). This is the only known 
attempt to validate the model and even this failed validation attempt was 
quite light, consisting of only a couple of network settings and 
obviously did not use any replications for the results shown in the 
paper. Hence, even the statistical validity of the results remains 
questionable. Results were shown only for a setting with AQM enabled at 
the bottleneck router. The results for a tail-drop case are missing in 
the paper manuscript.

The report (creno.pdf, see a pointer to the doc in the email pointed to 
below) that Bob wrote provides some explanation why the model does not 
give correct results and thereby the resulting behaviour presented in the 
original paper notably deviates from that of Reno CC. The email that I 
wrote to the wg list

  https://mailarchive.ietf.org/arch/msg/tcpm/bds-h_a6-NliTjx-ZqUSaFpSSnA/

complements Bob's explanation for the AQM case and corrects Bob's 
analysis for the tail-drop case, explaining why the model is incorrect 
for the traditional and still today prevailing tail-drop router case.

Consequently, the use of the incorect model results in unknown behaviour 
of CUBIC when in the Reno-friendly region. Moreover, it is quite likely 
that the behaviour is different with different AQM implementations at the 
bottleneck, resulting in even more random behavior. This alone is very 
problematic and becomes more problematic when considering how moving out 
from the Reno-friendly region is specified: when the genuine CUBIC 
formula gives a larger cwnd than the cwnd that the Reno-friendly model 
gives, CUBIC moves to the genuine CUBIC mode that is significantly more 
aggressive than Reno CC.

Therefore, if the incorrect model gives too low cwnd for mimicked Reno 
CC, CUBIC moves too early to the genuine CUBIC mode and becomes too 
agggressive too early even though it should behave equally aggressive as 
Reno CC. On the other hand, if the incorrect model gives too large cwnd, 
CUBIC is too aggressive throughout the Reno-friendly region.
In summary, if the model is not correct, it results in more aggressive 
behaviour than Reno CC no matter which direction the model fails.

And very importantly: some people have suggested that CUBIC should 
replace the current stds track CC algos and become the default. The 
behaviour of Reno CC is very thoroughly studied and very well understood. 
If we replace it with *unknown* behaviour, how can we anymore specify 
what is the correct and allowed aggressiveness for any upcoming CC when 
the behaviour of the new default itself is unknown, making comparative 
analysis of other CCs against CUBIC in the Reno-frindly region very 
difficult? The behaviour is assumed to be the same as Reno CC but the 
actual behaviour is random, it may be 2 times or 8 times more aggressive 
than Reno, for example.


Issue 1 b)
----------

Another issue related to the operating in the Reno-friendly region is the 
question when CUBIC should operate in the Reno-friendly region and 
when it may move out of it. Obviously CUBIC should stay in the 
Reno-friendly region when Reno CC would be able to fully utilize the 
available network capacity. In practice, this is specified by selecting 
the value for constant C in the formula that is used to determine cwnd in 
the "genuine" CUBIC mode. However, selecting a proper value for C has not 
been properly validated in a wide range of environments as required in 
RFC 5033.

Preliminary validation of constant C has been done for the original CUBIC 
paper. That is good enough for a scientific paper but not adequate for an 
IETF stds track algo. There seems to be no additional evaluation since 
the timeframe of the CUBIC paper publication around 15 years ago. 
Particularly, there seems to be no evaluation with AQM at the bottleneck 
router or with a buffer-bloated bottleneck router, not to mention many 
other network environments. Nor is there any data available for a 
non-SACK TCP sender.

The evaluation of 1 a) and 1 b) must be done separately. Othserwise, it 
is very hard to tell whether any deviations are due to the incorrect 
model or incorrect value of C. The original CUBIC paper and some other 
papers show that CUBIC is not fair to Reno CC in certain network 
conditions where Reno CC has no problems in utilizing the available 
network capacity; instead, CUBIC steals capacity from Reno CC.

Thanks,

/Markku