Re: [iccrg] LEDBAT++, rLEDBAT, and slowdowns

marcelo bagnulo braun <marcelo@it.uc3m.es> Wed, 04 September 2019 10:15 UTC

Return-Path: <marcelo@it.uc3m.es>
X-Original-To: iccrg@ietfa.amsl.com
Delivered-To: iccrg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4A1241200D7 for <iccrg@ietfa.amsl.com>; Wed, 4 Sep 2019 03:15:44 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Level:
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=it.uc3m.es
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8-2vWkKDM9gS for <iccrg@ietfa.amsl.com>; Wed, 4 Sep 2019 03:15:41 -0700 (PDT)
Received: from mail-wm1-x332.google.com (mail-wm1-x332.google.com [IPv6:2a00:1450:4864:20::332]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 9F3621200CE for <iccrg@irtf.org>; Wed, 4 Sep 2019 03:15:41 -0700 (PDT)
Received: by mail-wm1-x332.google.com with SMTP id o184so2981076wme.3 for <iccrg@irtf.org>; Wed, 04 Sep 2019 03:15:41 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=it.uc3m.es; s=google; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding:content-language; bh=VBJXWPKlJBlNkNDLfZhnmkPWo0uonf1Z/DPdoKA+XNI=; b=EdPlXIKgwfT5zaVWCgnoxYr7HbjEvrWhQr16N9lVRTFgK+0gAkFDhYKJxbZpN82GXq nFH5Ium4NnUXD8tOmcRDwNK5ClVRab0Yj7lTapplns5ljuoLEZK6U/G1KX/0n1mFs1C+ JoKbgQyLQHF4dbNJFRLTbHcMSqvJWIacxR94Qg6mYk6uqx5RndkjmLBS4C3Qrr16L5NS zXU/JTTtNRPE+++qYwZ4CsQbRz536dwGBQHvYvRedhwGJf5INpuihlXS7uWvJcyX1ODY 48snWD/EX7szV8CgLWvBjvLB8GQLrXg4j2otYAcVHVhVgePL4F2gk4dm+MlIV4jfPF8P IT+Q==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=VBJXWPKlJBlNkNDLfZhnmkPWo0uonf1Z/DPdoKA+XNI=; b=W9czyfa1eEGkxL7cFtnjg0Mp16aqAkaff4xfnVBS5lhjpJMNtJFfsCtgvgYzzbDuwh UrxOvkjoD+aBYC/WP2DudIc/j1iqRJ2cYlhjcUcNEk5oirROj7NNlioNZkBGFWr2Tq2a 6XHvpkGnU63ze0vzIVZKizWWwENFqtW90vByQQVxQX8paVV20UUx0028F2TLEwRzru5H aoX7DxL+00rRfesFMTs4UZCaIy3vXu4rPTRwcpTslP3APXMFEZfpK1UPjBtKGYggKTFJ oDEyozOU50isq9KCIOYVwLvRLzxFRpDwBdUHpFbIr9L+wyJMxKimzXzkSO89OdRpFhx6 hdnA==
X-Gm-Message-State: APjAAAVt/Vu5h8ZWGQMFpuUayioDFg2w4cqQY8AIoQxYyctOycDJgewx 3a1XMoqPC7QpLZRaN6N5cKZHfw==
X-Google-Smtp-Source: APXvYqyLO799ct81rg/JCV17pK9QpkyoG+OLw8QxA0sm3aODvnMJ/A2PZQAsFW4UQvs9g5egYi/oLA==
X-Received: by 2002:a7b:ce8f:: with SMTP id q15mr3605272wmj.106.1567592140043; Wed, 04 Sep 2019 03:15:40 -0700 (PDT)
Received: from Macintosh-6.local ([163.117.139.228]) by smtp.gmail.com with ESMTPSA id r18sm24273669wrx.36.2019.09.04.03.15.38 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 04 Sep 2019 03:15:39 -0700 (PDT)
To: Neal Cardwell <ncardwell@google.com>
Cc: iccrg IRTF list <iccrg@irtf.org>, Praveen Balasubramanian <pravb@microsoft.com>, Yuchung Cheng <ycheng@google.com>
References: <CADVnQynJY8xkqkghhCWhPpbF+4Ev_3c7OZf_tDEb_J5xr0FV9A@mail.gmail.com> <c4b76af5-abbe-3184-24ce-03a2c0b9544b@it.uc3m.es> <CADVnQy=3FqEjqipX6thgjcjN8YPTOqiduYKU2GccXHwS+a3wVA@mail.gmail.com> <be98e323-506f-bdc8-a128-72c9f4aa5ead@it.uc3m.es>
From: marcelo bagnulo braun <marcelo@it.uc3m.es>
Message-ID: <3862e86e-e588-0cca-38e8-4ea23ef2b4c7@it.uc3m.es>
Date: Wed, 04 Sep 2019 12:15:38 +0200
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:60.0) Gecko/20100101 Thunderbird/60.8.0
MIME-Version: 1.0
In-Reply-To: <be98e323-506f-bdc8-a128-72c9f4aa5ead@it.uc3m.es>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Content-Language: en-GB
Archived-At: <https://mailarchive.ietf.org/arch/msg/iccrg/CaJHLJxdbKwRZmhQ-1i-dGOTHis>
Subject: Re: [iccrg] LEDBAT++, rLEDBAT, and slowdowns
X-BeenThere: iccrg@irtf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussions of Internet Congestion Control Research Group \(ICCRG\)" <iccrg.irtf.org>
List-Unsubscribe: <https://www.irtf.org/mailman/options/iccrg>, <mailto:iccrg-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/iccrg/>
List-Post: <mailto:iccrg@irtf.org>
List-Help: <mailto:iccrg-request@irtf.org?subject=help>
List-Subscribe: <https://www.irtf.org/mailman/listinfo/iccrg>, <mailto:iccrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Wed, 04 Sep 2019 10:15:44 -0000

Apologies for replying to my own email, but i would like to take 
something back, see below...

El 03/09/19 a las 17:57, marcelo bagnulo braun escribió:
> I guess what I'm trying to get at is that:
>>
>> (1) AFAICT the LEDBAT++ and rLEDBAT algorithms so far do not seem to
>> provide a bound on "the additional queueing delay added by the LEDBAT
>> flow", in the case where there are several  LEDBAT++/rLEDBAT flows and
>> the fair share of in-flight data is less than the queue depth.
>> Instead, the group dynamics would be such that the queue would
>> gradually but steadily ratchet up until the buffer is full. This does
>> not seem to meet the goals expressed in RFC6817 ("limiting the
>> consequent increase in queueing delay on that path...avoiding
>> interference with the network performance of competing flows") or the
>> LEDBAT++/rLEDBAT drafts.
>
>
> right, the additional queueing delay added by a single flow is 
> bounded, but if you keep on adding LEDBAT flows one after the other, 
> the total added queuing delay is not bounded indeed.

Let me take this back.

So, let's consider how would this situation you describe would happen.

The scenario that I have been able to identify where this would happen 
would be for example the following.

We have two LEDBAT++ flows (let's assume there is no other traffic for now).

They have managed to accurately measure the base delay and thus they are 
fairly sharing the bottleneck link and inducing half of the target delay 
each.

After n minutes, there is one of the flows (f1) that forgets the oldest 
minute of history, the one that contained the accurate measurement of 
the base delay.

This implies that it will now record as base delay the minimum delay 
observed in the AIMD process, which is the one caused by both flows 
having half of the maximum window (I am assuming both flows are 
synchronized).

The first consequence of this is that flow f1 will take a larger share 
of the bottleneck capacity.

In any case, when the queuing delay measured by f1 is larger than the 
target, f1 will reduce its window to half. Depending on the relationship 
between the target delay and the real base delay, it is possible that 
when f1 reduces it window in half, the resulting delay is still larger 
than its current minimum, resulting in the increase of the queueing 
delay that you mentioned. Since this happens every n minutes,this can 
cause the queueing delay to grow unbounded.

Is this the type of scenario you had in mind?

However, if we apply the slowdown described in LEDBAT++ 
(unsynchronized), then f1 would reduce its window and because f1 is the 
one expelling the other flows out, this would allow f1 to have a more 
accurate measure of the base delay, enabling the AIMD fairness to kick 
in and preventing the increase of the queueing delay.

In summary, because errors in estimating the queuing delay causes one 
flow to grow more than the others, when this particular flow slows down, 
will allow a more accurate measure of the base delay, enabling AIMD 
fairness mechanisms to reconverge to a fair split.

So, I am not convinced that it is not enough with the slow down proposed 
in LEDBAT++.

I am not sure this is true for all combinations of parameter values, i 
need to look into this.

Regards, marcelo




>
> this is also true for LEDBAT, i guess.
>
>
>> (2) With minor tweaks to coordinate slowdowns, the LEDBAT++/rLEDBAT
>> algorithms could be tweaked to avoid this problem in common multi-flow
>> scenarios.
>
>
> I guess the obvious way to synchronize this would be to generate a 
> packet loss so that all the flows would slow down and it would be 
> possible to measure the base delay. But i guess you have something 
> more subtle in mind?
>
> Regards, marcelo
>
>
>
>> best,
>> neal
>>
>