Re: [iccrg] LEDBAT++, rLEDBAT, and slowdowns

Matt Mathis <mattmathis@google.com> Sat, 07 September 2019 00:06 UTC

Return-Path: <mattmathis@google.com>
X-Original-To: iccrg@ietfa.amsl.com
Delivered-To: iccrg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 56AC6120DFA for <iccrg@ietfa.amsl.com>; Fri, 6 Sep 2019 17:06:44 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -17.499
X-Spam-Level:
X-Spam-Status: No, score=-17.499 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Hd0uMSEEs6Bq for <iccrg@ietfa.amsl.com>; Fri, 6 Sep 2019 17:06:41 -0700 (PDT)
Received: from mail-yb1-xb36.google.com (mail-yb1-xb36.google.com [IPv6:2607:f8b0:4864:20::b36]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 38D531200DE for <iccrg@irtf.org>; Fri, 6 Sep 2019 17:06:41 -0700 (PDT)
Received: by mail-yb1-xb36.google.com with SMTP id o80so35827ybc.6 for <iccrg@irtf.org>; Fri, 06 Sep 2019 17:06:41 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Pm0RlWsOpXsXQFyGRbYWMjjRxA4o0LsNGoNovUc36bI=; b=fOrxVVhJ0Y1bcNZrBRVUk94t8w+zvmyhwrWpdoNiBq+KXVVsA+7+fi4P7Uxb78a+mg Gmq0/vFo6IlQfC1lQFBmW0rAuLcrBVJn4pQ0Nmf1J0oAPlT1qi2qmj9a3BZcO9AbzlWZ zbX84YdP8qw2kG5dd//QplrfMwOKzDNBOg+UVrV5y+iXnKH9aPE4LeBxaBKPlInObjxc 0gD9SMpTogvvIdzG5ApS/rDCsom+7gQfRiZuTmvUQ5yfEib6Xt1YTllwbwyhkbZzJPdj iL12DNcePUOpEMS4K64y7m0CuhenvhAr8WT5KKQW0g3B90LXjespPeANtMTbcufFYzyI sU1w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Pm0RlWsOpXsXQFyGRbYWMjjRxA4o0LsNGoNovUc36bI=; b=rWwxHUwkdRSuJmpL5tNDlwPahsKkR4OKFxsUltzdpP1kPUxae8qdGiIexIQFIOD6x1 yh3D5mrVZ8nXH2oNrH5V6g9VP3yScuOZw+/CvKuBmqmKOo/imjGn8E1EE4zarroTn9u8 vDdrmKv2Mpg2iFs+Igqj9PwsG3DH4rvfuJSosDCbh+2qmUMrndbCWJFuN95+KByFH4et fC8tOCZzQAQD2nA+ZJyXjE+01WgsUiOuDC4Hn3oAQHasF9zATMQ84k7iKrcfNqLea2pk 7LValxwcIOcltGjqg8IrBFYzh88ECfgbeQcHqTEaT6MeosUtYRuZHqiL/bbHAavGLIce 8GpA==
X-Gm-Message-State: APjAAAW4CuQCMsfSVrf68odN9K1g2jUEun1Il0u4YbFUnT8H8FrgJ0Iq myk0Z25IM5cD14Ggc8ZuauoSuFMgPz05zLjWoHbFIQ==
X-Google-Smtp-Source: APXvYqw+1IlcbQS2d5YA8VppqdNvIzO9uh1rniwDSUGEMKxo3UqyAtJ4EnsalWWfICt/70z7B7wr2M0r5vF5syJLI0k=
X-Received: by 2002:a25:324b:: with SMTP id y72mr7897855yby.361.1567814799900; Fri, 06 Sep 2019 17:06:39 -0700 (PDT)
MIME-Version: 1.0
References: <CADVnQynJY8xkqkghhCWhPpbF+4Ev_3c7OZf_tDEb_J5xr0FV9A@mail.gmail.com> <c4b76af5-abbe-3184-24ce-03a2c0b9544b@it.uc3m.es> <CADVnQy=3FqEjqipX6thgjcjN8YPTOqiduYKU2GccXHwS+a3wVA@mail.gmail.com> <be98e323-506f-bdc8-a128-72c9f4aa5ead@it.uc3m.es> <3862e86e-e588-0cca-38e8-4ea23ef2b4c7@it.uc3m.es> <CADVnQymqHHg4cF94fu81opaBibmShVrsZ3cLyVPcHir0p=Kv3Q@mail.gmail.com>
In-Reply-To: <CADVnQymqHHg4cF94fu81opaBibmShVrsZ3cLyVPcHir0p=Kv3Q@mail.gmail.com>
From: Matt Mathis <mattmathis@google.com>
Date: Fri, 06 Sep 2019 17:06:28 -0700
Message-ID: <CAH56bmDKNVAwx=mF7uW6O_vuVdHvPGnw160EAYPaBRth=vE1DA@mail.gmail.com>
To: marcelo bagnulo braun <marcelo@it.uc3m.es>, Praveen Balasubramanian <pravb@microsoft.com>
Cc: iccrg IRTF list <iccrg@irtf.org>, Neal Cardwell <ncardwell=40google.com@dmarc.ietf.org>, Yuchung Cheng <ycheng@google.com>
Content-Type: multipart/alternative; boundary="000000000000394cbc0591eb54a2"
Archived-At: <https://mailarchive.ietf.org/arch/msg/iccrg/tAUId9nn_RlYDZ4TqjsPN1IWABc>
Subject: Re: [iccrg] LEDBAT++, rLEDBAT, and slowdowns
X-BeenThere: iccrg@irtf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Discussions of Internet Congestion Control Research Group \(ICCRG\)" <iccrg.irtf.org>
List-Unsubscribe: <https://www.irtf.org/mailman/options/iccrg>, <mailto:iccrg-request@irtf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/iccrg/>
List-Post: <mailto:iccrg@irtf.org>
List-Help: <mailto:iccrg-request@irtf.org?subject=help>
List-Subscribe: <https://www.irtf.org/mailman/listinfo/iccrg>, <mailto:iccrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Sat, 07 Sep 2019 00:06:44 -0000

Adding to Neal's points: a robust distributed algorithm for each flow to
compute is own minRTT is a known hard problem, going back a long long way.
I believe that it is the essence of the proof in "Flow Control Power is
Nondecentralizable" [1] because no flow can tell if its observed min RTT
includes other flows' standing queue.   BBR minimizes this problem
by explicitly trying to globally synchronize minRTT measurements, which is
demonstrably good enough for known cases.  If you can do better, please
share and we can consider porting your algorithm to BBR, if you can't do
better, please consider adopting the exact algorithm in BBR, so we can all
synchronize the minRTT measurements.

BTW To the extent that the proof in [1] applies to BBR, it implies that the
min RTT algorithm in BBR is also subject to pathological behavior under
some unknown conditions.  Yes I do fret about this from time to time, but
no I haven't found an example.  "Global synchronization" is hard, so the
most likely failure modes look like strong regional synchronizations in
different phases, and flows that traverse multiple regions.  This clearly
might fail optimality, but I can't imagine how to cause divergent or
latchup behaviors.  Under authentic workloads this is not a worry at all.

[1] JEFFREY M. JAFFE, 1981, Flow Control Power is Nondecentralizable, IEEE
TRANSACTIONS ON COMMUNICATIONS, VOL. COM-29, NO. 9,

Thanks,
--MM--
The best way to predict the future is to create it.  - Alan Kay

We must not tolerate intolerance;
       however our response must be carefully measured:
            too strong would be hypocritical and risks spiraling out of
control;
            too weak risks being mistaken for tacit approval.


On Fri, Sep 6, 2019 at 8:26 AM Neal Cardwell <ncardwell=
40google.com@dmarc.ietf.org> wrote:

> On Wed, Sep 4, 2019 at 6:15 AM marcelo bagnulo braun <marcelo@it.uc3m.es>
> wrote:
> >
> > Apologies for replying to my own email, but i would like to take
> > something back, see below...
> >
> > El 03/09/19 a las 17:57, marcelo bagnulo braun escribió:
> > > I guess what I'm trying to get at is that:
> > >>
> > >> (1) AFAICT the LEDBAT++ and rLEDBAT algorithms so far do not seem to
> > >> provide a bound on "the additional queueing delay added by the LEDBAT
> > >> flow", in the case where there are several  LEDBAT++/rLEDBAT flows and
> > >> the fair share of in-flight data is less than the queue depth.
> > >> Instead, the group dynamics would be such that the queue would
> > >> gradually but steadily ratchet up until the buffer is full. This does
> > >> not seem to meet the goals expressed in RFC6817 ("limiting the
> > >> consequent increase in queueing delay on that path...avoiding
> > >> interference with the network performance of competing flows") or the
> > >> LEDBAT++/rLEDBAT drafts.
> > >
> > >
> > > right, the additional queueing delay added by a single flow is
> > > bounded, but if you keep on adding LEDBAT flows one after the other,
> > > the total added queuing delay is not bounded indeed.
> >
> > Let me take this back.
> >
> > So, let's consider how would this situation you describe would happen.
> >
> > The scenario that I have been able to identify where this would happen
> > would be for example the following.
> >
> > We have two LEDBAT++ flows (let's assume there is no other traffic for
> now).
> >
> > They have managed to accurately measure the base delay and thus they are
> > fairly sharing the bottleneck link and inducing half of the target delay
> > each.
> >
> > After n minutes, there is one of the flows (f1) that forgets the oldest
> > minute of history, the one that contained the accurate measurement of
> > the base delay.
> >
> > This implies that it will now record as base delay the minimum delay
> > observed in the AIMD process, which is the one caused by both flows
> > having half of the maximum window (I am assuming both flows are
> > synchronized).
> >
> > The first consequence of this is that flow f1 will take a larger share
> > of the bottleneck capacity.
> >
> > In any case, when the queuing delay measured by f1 is larger than the
> > target, f1 will reduce its window to half. Depending on the relationship
> > between the target delay and the real base delay, it is possible that
> > when f1 reduces it window in half, the resulting delay is still larger
> > than its current minimum, resulting in the increase of the queueing
> > delay that you mentioned. Since this happens every n minutes,this can
> > cause the queueing delay to grow unbounded.
> >
> > Is this the type of scenario you had in mind?
>
> Yes, that is the type of scenario I had in mind.
>
> Basically, if the fair share of inflight data is less than the standing
> queue,
> then having a single flow cut its inflight will not drain the queue, and
> thus
> no single flow can reveal the true two-way propagation delay by itself.
>
> So an algorithm that relies on a windowed min_rtt estimate and individual
> flows cutting by themselves will gradually ratchet up the level of queuing,
> causing the queuing delay to grow without bounds.
>
> > However, if we apply the slowdown described in LEDBAT++
> > (unsynchronized), then f1 would reduce its window and because f1 is the
> > one expelling the other flows out, this would allow f1 to have a more
> > accurate measure of the base delay, enabling the AIMD fairness to kick
> > in and preventing the increase of the queueing delay.
>
> As noted above, if the fair share of inflight data is less than the
> standing queue, when f1 reduces its window it will not drain the queue
> fully, so its base delay estimate will ratchet up to a value that
> includes the queueing delay from all the other flows queued at
> the bottleneck, even if all the others are LEDBAT++ flows.
>
>
> > In summary, because errors in estimating the queuing delay causes one
> > flow to grow more than the others, when this particular flow slows down,
> > will allow a more accurate measure of the base delay, enabling AIMD
> > fairness mechanisms to reconverge to a fair split.
> >
> > So, I am not convinced that it is not enough with the slow down proposed
> > in LEDBAT++.
>
> When you say a slow-down "will allow a more accurate measure of the
> base delay," I think a key question is: more accurate than what, exactly?
> An uncoordinated slow-down will allow a more accurate measure
> of the base delay than the most recent RTT samples, but it will not
> allow a measure that is more accurate than (or even matching)
> the current base delay estimate.That's because uncoordinated
> slow-downs can cause feedback loops, where flows measure
> ever-increasing RTTs from the queues created by other flows
> that are also unable to "anchor" their base delay estimates, so
> are also allowing ever-increasing amounts of data in flight.
>
> > I am not sure this is true for all combinations of parameter values, i
> > need to look into this.
>
> Yes, I think further testing of LEDBAT++/rLEDBAT in multi-flow scenarios
> would be worthwhile.
>
> A simple scenario that should illustrate this issue:
>  o RTT of 100ms
>  o bottleneck with 10*BDP of buffer space (allowing 1 sec of queuing)
>  o 3 bulk LEDBAT++/rLEDBAT flows
>
> Per the dynamics discussed above, my theory is that in such a scenario
> the queuing delay will grow until the buffer is oscillating near full and
> there is ~1sec of queuing in the steady state. That is, rather than
> "Low Extra Delay" there will be 1sec of extra delay. :-)
>
> best,
> neal
>
> _______________________________________________
> iccrg mailing list
> iccrg@irtf.org
> https://www.irtf.org/mailman/listinfo/iccrg
>