Re: [tsvwg] [Ecn-sane] per-flow scheduling

Kyle Rose <krose@krose.org> Sat, 27 July 2019 15:35 UTC

Return-Path: <krose@krose.org>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 60C11120024 for <tsvwg@ietfa.amsl.com>; Sat, 27 Jul 2019 08:35:45 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.999
X-Spam-Level:
X-Spam-Status: No, score=-1.999 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=krose.org
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id BWOUqiM677KE for <tsvwg@ietfa.amsl.com>; Sat, 27 Jul 2019 08:35:42 -0700 (PDT)
Received: from mail-yw1-xc2d.google.com (mail-yw1-xc2d.google.com [IPv6:2607:f8b0:4864:20::c2d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D5364120018 for <tsvwg@ietf.org>; Sat, 27 Jul 2019 08:35:41 -0700 (PDT)
Received: by mail-yw1-xc2d.google.com with SMTP id x67so19971098ywd.3 for <tsvwg@ietf.org>; Sat, 27 Jul 2019 08:35:41 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=krose.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=y1TGiQbw/fiy5aL3PXmOZ6htYXVCo/Ju+8/j5MS8GaA=; b=IlgtZMJw8nvR0McFOuS3vr0gJemzAIFVS+izTeX+/0YoEd217vpOtgg9reCn7CDSNZ TxVm+DPIISW8y0XyNb+jMf0QXNWaDCZnJfuV+sWDNHIV/yCNM4CvM8oSJ2VX+6plC/1k gxSCHZljyHG5UM5eFTV7Zf0hicFv6UQZuwrsY=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=y1TGiQbw/fiy5aL3PXmOZ6htYXVCo/Ju+8/j5MS8GaA=; b=k3sXDunCkF+3IFJoiipYaSB6DB3Y1U3ZwvsjWem766uwQZqHiTjV3N8JTmpgCbgpcE VJkf5hs4SmQvWcIxdEW44b4NajGI2iZLrvVUU1J93KSwEOFzlnMDC2VW2RUp2fQVyWQU xMn9uC6SL/GIYTIJxQdLTjhAu5pAnmnidWjLwTDSv8d4BrkuZujnMb/435/GKOJKTlJW +2W4uk6ow/1fDbRR7ndTqzJOpmv4dV+fGy7ECtd0wOvNgrfNKEd30QAY2LFntkkm07W9 zbNeT5wNRRK4BmrKVaWJx32Ug10M8CtWCBbzYqccbXldSg4a0CzExOiu+BhqN9dMRJ+g yW2g==
X-Gm-Message-State: APjAAAXDpFdVAf4ht+yj9NnK5By2EB4oiB4Tkuqrk6b538cuksanE9gy X/v9S61OkMZngaNkOyRNxMb5oNERelEPddVx+hI=
X-Google-Smtp-Source: APXvYqzcGPxfljbAagn8KAShD/eBewUR/YtphhOIyn/8t+6RCgChDsqhnSeEGa9WIx8GuUw22iLiuQ+771f63uJHD1w=
X-Received: by 2002:a81:980d:: with SMTP id p13mr60288734ywg.51.1564241740643; Sat, 27 Jul 2019 08:35:40 -0700 (PDT)
MIME-Version: 1.0
References: <350f8dd5-65d4-d2f3-4d65-784c0379f58c@bobbriscoe.net> <40605F1F-A6F5-4402-9944-238F92926EA6@gmx.de> <1563401917.00951412@apps.rackspace.com> <D1595770-9481-46F6-AC50-3A720E28E03D@gmail.com> <d8911b7e-406d-adfd-37a5-1c2c20b353f2@bobbriscoe.net> <CAJU8_nWTuQ4ERGP9PhXhpiju_4750xc3BX10z4yp4an0QBE-xw@mail.gmail.com> <4833615D-4D68-4C95-A35D-BCF9702CEF69@akamai.com>
In-Reply-To: <4833615D-4D68-4C95-A35D-BCF9702CEF69@akamai.com>
From: Kyle Rose <krose@krose.org>
Date: Sat, 27 Jul 2019 11:35:28 -0400
Message-ID: <CAJU8_nVrC93VJhYxN2GGfVDDXz2sS0aKwHABfc2YxfeY5uFBJQ@mail.gmail.com>
To: "Holland, Jake" <jholland@akamai.com>
Cc: Bob Briscoe <ietf@bobbriscoe.net>, "ecn-sane@lists.bufferbloat.net" <ecn-sane@lists.bufferbloat.net>, tsvwg IETF list <tsvwg@ietf.org>, "David P. Reed" <dpreed@deepplum.com>
Content-Type: multipart/alternative; boundary="0000000000004b57d9058eab6948"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/Vr0x8m4JZCr6xXhh40DsYMMQpnk>
Subject: Re: [tsvwg] [Ecn-sane] per-flow scheduling
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 27 Jul 2019 15:35:45 -0000

Right, I understand that under 3168 behavior the sender would react
differently to ECE markings than L4S flows would, but I guess I don't
understand why a sender willing to misclassify traffic with ECT(1) wouldn't
also choose to react non-normatively to ECE markings. On the rest, I think
we agree.

Kyle

On Thu, Jul 25, 2019 at 3:26 PM Holland, Jake <jholland@akamai.com> wrote:

> Hi Kyle,
>
>
>
> I almost agree, except that the concern is not about classic flows.
>
>
>
> I agree (with caveats) with what Bob and Greg have said before: ordinary
> classic flows don’t have an incentive to mis-mark if they’ll be responding
> normally to CE, because a classic flow will back off too aggressively and
> starve itself if it’s getting CE marks from the LL queue.
>
>
>
> That said, I had a message where I tried to express something similar to
> the concerns I think you just raised, with regard to a different category
> of flow:
>
> https://mailarchive.ietf.org/arch/msg/tsvwg/bUu7pLmQo6BhR1mE2suJPPluW3Q
>
>
>
> So I agree with the concerns you’ve raised here, and I want to +1 that
> aspect of it while also correcting that I don’t think these apply for
> ordinary classic flows, but rather for flows that use application-level
> quality metrics to change bit-rates instead responding at the transport
> level.
>
>
>
> For those flows (which seems to include some of today’s video conferencing
> traffic), I expect they really would see an advantage by mis-marking
> themselves, and will require policing that imposes a policy decision.
> Given that, I agree that I don’t see a simple alternative to FQ for flows
> originating outside the policer’s trust domain when the network is fully
> utilized.
>
>
>
> I hope that makes at least a little sense.
>
>
>
> Best regards,
>
> Jake
>
>
>
> *From: *Kyle Rose <krose@krose.org>
> *Date: *2019-07-23 at 11:13
> *To: *Bob Briscoe <ietf@bobbriscoe.net>
> *Cc: *"ecn-sane@lists.bufferbloat.net" <ecn-sane@lists.bufferbloat.net>,
> tsvwg IETF list <tsvwg@ietf.org>, "David P. Reed" <dpreed@deepplum.com>
> *Subject: *Re: [tsvwg] [Ecn-sane] per-flow scheduling
>
>
>
> On Mon, Jul 22, 2019 at 9:44 AM Bob Briscoe <ietf@bobbriscoe.net> wrote:
>
> Folks,
>
> As promised, I've pulled together and uploaded the main architectural
> arguments about per-flow scheduling that cause concern:
>
> Per-Flow Scheduling and the End-to-End Argum ent
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__bobbriscoe.net_projects_latency_per-2Dflow-5Ftr.pdf&d=DwMFaQ&c=96ZbZZcaMF4w0F4jpN6LZg&r=bqnFROivDo_4iF8Z3R4DyNWKbbMeXr0LOgLnElT1Ook&m=PI1HWa27sXLOTKR6A5e3p0PaPt7vS4SMNHQKYIzfXxM&s=ACtkb-7e-7Ifb6QsnMjd4WSYrCfUyWGIbBuNkDZ8V3E&e=>
>
>
> It runs to 6 pages of reading. But I tried to make the time readers will
> have to spend worth it.
>
>
>
> Before reading the other responses (poisoning my own thinking), I wanted
> to offer my own reaction. In the discussion of figure 1, you seem to imply
> that there's some obvious choice of bin packing for the flows involved, but
> that can't be right. What if the dark green flow has deadlines? Why should
> that be the one that gets only leftover bandwidth? I'll return to this
> point in a bit.
>
>
>
> The tl;dr summary of the paper seems to be that the L4S approach leaves
> the allocation of limited bandwidth up to the endpoints, while FQ
> arbitrarily enforces equality in the presence of limited bandwidth; but in
> reality the bottleneck device needs to make *some* choice when there's a
> shortage and flows don't respond. That requires some choice of policy.
>
>
>
> In FQ, the chosen policy is to make sure every flow has the ability to get
> low latency for itself, but in the absence of some other kind of trusted
> signaling allocates an equal proportion of the available bandwidth to each
> flow. ISTM this is the best you can do in an adversarial environment,
> because anything else can be gamed to get a more than equal share (and
> depending on how "flow" is defined, even this can be gamed by opening up
> more flows; but this is not a problem unique to FQ).
>
>
>
> In L4S, the policy is to assume one queue is well-behaved and one not, and
> to use the ECT(1) codepoint as a classifier to get into one or the other.
> But policy choice doesn't end there: in an uncooperative or adversarial
> environment, you can easily get into a situation in which the bottleneck
> has to apply policy to several unresponsive flows in the supposedly
> well-behaved queue. Note that this doesn't even have to involve bad actors
> misclassifying on purpose: it could be two uncooperative 200 Mb VR flows
> competing for 300 Mb of bandwidth. In this case, L4S falls back to classic,
> which with DualQ means every flow, not just the uncooperative ones,
> suffers. As a user, I don't want my small, responsive flows to suffer when
> uncooperative actors decide to exceed the BBW.
>
>
>
> Getting back to figure 1, how do you choose the right allocation? With the
> proposed use of ECT(1) as classifier, you have exactly one bit available to
> decide which queue, and therefore which policy, applies to a flow. Should
> all the classic flows get assigned whatever is left after the L4S flows are
> allocated bandwidth? That hardly seems fair to classic flows. But let's say
> this policy is implemented. It then escapes me how this is any different
> from the trust problems facing end-to-end DSCP/QoS: why wouldn't everyone
> just classify their classic flows as L4S, forcing everything to be treated
> as classic and getting access to a (greater) share of the overall BBW? Then
> we're left both with a spent ECT(1) codepoint and a need for FQ or some
> other queuing policy to arbitrate between flows, without any bits with
> which to implement the high-fidelity congestion signal required to achieve
> low latency without getting squeezed out.
>
>
>
> The bottom line is that I see no way to escape the necessity of something
> FQ-like at bottlenecks outside of the sender's trust domain. If FQ can't be
> done in backbone-grade hardware, then the only real answer is pipes in the
> core big enough to force the bottleneck to live somewhere closer to the
> edge, where FQ does scale.
>
>
>
> Note that, in a perfect world, FQ wouldn't trigger at all because there
> would always be enough bandwidth for everything users wanted to do, but in
> the real world it seems like the best you can possibly do in the absence of
> trusted information about how to prioritize traffic. IMO, best to think of
> FQ as a last-ditch measure indicating to the operator that they're gonna
> need a bigger pipe than as a steady-state bandwidth allocator.
>
>
>
> Kyle
>
>
>