Re: [tsvwg] Requesting TSVWG adoption of SCE draft-morton-tsvwg-sce

Sebastian Moeller <moeller0@gmx.de> Mon, 18 November 2019 08:44 UTC

Return-Path: <moeller0@gmx.de>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1E99B1200BA for <tsvwg@ietfa.amsl.com>; Mon, 18 Nov 2019 00:44:06 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.649
X-Spam-Level:
X-Spam-Status: No, score=-1.649 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=gmx.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zk50EW-riyW9 for <tsvwg@ietfa.amsl.com>; Mon, 18 Nov 2019 00:44:04 -0800 (PST)
Received: from mout.gmx.net (mout.gmx.net [212.227.15.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1A8E21200E5 for <tsvwg@ietf.org>; Mon, 18 Nov 2019 00:44:03 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1574066608; bh=Hpz89idiznnsuygYeUqr2SZpEzgwvsSaroOJhuzHGFY=; h=X-UI-Sender-Class:Subject:From:In-Reply-To:Date:Cc:References:To; b=N7Kq7GZo2cfe5yb1XrV0PM1W9t6IzkVe6p9nhw3VugsU/8jCUoZjsuYwteIgAkUuj OnFripy5JPaiub1ZOyW+nd4Ehe5dDZkw09OxAOYqFnr+SKoZubC7YdpjvC3RPUQqJw rd7v6/bqAXx0lKfpL3Kcgg7r6hO+N+xez/yeFL3s=
X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c
Received: from [10.11.12.33] ([134.76.241.253]) by mail.gmx.com (mrgmx005 [212.227.17.190]) with ESMTPSA (Nemesis) id 1N6bk4-1hn3gE2oU2-0181gc; Mon, 18 Nov 2019 09:43:28 +0100
Content-Type: text/plain; charset="utf-8"
Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\))
From: Sebastian Moeller <moeller0@gmx.de>
In-Reply-To: <B99441C2-B57F-41A8-8E2C-AD80BC59F84C@cablelabs.com>
Date: Mon, 18 Nov 2019 09:43:25 +0100
Cc: "gorry@erg.abdn.ac.uk" <gorry@erg.abdn.ac.uk>, Ingemar Johansson S <ingemar.s.johansson@ericsson.com>
Content-Transfer-Encoding: quoted-printable
Message-Id: <91209DC2-699E-4E96-86B3-DB7F4A284BE6@gmx.de>
References: <HE1PR07MB4425A6B56F769A5925FF5AA0C2720@HE1PR07MB4425.eurprd07.prod.outlook.com> <0F5F9FA9-FC09-4679-8A6A-45F93A6A6ED5@akamai.com> <B99441C2-B57F-41A8-8E2C-AD80BC59F84C@cablelabs.com>
To: tsvwg IETF list <tsvwg@ietf.org>, Greg White <g.white@CableLabs.com>, "Holland, Jake" <jholland@akamai.com>, Ingemar Johansson S <ingemar.s.johansson=40ericsson.com@dmarc.ietf.org>, "tsvwg@ietf.org" <tsvwg@ietf.org>
X-Mailer: Apple Mail (2.3445.104.11)
X-Provags-ID: V03:K1:3y0Nstm5LH3OqFehyiE/OdYUHBdeGxNxxNa0DYHFT0qV9i/nLvy 3pBla3uEZ3j97oZYTD2JfL0+0PdRWOVve3qfp8GgB39IlUFPX9Qq/1mksa8vyTwhV0rJPVP Z59gNAnQqEw4SC6dBUYlyeToOuYA4znf5y0hYC0WSfSJFv32ys5W+o7tHGkUxVgN4ki+wET AZZUf3/yKqXLfbnCr/kZg==
X-UI-Out-Filterresults: notjunk:1;V03:K0:339ouh2UmhU=:PJbWEIAF6HNVG0Bg2ccn0V 6sLsDiAt91DHbvc780jTztvpHQa8JRDAxdnChBTzLsBJ2MvDKJbM4HJw0HcsbpmOyte/+nnXX PFxYuOZfJVRFhxpcnzrSRQlIs1jRT3/R2dIMSYE6foxs6R0WAORBhxV8x5OOM31cpEJqjjAxz OTxRrGONt5Jj0+Kxq5O/RwmMFSnjKTVuWsY8maQAo0HavwIHp87dFrjgFvJVCWidWi7ofcaPC oLKTJpVu0QdseMdaVZToTvnjS3Pj9BNkzl61GLt2d7Q95rELlEGKGsH0D/SiOxWAezIQaOAU0 SJVcZaor59Jns/amqrF4LtePHqB95M5foGX/DGef9+pnKfvR8vmD7KefwncoRE3b2vEdrtdC8 gHKkHIFEqJG+ojIsNtMwba8ylyL6fbIZI6z+lPX/CqKWTH0qV9OCj0XHEOjRgpQ+oTYVZHnkO 298SK+CfKJbyDpFRmQPgZfJw+ITM0PdPXWY5jDWkPCB8OJvVEJHfJ26FjF/0DjLDpWhyorlmt jMd0ds6+qvb9ImmwnhP/t0sOFKFKSFNJrDzCXjHNLYTUrfyWtBV3knOndXRAsBZDTGETXKfFj gD7oMdHTa5ZIvCoqD/tjU/pdmD7EnDD1BobsdiZhqZMaZNT7R0Z9//0SsamSzIlGMXKwH5t7A UxsVkXie2M8AQy9cQj7fTuB3UBJQuZsOj+CqiR7WVLiwDWRxx+HjRkoLmHLtxvwM6Nbwjqp6h q7+0TmfyjZ40ECOzr/3GeyjSqqN0TNtQ5odKbRLWalKeJf4ob+G25cP4oEDfrlNadF/i6fK/7 CDiFWZaNPzZuJ8fBGkwWok3EH7YZEFmbyL1W2VZWUH8lC1NJmQp/IJHhlZBpnjGt1OkKW8CvY VGb7BfTY8qLhrZ+ufd2dUW/a68S30xUaBXYr6fab+d/ajGLrC8ENsu9RO5mehbGCwO463EHTz naXALXnH9TyZzgmNO/j/nVJ8GpSmflWC7hKzOfNEVt5Wsg87EFCjWi//SfiA44YsTyf/nfmTU NJGGRrIzs73NYBYCQgmQO9kISJXCDAp/Rjgv7LRUGqZjE998bCWGnBB3twMcJofm06bOzdvIQ rIzFHhD2T1XqW/stobXCGjfI1M0OOqfMEZ0bEE2CymKb+ybQbGcrnswfstXjFxzJzaDTCRXuH oJJCI/uFgK5EF4D9ZviaXCjk77nWakbcfe4EBGzhItZu6im050PYOZic1s/i6j5wqcolnQARO FPtwYIqGCSu2hziBkmbXePWhEdh5/IGDq+IbV7w==
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/YZHL4YCUGCV8Q5SM-En3wMiMJu0>
Subject: Re: [tsvwg] Requesting TSVWG adoption of SCE draft-morton-tsvwg-sce
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 18 Nov 2019 08:44:06 -0000

Dear Greg, Others

please see more below in-line.


On November 18, 2019 2:46:41 AM GMT+01:00, Greg White <g.white@CableLabs.com> wrote:
> Hi Jake,
> 
> A couple of comments.
> 
> Please don't confuse L4S with TCP Prague.  TCP Prague is just one
> congestion controller that is L4S compatible.

	[SM] As far as I can tell TCP Prague is the reference implementation for an L4S compatible congestion controller. I do not believe it to be overly onerous to expect that it actually is demonstrated to work robustly and reliably even under adversarial conditions while maintaining.implementing the L4S arch I.D.'s requirements and claims. This is your chance to demonstrate that L4S works, please use it.

>  There are others, and
> there will likely be more in the future.  

	[SM]Once/If that is the case, I am certain,  these will be also tested against and compared with the declared L4S goals, but let's focus on what is available now, please.

> The bug you are referring to
> was not with L4S congestion signaling, it was with the TCP Prague
> implementation, and a part of that implementation (exit from slow
> start) where there is active research (paced chirping, etc.) and
> opportunity for continued evolution even after L4S gets deployed.   

	[SM] I consider a bug in a non-esoteric feature like slow-start to be at leat an indication that in the impressively fast implementation period of TCP Prague a few things can still be improved, like testing of core functionality. Paced chirping (PC) is a great concept, but currently not part of TCP Prague, so it seems orthogonal to this discussion. Unless your claim is that PC is the solution to the issue at hand, in which case I think it should be implemented in TCP Prague first and demonstrate its robustness and reliability in realistic test scenarios.

> 
> The second issue you are referring to is not similar to the first.  It
> does not affect any flows other than the TCP Prague flow itself, and it
> seems to be restricted to these existing IQRouter/CAKE implementations.

	[SM] It is a demonstration though, of TCP Prague not meeting its design goals...


> If we can consider improvements in the IQRouter/CAKE implementations (a
> not unreasonable fix for this sort of issue) a fairly minor change to
> fq_codel (low CE threshold marking for ECT(1) packets) would fix it. 
> Also (and this is speculation) a more responsive AQM such as PIE
> instead of CoDel probably would as well.

	[SM] The design goal needs to be compatibility withe the existing internet as changing all the internet seems like a tall order. So IMHO we do not get to pick and choose the internet L4S/SCE needs to be compatible with. I strongly believe that any new comer needs to play fairly with the existing traffic and if in doubt scale back itself instead of just steam-rolling over standards compliant traffic, but I am not an engineer, so maybe I am too naive in that regard.


Best Regards
	Sebastian


> 
> Greg
> 
> On 11/18/19, 8:45 AM, "tsvwg on behalf of Holland, Jake"
> <tsvwg-bounces@ietf.org on behalf of jholland@akamai.com> wrote:
> 
>   Hi Ingemar,
> 
>   If fragmenting the space will prevent other SDOs from prematurely
>   adopting the unproven L4S technology, that seems like exactly the
>   right thing to do at this stage.
> 
>   I think we've seen strong evidence that L4S may still contain
>   show-stopping problems.  Also that we have not yet seen strong
>   evidence that the problems stemming from the ambiguity in the
>   L4S signaling design can be fixed.
> 
>   This carries a demonstrated potential for breaking existing
>   ECN deployments by under-responding to the already widely-deployed
>   congestion feedback systems.
> 
>   Certainly L4S's implementation was demonstrated to contain an
>   issue that would have wrecked the latency of existing ECN
>   deployments, and it had not previously been detected, despite the
>   years of lab evaluation and repeated requests from reviewers to
>   test such scenarios earlier.
> 
>   Although a fix was found for the specific initially-demonstrated
>   case, no fix has yet been demonstrated for what looks to be a very
>   similar issue occurring with staggered flow startup, which can't
>   be attributed to a wrong alpha starting value:
>   https://trac.ietf.org/trac/tsvwg/ticket/17#comment:8
> 
>   The proposed pseudocode fix (with up to 5 tuning parameters, IIRC)
>   may or may not be able to address this for specific cases, and it
>   may or may not be possible to discover a set of tuning values that
>   can address a wide range of conditions, but it seems appropriate
>   to have some skepticism, at least until demonstration of successful
>   operation under a wide range of conditions, given the history of
>   such proposals.  This suggests that we do the opposite of
> encouraging other SDOs to move broadly forward with L4S at this time.
> 
>   I share your concern that we might lose the codepoint (and the low
>   latency functionality), and I acknowledge that a persistently
>   fragmented space introduces a risk that it never happens, or takes
>   an extra decade.
> 
>   But the risk that concerns me even more is if L4S gets rolled out
>   and then these kinds of issues are discovered in production, after
>   other SDOs have prematurely standardized on this experiment, and it
>   therefore gets shut off with prejudice against future solutions.
>   That outcome also would lose the use of the codepoint, probably
>   even more permanently.
> 
>  (Or even worse: if it does not get shut off in spite of the problems
>  it causes, which loses even the low-ish latency solutions we already
>   have, and adds to the congestion control aggression arms race.)
> 
>   IMO, those would be even worse outcomes than a somewhat delayed
>   adoption of a fully vetted system (or at least one that can't break
>   existing deployed networks).
> 
>   Best regards,
>   Jake
> 
>   PS: I still don't understand why the gains available through the
>   use of regular AQM (especially with ECN) have not been more widely
>   adopted by the other SDOs that would want to make use of L4S.
> 
>   It seems possible already to reduce the application-visible delay
>   spikes from ~200ms to ~20ms (provided that no overly aggressive
>   competing traffic improperly ignores the feedback, or that
>   flow-queuing or other queue protection mechanisms are more widely
>   deployed to prevent excessive damage from aggressive flows to less
>   aggressive competing flows).
> 
>   I wonder if whatever would drive SDOs to start using L4S maybe
>   could instead be leveraged to drive adoption of the much more well-
>   proven existing ECN solutions, which at least already have a lot
>   of endpoint support deployed.
> 
>   The endpoint support is a critical component to making this useful,
> and I see no reason to believe it'll be any quicker than the existing
> regular ECN was.  I'd even expect less so, since the behavior is much
>   more complicated and hard to test.
> 
>  PPS: I agree it would be interesting to see paced chirping solutions
>   to help do better than slow start, and to quickly grow when new
> capacity opens on-path.  But I'll point out that's not specific to L4S,
>  but rather should have application for any CC that can avoid pushing
> the network until queue overflow, which to me likely includes regular
>   ECN-enabled Reno or Cubic, as well as BBR.
> 
>   However, as yet another unproven TBD, I'll suggest it's not very
>   useful as a strong influence on this debate, in spite of the early
>   demos using L4S.  Regardless of the ultimate low latency solution,
>   that part will need further development and might not work.
> 
> 
>