Re: [tsvwg] links to Canary methods for roll-out of new transport features

Martin Duke <martin.h.duke@gmail.com> Fri, 13 August 2021 17:14 UTC

Return-Path: <martin.h.duke@gmail.com>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C1B893A1FBC for <tsvwg@ietfa.amsl.com>; Fri, 13 Aug 2021 10:14:04 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.097
X-Spam-Level:
X-Spam-Status: No, score=-2.097 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id H0YwKY5z20YG for <tsvwg@ietfa.amsl.com>; Fri, 13 Aug 2021 10:14:00 -0700 (PDT)
Received: from mail-io1-xd2f.google.com (mail-io1-xd2f.google.com [IPv6:2607:f8b0:4864:20::d2f]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 5BDB13A1FB8 for <tsvwg@ietf.org>; Fri, 13 Aug 2021 10:14:00 -0700 (PDT)
Received: by mail-io1-xd2f.google.com with SMTP id h1so14102161iol.9 for <tsvwg@ietf.org>; Fri, 13 Aug 2021 10:14:00 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=eYQboTFSDfzfHv7EiGq8G0n6NxWCR8xtZUsbQgDaWzk=; b=Yb677c8+KKip/0FsxIR2SqFmhNcHhh0YgSbW3Lq+UhYq61YAB+88UCkEe2TIjWva76 ZpDwe3aLn1Vjs8vQf3/J0BNj5pgkLfRqXFLGRAlCcsgCsMuTqVTxcwjj3KJpM8EZLRO2 MfTet5JDZLSvxsTX5g92b190sKujhsG/3A6EweNs5uooU3XRPvDtGo8GT1S0LAVVFNbf /Oy0CDhMKMqlOpWUMUysIFaUsNnUzoOsn/NgC107ibM4WJEUipNcmlIB/hist14sKnw2 d5kGbJd+x0++SexBKs8ju3tOKKmZ6kpY+o2fLa+EO3Tzau/jLXGRfBrQbIMMECHvWzgN +TCA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=eYQboTFSDfzfHv7EiGq8G0n6NxWCR8xtZUsbQgDaWzk=; b=bdbm4key2I0P2xEUPjM06lmqwVqdvXg/065ZDSvoadfJeuYZLycMIFwL0FAqfIiWcl Db1WLmSs6PE2Zw4f+nTUbV/hsmrWP+nN/q4DZAsDi/tR3wMR9DToqwNyz88yPXI4kr0A /d7lySUS8+DqOEI1yYG8iTqq4Xec+yVVqNqBPPwG7G8XVqsrv/yr9MI1eKUl8qytIojW nWDdAPALblxjw5mIAiw3zvJnRXnX+SGVLyFll5CARI+U+w92ECle4bhcxm3PQHAT8KRO wrQF4wzVcgD5qOajdhgPYw6NBCHtCcGsyu7EAP2DoCT0owrT6s/Y/J71lK8nei/t1b/L GwIw==
X-Gm-Message-State: AOAM532PZZc9eKYNMGC4YyGJUerEnhwNhJFCVIz1D0TXC0w0xbcwIP9o mf9Sgxy/zACbPy0vfuNNNOQuhnFKzh9jj0wl0Q0=
X-Google-Smtp-Source: ABdhPJyyFxoXfObDlrYYbMLrg8ZHYqeNCfYfWtMcziFoB1AoYBbZDfli0aqRv/1ifiwwqJEsUmrLOSZ0oFe2jYThv1A=
X-Received: by 2002:a02:7f4c:: with SMTP id r73mr3272259jac.95.1628874838320; Fri, 13 Aug 2021 10:13:58 -0700 (PDT)
MIME-Version: 1.0
References: <AF731D2C-B796-4B20-973D-6DB496DB1228@akamai.com> <232F9BFA-0D05-48C5-807E-FA2A7904754A@erg.abdn.ac.uk> <eg5mzk.qx1zf8.0-qmf@smtp.gmail.com> <de1017ec-d437-4c61-9f9c-7d237eee8fcb@erg.abdn.ac.uk>
In-Reply-To: <de1017ec-d437-4c61-9f9c-7d237eee8fcb@erg.abdn.ac.uk>
From: Martin Duke <martin.h.duke@gmail.com>
Date: Fri, 13 Aug 2021 10:13:47 -0700
Message-ID: <CAM4esxSf9F86dYKW5jg8AW-m7aa8bkcgkfANzwvtVBeAdYDAbA@mail.gmail.com>
To: Gorry Fairhurst <gorry@erg.abdn.ac.uk>
Cc: Jonathan Morton <chromatix99@gmail.com>, "Holland, Jake" <jholland=40akamai.com@dmarc.ietf.org>, tsvwg <tsvwg@ietf.org>
Content-Type: multipart/alternative; boundary="0000000000001f2d3c05c973fa53"
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/sW4m-GdJUDVYmjlNuJOWoBpnyJQ>
Subject: Re: [tsvwg] links to Canary methods for roll-out of new transport features
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 13 Aug 2021 17:14:05 -0000

(As an individual)

The "control" part of the canary experiment is how you identify collateral
damage. The L4S hypotheses, informally, are that
(1) L4S flows will experience lower latency, etc
(2) Not-ECT flows, and 3168 flows that traverse most queueing
configurations, will not suffer significant degradation.
(3) 3168 flows will only suffer starvation in queue configurations that are
not deployed at scale in the internet.

So the proper design of a canary deployment, IMO, is in two steps:
(a) Have servers choose between Not-ECT and ECT(0) to establish a baseline.
(b) servers choose between Not-ECT, ECT(0), and ECT(1) and compare the
results.

On Fri, Jul 30, 2021 at 4:24 AM Gorry Fairhurst <gorry@erg.abdn.ac.uk>
wrote:

>
> On 30/07/2021 11:21, Jonathan Morton wrote:
> > On Friday, 30 July 2021, Gorry (erg) wrote:
> >> I am not sure though, I think many CC- related topics can have
> potential collateral damage, and we have managed to deploy these gradually
> and improve the transport were necessary. So, what is different here to
> exploring methods such as larger Initial Window, BBR, Hystart, etc.
> > Hystart makes a transport strictly less aggressive, by exiting the
> exponential growth phase early and continuing with much slower linear or
> polynomial growth.  There is no possibility of collateral damage except in
> case of implementation bugs.  This is an excellent use case for canary
> testing, and I would endorse its use with Hystart++.
> >
> > Large IWs do have potential for collateral damage, but the conditions
> that would trigger it (small buffers) result in effects (high loss to IW)
> that are easily noticed by the transport employing it.  This is therefore
> also suitable for canary testing.
> >
> > BBR is not so clear a case.  I am pleased that Google is taking a
> relatively cautious approach to deploying it in an Internet facing context,
> has designed it with standards-track CC coexistence in mind, and seeks to
> improve it when problems are reported and verified.  However, the ECN
> response introduced with BBRv2 is not standards-track compatible, and the
> likely collateral damage when operating in a shared AQM bottleneck is not
> easily noticed by the BBR transport itself, especially when those
> circumstances arise only infrequently.  Canary testing is thus an
> incomplete solution there.
> >
> > The entire problem here is that L4S is likely to cause *externalised*
> collateral damage which is not easily noticed by the L4S transport itself.
> Unless great care is taken to watch for such problems, canary testing will
> therefore fail to find them.
> >
> > What is more, canary testing has as a prerequisite the confidence of
> correct operation and design gained though lab testing.  Lab testing of L4S
> has not given any of that confidence, thus progression to canary testing
> would be inappropriate.
> >
> >   - Jonathan Morton
>
> I was responding to a request to provide references to canary approaches.
>
> Gorry
>
>