Re: [tsvwg] links to Canary methods for roll-out of new transport features

Gorry Fairhurst <gorry@erg.abdn.ac.uk> Fri, 30 July 2021 11:24 UTC

To: Jonathan Morton <chromatix99@gmail.com>, jholland=40akamai.com@dmarc.ietf.org
Cc: tsvwg@ietf.org
References: <AF731D2C-B796-4B20-973D-6DB496DB1228@akamai.com> <232F9BFA-0D05-48C5-807E-FA2A7904754A@erg.abdn.ac.uk> <eg5mzk.qx1zf8.0-qmf@smtp.gmail.com>
From: Gorry Fairhurst <gorry@erg.abdn.ac.uk>
Message-ID: <de1017ec-d437-4c61-9f9c-7d237eee8fcb@erg.abdn.ac.uk>
Date: Fri, 30 Jul 2021 12:23:40 +0100
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:78.0) Gecko/20100101 Thunderbird/78.12.0
MIME-Version: 1.0
In-Reply-To: <eg5mzk.qx1zf8.0-qmf@smtp.gmail.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 7bit
Content-Language: en-GB
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/LrErTnCiEwoZUqYclTcQxpbpBQI>
Subject: Re: [tsvwg] links to Canary methods for roll-out of new transport features
Precedence: list

On 30/07/2021 11:21, Jonathan Morton wrote:
> On Friday, 30 July 2021, Gorry (erg) wrote:
>> I am not sure though, I think many CC- related topics can have potential collateral damage, and we have managed to deploy these gradually and improve the transport were necessary. So, what is different here to exploring methods such as larger Initial Window, BBR, Hystart, etc.
> Hystart makes a transport strictly less aggressive, by exiting the exponential growth phase early and continuing with much slower linear or polynomial growth.  There is no possibility of collateral damage except in case of implementation bugs.  This is an excellent use case for canary testing, and I would endorse its use with Hystart++.
>
> Large IWs do have potential for collateral damage, but the conditions that would trigger it (small buffers) result in effects (high loss to IW) that are easily noticed by the transport employing it.  This is therefore also suitable for canary testing.
>
> BBR is not so clear a case.  I am pleased that Google is taking a relatively cautious approach to deploying it in an Internet facing context, has designed it with standards-track CC coexistence in mind, and seeks to improve it when problems are reported and verified.  However, the ECN response introduced with BBRv2 is not standards-track compatible, and the likely collateral damage when operating in a shared AQM bottleneck is not easily noticed by the BBR transport itself, especially when those circumstances arise only infrequently.  Canary testing is thus an incomplete solution there.
>
> The entire problem here is that L4S is likely to cause *externalised* collateral damage which is not easily noticed by the L4S transport itself.  Unless great care is taken to watch for such problems, canary testing will therefore fail to find them.
>
> What is more, canary testing has as a prerequisite the confidence of correct operation and design gained though lab testing.  Lab testing of L4S has not given any of that confidence, thus progression to canary testing would be inappropriate.
>
>   - Jonathan Morton

I was responding to a request to provide references to canary approaches.

Gorry

[tsvwg] links to Canary methods for roll-out of n… Gorry Fairhurst
Re: [tsvwg] links to Canary methods for roll-out … Holland, Jake
Re: [tsvwg] links to Canary methods for roll-out … Gorry (erg)
Re: [tsvwg] links to Canary methods for roll-out … Jonathan Morton
Re: [tsvwg] links to Canary methods for roll-out … Gorry Fairhurst
Re: [tsvwg] links to Canary methods for roll-out … Martin Duke
Re: [tsvwg] links to Canary methods for roll-out … Holland, Jake
Re: [tsvwg] links to Canary methods for roll-out … Martin Duke
Re: [tsvwg] links to Canary methods for roll-out … Jonathan Morton