Re: [tsvwg] Switch testing at 25G with ECN --SCE Draft

Gorry Fairhurst <gorry@erg.abdn.ac.uk> Wed, 21 August 2019 08:03 UTC

Return-Path: <gorry@erg.abdn.ac.uk>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6F9F31200FD for <tsvwg@ietfa.amsl.com>; Wed, 21 Aug 2019 01:03:42 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id XC80fhvFTGLf for <tsvwg@ietfa.amsl.com>; Wed, 21 Aug 2019 01:03:40 -0700 (PDT)
Received: from pegasus.erg.abdn.ac.uk (pegasus.erg.abdn.ac.uk [IPv6:2001:630:42:150::2]) by ietfa.amsl.com (Postfix) with ESMTP id 1EA6012004E for <tsvwg@ietf.org>; Wed, 21 Aug 2019 01:03:39 -0700 (PDT)
Received: from MacBook-Pro-5.local (fgrpf.plus.com [212.159.18.54]) by pegasus.erg.abdn.ac.uk (Postfix) with ESMTPSA id 1A18C1B00247; Wed, 21 Aug 2019 09:03:33 +0100 (BST)
Message-ID: <5D5CFAD5.9050404@erg.abdn.ac.uk>
Date: Wed, 21 Aug 2019 09:03:33 +0100
From: Gorry Fairhurst <gorry@erg.abdn.ac.uk>
Reply-To: gorry@erg.abdn.ac.uk
Organization: University of Aberdeen
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:12.0) Gecko/20120428 Thunderbird/12.0.1
MIME-Version: 1.0
To: rgrimes@freebsd.org
CC: "Rodney W. Grimes" <freebsd@gndrsh.dnsmgr.net>, "Scaglione, Giuseppe" <giuseppe.scaglione@hpe.com>, "tsvwg@ietf.org" <tsvwg@ietf.org>
References: <201908131711.x7DHBZ8W017255@gndrsh.dnsmgr.net>
In-Reply-To: <201908131711.x7DHBZ8W017255@gndrsh.dnsmgr.net>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/fO337vUrfxLl8GRRLXlgg0dNkkc>
Subject: Re: [tsvwg] Switch testing at 25G with ECN --SCE Draft
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 21 Aug 2019 08:03:42 -0000

Sorry - That helps explain a little.

My email wanted to note that I had seen results comparing the new 
methods with the TCP behaviour specified in RFC3168, whereas RFC8511 was 
the current specification for the transport behaviour for CE marking of 
ECT(0).  RFC8511 was published as an experimental RFC - to collect 
experience with  deployment  and confirm acceptable safety.

Gorry


On 13/08/2019, 18:11, Rodney W. Grimes wrote:
> Gorry,
> I am not certain which "you" are asking about RFC8511, but if it is
> the SCE developers that is much about what this test was, it was looking
> at SCE type response (similiar to RFC8511, but not the same) at very
> high bandwidth very short delay type paths.
>
> Within SCE we have tried different responses to ESCE in the sender,
> and at present I believe the published code is using a -sqrt(ackbytes)
> as the backoff, which is what these tests results are from.
>
> The work in these areas is ongoing, and we have had several different
> funtions here (the very first was simply -(0.5 * ackbytes), but found
> to be a bit over responsive).
>
> As SCE gathers more data it is probable that adjustments to both
> the marking function and the response function are going to need
> to be made, infact iirc there are 3 existing marking functions
> (codel, fg_codel, pie) and 2 new ones (LFQ, NCQ) being worked on.
>
> But it is unlikely until wide scale experimentation that a standard
> set of functions can be identified.  This test data shows probably
> the simplest of possible implementations, and it was pleasing
> to see positive test results without any iterative design process,
> or tweaking for this test.
>
> It is also possible that you are suggesting that the non-SCE
> tests might be repeated using RFC8511 ABE type response in the
> DCTCP used, I too believe that would be a meaningful data point.
>
>> It seems you are comparing with RFC3168, but have you looked also at the
>> updated response in RFC8511. If you are using shallow marking, this
>> would seem to be giving you the differentiation between buffer over flow
>> (loss) and queue building (CE) that you mention?
>>
>> Gorry
>>
>> On 12/08/2019, 16:28, Scaglione, Giuseppe wrote:
>>> Hi Greg,
>>>
>>> My humble opinion based on the testing presented is that with SCE remarking purely function of the queue depth, the TCP stack has a notion of congestion present in the TCP connection that is independent of the buffering capability of the switch queue. The TCP stack is seeing a "5%" of packet remarked for example, which means somewhere in the network some queue is seeing "some" oversubscription period.
>>>
>>> While with CE RFC3168 you know that remarking typically only starts after a X% of queue depth (user configured). And that X is function of the queue depth and memory available, and if the X is not 'tuned' you either get not optimal link utilization or packet drops.
>>>
>>> Basically, it seems like having an 'early warning' marking makes the SW implementation of the TCP algo easier and more predictable, instead of relying on a switch configuration of X.
>>>
>>> Yes -- more testing is required.
>>>
>>> Best Regards,
>>> Giuseppe
>>>
>>> -----Original Message-----
>>> From: Greg White [mailto:g.white@CableLabs.com]
>>> Sent: Friday, August 9, 2019 2:54 PM
>>> To: Scaglione, Giuseppe<giuseppe.scaglione@hpe.com>; Jonathan Morton<chromatix99@gmail.com>
>>> Cc: rgrimes@freebsd.org; tsvwg@ietf.org
>>> Subject: Re: [tsvwg] Switch testing at 25G with ECN --SCE Draft
>>>
>>> Agreed.  I would not expect any different result either, which begs the question why does SCE need two different signals (ECT(1) and CE) in a datacenter environment?
>>>
>>> -Greg
>>>
>>>
>>>
>>> ?On 8/9/19, 3:43 PM, "Scaglione, Giuseppe"<giuseppe.scaglione@hpe.com>   wrote:
>>>
>>>       Greg,
>>>
>>>       We are working -- and in beta -- with having the switch natively set SCE bits instead of CE and removing the iptable configuration on the target server.  Yet, I do not expect any different result since the TCP-SCE stack would react the same.
>>>
>>>       Regards,
>>>       Giuseppe Scaglione
>>>
>>>
>>>       -----Original Message-----
>>>       From: Greg White [mailto:g.white@CableLabs.com]
>>>       Sent: Friday, August 9, 2019 2:36 PM
>>>       To: Jonathan Morton<chromatix99@gmail.com>; Scaglione, Giuseppe<giuseppe.scaglione@hpe.com>
>>>       Cc: rgrimes@freebsd.org; tsvwg@ietf.org
>>>       Subject: Re: [tsvwg] Switch testing at 25G with ECN --SCE Draft
>>>
>>>       Right.  Per the SCE method, the switch would mark ECT(1) using a ramp, and then when the ramp would exceed 100% marking, it would change to using CE.
>>>
>>>       The implementation discussed here marks CE using a ramp, and when the ramp would exceed 100% marking, it changes to packet drop.  This, as you said, is simply RFC3168 ECN marking, not SCE ECN marking.
>>>
>>>       I was just trying to set the record straight.  There was a claim made that a switch vendor had implemented SCE-style packet marking in hardware at 25Gbps, which wasn't accurate.
>>>
>>>       -Greg
>>>
>>>
>>>
>>>       On 8/9/19, 2:58 PM, "Jonathan Morton"<chromatix99@gmail.com>   wrote:
>>>
>>>           >   On 9 Aug, 2019, at 11:44 pm, Scaglione, Giuseppe<giuseppe.scaglione@hpe.com>   wrote:
>>>           >
>>>           >>>   Just to be super clear, this isn't a hardware implementation of SCE running at 25Gbps.
>>>           >
>>>           >   I am not sure I follow. The Test setup section of the paper clearly describes the hardware used -- severs, switch, cables.
>>>           >   What I cannot disclose at this point is the exact model and characteristics of the HPE Aruba Switch used. Yet, it is a "real" Ethernet Switch, providing 25Gbps connectivity, configured to do bridging across the four ports and implementing RFC3168 with the ECN remarking configuration described in my previous email and on the paper.
>>>
>>>           I think the issue is with the way the switch itself marks with CE, not ECT(1).  It's a limitation I think is worth acknowledging and, hopefully, finding a way to remove.
>>>
>>>            - Jonathan Morton
>>>
>>>
>>>
>>