Re: [tsvwg] thoughts on operational queue settings (Re: CC/bleaching thoughts for draft-ietf-tsvwg-le-phb-04)

Toerless Eckert <tte@cs.fau.de> Thu, 12 April 2018 08:05 UTC

Return-Path: <eckert@i4.informatik.uni-erlangen.de>
X-Original-To: tsvwg@ietfa.amsl.com
Delivered-To: tsvwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D540C1242EA for <tsvwg@ietfa.amsl.com>; Thu, 12 Apr 2018 01:05:30 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.949
X-Spam-Level:
X-Spam-Status: No, score=-3.949 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.25, RCVD_IN_DNSWL_MED=-2.3, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id nETluMDMMGN2 for <tsvwg@ietfa.amsl.com>; Thu, 12 Apr 2018 01:05:23 -0700 (PDT)
Received: from faui40.informatik.uni-erlangen.de (faui40.informatik.uni-erlangen.de [IPv6:2001:638:a000:4134::ffff:40]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 52A57120047 for <tsvwg@ietf.org>; Thu, 12 Apr 2018 01:05:23 -0700 (PDT)
Received: from faui48f.informatik.uni-erlangen.de (faui48f.informatik.uni-erlangen.de [131.188.34.52]) by faui40.informatik.uni-erlangen.de (Postfix) with ESMTP id 5E66F58C52E; Thu, 12 Apr 2018 10:05:15 +0200 (CEST)
Received: by faui48f.informatik.uni-erlangen.de (Postfix, from userid 10463) id 5129C440214; Thu, 12 Apr 2018 10:05:15 +0200 (CEST)
Date: Thu, 12 Apr 2018 10:05:15 +0200
From: Toerless Eckert <tte@cs.fau.de>
To: Mikael Abrahamsson <swmike@swm.pp.se>
Cc: Brian E Carpenter <brian.e.carpenter@gmail.com>, tsvwg@ietf.org
Message-ID: <20180412080515.g5qkjqmluru26nib@faui48f.informatik.uni-erlangen.de>
References: <20180406160344.xwfqgzhzfto56jhq@faui48f.informatik.uni-erlangen.de> <5252b232-13df-fb19-227f-95694572b86c@kit.edu> <20180412012544.tmnzec3zddlyrmlb@faui48f.informatik.uni-erlangen.de> <39fb9195-b46f-945f-d414-936f97a59d87@gmail.com> <alpine.DEB.2.20.1804120818210.18650@uplift.swm.pp.se>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <alpine.DEB.2.20.1804120818210.18650@uplift.swm.pp.se>
User-Agent: NeoMutt/20170113 (1.7.2)
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsvwg/WdJJ_7A6CW-97nif6fEVLHShifc>
Subject: Re: [tsvwg] thoughts on operational queue settings (Re: CC/bleaching thoughts for draft-ietf-tsvwg-le-phb-04)
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsvwg/>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 12 Apr 2018 08:05:31 -0000

Thanks Mikael, interesting data

We have done some pretty advanced work in the past with different
drop profiles in single queues, see eg: draft-lai-tsvwg-normalizer,
and i got out of this experience thinking that these drop profiles are very
magic parameter based tools. Results change widely with the
different congestion control and short term burstyness of flows.
RED profiles will also not meet the current written requirement in
the draft to not send LE packets when non-LE packets are queued,
but of course to really do this, you would need strict
priority queuing with a lowest prio queue for LE. I think thats
very uncommon on routers.

I would hope today we could get a separate WFQ for LE on all
platforms which looks to me like the much easier and easier to
predict solution. If you tell me thats not the case, my
interest to delve again into RED profiles will raise again *sigh*

Wrt to give LE more than 0% bandwidth share: In another mail i
made the argument that the percentage you assign becomes yet
another magic parameter where its unclear what the best value is,
and worst case we figure out it should be consistent across paths.

Therefore IMHO it would be better if we could persuade apps to
use BE/LE in a hybrid fashion, preferring LE, but resorting to LE
to guarantee meeting deadlines - especially when LE gives you
actual zero for longer periods. That way we have a very simple,
non-magic parameter in the network. LE is really only leftover.

Admittedly this approach puts some more complexity into the app,
so i am not saying you're wrong, i am just saying N > 0 makes
LE more compex for the network operator and i wouldn't know
a simple recipe how to optimize N.

And both approaches would of course be complementary.

Cheers
   Toerless

On Thu, Apr 12, 2018 at 08:39:25AM +0200, Mikael Abrahamsson wrote:
> On Thu, 12 Apr 2018, Brian E Carpenter wrote:
> 
> > BE and LE PHBs should talk about queueing and dropping behaviour, not
> > about capacity share, in any case. It's clear that on a congested link,
> > LE is sacrificed first - peak hour LE throughput might be precisely
> > zero, which is not acceptable for BE.
> 
> I have received questions from operational people for configuration examples
> for how to handle LE/BE etc. So I did some work in our lab to give some kind
> of example.
> 
> So my first goal was to figure out something that'd do something reasonable
> on a platform that'll only do DSCP based RED (as this is typically available
> on platforms going back 15 years). This is not optimal, but at least it
> would be deployable on lots of platforms currently installed and moving
> packets for customers.
> 
> The test was performed with 30ms of RTT, 10 parallel TCP sessions per
> diffserv RED curve, 800 megabit/s access speed (it's really gig, but in my
> lab setup I have some contraints that meant if I set it to gig I might get
> some uncontrolled packet loss due to other equipment sitting on the same
> shared link, so I opted for 800 megabit/s as "close enough").
> 
> What I came up with that would give LE ~10% of access bandwidth compared to
> BE, and a slight advantage for anything that is not BE/LE (goal was to give
> this traffic a lossless experience) was this:
> 
> This is a Cisco ASR9k that without this RED configuration will buffer
> packets up to ~90 milliseconds, resulting in 120ms RTT (30ms path RTT and
> 90ms buffer-bloat).
> 
>  class class-default
>   shape average 800 mbps
>   random-detect dscp 1,8 1 ms 500 ms
>   random-detect dscp 0,2-7 5 ms 1000 ms
>   random-detect dscp 9-63 10 ms 1000 ms
> 
> This basically says that for LE and CS1, start dropping packets at 1ms of
> buffer fill. Since some applications use CS1 for scavanger, it made sense to
> me to treat CS1 and LE the same.
> 
> For BE (which I made to be DSCP 0,2-7), start dropping packets at 5ms buffer
> fill, less agressively compared to LE.
> 
> For the rest, don't start dropping packets until 10ms buffer fill, giving it
> slight advantage (thought here being that gaming traffic etc should not see
> much drops even though they will see some induced RTT because of BE
> traffic).
> 
> This typically results in LE using approximately 30-50 megabit/s when there
> are 10 LE TCP sessions and 10 BE TCP sessions, all trying to go full out.
> The BE sessions then get ~750 megabit/s. The added buffer delay is around
> 5-10ms as that's where the BE sessions settle their BW usage. Platform
> unfortunately doesn't support ECN marking.
> 
> If I were to spend queues on this traffic instead of using RED, I would do
> this differently. I will do more tests with lower speeds etc, this was just
> initial testing for one use-case, but also to give an example of what can be
> done on currently shipping platforms. I know there are much better ways of
> doing this, but I want this into networks NOW, not in 5-10 years. So the
> easier the advice, the better chance we get this into production networks.
> 
> I don't think it's a good idea to give CS1/LE no bandwidth at all, that
> might cause failure cases we can't predict. I prefer to give LE traffic a
> big disadvantage, so that it might only get 5-10% or something of bandwidth,
> when there is competing traffic.
> 
> I will do more testing, I have several typical platforms available to me
> that are in wide use.
> 
> -- 
> Mikael Abrahamsson    email: swmike@swm.pp.se

-- 
---
tte@cs.fau.de