Re: [ippm] How should capacity measurement interact with shaping?

"MORTON, ALFRED C (AL)" <acm@research.att.com> Thu, 19 September 2019 22:35 UTC

From: "MORTON, ALFRED C (AL)" <acm@research.att.com>
To: Matt Mathis <mattmathis@google.com>, "Ruediger.Geib@telekom.de" <Ruediger.Geib@telekom.de>
CC: "ippm@ietf.org" <ippm@ietf.org>, "CIAVATTONE, LEN" <lc9892@att.com>
Thread-Topic: How should capacity measurement interact with shaping?
Thread-Index: AQHVU4HrkoUGktYwE0yJQr8LfUwZ06b/aQHwgAQxteCAL6Jm4IAAvlkA//++dPA=
Date: Thu, 19 Sep 2019 22:34:52 +0000
Message-ID: <4D7F4AD313D3FC43A053B309F97543CFA0AF94D0@njmtexg5.research.att.com>
References: <CAH56bmBmywKg_AxsHnRf97Pfxu4Yjsp_fv_s4S7LXk1voQpV1g@mail.gmail.com> <4D7F4AD313D3FC43A053B309F97543CFA0ADC777@njmtexg4.research.att.com> <LEXPR01MB05607E081CB169E34587EEEF9CA10@LEXPR01MB0560.DEUPRD01.PROD.OUTLOOK.DE> <4D7F4AD313D3FC43A053B309F97543CFA0AF9184@njmtexg5.research.att.com> <CAH56bmC3gDEDF0wypcN2Lu+Ken3E7f_zXf_5yYbJGURBsju22w@mail.gmail.com>
In-Reply-To: <CAH56bmC3gDEDF0wypcN2Lu+Ken3E7f_zXf_5yYbJGURBsju22w@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
Content-Type: multipart/alternative; boundary="_000_4D7F4AD313D3FC43A053B309F97543CFA0AF94D0njmtexg5researc_"
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/ippm/C9x-pRWh7viWdRS3YrvpZZVJ3vI>
Subject: Re: [ippm] How should capacity measurement interact with shaping?
Precedence: list

Thanks Matt!  This is an interesting trace to consider,
and an important discussion to share with the group.

When I look at the equation for BBR:
https://cacm.acm.org/magazines/2017/2/212428-bbr-congestion-based-congestion-control/fulltext

both BBR and Maximum IP-layer Capacity Metric seek the
Max over some time interval. The window seems smaller for
BBR: 6 to 10 RTTs, where we’ve been using parameters that
result in a rate measurement once a second and take the max
of the 10 one-second measurements.

We’ve also evaluate several performance metrics when
adjusting load, and that determines how high the sending
rate will go (based on feedback from the receiver).
https://tools.ietf.org/html/draft-morton-ippm-capcity-metric-method-00#section-4.3

So, it seems that the MAX delivered rate for the 10 second test,  we
can all see is 94.5 Mbps. This rate was sustained for more
than a trivial amount of time, too. But if you are concerned that this
rate was somehow inflated by a large buffer and a large
burst tolerance in the shaper – that’s where the additional
metrics and slightly different sending rate control
that we described in the draft (and the slides) might help.
https://datatracker.ietf.org/meeting/105/materials/slides-105-ippm-metrics-and-methods-for-ip-capacity-00

IOW, it might well be that Max IP Capacity, measured as we designed
and parameterized it, measures 83 Mbps for this path
(assuming the 94.5 is the result of big overshoot at sender, and the
fluctuating performance afterward seems to support that).

When I was looking for background on BBR, I saw a paper comparing
BBR and CUBIC during drive tests.
http://web.cs.wpi.edu/~claypool/papers/driving-bbr/
One pair of plots seemed to indicate that BBR sent lots of Bytes
early-on, and grew the RTT pretty high before settling down
(Figure 5, a & b).
This looks a bit like the case you described below,
except 94.5 Mbps is a Received Rate – we don’t know
what came out of the network, just what went in and filled
a buffer before crashing down in the drive test.

So, I think I did more investigation than justification
for my answers, but I conclude the parameters like the
individual measurement intervals and overall time interval
from which the Max is drawn, plus the rate control algorithm
itself, play a big role here.

regards,
Al


From: Matt Mathis [mailto:mattmathis@google.com]
Sent: Thursday, September 19, 2019 5:18 PM
To: MORTON, ALFRED C (AL) <acm@research.att.com>; Ruediger.Geib@telekom.de
Cc: ippm@ietf.org
Subject: Fwd: How should capacity measurement interact with shaping?

Ok, moving the thread to IPPM

Some background, we (Measurement Lab) are testing a new transport (TCP) performance measurement tool, based on BBR-TCP.   I'm not ready to talk about results yet (well ok, it looks pretty good).    (BTW the BBR algorithm just happens to resemble the algorithm described in draft-morton-ippm-capcity-metric-method-00.)

Anyhow we noticed some interesting performance features for number of ISPs in the US and Europe and I wanted to get some input for how these cases should be treated.

One data point, a single trace saw ~94.5 Mbit/s for ~4 seconds, fluctuating performance ~75 Mb/s for ~1 second and then stable performance at ~83Mb/s for the rest of the 10 second test.    If I were to guess this is probably a policer (shaper?) with a 1 MB token bucket and a ~83Mb/s token rate (these numbers are not corrected for header overheads, which actually matter with this tool).  What is weird about it is that different ingress interfaces to the ISP (peers or serving locations) exhibit different parameters.

Now the IPPM measurement question:   Is the bulk transport capacity of this link ~94.5 Mbit/s or ~83Mb/s?   Justify your answer....?

Thanks,
--MM--
The best way to predict the future is to create it.  - Alan Kay

We must not tolerate intolerance;
       however our response must be carefully measured:
            too strong would be hypocritical and risks spiraling out of control;
            too weak risks being mistaken for tacit approval.

Forwarded Conversation
Subject: How should capacity measurement interact with shaping?
------------------------

From: Matt Mathis <mattmathis@google.com<mailto:mattmathis@google.com>>
Date: Thu, Aug 15, 2019 at 8:55 AM
To: MORTON, ALFRED C (AL) <acm@research.att.com<mailto:acm@research.att.com>>

We are seeing shapers  with huge bucket sizes, perhaps as larger or larger than 100 MB.

These are prohibitive to test by default, but can have a huge impact in some common situations.  E.g. downloading software updates.

An unconditional pass is not good, because some buckets are small.  What counts as large enough to be ok, and what "derating" is ok?

Thanks,
--MM--
The best way to predict the future is to create it.  - Alan Kay

We must not tolerate intolerance;
       however our response must be carefully measured:
            too strong would be hypocritical and risks spiraling out of control;
            too weak risks being mistaken for tacit approval.

----------
From: MORTON, ALFRED C (AL) <acm@research.att.com<mailto:acm@research.att.com>>
Date: Mon, Aug 19, 2019 at 5:08 AM
To: Matt Mathis <mattmathis@google.com<mailto:mattmathis@google.com>>
Cc: CIAVATTONE, LEN <lc9892@att.com<mailto:lc9892@att.com>>, Ruediger.Geib@telekom.de<mailto:Ruediger.Geib@telekom.de> <Ruediger.Geib@telekom.de<mailto:Ruediger.Geib@telekom.de>>

Hi Matt, currently cruising between Crete and Malta,
with about 7 days of vacation remaining – Adding my friend Len.
You know Rüdiger. It appears I’ve forgotten how to typs in 2 weeks
given the number of typos I’ve fixed so far...

We’ve seen big buffers on a basic DOCSIS cable service (downlink >2 sec)
but,
  we have 1-way delay variation or RTT variation limits
  when searching for the max rate, that don’t many packets
  queue in the buffer

  we want the status messages that result in rate adjustment to return
 in a reasonable amount of time (50ms + RTT)

  we usually search for 10 seconds, but if we go back and test with
  a fixed rate, we can see the buffer growing if the rate is too high.

  There will eventually be a discussion on the thresholds we use
  in the search // load rate control algorithm. The copy of
  Y.1540 I sent you has a simple one, we moved beyond that now
  (see the slides I didn’t get to present at IETF).

  There is value in having some of this discussion on IPPM-list,
  so we get some *agenda time at IETF-106*

We measure rate and performance, with some performance limits
built-in.  Pass/Fail is another step, de-rating too (made sense
with MBM “target_rate”).

Al

----------
From: <Ruediger.Geib@telekom.de<mailto:Ruediger.Geib@telekom.de>>
Date: Mon, Aug 26, 2019 at 12:05 AM
To: <acm@research.att.com<mailto:acm@research.att.com>>
Cc: <lc9892@att.com<mailto:lc9892@att.com>>, <mattmathis@google.com<mailto:mattmathis@google.com>>

Hi Al,

thanks for keeping me involved. I don’t have a precise answer and doubt, that there will be a single universal truth.

If the aim is only to determine the IP bandwidth of an access, then we aren’t interested in filling a buffer. Buffering events may occur, some of which are useful and to be expected, whereas others are not desired:


  *   Sender shaping behavior may matter (is traffic at the source CBR or is it bursty)
  *   Random collisions should be tolerated at the access whose bandwidth is to be measured.
  *   Limiting packet drop due to buffer overflow is a design aim or an important part of the algorithm, I think.
  *   Shared media might create bursts. I’m not an expert in the area, but there’s an “is bandwidth available” check in some cases between a central sender using a shared medium and the receivers connected. WiFi and may be other wireless equipment buffers packets also to optimize wireless resource optimization.
  *   It might be an idea to mark some flows by ECN, once there’s a guess on a sending bitrate when to expect no or very little packet drop. Today, this is experimental. CE marks by an ECN capable device should be expected roughly once queuing starts.

Practically, the set-up should be configurable with commodity hard- and software and all metrics should be measurable at the receiver. Burstiness of traffic and a distinction between queuing events which are to be expected and (undesired) queue build up are the events to be distinguished. I hope that can be done with commodity hard- and software. I at least am not able to write down a simple metric distinguishing queues to be expected from (undesired) queue build up causing congestion. The hard- and software to be used should be part of the solution, not part of the problem (bursty source traffic and timestamps with insufficient accuracy to detect queues are what I’d like to avoid).

I’d suggest to move discussion to the list.

Regards,

Rüdiger

----------
From: MORTON, ALFRED C (AL) <acm@research.att.com<mailto:acm@research.att.com>>
Date: Thu, Sep 19, 2019 at 7:01 AM
To: Ruediger.Geib@telekom.de<mailto:Ruediger.Geib@telekom.de> <Ruediger.Geib@telekom.de<mailto:Ruediger.Geib@telekom.de>>
Cc: CIAVATTONE, LEN <lc9892@att.com<mailto:lc9892@att.com>>, mattmathis@google.com<mailto:mattmathis@google.com> <mattmathis@google.com<mailto:mattmathis@google.com>>

I’m catching-up with this thread again, but before I reply:

*** Any objection to moving this discussion to IPPM-list ?? ***

@Matt – this is a question to you at this point...

thanks,
Al

From: Ruediger.Geib@telekom.de<mailto:Ruediger.Geib@telekom.de> [mailto:Ruediger.Geib@telekom.de<mailto:Ruediger.Geib@telekom.de>]
Sent: Monday, August 26, 2019 3:05 AM
To: MORTON, ALFRED C (AL) <acm@research.att.com<mailto:acm@research.att.com>>
Cc: CIAVATTONE, LEN <lc9892@att.com<mailto:lc9892@att.com>>; mattmathis@google.com<mailto:mattmathis@google.com>
Subject: AW: How should capacity measurement interact with shaping?

Hi Al,

thanks for keeping me involved. I don’t have a precise answer and doubt, that there will be a single universal truth.

If the aim is only to determine the IP bandwidth of an access, then we aren’t interested in filling a buffer. Buffering events may occur, some of which are useful and to be expected, whereas others are not desired:

-        Sender shaping behavior may matter (is traffic at the source CBR or is it bursty)
-        Random collisions should be tolerated at the access whose bandwidth is to be measured.
-        Limiting packet drop due to buffer overflow is a design aim or an important part of the algorithm, I think.
-        Shared media might create bursts. I’m not an expert in the area, but there’s an “is bandwidth available” check in some cases between a central sender using a shared medium and the receivers connected. WiFi and may be other wireless equipment buffers packets also to optimize wireless resource optimization.
-        It might be an idea to mark some flows by ECN, once there’s a guess on a sending bitrate when to expect no or very little packet drop. Today, this is experimental. CE marks by an ECN capable device should be expected roughly once queuing starts.

[ippm] Fwd: How should capacity measurement inter… Matt Mathis
Re: [ippm] How should capacity measurement intera… MORTON, ALFRED C (AL)
Re: [ippm] How should capacity measurement intera… Matt Mathis
Re: [ippm] How should capacity measurement intera… MORTON, ALFRED C (AL)
Re: [ippm] How should capacity measurement intera… Matt Mathis
Re: [ippm] How should capacity measurement intera… Ruediger.Geib
Re: [ippm] How should capacity measurement intera… Matt Mathis
Re: [ippm] Fwd: How should capacity measurement i… Joachim Fabini
Re: [ippm] Fwd: How should capacity measurement i… Ruediger.Geib
Re: [ippm] Fwd: How should capacity measurement i… Joachim Fabini
Re: [ippm] Fwd: How should capacity measurement i… Ruediger.Geib
Re: [ippm] How should capacity measurement intera… J Ignacio Alvarez-Hamelin
Re: [ippm] How should capacity measurement intera… Ruediger.Geib
Re: [ippm] How should capacity measurement intera… MORTON, ALFRED C (AL)