Re: [ippm] How should capacity measurement interact with shaping?

<Ruediger.Geib@telekom.de> Mon, 23 September 2019 08:20 UTC

Return-Path: <Ruediger.Geib@telekom.de>
X-Original-To: ippm@ietfa.amsl.com
Delivered-To: ippm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BFFB112013B for <ippm@ietfa.amsl.com>; Mon, 23 Sep 2019 01:20:00 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.298
X-Spam-Level:
X-Spam-Status: No, score=-4.298 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_NONE=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=telekom.de
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id AXFrBplCoE93 for <ippm@ietfa.amsl.com>; Mon, 23 Sep 2019 01:19:57 -0700 (PDT)
Received: from mailout21.telekom.de (mailout21.telekom.de [194.25.225.215]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4CBD6120121 for <ippm@ietf.org>; Mon, 23 Sep 2019 01:19:55 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=telekom.de; i=@telekom.de; q=dns/txt; s=dtag1; t=1569226793; x=1600762793; h=from:to:cc:subject:date:message-id:references: in-reply-to:mime-version; bh=xJiCftOPLfOKPT/i0WjtTgtIsCNFL0hy8Ag6/gSIl+I=; b=ZR41klFCnKWh/VsxoKC4BP7rc+S907AHH/KJGc/yynxqTeB3u/flH455 PVz+f+qKOjAfkr4fbONnsF9BJfB7cR2g7I4Hwb0qa3o1Mrhe7YW9n9Jre C5gcTWp7u1mNO14CZaY7PgUVTZDFPlyma6cwvM1YnNhnzDhOMz4sGZ6fT D0GcStt1VHZy0NmESwjNZWyRmdM4LJ3tQwlhqrwRgCwGCOKTz43Nb/ha6 1y2JQmY6AIPDZrMrcPjmpUfzMMea3zm3Gbz7sZ0Eg6aIPQV04alA9DJsm j6fiyJJ+WDpWG1dE7wlWeqLnB/BlZZMyo5ku2o1yHpeVz7N7iNNH1D/2v g==;
Received: from qdezc2.de.t-internal.com ([10.171.255.37]) by MAILOUT21.dmznet.de.t-internal.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 23 Sep 2019 10:19:50 +0200
X-IronPort-AV: E=Sophos;i="5.64,539,1559512800"; d="scan'208,217";a="152122285"
X-MGA-submission: MDG6ZnBrAqwKfpC6D901c7/iQXWUVcpiXbLjW1YSfKPzHWkBk/JNeByHfuPN8OItgcbTt3aOizZe7E7kN1BwdTONZpYU2gQ+RRMYqwkXZYBgpNKCgTAznmjFnmGYOzS6KdJNKwa/UlkNdIhpxN6KeL5WgitW4f66IkU1vRHz9gX/ag==
Received: from he105715.emea1.cds.t-internal.com ([10.169.118.51]) by qde0ps.de.t-internal.com with ESMTP/TLS/AES256-SHA; 23 Sep 2019 10:19:53 +0200
Received: from HE105712.EMEA1.cds.t-internal.com (10.169.118.43) by HE105715.emea1.cds.t-internal.com (10.169.118.51) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Mon, 23 Sep 2019 10:19:53 +0200
Received: from HE106564.emea1.cds.t-internal.com (10.171.40.16) by HE105712.EMEA1.cds.t-internal.com (10.169.118.43) with Microsoft SMTP Server (TLS) id 15.0.1497.2 via Frontend Transport; Mon, 23 Sep 2019 10:19:52 +0200
Received: from GER01-FRA-obe.outbound.protection.outlook.de (51.4.80.23) by O365mail01.telekom.de (172.30.0.234) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Mon, 23 Sep 2019 10:19:50 +0200
Received: from LEJPR01MB1178.DEUPRD01.PROD.OUTLOOK.DE (10.158.145.12) by LEJPR01MB1260.DEUPRD01.PROD.OUTLOOK.DE (10.158.147.150) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2284.25; Mon, 23 Sep 2019 08:19:52 +0000
Received: from LEJPR01MB1178.DEUPRD01.PROD.OUTLOOK.DE ([fe80::987d:70fa:4d72:d6da]) by LEJPR01MB1178.DEUPRD01.PROD.OUTLOOK.DE ([fe80::987d:70fa:4d72:d6da%5]) with mapi id 15.20.2284.023; Mon, 23 Sep 2019 08:19:52 +0000
From: Ruediger.Geib@telekom.de
To: mattmathis@google.com
CC: ippm@ietf.org, acm@research.att.com
Thread-Topic: How should capacity measurement interact with shaping?
Thread-Index: AQHVU4HrkoUGktYwE0yJQr8LfUwZ06b/aQHwgAQxteCAL6Jm4IAAe0sAgAVc+CA=
Date: Mon, 23 Sep 2019 08:19:52 +0000
Message-ID: <LEJPR01MB11783F8121EC828FF1EC12419C850@LEJPR01MB1178.DEUPRD01.PROD.OUTLOOK.DE>
References: <CAH56bmBmywKg_AxsHnRf97Pfxu4Yjsp_fv_s4S7LXk1voQpV1g@mail.gmail.com> <4D7F4AD313D3FC43A053B309F97543CFA0ADC777@njmtexg4.research.att.com> <LEXPR01MB05607E081CB169E34587EEEF9CA10@LEXPR01MB0560.DEUPRD01.PROD.OUTLOOK.DE> <4D7F4AD313D3FC43A053B309F97543CFA0AF9184@njmtexg5.research.att.com> <CAH56bmC3gDEDF0wypcN2Lu+Ken3E7f_zXf_5yYbJGURBsju22w@mail.gmail.com>
In-Reply-To: <CAH56bmC3gDEDF0wypcN2Lu+Ken3E7f_zXf_5yYbJGURBsju22w@mail.gmail.com>
Accept-Language: de-DE, en-US
Content-Language: de-DE
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
authentication-results: spf=none (sender IP is ) smtp.mailfrom=Ruediger.Geib@telekom.de;
x-originating-ip: [164.19.3.27]
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: ae21857b-107b-4d77-d07a-08d73ffecee7
x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(5600167)(711020)(4605104)(1401327)(2017052603328)(7193020); SRVR:LEJPR01MB1260;
x-ms-traffictypediagnostic: LEJPR01MB1260:
x-microsoft-antispam-prvs: <LEJPR01MB1260D8704974EB8CA3123CA09C850@LEJPR01MB1260.DEUPRD01.PROD.OUTLOOK.DE>
x-ms-oob-tlc-oobclassifiers: OLM:24;
x-forefront-prvs: 0169092318
x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(396003)(366004)(136003)(39860400002)(346002)(376002)(199004)(189003)(14454004)(256004)(4326008)(476003)(76116006)(66556008)(66476007)(66446008)(64756008)(66946007)(33656002)(7696005)(102836004)(19627235002)(86362001)(478600001)(8936002)(53546011)(486006)(14444005)(3846002)(85182001)(5660300002)(8676002)(55016002)(81156014)(81166006)(9686003)(30864003)(71200400001)(236005)(54896002)(7736002)(6306002)(316002)(26005)(11346002)(6916009)(186003)(6116002)(790700001)(446003)(71190400001)(66574012)(2906002)(66066001)(85202003)(76176011)(54906003)(777600001); DIR:OUT; SFP:1101; SCL:1; SRVR:LEJPR01MB1260; H:LEJPR01MB1178.DEUPRD01.PROD.OUTLOOK.DE; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1;
received-spf: None (protection.outlook.com: telekom.de does not designate permitted sender hosts)
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam-message-info: QfC1zsT7w2hjP19mQZ9jph7fJCLlIcaDm5RCXrciDEd2+38kz7ig6/BH0UsYyaGdsTt2dGLDEpYIT0dAoyLwU2muk6h3pIPprwJaIMPjkbwI7mOpIH706PN5xSoHtmQkuu4HHGZOIC6bi16GGqJBy8tVLBbtOQgaK74glqkC7bqTDOblf8t1Feufyb1G99I81s4hBXlaR4M/42a+t8K3FElq02qN9cFxl0qf8dUlso79I6w3xoK1r2uwFkhsJXyppLpyIoE2WqhNkv+dWhSMRTkYri0uCZyt+p7YKUQHGacjGx929xvrKEQ5ihGs2Z4JScJtQkPcqKKgt7LI5ML2nlquo3tjoEOQDFqsMyWliWQ7ONJQYVutM7wTTJ7yzUIJYgc2ZKI4f+mcQ4Le2HktWRUXXwK0vWHuPrxU7ZeKUw8=
x-ms-exchange-transport-forked: True
Content-Type: multipart/alternative; boundary="_000_LEJPR01MB11783F8121EC828FF1EC12419C850LEJPR01MB1178DEUP_"
MIME-Version: 1.0
X-MS-Exchange-CrossTenant-Network-Message-Id: ae21857b-107b-4d77-d07a-08d73ffecee7
X-MS-Exchange-CrossTenant-originalarrivaltime: 23 Sep 2019 08:19:52.0886 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: bde4dffc-4b60-4cf6-8b04-a5eeb25f5c4f
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: OoObjvpZosMOKNvur+zZPHuP/hk3SXeHNzoC0wEZRI19QlLuajukOwEOU/48De/vTtc8CvxAZsW2dV9p8i+yX+UmPK0I1Kej4u1xW18TrA0=
X-MS-Exchange-Transport-CrossTenantHeadersStamped: LEJPR01MB1260
X-OriginatorOrg: telekom.de
Archived-At: <https://mailarchive.ietf.org/arch/msg/ippm/qHsOM0MWHLeCj6diuwJEDZPwgiU>
Subject: Re: [ippm] How should capacity measurement interact with shaping?
X-BeenThere: ippm@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF IP Performance Metrics Working Group <ippm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ippm>, <mailto:ippm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ippm/>
List-Post: <mailto:ippm@ietf.org>
List-Help: <mailto:ippm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ippm>, <mailto:ippm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 23 Sep 2019 08:20:01 -0000

Hi Matt,

Thanks. My expertise is more with fixed network access. I don’t know how access bandwidth is shared between LTE flows at a mobile access.

On the fixed network side, I’d expect a policy control point to be connected to some access line concentrator (be it optical, or copper based – coax I can’t tell). To my understanding the configurations having an impact at the fixed access are:

  *   The access shaper rate (settable)
  *   The access shaper burst size (likely a default value, depending on phys. line card bandwidth)
  *   The queue depth (settable)
  *   AQM parameters (settable to some extent, if deployed)

A burst which is below or equal to shaper burst size with a temporal pacing allowing to the shaper to recover for new tokens will likely see line speed at that bottleneck. That should hold in general.

The shaper kicks in, if more than default shaper burst size packets arrive within the given time frame. Then the queue-limit has an impact, if more than access shaper rate packets arrive. If the latency added by the building queue starts to control the measurement flow rate fast enough, loss should be avoided. But then you should be able to measure the shaper rate, I’d expect.

While I doubt that this fits to what you’ve observed, it may still help to design a test which either measures available bandwidth or access bandwidth in the presence of a shaper:

  *   An initial burst may pass a shaper at line speed
  *   A measurement sending up to shaper rate CBR spaced packets which consists of more than “default shaper burst size” packets  should see no added delay and be forwared by shaper rate (it would of course pass with up to “default shaper burst size” packets at that rate too, but then the rate measurement isn’t sustainable, if enough tokens are added between separate measurement streams) .
  *   If the CBR spaced measurement flow sends more than default shaper burst size packets with more than shaper rate, then a queue should build at the shaper.

You mentioned a token bucket of 1 MB. I’d expect the shaper default burst size to depend on the phys. line card bandwidth of the interface, where this shaper is configured. I’m not sure, whether to expect a single constant value for different implementations. I’d like to describe the default shaper bandwidth by a strawman “1ms@phys-interface-bandwidth” token bucket. Interfaces may be 1GE, 10GE or 100GE. I’d however not be surprised, if different generations of linecards from the same vendor support different values for the same physical port bandwidth, and different vendors may have different design philosophies.

Regards,

Ruediger



Von: Matt Mathis <mattmathis@google.com>
Gesendet: Donnerstag, 19. September 2019 23:18
An: ALFRED C (AL) <acm@research.att.com>; Geib, Rüdiger <Ruediger.Geib@telekom.de>
Cc: ippm@ietf.org
Betreff: Fwd: How should capacity measurement interact with shaping?

Ok, moving the thread to IPPM

Some background, we (Measurement Lab) are testing a new transport (TCP) performance measurement tool, based on BBR-TCP.   I'm not ready to talk about results yet (well ok, it looks pretty good).    (BTW the BBR algorithm just happens to resemble the algorithm described in draft-morton-ippm-capcity-metric-method-00.)

Anyhow we noticed some interesting performance features for number of ISPs in the US and Europe and I wanted to get some input for how these cases should be treated.

One data point, a single trace saw ~94.5 Mbit/s for ~4 seconds, fluctuating performance ~75 Mb/s for ~1 second and then stable performance at ~83Mb/s for the rest of the 10 second test.    If I were to guess this is probably a policer (shaper?) with a 1 MB token bucket and a ~83Mb/s token rate (these numbers are not corrected for header overheads, which actually matter with this tool).  What is weird about it is that different ingress interfaces to the ISP (peers or serving locations) exhibit different parameters.

Now the IPPM measurement question:   Is the bulk transport capacity of this link ~94.5 Mbit/s or ~83Mb/s?   Justify your answer....?

Thanks,
--MM--
The best way to predict the future is to create it.  - Alan Kay

We must not tolerate intolerance;
       however our response must be carefully measured:
            too strong would be hypocritical and risks spiraling out of control;
            too weak risks being mistaken for tacit approval.

Forwarded Conversation
Subject: How should capacity measurement interact with shaping?
------------------------

From: Matt Mathis <mattmathis@google.com<mailto:mattmathis@google.com>>
Date: Thu, Aug 15, 2019 at 8:55 AM
To: MORTON, ALFRED C (AL) <acm@research.att.com<mailto:acm@research.att.com>>

We are seeing shapers  with huge bucket sizes, perhaps as larger or larger than 100 MB.

These are prohibitive to test by default, but can have a huge impact in some common situations.  E.g. downloading software updates.

An unconditional pass is not good, because some buckets are small.  What counts as large enough to be ok, and what "derating" is ok?

Thanks,
--MM--
The best way to predict the future is to create it.  - Alan Kay

We must not tolerate intolerance;
       however our response must be carefully measured:
            too strong would be hypocritical and risks spiraling out of control;
            too weak risks being mistaken for tacit approval.

----------
From: MORTON, ALFRED C (AL) <acm@research.att.com<mailto:acm@research.att.com>>
Date: Mon, Aug 19, 2019 at 5:08 AM
To: Matt Mathis <mattmathis@google.com<mailto:mattmathis@google.com>>
Cc: CIAVATTONE, LEN <lc9892@att.com<mailto:lc9892@att.com>>, Ruediger.Geib@telekom.de<mailto:Ruediger.Geib@telekom.de> <Ruediger.Geib@telekom.de<mailto:Ruediger.Geib@telekom.de>>

Hi Matt, currently cruising between Crete and Malta,
with about 7 days of vacation remaining – Adding my friend Len.
You know Rüdiger. It appears I’ve forgotten how to typs in 2 weeks
given the number of typos I’ve fixed so far...

We’ve seen big buffers on a basic DOCSIS cable service (downlink >2 sec)
but,
  we have 1-way delay variation or RTT variation limits
  when searching for the max rate, that don’t many packets
  queue in the buffer

  we want the status messages that result in rate adjustment to return
 in a reasonable amount of time (50ms + RTT)

  we usually search for 10 seconds, but if we go back and test with
  a fixed rate, we can see the buffer growing if the rate is too high.

  There will eventually be a discussion on the thresholds we use
  in the search // load rate control algorithm. The copy of
  Y.1540 I sent you has a simple one, we moved beyond that now
  (see the slides I didn’t get to present at IETF).

  There is value in having some of this discussion on IPPM-list,
  so we get some *agenda time at IETF-106*

We measure rate and performance, with some performance limits
built-in.  Pass/Fail is another step, de-rating too (made sense
with MBM “target_rate”).

Al

----------
From: <Ruediger.Geib@telekom.de<mailto:Ruediger.Geib@telekom.de>>
Date: Mon, Aug 26, 2019 at 12:05 AM
To: <acm@research.att.com<mailto:acm@research.att.com>>
Cc: <lc9892@att.com<mailto:lc9892@att.com>>, <mattmathis@google.com<mailto:mattmathis@google.com>>

Hi Al,

thanks for keeping me involved. I don’t have a precise answer and doubt, that there will be a single universal truth.

If the aim is only to determine the IP bandwidth of an access, then we aren’t interested in filling a buffer. Buffering events may occur, some of which are useful and to be expected, whereas others are not desired:


  *   Sender shaping behavior may matter (is traffic at the source CBR or is it bursty)
  *   Random collisions should be tolerated at the access whose bandwidth is to be measured.
  *   Limiting packet drop due to buffer overflow is a design aim or an important part of the algorithm, I think.
  *   Shared media might create bursts. I’m not an expert in the area, but there’s an “is bandwidth available” check in some cases between a central sender using a shared medium and the receivers connected. WiFi and may be other wireless equipment buffers packets also to optimize wireless resource optimization.
  *   It might be an idea to mark some flows by ECN, once there’s a guess on a sending bitrate when to expect no or very little packet drop. Today, this is experimental. CE marks by an ECN capable device should be expected roughly once queuing starts.

Practically, the set-up should be configurable with commodity hard- and software and all metrics should be measurable at the receiver. Burstiness of traffic and a distinction between queuing events which are to be expected and (undesired) queue build up are the events to be distinguished. I hope that can be done with commodity hard- and software. I at least am not able to write down a simple metric distinguishing queues to be expected from (undesired) queue build up causing congestion. The hard- and software to be used should be part of the solution, not part of the problem (bursty source traffic and timestamps with insufficient accuracy to detect queues are what I’d like to avoid).

I’d suggest to move discussion to the list.

Regards,

Rüdiger

----------
From: MORTON, ALFRED C (AL) <acm@research.att.com<mailto:acm@research.att.com>>
Date: Thu, Sep 19, 2019 at 7:01 AM
To: Ruediger.Geib@telekom.de<mailto:Ruediger.Geib@telekom.de> <Ruediger.Geib@telekom.de<mailto:Ruediger.Geib@telekom.de>>
Cc: CIAVATTONE, LEN <lc9892@att.com<mailto:lc9892@att.com>>, mattmathis@google.com<mailto:mattmathis@google.com> <mattmathis@google.com<mailto:mattmathis@google.com>>

I’m catching-up with this thread again, but before I reply:

*** Any objection to moving this discussion to IPPM-list ?? ***

@Matt – this is a question to you at this point...

thanks,
Al

From: Ruediger.Geib@telekom.de<mailto:Ruediger.Geib@telekom.de> [mailto:Ruediger.Geib@telekom.de<mailto:Ruediger.Geib@telekom.de>]
Sent: Monday, August 26, 2019 3:05 AM
To: MORTON, ALFRED C (AL) <acm@research.att.com<mailto:acm@research.att.com>>
Cc: CIAVATTONE, LEN <lc9892@att.com<mailto:lc9892@att.com>>; mattmathis@google.com<mailto:mattmathis@google.com>
Subject: AW: How should capacity measurement interact with shaping?

Hi Al,

thanks for keeping me involved. I don’t have a precise answer and doubt, that there will be a single universal truth.

If the aim is only to determine the IP bandwidth of an access, then we aren’t interested in filling a buffer. Buffering events may occur, some of which are useful and to be expected, whereas others are not desired:

-        Sender shaping behavior may matter (is traffic at the source CBR or is it bursty)
-        Random collisions should be tolerated at the access whose bandwidth is to be measured.
-        Limiting packet drop due to buffer overflow is a design aim or an important part of the algorithm, I think.
-        Shared media might create bursts. I’m not an expert in the area, but there’s an “is bandwidth available” check in some cases between a central sender using a shared medium and the receivers connected. WiFi and may be other wireless equipment buffers packets also to optimize wireless resource optimization.
-        It might be an idea to mark some flows by ECN, once there’s a guess on a sending bitrate when to expect no or very little packet drop. Today, this is experimental. CE marks by an ECN capable device should be expected roughly once queuing starts.