Re: [ippm] [Rpm] [tsvwg] [Starlink] [M-Lab-Discuss] misery metrics & consequences

rjmcmahon <rjmcmahon@rjmcmahon.com> Tue, 25 October 2022 15:17 UTC

Return-Path: <rjmcmahon@rjmcmahon.com>
X-Original-To: ippm@ietfa.amsl.com
Delivered-To: ippm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 51AD6C14CF1B; Tue, 25 Oct 2022 08:17:24 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.109
X-Spam-Level:
X-Spam-Status: No, score=-0.109 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HK_RANDOM_ENVFROM=0.998, HK_RANDOM_FROM=0.998, RCVD_IN_DNSWL_BLOCKED=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=rjmcmahon.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id UEnDq3iXmb3e; Tue, 25 Oct 2022 08:17:19 -0700 (PDT)
Received: from bobcat.rjmcmahon.com (bobcat.rjmcmahon.com [45.33.58.123]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 780D5C14F733; Tue, 25 Oct 2022 08:17:19 -0700 (PDT)
Received: from mail.rjmcmahon.com (bobcat.rjmcmahon.com [45.33.58.123]) by bobcat.rjmcmahon.com (Postfix) with ESMTPA id 77AEE1B252; Tue, 25 Oct 2022 08:17:18 -0700 (PDT)
DKIM-Filter: OpenDKIM Filter v2.11.0 bobcat.rjmcmahon.com 77AEE1B252
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rjmcmahon.com; s=bobcat; t=1666711038; bh=70iB5cVcBYU2Wfzg0UZPB18gI1AvbtwYwA9RVmUBw68=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=OpeKwaBn4AOIr6pu9F+IOaFg7pmQvcA+Udv8YZtYJ7al9vGqr1luAUYg610coa+x7 RJnTAp4e9W6zqzhy2u82pYt//SdFCRjhnpck8tEii+A/Iy5nodDTlXn9VgbAfw1K5G haN9buqpMJlZEoFGavM+xAfVoTQESqoy7Ks8Iivs=
MIME-Version: 1.0
Date: Tue, 25 Oct 2022 08:17:18 -0700
From: rjmcmahon <rjmcmahon@rjmcmahon.com>
To: Neal Cardwell <ncardwell@google.com>
Cc: Christoph Paasch <cpaasch=40apple.com@dmarc.ietf.org>, Dave Taht via Starlink <starlink@lists.bufferbloat.net>, tsvwg IETF list <tsvwg@ietf.org>, IETF IPPM WG <ippm@ietf.org>, Rpm <rpm@lists.bufferbloat.net>, Glenn Fishbine <glenn@breakingpointsolutions.com>, Measurement Analysis and Tools Working Group <mat-wg@ripe.net>, discuss <discuss@measurementlab.net>
In-Reply-To: <CADVnQy=uGOczVGtVQH3pjuD4wGrPq_ZBi-Otih-jUDZ8x0Ceog@mail.gmail.com>
References: <CAA93jw4w27a1EO_QQG7NNkih+C3QQde5=_7OqGeS9xy9nB6wkg@mail.gmail.com> <CABs+J_DhaLPp8nba=hqrg6Z_Db3DBH1__FymBsqEXgSDo+8-5w@mail.gmail.com> <339AB8BC-9628-40E2-9339-77FCFA74488D@gmx.de> <26F033E5-2490-414E-8DFF-4ACD27B74075@apple.com> <E45A11AA-3EAD-4BA7-9D29-B3FDCAC0B5FE@gmx.de> <CAF33DDA-8421-4284-9C22-C86771CA7DF3@apple.com> <CADVnQy=uGOczVGtVQH3pjuD4wGrPq_ZBi-Otih-jUDZ8x0Ceog@mail.gmail.com>
Message-ID: <24d2e4c89cd4c9d0a4736f37cefa1e42@rjmcmahon.com>
X-Sender: rjmcmahon@rjmcmahon.com
Content-Type: multipart/mixed; boundary="=_fd7680ba2f4540fb4a0d03955c45e35f"
Archived-At: <https://mailarchive.ietf.org/arch/msg/ippm/Id0Wfu-bchf5-7eOmHiR631A2hM>
X-Mailman-Approved-At: Wed, 02 Nov 2022 09:43:48 -0700
Subject: Re: [ippm] [Rpm] [tsvwg] [Starlink] [M-Lab-Discuss] misery metrics & consequences
X-BeenThere: ippm@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: IETF IP Performance Metrics Working Group <ippm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ippm>, <mailto:ippm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ippm/>
List-Post: <mailto:ippm@ietf.org>
List-Help: <mailto:ippm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ippm>, <mailto:ippm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 25 Oct 2022 15:17:24 -0000

One sample for a subgroup, from an SPC perspective, is typically 
insufficient, e.g. Shewart control charts. Below are some suggestions:

https://bookdown.org/lawson/an_introduction_to_acceptance_sampling_and_spc_with_r26/shewhart-control-charts-in-phase-i.html

o) Define the subgroup size: Initially, this is a constant number of 4 
or 5 items per each subgroup taken over a short enough interval of time 
so that variation among them is due only to common causes.

o) Define the Subgroup Frequency: The subgroups collected should be 
spaced out in time, but collected often enough so that they can 
represent opportunities for the process to change.

o) Define the number of subgroups: Generally 25 or more subgroups are 
necessary to establish the characteristics of a stable process. If some 
subgroups are eliminated before calculating the revised control limits 
due to the discovery of assignable causes, additional subgroups may need 
to be collected so that there are at least 25 subgroups used in 
calculating the revised limits.

Then return the mean and variance per the control chart tables and the 
subgroup size. Also, keep in mind that the subgrouping is normalizing 
the samples so information is lost if the underlying distribution is not 
normal. That's why we give the full histogram in iperf 2. One can 
compare against normal.

https://en.wikipedia.org/wiki/Control_chart

Bob

> On Mon, Oct 24, 2022 at 7:44 PM Christoph Paasch
> <cpaasch=40apple.com@dmarc.ietf.org> wrote:
> 
>> On Oct 24, 2022, at 1:57 PM, Sebastian Moeller <moeller0@gmx.de>
>> wrote:
>> Hi Christoph
>> 
>> On Oct 24, 2022, at 22:08, Christoph Paasch <cpaasch@apple.com>
>> wrote:
>> 
>> Hello Sebastian,
>> 
>> On Oct 23, 2022, at 4:57 AM, Sebastian Moeller via Starlink
>> <starlink@lists.bufferbloat.net> wrote:
>> 
>> Hi Glenn,
>> 
>> On Oct 23, 2022, at 02:17, Glenn Fishbine via Rpm
>> <rpm@lists.bufferbloat.net> wrote:
>> 
>> As a classic died in the wool empiricist, granted that you can
>> identify "misery" factors, given a population of 1,000 users, how do
>> you propose deriving a misery index for that population?
>> 
>> We can measure download, upload, ping, jitter pretty much without
>> user intervention.  For the measurements you hypothesize, how you
>> you automatically extract those indecies without subjective user
>> contamination.
>> 
>> I.e.  my download speed sucks. Measure the download speed.
>> 
>> My isp doesn't fix my problem. Measure what? How?
>> 
>> Human survey technology is 70+ years old and it still has problems
>> figuring out how to correlate opinion with fact.
>> 
>> Without an objective measurement scheme that doesn't require human
>> interaction, the misery index is a cool hypothesis with no way to
>> link to actual data.  What objective measurements can be made?
>> Answer that and the index becomes useful. Otherwise it's just
>> consumer whining.
>> 
>> Not trying to be combative here, in fact I like the concept you
>> support, but I'm hard pressed to see how the concept can lead to
>> data, and the data lead to policy proposals.
>> 
>> [SM] So it seems that outside of seemingly simple to test
>> throughput numbers*, the next most important quality number (or the
>> most important depending on subjective ranking) is how does latency
>> change under "load". Absolute latency is also important albeit
>> static high latency can be worked around within limits so the change
>> under load seems more relevant.
>> All of flent's RRUL test, apple's networkQuality/RPM, and iperf2's
>> bounceback test offer methods to asses latency change under load**,
>> as do waveforms bufferbloat tests and even to a degree Ookla's
>> speedtest.net [1]. IMHO something like latency increase under load
>> or apple's responsiveness measure RPM (basically the inverse of the
>> latency under load calculated on a per minute basis, so it scales in
>> the typical higher numbers are better way, unlike raw latency under
>> load numbers where smaller is better).
>> IMHO what networkQuality is missing ATM is to measure and report
>> the unloaded RPM as well as the loaded the first gives a measure
>> over the static latency the second over how well things keep working
>> if capacity gets tight. They report the base RTT which can be
>> converted to RPM. As an example:
>> 
>> macbook:~ user$ networkQuality -v
>> ==== SUMMARY ====
>> 
>> Upload capacity: 24.341 Mbps
>> Download capacity: 91.951 Mbps
>> Upload flows: 20
>> Download flows: 16
>> Responsiveness: High (2123 RPM)
>> Base RTT: 16
>> Start: 10/23/22, 13:44:39
>> End: 10/23/22, 13:44:53
>> OS Version: Version 12.6 (Build 21G115)
> 
> You should update to latest macOS:
> 
> $ networkQuality
> ==== SUMMARY ====
> Uplink capacity: 326.789 Mbps
> Downlink capacity: 446.359 Mbps
> Responsiveness: High (2195 RPM)
> Idle Latency: 5.833 milli-seconds
> 
> ;-)
> 
>  [SM] I wish... just updated to the latest and greatest for this
> hardware (A1398):
> 
> macbook-pro:DPZ smoeller$ networkQuality
> ==== SUMMARY ====
> 
> Upload capacity: 7.478 Mbps
> Download capacity: 2.415 Mbps
> Upload flows: 16
> Download flows: 20
> Responsiveness: Low (90 RPM)
> macbook-pro:DPZ smoeller$ networkQuality -v
> ==== SUMMARY ====
> 
> Upload capacity: 5.830 Mbps
> Download capacity: 6.077 Mbps
> Upload flows: 12
> Download flows: 20
> Responsiveness: Low (56 RPM)
> Base RTT: 134
> Start: 10/24/22, 22:47:48
> End: 10/24/22, 22:48:09
> OS Version: Version 12.6.1 (Build 21G217)
> macbook-pro:DPZ smoeller$
> 
> Still, I only see the "Base RTT" with the -v switch and I am not sure
> whether that is identical to your "Idle Latency".
> 
> I guess I need to convince my employer to exchange that macbook
> (actually because the battery starts bulging and not because I am
> behind with networkQuality versions ;) )
> 
> Yes, you would need macOS Ventura to get the latest and greatest.
> 
>>> But, what I read is: You are suggesting that “Idle Latency”
>>> should be expressed in RPM as well? Or, Responsiveness expressed
>>> in millisecond ?
>> 
>> [SM] Yes, I am fine with either (or both) the idea is to make it
>> really easy to see whether/how much "working conditions" deteriorate
>> the responsiveness / increase the latency-under-load. At least in
>> verbose mode it would be sweet if nwtworkQuality could expose that
>> information.
> 
> I see - let me think about that…
> 
> +1 w/ Sebastian's point here. IMHO it would be great if the
> responsiveness under load and when idle were reported:
> 
>   (a) symmetrically, with the same metrics for both cases, and
> 
>   (b) in both RPM and ms terms for both cases
> 
> So instead of:
> 
> Responsiveness: High (2195 RPM)
> Idle Latency: 5.833 milli-seconds
> 
> Perhaps something like:
> 
> Loaded Responsiveness: High (XXXX RPM)
> Loaded Latency: X.XXX milli-seconds
> Idle Responsiveness: High (XXXX RPM)
> Idle Latency: X.XXX milli-seconds
> 
> Having both RPM and ms available for loaded and unloaded cases would
> seem to make it easier to compare loaded and idle performance more
> directly and in a more apples-to-apples way.
> 
> best,
> neal
> 
> 
> 
> Links:
> ------
> [1] http://speedtest.net
> _______________________________________________
> Rpm mailing list
> Rpm@lists.bufferbloat.net
> https://lists.bufferbloat.net/listinfo/rpm