Re: [ippm] [Rpm] lightweight active sensing of bandwidth and buffering

rjmcmahon <rjmcmahon@rjmcmahon.com> Wed, 02 November 2022 20:38 UTC

Return-Path: <rjmcmahon@rjmcmahon.com>
X-Original-To: ippm@ietfa.amsl.com
Delivered-To: ippm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 71D6EC1524DB for <ippm@ietfa.amsl.com>; Wed, 2 Nov 2022 13:38:02 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -5.11
X-Spam-Level:
X-Spam-Status: No, score=-5.11 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HK_RANDOM_ENVFROM=0.998, HK_RANDOM_FROM=0.998, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=rjmcmahon.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id fmPSkt-2MuoX for <ippm@ietfa.amsl.com>; Wed, 2 Nov 2022 13:37:58 -0700 (PDT)
Received: from bobcat.rjmcmahon.com (bobcat.rjmcmahon.com [45.33.58.123]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4679FC14F719 for <ippm@ietf.org>; Wed, 2 Nov 2022 13:37:58 -0700 (PDT)
Received: from mail.rjmcmahon.com (bobcat.rjmcmahon.com [45.33.58.123]) by bobcat.rjmcmahon.com (Postfix) with ESMTPA id 366F81B277; Wed, 2 Nov 2022 13:37:57 -0700 (PDT)
DKIM-Filter: OpenDKIM Filter v2.11.0 bobcat.rjmcmahon.com 366F81B277
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rjmcmahon.com; s=bobcat; t=1667421477; bh=WoAUS0OqFI3XlZP74qM+jgqHYcygogMcwGOTc/e7Cn8=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=JlLi87PQpzzcXxdHLZScMWEWbxd5zc5cHddAYjBQUwNVmZoCCBos3O+nj45mtMTXe V4WsONp3Mz2BekM7jz2/0mXCn4ux7Qrk1Rzn1Zi6KHSO5VfA8QIRDkzkye+87gcRy3 rH/bzbJhHirRAl1BZvMt3jmcxC9ADR7IkcGYn99I=
MIME-Version: 1.0
Date: Wed, 02 Nov 2022 13:37:57 -0700
From: rjmcmahon <rjmcmahon@rjmcmahon.com>
To: Dave Taht <dave.taht@gmail.com>
Cc: Ruediger.Geib@telekom.de, rpm@lists.bufferbloat.net, ippm@ietf.org
In-Reply-To: <CAA93jw7ZFpk+g5=9uNHc1TF6a3iBc7nn8UFsX8JwSgsgsukecg@mail.gmail.com>
References: <CH0PR02MB79808E2508E6AED66DC7657AD32E9@CH0PR02MB7980.namprd02.prod.outlook.com> <CH0PR02MB7980DFB52D45F2458782430FD3379@CH0PR02MB7980.namprd02.prod.outlook.com> <CH0PR02MB7980D3036BF700A074D902A1D3379@CH0PR02MB7980.namprd02.prod.outlook.com> <CAA93jw7Jb_77dZzr-AFjXPtwf_hBxhODyF5UzTX5a-A6+xMkWw@mail.gmail.com> <0a8cc31c7077918bf84fddf9db50db02@rjmcmahon.com> <CH0PR02MB798043B62D22E8C82F61138DD3379@CH0PR02MB7980.namprd02.prod.outlook.com> <CAA93jw6kuHJp_PnUBb6J4HiFmy=xTG9uiu7bML7fuHFzNhMr2w@mail.gmail.com> <344f2a33b6bcae4ad4390dcb96f92589@rjmcmahon.com> <261B90F5-FD4E-46D5-BEFE-6BF12D249A28@gmx.de> <FR2P281MB15274FF81D44E875CC4940259C399@FR2P281MB1527.DEUP281.PROD.OUTLOOK.COM> <9519aceac2103db90e363b5c9f447d12@rjmcmahon.com> <CAA93jw7ZFpk+g5=9uNHc1TF6a3iBc7nn8UFsX8JwSgsgsukecg@mail.gmail.com>
Message-ID: <72c1083dfe84fdd2e0f9af2170f08369@rjmcmahon.com>
X-Sender: rjmcmahon@rjmcmahon.com
Content-Type: text/plain; charset="US-ASCII"; format="flowed"
Content-Transfer-Encoding: 7bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/ippm/wqemoPYinjtOuEm5gkyB1enru4U>
Subject: Re: [ippm] [Rpm] lightweight active sensing of bandwidth and buffering
X-BeenThere: ippm@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: IETF IP Performance Metrics Working Group <ippm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ippm>, <mailto:ippm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ippm/>
List-Post: <mailto:ippm@ietf.org>
List-Help: <mailto:ippm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ippm>, <mailto:ippm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 02 Nov 2022 20:38:02 -0000

I used iperf 2's Little's law calculation to find the buffer sizes 
designed in by our hardware team(s). They were surprised that the 
numbers exactly matched their designs - Little applied - and I never saw 
either the hardware nor its design spec.

It seems reasonable to use something if & when it works and is useful. 
The challenge seems to be knowing the limits of any claims (or 
simulations.) I think engineers do this much when we assume linearity 
over some small interval as an example in finite element analysis and 
structures: 
https://control.com/technical-articles/the-difference-between-linear-and-nonlinear-finite-element-analysis-fea/

Bob
> On Wed, Nov 2, 2022 at 12:29 PM rjmcmahon via Rpm
> <rpm@lists.bufferbloat.net> wrote:
>> 
>> Most measuring bloat are ignoring queue build up phase and rather 
>> start
>> taking measurements after the bottleneck queue is in a standing state.
> 
> +10. It's the slow start transient that is holding things back. If we
> could, for example
> open up the 110+ objects and flows web pages require all at once, and
> let 'em rip, instead of 15 at a time, without destroying the network,
> web PLT would get much better.
> 
>> My opinion, the best units for bloat is packets for UDP or bytes for
>> TCP. Min delay is a proxy measurement.
> 
> bytes, period. bytes = time. Sure most udp today is small packets but
> quic and videconferencing change that.
> 
>> 
>> Little's law allows one to compute this though does assume the network
>> is in a stable state over the measurement interval. In the real world,
>> this probably is rarely true. So we, in test & measurement 
>> engineering,
>> force the standing state with some sort of measurement co-traffic and
>> call it "working conditions" or equivalent. ;)
> 
> There was an extremely long, nuanced debate about little's law and
> where it applies, last year, here:
> 
> https://lists.bufferbloat.net/pipermail/cake/2021-July/005540.html
> 
> I don't want to go into it, again.
> 
>> 
>> Bob
>> > Bob, Sebastian,
>> >
>> > not being active on your topic, just to add what I observed on
>> > congestion:
>> > - starts with an increase of jitter, but measured minimum delays still
>> > remain constant. Technically, a queue builds up some of the time, but
>> > it isn't present permanently.
>> > - buffer fill reaches a "steady state", called bufferbloat on access I
>> > think; technically, OWD increases also for the minimum delays, jitter
>> > now decreases (what you've described that as "the delay magnitude"
>> > decreases or "minimum CDF shift" respectively, if I'm correct). I'd
>> > expect packet loss to occur, once the buffer fill is on steady state,
>> > but loss might be randomly distributed and could be of a low
>> > percentage.
>> > - a sudden rather long load burst may cause a  jump-start to
>> > "steady-state" buffer fill. The above holds for a slow but steady load
>> > increase (where the measurement frequency determines the timescale
>> > qualifying "slow").
>> > - in the end, max-min delay or delay distribution/jitter likely isn't
>> > an easy to handle single metric to identify congestion.
>> >
>> > Regards,
>> >
>> > Ruediger
>> >
>> >
>> >> On Nov 2, 2022, at 00:39, rjmcmahon via Rpm
>> >> <rpm@lists.bufferbloat.net> wrote:
>> >>
>> >> Bufferbloat shifts the minimum of the latency or OWD CDF.
>> >
>> >       [SM] Thank you for spelling this out explicitly, I only worked on a
>> > vage implicit assumption along those lines. However what I want to
>> > avoid is using delay magnitude itself as classifier between high and
>> > low load condition as that seems statistically uncouth to then show
>> > that the delay differs between the two classes;).
>> >       Yet, your comment convinced me that my current load threshold (at
>> > least for the high load condition) probably is too small, exactly
>> > because the "base" of the high-load CDFs coincides with the base of
>> > the low-load CDFs implying that the high-load class contains too many
>> > samples with decent delay (which after all is one of the goals of the
>> > whole autorate endeavor).
>> >
>> >
>> >> A suggestion is to disable x-axis auto-scaling and start from zero.
>> >
>> >       [SM] Will reconsider. I started with start at zero, end then switched
>> > to an x-range that starts with the delay corresponding to 0.01% for
>> > the reflector/condition with the lowest such value and stops at 97.5%
>> > for the reflector/condition with the highest delay value. My rationale
>> > is that the base delay/path delay of each reflector is not all that
>> > informative* (and it can still be learned from reading the x-axis),
>> > the long tail > 50% however is where I expect most differences so I
>> > want to emphasize this and finally I wanted to avoid that the actual
>> > "curvy" part gets compressed so much that all lines more or less
>> > coincide. As I said, I will reconsider this
>> >
>> >
>> > *) We also maintain individual baselines per reflector, so I could
>> > just plot the differences from baseline, but that would essentially
>> > equalize all reflectors, and I think having a plot that easily shows
>> > reflectors with outlying base delay can be informative when selecting
>> > reflector candidates. However once we actually switch to OWDs baseline
>> > correction might be required anyways, as due to colck differences ICMP
>> > type 13/14 data can have massive offsets that are mostly indicative of
>> > un synched clocks**.
>> >
>> > **) This is whyI would prefer to use NTP servers as reflectors with
>> > NTP requests, my expectation is all of these should be reasonably
>> > synced by default so that offsets should be in the sane range....
>> >
>> >
>> >>
>> >> Bob
>> >>> For about 2 years now the cake w-adaptive bandwidth project has been
>> >>> exploring techniques to lightweightedly sense  bandwidth and
>> >>> buffering problems. One of my favorites was their discovery that ICMP
>> >>> type 13 got them working OWD from millions of ipv4 devices!
>> >>> They've also explored leveraging ntp and multiple other methods, and
>> >>> have scripts available that do a good job of compensating for 5g and
>> >>> starlink's misbehaviors.
>> >>> They've also pioneered a whole bunch of new graphing techniques,
>> >>> which I do wish were used more than single number summaries
>> >>> especially in analyzing the behaviors of new metrics like rpm,
>> >>> samknows, ookla, and
>> >>> RFC9097 - to see what is being missed.
>> >>> There are thousands of posts about this research topic, a new post on
>> >>> OWD just went by here.
>> >>> https://forum.openwrt.org/t/cake-w-adaptive-bandwidth/135379/793
>> >>> and of course, I love flent's enormous graphing toolset for
>> >>> simulating and analyzing complex network behaviors.
>> >> _______________________________________________
>> >> Rpm mailing list
>> >> Rpm@lists.bufferbloat.net
>> >> https://lists.bufferbloat.net/listinfo/rpm
>> >
>> > _______________________________________________
>> > ippm mailing list
>> > ippm@ietf.org
>> > https://www.ietf.org/mailman/listinfo/ippm
>> _______________________________________________
>> Rpm mailing list
>> Rpm@lists.bufferbloat.net
>> https://lists.bufferbloat.net/listinfo/rpm