Re: [ippm] draft-ietf-ippm-responsiveness
Christoph Paasch <cpaasch@apple.com> Fri, 19 January 2024 18:57 UTC
Return-Path: <cpaasch@apple.com>
X-Original-To: ippm@ietfa.amsl.com
Delivered-To: ippm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 60802C14F604 for <ippm@ietfa.amsl.com>; Fri, 19 Jan 2024 10:57:51 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.105
X-Spam-Level:
X-Spam-Status: No, score=-2.105 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=apple.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id uIYCybT8m7MD for <ippm@ietfa.amsl.com>; Fri, 19 Jan 2024 10:57:47 -0800 (PST)
Received: from rn-mailsvcp-mx-lapp03.apple.com (rn-mailsvcp-mx-lapp03.apple.com [17.179.253.24]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 73F4AC14F6E2 for <ippm@ietf.org>; Fri, 19 Jan 2024 10:57:26 -0800 (PST)
Received: from rn-mailsvcp-mta-lapp01.rno.apple.com (rn-mailsvcp-mta-lapp01.rno.apple.com [10.225.203.149]) by rn-mailsvcp-mx-lapp03.rno.apple.com (Oracle Communications Messaging Server 8.1.0.23.20230328 64bit (built Mar 28 2023)) with ESMTPS id <0S7I0058BVBPZI30@rn-mailsvcp-mx-lapp03.rno.apple.com> for ippm@ietf.org; Fri, 19 Jan 2024 10:57:26 -0800 (PST)
X-Proofpoint-ORIG-GUID: akYkz8IbKOg3UcSbYYbOGd76_TYAUzii
X-Proofpoint-GUID: akYkz8IbKOg3UcSbYYbOGd76_TYAUzii
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.619, 18.0.997 definitions=2024-01-19_11:2024-01-19, 2024-01-19 signatures=0
X-Proofpoint-Spam-Details: rule=interactive_user_notspam policy=interactive_user score=0 suspectscore=0 spamscore=0 phishscore=0 mlxscore=0 bulkscore=0 mlxlogscore=999 adultscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311290000 definitions=main-2401190111
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=apple.com; h=from : message-id : content-type : mime-version : subject : date : in-reply-to : cc : to : references; s=20180706; bh=UH7K282PK8A1+9NNoxWh+hqJTqjJVBesu4JNCXJL0gI=; b=jy7KMuioPlQ5F+ApwbLXkwWtxMFufwKVaOOqCGGzsEwtQcHuIf/D23jUSlKTV7YjONC8 UcprE1njoyNj/gdPfdkddZScxECRTkQGEvf2jXP1aA0fhmxS2DucjbDLyIACeCahRuaR DTAa7igiDEqCsox2u+q/5/zjiksKusGFvlvjrj1xj3UTq1pkBp4UAz3AEoWZ9yPqY4j+ N2EeJbTGmm5xL2jI6C48gITxKm5tq8wzGGDsGSc3vTEu5vBiB9sE1wVkWbtzaMvE6UUE uq1K0mOdfivv0R35HVJzOM1uHKJsA+BHW6hl4Poc1fT5BwcrBMU5bfjPFBnBNr9AXIsS 2g==
Received: from rn-mailsvcp-mmp-lapp01.rno.apple.com (rn-mailsvcp-mmp-lapp01.rno.apple.com [17.179.253.14]) by rn-mailsvcp-mta-lapp01.rno.apple.com (Oracle Communications Messaging Server 8.1.0.23.20230328 64bit (built Mar 28 2023)) with ESMTPS id <0S7I001FBVBM56V0@rn-mailsvcp-mta-lapp01.rno.apple.com>; Fri, 19 Jan 2024 10:57:22 -0800 (PST)
Received: from process_milters-daemon.rn-mailsvcp-mmp-lapp01.rno.apple.com by rn-mailsvcp-mmp-lapp01.rno.apple.com (Oracle Communications Messaging Server 8.1.0.23.20230328 64bit (built Mar 28 2023)) id <0S7I00J00V4SHR00@rn-mailsvcp-mmp-lapp01.rno.apple.com>; Fri, 19 Jan 2024 10:57:22 -0800 (PST)
X-Va-A:
X-Va-T-CD: 0af778c0afa90afa8c4c05937d25c782
X-Va-E-CD: c271acf190fbf832db5ab77cfa01cf54
X-Va-R-CD: 13f3c1abb006b363887502e732d00f42
X-Va-ID: c5966473-e3ae-4790-af53-92d2a3cad9f8
X-Va-CD: 0
X-V-A:
X-V-T-CD: 0af778c0afa90afa8c4c05937d25c782
X-V-E-CD: c271acf190fbf832db5ab77cfa01cf54
X-V-R-CD: 13f3c1abb006b363887502e732d00f42
X-V-ID: 123da04e-d28b-4f36-b0f2-d996bd369efb
X-V-CD: 0
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.619, 18.0.997 definitions=2024-01-19_12:2024-01-19, 2024-01-19 signatures=0
Received: from smtpclient.apple ([17.230.162.166]) by rn-mailsvcp-mmp-lapp01.rno.apple.com (Oracle Communications Messaging Server 8.1.0.23.20230328 64bit (built Mar 28 2023)) with ESMTPSA id <0S7I00EAHVBLPT00@rn-mailsvcp-mmp-lapp01.rno.apple.com>; Fri, 19 Jan 2024 10:57:22 -0800 (PST)
From: Christoph Paasch <cpaasch@apple.com>
Message-id: <628E2A62-5C2F-448A-83B8-08FC4FB57E7C@apple.com>
Content-type: multipart/alternative; boundary="Apple-Mail=_3FB7E0B7-CFB3-4A07-8508-7FB76E6696FD"
MIME-version: 1.0 (Mac OS X Mail 16.0 \(3774.300.61.1.2\))
Date: Fri, 19 Jan 2024 10:57:11 -0800
In-reply-to: <14EC339A-9A84-40C5-AFCC-474DF03C16B6@gmx.de>
Cc: IETF IPPM WG <ippm@ietf.org>, Rpm <rpm@lists.bufferbloat.net>
To: Sebastian Moeller <moeller0@gmx.de>
References: <D7323D41-BA9B-46E5-AA7D-6514636AA44D@gmx.de> <7494CC8D-7BAB-41DB-9FF7-7306747F2DC9@apple.com> <14EC339A-9A84-40C5-AFCC-474DF03C16B6@gmx.de>
X-Mailer: Apple Mail (2.3774.300.61.1.2)
Archived-At: <https://mailarchive.ietf.org/arch/msg/ippm/bhijmfeuLCNVt2xXwTfdFyhm7Ck>
Subject: Re: [ippm] draft-ietf-ippm-responsiveness
X-BeenThere: ippm@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: IETF IP Performance Metrics Working Group <ippm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ippm>, <mailto:ippm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ippm/>
List-Post: <mailto:ippm@ietf.org>
List-Help: <mailto:ippm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ippm>, <mailto:ippm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 19 Jan 2024 18:57:51 -0000
Hello, > On Jan 19, 2024, at 5:14 AM, Sebastian Moeller <moeller0@gmx.de> wrote: >> On 16. Jan 2024, at 20:01, Christoph Paasch <cpaasch@apple.com> wrote: >>> On Dec 3, 2023, at 10:13 AM, Sebastian Moeller <moeller0@gmx.de> wrote: >>> >>> Dear IPPM members, >>> >>> On re-reading the current responsiveness draft I stumbled over the following section: >>> >>> >>> Parallel vs Sequential Uplink and Downlink >>> >>> Poor responsiveness can be caused by queues in either (or both) the upstream and the downstream direction. Furthermore, both paths may differ significantly due to access link conditions (e.g., 5G downstream and LTE upstream) or routing changes within the ISPs. To measure responsiveness under working conditions, the algorithm must explore both directions. >>> >>> One approach could be to measure responsiveness in the uplink and downlink in parallel. It would allow for a shorter test run-time. >>> >>> However, a number of caveats come with measuring in parallel: >>> >>> • Half-duplex links may not permit simultaneous uplink and downlink traffic. This restriction means the test might not reach the path's capacity in both directions at once and thus not expose all the potential sources of low responsiveness. >>> • Debuggability of the results becomes harder: During parallel measurement it is impossible to differentiate whether the observed latency happens in the uplink or the downlink direction. >>> Thus, we recommend testing uplink and downlink sequentially. Parallel testing is considered a future extension. >>> >>> >>> I argue, that this is not the correct diagnosis and hence not the correct decision. >>> For half-duplex links the given argument is not incorrect, but incomplete, as it is quite likely that when forced to multiplex more bi-directional traffic (all TCP testing is bi-directional, so we only argue about the amount of reverse traffic, not whether it exist, and even if we would switch to QUIC/UDP we would still need a feed-back channel) we will se different "potential sources of low responsiveness" so ignoring any of the two seems ill advised. >> >> You are saying that parallel bi-directional traffic exposes different sources of responsiveness issues than uni-directional traffic (up and down) ? What kind of different sources would that expose ? Can you give some examples and maybe a suggestion on how to word things ? > > [SM] If the bottleneck is a WiFi link we occasionally see that some OS are more aggressive than others in acquiring airtime, which easily results in differential throughput for the two directions and often higher queueing delay for the direction that is 'slowed' down. > In theory that should not really happen but in practise it does, e.g. the ISP unhelpfully passes undesired DSCP marks into a home network that then are acted upon by WiFi WMM. To elaborate, Comcast for a long time had an issue where large fractions (IIRC up to 25%) of packets where inadvertently marked as CS1 which in default WMM translates to AC_BK, and if the client sends the upload traffic via the default AC_BE, these differential AC usage can now result in different queueing delay compared to looking at upload and download individually. (If all traffic of a channel uses AC_BK instead of AC_BE this should not affect latency much) > Side-note: Comcast after being alerted took notice of the issue and fixed it, but I think this kind of issue can happen to other ISPs as well. >>> Debuggability is not "rocket science" either, all one needs is a three value timestamp format (similar to what NTP uses) and one can, even without synchronized clocks! establish baseline OWDs and then under bi-directional load one can see which of these unloaded OWDs actually increases, so I argue that "it is impossible to differentiate whether the observed latency happens in the uplink or the downlink direction" is simply an incorrect assertion... (and we are actually doing this successfully in the existing internet as part of the cake-autorate project [h++ps://github.com/lynxthecat/cake-autorate/tree/master] already, based on ICMP timestamps). The relevant observation here is that we are not necessarily interested in veridical OWDs under idle conditions, but we want to see which OWD(s) increase during working-conditions, and that works with desynchronized clocks and is also robust against slow clock drift. >> >> Unfortunately, this would require for the server to add timestamps to the HTTP-response, right ? > > [SM] Yes in a sense.... but that could be a a small process that simply updates the content of that file every couple of milliseconds, so would not strictly need to be the server process... Which would kill the in-memory caching the servers do. And some webserver implementations actually have the ability to generate “random” data on-the-fly without a backing file behind it (like “statichit” in ATS https://docs.trafficserver.apache.org/en/9.0.x/release-notes/whats-new.en.html) >> We opted against this because the “power” of the responsiveness methodology is that it is extremely lightweight on the server-side. And with lightweight I mean not only from an implementation/CPU perspective but also from a deployment perspective. All one needs to do on the server in order to provide a responsiveness-measurement-endpoint is to host 2 files (one very large one and a very small one) and provide an endpoint to “POST” data to. All of these are standard capabilities in every webserver that can easily be configured. And we have seen a rise of endpoints showing up thanks to the simplicity to deploy it. >> >> So, it is IMO a balance between “deployability” and “debuggability”. The responsiveness test is clearly aiming towards being deployable and accessible. Thus I think we would prefer keeping things on the server-side simple. >> >> >> Thoughts ? > > [SM] I really really would like some way to get OWDs if only optional, but even more than that I think RPM should get as wide a deployment as possible, ubiquity has its own inherent value for measurement platforms, so if this makes deployment harder it would be a no-go. > > Now, I get that this is a long shot, but I fear that if the draft does not mention this at all the chance will be gone forever.... > Could we maybe add a description of an optional 'time' payload, so clients could expect a single standardised format for that, if a server would optionally support it? Well, if we describe the optional ’time’ payload, we also need to specify how the clients are gonna use it and expose that information. Meaning also, having some kind of implementation experience with it, running code, … Overall, the draft initially started with recommending parallel mode. But the feedback from the WG was that parallel mode is too hard to debug and interpret (which I agree with) - see the discussion at https://mailarchive.ietf.org/arch/msg/ippm/IZKaxtqQacj3j2ftKxEwCWLQ8cU/ and thus rather favor sequential mode. This was part of the WG adoption of the draft. >> That being said, I’m not entirely opposed to recommending the parallel mode as well. The interesting bit about the parallel mode is not so much the responsiveness measurement but rather the capacity measurement. Because, surprisingly many modems/… that are supposedly (according to their spec-sheet) able to handle 1 Gbps full-duplex suddenly show their weakness and are no more able to handle line-rate. So, it is more about capacity than responsiveness IMO. > > [SM] True, yet such overload also occasionally affects queuing delay and jitter (sure RPM does not report jitter, but it likely affects the ability of a test to reach the required stability criteria). > >> However, as a frequent user of the networkQuality-tool I realize myself that whenever I want to test my network I end up using a sequential test in favor of the parallel test. > > [SM] I agree that a full complement of upload, then download, then combined upload & download is a great tool for understanding network behaviour. I also want to applaud Apple's networkQuality of an excellent implementation of the ideas behind this draft, offering a great and well selected set of options: > > USAGE: networkQuality [-C <configuration_url>] [-c] [-d] [-f <comma-separated list>] [-h] [-I <network interface name>] [-k] [-p] [-r host] [-S <port>] [-s] [-u] [-v] > -C: Override Configuration URL or path (with scheme file://) > -c: Produce computer-readable output > -d: Do not run a download test (implies -s) > -f: <comma-separated list>: Enforce Protocol selections. Available options: > h1: Force-enable HTTP/1.1 > h2: Force-enable HTTP/2 > h3: Force-enable HTTP/3 (QUIC) > L4S: Force-enable L4S > noL4S: Force-disable L4S > -h: Show help (this message) > -I: Bind test to interface (e.g., en0, pdp_ip0,...) > -k: Disable certificate validation > -p: Use iCloud Private Relay > -r: Connect to host or IP, overriding DNS for initial config request > -S: Start and run server on specified port. Other specified options ignored > -s: Run tests sequentially instead of parallel upload/download > -u: Do not run an upload test (implies -s) > -v: Verbose output > > that cover a lot of cases with a relative small set of control parameters. Thanks :) Always looking for suggestions on what else to expose! Let us know! Christoph >> >> >> >> Christoph >> >> >>> >>> Given these observations, I ask that we change this design parameter to default requiring both measurement modes and defaulting to parallel testing (or randomly select between both modes, but report which it choose). >>> >>> Best Regards >>> Sebastian >>> _______________________________________________ >>> ippm mailing list >>> ippm@ietf.org >>> https://www.ietf.org/mailman/listinfo/ippm
- [ippm] draft-ietf-ippm-responsiveness Sebastian Moeller
- Re: [ippm] draft-ietf-ippm-responsiveness Christoph Paasch
- Re: [ippm] draft-ietf-ippm-responsiveness Sebastian Moeller
- Re: [ippm] draft-ietf-ippm-responsiveness Christoph Paasch
- Re: [ippm] [Rpm] draft-ietf-ippm-responsiveness rjmcmahon