Re: [ippm] Adoption call for draft-cpaasch-ippm-responsiveness

Hello Matt,

> On Dec 19, 2021, at 7:38 PM, Matt Mathis <mattmathis=40google.com@dmarc.ietf.org> wrote:
> 
> I support this draft, and I have even offered to be a co-author.

thanks for the support and indeed you will be on the next version of the draft.

> Thank you  Al. for the extensive comments.   I only have a moment for some high level responses:
> 
> In some sense this document  is opposite of RFC8337.  Responsiveness is about measuring the how well/early the bottleneck signals congestion to the transport without incurring excessive queueing.   This not only depends on the loop behavior of the bottleneck and the measurement stream, it also depends on the loop behavior, population and RTT's of the cross traffic.   I do not expect any of the theory to be mature enough to be able to do any of this as abstractly as IPPM does with most metrics.   At this time I expect a responsiveness metric to require fingerprinting the hardware, OS and application software, and include a warning that results may be sensitive to the details of the implementation.

Yes - things can be very implementation-dependent. Different service deployments can have different ways of generating the "working conditions". A service with QUIC deployment using BBR may expose entirely different responsiveness metrics than a TCP-based deployment.

I wonder if we can abstract away the definition of "working conditions" enough so that it becomes independent of the implementation details. It's gonna be a tradeoff between how much we want different implementations to measure similar results vs leaving room for implementations to make their own choices.

> WiFi is a poorly modeled half duplex shared broadcast channel, that is a critical (dominant?) part of the responsiveness for typical Internet users.     Bidirectional traffic matters a lot, but I fear the scope might be too broad.

Agree'd. It complicates things by quite a bit.

Christoph

> 
> Thanks,
> --MM--
> The best way to predict the future is to create it.  - Alan Kay
> 
> We must not tolerate intolerance;
>        however our response must be carefully measured: 
>             too strong would be hypocritical and risks spiraling out of control;
>             too weak risks being mistaken for tacit approval.
> 
> 
> On Fri, Dec 17, 2021 at 3:51 PM MORTON JR., AL <acmorton@att.com <mailto:acmorton@att.com>> wrote:
> Hi authors and ippm-chairs,
> 
>  
> 
> Thanks for writing this-up!
> 
>  
> 
> I took one pass through, and have the following comments during Adoption call for draft-cpaasch-ippm-responsiveness:
> 
>  
> 
> TL;DR:
> 
> Many previously undefined terms were used here, and a more direct description using the term “saturation” seems possible, IMO. IPPM has used a template for metric drafts, and use of the hierarchy of singleton, sample, and statistic metrics from RFC 2330 will help with clarity/answer many of my questions.
> 
>  
> 
> regards (I’m off-line for a while now, so enjoy the holidays),
> 
> Al
> 
>  
> 
> >From the Abstract:
> 
>  
> 
>    This document specifies the "RPM Test" for measuring responsiveness.
> 
>    It uses common protocols and mechanisms to measure user experience
> 
>    especially when the network is fully loaded ("responsiveness under
> 
>    working conditions".)  The measurement is expressed as "Round-trips
> 
>    Per Minute" (RPM) and should be included with throughput (up and
> 
>    down) and idle latency as critical indicators of network quality.
> 
>  
> 
> “fully loaded” and “working conditions” aren’t necessarily the same, to me.  I’ll be looking for better definitions.
> 
>  
> 
> 3 <https://datatracker.ietf.org/doc/html/draft-cpaasch-ippm-responsiveness-01#section-3>.  Goals
>  
>    The algorithm described here defines an RPM Test that serves as a
>    good proxy for user experience.  This means:
>  
>    1.  Today's Internet traffic primarily uses HTTP/2 over TLS.  Thus,
>        the algorithm should use that protocol.
>  
>        As a side note: other types of traffic are gaining in popularity
>        (HTTP/3) and/or <???? UDP ???> are already being used widely (RTP).
>  
> 
> There are many measurement stability challenges when TCP is involved, see section 4 of RFC8337:  https://datatracker.ietf.org/doc/html/rfc8337#section-4 <https://datatracker.ietf.org/doc/html/rfc8337#section-4>
> RFC8337 intentionally broke the TCP control loop to make measurements in the face of these challenges.
> 
>  
> 
> 4.1 <https://datatracker.ietf.org/doc/html/draft-cpaasch-ippm-responsiveness-01#section-4.1>.  Working Conditions
>  
>    For the purpose of this methodology, typical "working conditions"
>    represent a state of the network in which the bottleneck node is
>    experiencing ingress and egress flows similar to those created by
>    humans in the typical day-to-day pattern.
>  
>    While a single HTTP transaction might briefly put a network into
>    working conditions, making reliable measurements requires maintaining
>    the state over sufficient time.
>  
>    The algorithm must also detect when the network is in a persistent
>    working condition, also called "saturation".
>  
>    Desired properties of "working condition":
>  
>    o  Should not waste traffic, since the person may be paying for it
>  
>    o  Should finish within a short time to avoid impacting other people
>       on the same network, to avoid varying network conditions, and not
>       try the person's patience.
>  
> 
> These seem like reasonable goals for the traffic that loads the network.
> 
> New terms needing definition were introduced:
> 
> “persistent working condition = saturation”,
> 
> which is different from
> 
> “ingress and egress flows similar to those created by humans in the typical day-to-day pattern”
> 
>  
> 
> Later in 4.1.1, terms like “saturate a path”  and “fill the pipe” appear, and
> 
>  
> 
>    The goal of the RPM Test is to keep the network as busy as possible
>    in a sustained and persistent way.  It uses multiple TCP connections
>    and gradually adds more TCP flows until saturation is reached.
>  
> 
> The terms “busy as possible”, and “typical day-to-day pattern”, or
> 
> “saturation” and “working conditions” indicate different load levels to me.
> 
>  
> 
> @@@@ Suggestion: I think it would help to simplify the terminology in this draft. You intend to measure a saturated path, so just say that. No “typical”, no “working conditions”, etc., in these early sections.
> 
>  
> 
> The sentence beginning “The goal...” should really appear in Section 3. Goals
> 
>  
> 
> Also, you have defined a measurement method in the sentence, “It uses...” above. This method of adding connections has been observed in other measurement systems, but it isn’t typical of user traffic, especially when each connection has an ~infinite amount of data to send during the test.
> 
>  
> 
> 4.1.2 <https://datatracker.ietf.org/doc/html/draft-cpaasch-ippm-responsiveness-01#section-4.1.2>.  Parallel vs Sequential Uplink and Downlink
>  
> ...
>    To measure responsiveness under working conditions, the algorithm
>    must saturate both directions.
>  
> Bi-directional saturation is really atypical of usage. I don’t think the benefit of “more data” pays off.
>  
> ...
>  
>    However, a number of caveats come with measuring in parallel:
>  
>    o  Half-duplex links may not permit simultaneous uplink and downlink
>       traffic.  This means the test might not saturate both directions
>       at once.
>  
>    o  Debuggability of the results becomes harder: During parallel
>       measurement it is impossible to differentiate whether the observed
>       latency happens in the uplink or the downlink direction.
>  
>    o  Consequently, the test should have an option for sequential
>       testing.
>  
> 
> @@@@ Suggestion: IMO, tests/results with Downlink saturation OR Uplink saturation would be more straightforward, and can be understood by users (especially those who have tested in the past). Avoid the pitfalls and make Sequential testing the preferred option.
> 
>  
> 
>  
> 
> 4.1.3 <https://datatracker.ietf.org/doc/html/draft-cpaasch-ippm-responsiveness-01#section-4.1.3>.  Reaching saturation
>  
>    The RPM Test gradually increases the number of TCP connections and
>    measures "goodput" - the sum of actual data transferred across all
>    connections in a unit of time.  When the goodput stops increasing, it
>    means that saturation has been reached.
> ...
>  
>    Filling buffers at the bottleneck depends on the congestion control
>    deployed on the sender side.  Congestion control algorithms like BBR
>    may reach high throughput without causing queueing because the
>    bandwidth detection portion of BBR effectively seeks the bottleneck
>    capacity.
>  
>    RPM Test clients and servers should use loss-based congestion
>    controls like Cubic to fill queues reliably.
>  
> With the evolution of Congestion control algorithms seeking to avoid filling buffers, does it make sense to require a full buffer at the bottleneck to achieve saturation?
> In fact, the definition above, “When the goodput stops increasing,...” does not require full buffers; it requires maximizing a delivery rate measurement instead.
>  
>  
> In 4.1.4, the final steps of the algorithm were not clear to me:
>  
>       *  Else, network reached saturation for the current flow count.
> @@@@ This wording implies it to be the final step, but there are further conditions to test.
>      Maybe this step is “Else, Candidate for stable saturation”?
>  
>          +  If new flows added and for 4 seconds the moving average
>             throughput did not change: network reached stable saturation
> @@@@ Maybe: 
>          +  If the 4 second moving average of "instantaneous aggregate goodput" with no new 
>             flows added did not change 
>             (defined as: moving average = "previous" moving average +/- 5%),
>             then the network reached stable saturation
> ----------------------------------------------------------------------------------------------------
>  
>          +  Else, add four more flows
> @@@ ??? and return to start?
>  
> 
>  
> 
> Finally, in 4.1.4, the Note explains:
> 
>  
> 
>    Note: It is tempting to envision an initial base RTT measurement and
>    adjust the intervals as a function of that RTT.  However, experiments
>    have shown that this makes the saturation detection extremely
>    unstable in low RTT environments.  In the situation where the
>    "unloaded" RTT is in the single-digit millisecond range, yet the
>    network's RTT increases under load to more than a hundred
>    milliseconds, the intervals become much too low to accurately drive
>    the algorithm.
>  
> 
> Well, TCP senders/control-loops are involved here, and likely play a
> 
> role in behavior categorized as “difficult to measure”.
> 
>  
> 
> By the time we get to
> 
>  
> 4.2 <https://datatracker.ietf.org/doc/html/draft-cpaasch-ippm-responsiveness-01#section-4.2>.  Measuring Responsiveness
>  
>    Once the network is in a consistent working conditions, the RPM Test
>    must "probe" the network multiple times to measure its
>    responsiveness.
>  
>    Each RPM Test probe measures:
>  
> 
> You previously started at least four TCP connections with infinitely large files.
> 
> The “create connection” RPM probes establish additional connections, DNS, TCP, etc.
> 
> Is each new connection an RPM probe? or is the set of connection tests a single probe?
> 
> (later we learn it is the set of
> 
> What if one of the set of connections fails/times-out?
> 
>  
> 
> I take it that the “load-bearing” connections are driving the path to saturation.
> 
> Maybe “load-generating connections” is more clear?
> 
>  
> 
> 4.2.1 <https://datatracker.ietf.org/doc/html/draft-cpaasch-ippm-responsiveness-01#section-4.2.1>.  Aggregating the Measurements
>  
>    The algorithm produces sets of 5 times for each probe, namely: DNS
>    handshake, TCP handshake, TLS handshake, HTTP/2 request/response on
>    separate (idle) connections, HTTP/2 request/response on load bearing
>    connections.  This fine-grained data is useful, but not necessary for
>    creating a useful metric.
>  
> @@@@ So, only ONE of the load-generating connections runs the 1-byte GET ?  (it says connections, and there are at least 4) The various handshakes result in 4 RT measurements.
>  
> But, do you mean *5 repeated measurement sets*? Each set with:
> DNS HS, TCP HS, TLS HS, HTTP/2 idle GET, and the potentially much longer GET on the 
> load generating connections.
>  
>    To create a single "Responsiveness" (e.g., RPM) number, this first
>    iteration of the algorithm gives an equal weight to each of these
>    values.  That is, it sums the five time values for each probe, and
>    divides by the total number of probes to compute an average probe
>    duration.  The reciprocal of this, normalized to 60 seconds, gives
>    the Round-trips Per Minute (RPM).
>  
> @@@@ I’m missing a step, I think:
> Are the “time values for each probe” the sum of handshake or response times for
> DNS HS, TCP HS, TLS HS, HTTP/2 idle GET and load generating connections?
> The processing doesn’t seem to include this preliminary calculation to produce 
> “five time values”.  
>  
> 
>  
> 
> In Section 5, “no new protocol is defined”, but
> 
>  
> 
>    The client begins the responsiveness measurement by querying for the
>    JSON configuration.  This supplies the URLs for creating the load
>    bearing connections in the upstream and downstream direction as well
>    as the small object for the latency measurements.
>  
> 
> The client needs to know how the server response will be organized, down to key: value, right?
> 
> Some client and server agreements needed...
> 
>  
> 
>  
> 
> From: ippm <ippm-bounces@ietf.org <mailto:ippm-bounces@ietf.org>> On Behalf Of Marcus Ihlar
> Sent: Monday, December 6, 2021 10:53 AM
> To: ippm@ietf.org <mailto:ippm@ietf.org>
> Subject: [ippm] Adoption call for draft-cpaasch-ippm-responsiveness
> 
>  
> 
> Hi IPPM,
> 
>  
> 
> This email starts an adoption call for draft-cpaasch-ippm-responsiveness, "Responsiveness under Working Conditions”. This document specifies the “RPM Test” for measuring user experience when the network is fully loaded. The intended status of the document is Experimental.   
> 
>  
> 
> https://datatracker.ietf.org/doc/draft-cpaasch-ippm-responsiveness/ <https://urldefense.com/v3/__https:/datatracker.ietf.org/doc/draft-cpaasch-ippm-responsiveness/__;!!BhdT!zI6d5je1i8cafA6NXByD5tvxHFKKPMjYgtM6t2aLUHFPsyPz-XwPFguwa1HS$>
> https://datatracker.ietf.org/doc/html/draft-cpaasch-ippm-responsiveness-01 <https://urldefense.com/v3/__https:/datatracker.ietf.org/doc/html/draft-cpaasch-ippm-responsiveness-01__;!!BhdT!zI6d5je1i8cafA6NXByD5tvxHFKKPMjYgtM6t2aLUHFPsyPz-XwPFq67PfWg$>
>  
> 
> This adoption call will last until Monday, December 20. Please review the document, and reply to this email thread to indicate if you think IPPM should adopt this document.
> 
>  
> 
> BR,
> 
> Marcus
> 
>  
> 
> _______________________________________________
> ippm mailing list
> ippm@ietf.org <mailto:ippm@ietf.org>
> https://www.ietf.org/mailman/listinfo/ippm <https://www.ietf.org/mailman/listinfo/ippm>
> _______________________________________________
> ippm mailing list
> ippm@ietf.org
> https://www.ietf.org/mailman/listinfo/ippm