Re: [ippm] Magnus Westerlund's Discuss on draft-ietf-ippm-capacity-metric-method-06: (with DISCUSS)

"MORTON, ALFRED C (AL)" <acm@research.att.com> Wed, 03 March 2021 13:39 UTC

From: "MORTON, ALFRED C (AL)" <acm@research.att.com>
To: Magnus Westerlund <magnus.westerlund@ericsson.com>, "iesg@ietf.org" <iesg@ietf.org>
CC: "tpauly@apple.com" <tpauly@apple.com>, "ianswett@google.com" <ianswett@google.com>, "draft-ietf-ippm-capacity-metric-method@ietf.org" <draft-ietf-ippm-capacity-metric-method@ietf.org>, "ippm-chairs@ietf.org" <ippm-chairs@ietf.org>, "ippm@ietf.org" <ippm@ietf.org>
Thread-Topic: Magnus Westerlund's Discuss on draft-ietf-ippm-capacity-metric-method-06: (with DISCUSS)
Thread-Index: AQHXC4EzqJ5daDOk8keYoMTRANvS6qpo9txggAGffICAACldMIAGGjkAgADI6CA=
Date: Wed, 03 Mar 2021 13:39:28 +0000
Message-ID: <4D7F4AD313D3FC43A053B309F97543CF01476B57E9@njmtexg4.research.att.com>
References: <161426272345.2083.7668347127672505809@ietfa.amsl.com> <4D7F4AD313D3FC43A053B309F97543CF01476A0C0E@njmtexg5.research.att.com> <66f367953ae838c8ba7505c60e51367843117787.camel@ericsson.com> <4D7F4AD313D3FC43A053B309F97543CF01476A0FE3@njmtexg5.research.att.com> <HE1PR0702MB3772A66E2C0409F5A69DC7DA95999@HE1PR0702MB3772.eurprd07.prod.outlook.com>
In-Reply-To: <HE1PR0702MB3772A66E2C0409F5A69DC7DA95999@HE1PR0702MB3772.eurprd07.prod.outlook.com>
Accept-Language: en-US
Content-Language: en-US
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/ippm/GER0pISM4KMZAQerRGsxciNEZmk>
Subject: Re: [ippm] Magnus Westerlund's Discuss on draft-ietf-ippm-capacity-metric-method-06: (with DISCUSS)
Precedence: list

Hi Magnus,

Len and I can meet tomorrow, but I haven't heard from Rudiger.
Also, I somehow failed to press *send* on this message last night ???

We have extracted your comments in the remaining areas for discussion, below.

Al 

<Reordering example: >

It was the above type of cases I was considering. I was worried that you don't detect some cases of issues if you reset and have no memory of last received packets. I think one obvious case that doesn't report as out of order for really short ones are this case when zero or single packets are delivered. Then a reordering could look like this.

|| 1   || 3 || 2 || and without a memory of the last received this would not be an out of order event. Nor would a packet loss like this

|| 1 ||     ||  3 || 

So when FB is on the same magnitude as the sent packet interval this measurement method appear to have issues. I have not analyzed if a single last seen packet number memory would solve most or all issues. 

At least it would also solve the issue of this

|| 1 2  3 4 || 6 7 8 || that I am also uncertain if that is detected. 

So, I think more details are needed to what is required on the feedback mechanism to detect these type of issues. 

@@@@Answer: the text says:
	The accumulated statistics are then reset by the receiver for the next feedback interval.

and nothing about clearing the entire memory of the measurement code.

Also, assuming you are looking for recommended ranges for parameters like FB (=FT in the text), we can certainly provide them. FT is never = single packet send interval and << dt duration of the sub-interval.

-=-=-=-=-=-=-=-=-=-=-=-=-

> > In addition if the feedback is not reliable it looses the information
> > for that interval.
> [acm]
> That's right, and:
> 1. the sending rate does not increase or decrease without feedback 2. the
> feedback is traveling the reverse path that the test do not congest with
>    test traffic
> 3. the running code has watchdog time-outs that *terminate the
> connection* if
>    either the sender or receiver go quiet
> 
> In essence, the test method is not reliable byte stream transfer like TCP.
> We can shut-down the test traffic very quickly if something is wrong and
> useful measurement is in question.
> 
> 

Al, I hope you understand that my concern here is that the documents description of the control algorithm is incomplete and are missing important protection aspects. As well as its behavior to different parameterizations in various conditions. 

@@@@Answer: We can add more details, but we need to agree on the list before we add the parameters, etc., and then we are done. To be clear, we have already added many default settings for parameter values. That's far-beyond what IPPM methods have required, and it recognizes both the number of parameters involved in this method/algorithm and the extensive testing we have completed (where we haven't broken anything in the 'net). 

@@@@Also, it makes no sense to test for equivalent results with different parameter sets.  BCP 176 recommends identical parameters (the only exception is lack of ability to configure the parameter to identical values).
 

-=-=-=-=-=-=-=-=-=-

> Magnus wrote:
> > And making feedback reliabel could cause worse HOL issues for
> > reacting to later feedback that are recived prior to the lost one.
> >
> [acm]
> So the alternative to un-reliable feedback can be worse?
> Good thing it's not planned.

So let's assume that receiver measures two feedback intervals in a row that there where reordering events etc. So in case the first report is lost, then from the senders perspective would there be an advantage to know that both intervals indicated issues at this rate? It at least indicates that the measurement that one is over capacity is consistent. 

@@@@Answer: In the case above, the fast initial sender ramp-up would pause in response to the second report (Rx -= 1, like the single random loss case we discussed) and then resume at Rx += 10 steps. So, if the reordering in your example is caused by congestion we have continued to send at the ~rate that caused congestion during the missing status interval, and likely have a third message/measurement indicating congestion or two more measurements when the rate increases further.  
IOW, if measurement message loss is at the onset of congestion, rate increase will ensure more opportunities to communicate that state. And if measurement message loss is a random event, it just pauses the ramp-up for one FT interval.

-=-=-=-=-=-=-=-=- 

Also, will the sender know that feedback is missing. This is all depending on the protocol used for the measurements. 

@@@@Answer: Yes, the running code (and a draft protocol proposal) has sequence numbers.

So I think there are two exiting protocols that I could reasonably easily implement this capacity measurement into. These two are QUIC and RTP/UDP. By taking of the shelf implementations I would have frameworks that would contain a number of features that appear needed to be able to do this measurement and receive feedback needed.

@@@@Answer: Yes, you could do that. Or, you could help us by adding the minimum security features that would pass IETF/SecDIR review into the running code and (currently identical) protocol proposal. Both of these paths have a set of unknowns when it comes to the final product, like will a small-CPU-power CPE sender still be able to achieve 1 or 2 Gbps? 

@@@@Offer: :-)
We now have the draft with the protocol description ready to go. If we can have a submission during black-out period, then people can look at it before IPPM's last-last session on Friday.  Otherwise, we'll wait till next Monday.


-=-=-=-=-=-=-=-=-=-

> [acm]
> Yes, except that the current table stops at 10Gbps. We haven't had the
> opportunity to test >10Gbps.
> 
...
> Our experience is that we avoid large overshoot with fast or slow linear
> increases. It means we've taken some care to keep the network running. 
> We haven't broken any test path yet, and we bail-out quickly and 
> completely if something goes wrong.

I understand that there are several considerations here that are contradicting.

1. Quickly finding roughly the existing capacity
2. Avoid significant overshoot that build up to much queue if that is possible
3. Have sufficient fine grained control around the capacity so that adjustment step finds the capacity.

For example the last I am worried when you define the step size as 1 GBPS above 10 GBPS. My fear is that one 1 GBPS more traffic will increase the buffer occupancy so quickly that one alternates between not filled, and over filled between stepping up and receiving the feedback in the control cycle. We have to remember that with a FB of 50 ms the delay in the control cycle will be up to RTT + 50 ms.  

So I am speculating and you haven't tested. So if you put in table construction recommendations, then I do think you should be very clear that these are untested.

@@@@Answer:  We added three sentences describing three ranges in response to your comment, IIRC. We can easily edit the sentence in the working text: 
    Above 10 Gbps, increments of 1 Gbps are RECOMMENDED.
	
and say something safer:
    Above 10 Gbps, increments of 100 Mbps are suggested in the absence of testing.
	
or we could delete the sentence entirely!  Your choice.

-=-=-=-=-=-=-=-=-=-=-=-

> Early impressions have been formed on several erroneous assumptions
> regarding algorithm stability (operation above 1Gbps) and
> suitability for purpose (one random loss case).

I agree to misunderstandings and misinterpretation of what was defined. And I do have to say that some of these assumptions or guesses are due to the lack of clarity of the specification. Which from my perspective strengthens my position that this is not yet standards track quality around the load algorithm. 

@@@@Answer: But we are in the phase of development where new readers suggest clarifications to the text, and we are more than willing to clarify anything we wrote. We spent several weeks doing exactly that with Martin before Last Call. To be fair, you did not have much time to review this memo before casting your ballot. 

@@@@Proposal: Would it help in your view to make the load adjustment pseudocode in the Appendix a normative supplement to the processes we have described in words in section 8.1 ?  We added the pseudocode during Martin's review.


-=-=-=-=-=-=-=-=-=-=-

> We believe in rough consensus and running code.
> 

So BBR is frequently discussed in ICCRG, and the BBR v1 was found to have some truly horrible stable state. Like one where it was running on at 30% packet loss without reacting as it only contained delay. BBRv2 has addressed this so it will react to higher degrees of packet loss. 

@@@@Answer: Matt Mathis and I had lunch together at IETF the last time we were in Montreal, where he told me this BBRv1 story. Matt was very excited about what we were proposing because of the similarities to BBR, and after two hours Matt bought my lunch. During the moths that followed, we performed several comparison tests between our running code and BBR versions.

So I do believe in running code to and as long as you are monitoring your impact I think experimentation is the right way here. However, if we put the label IETF proposed standard on the load algorithm people will assume that it is safe to use. That is my issue here and I don't think we can give that recommendation even with a limited applicability statement in the current form. 

@@@@@Answer: Here's another Big Difference from the usual TCP algorithm deployment criteria, which you still seem be applying. BBR updates have been able to take advantage of vast server-ownership when they update the servers to use BBR-next-version and then the dominant downstream traffic is immediately controlled in the best way.  In the running code and the protocol proposal draft, the load control algorithm is a *server-only function*, and the server resides in the network where the algorithm can more easily be adjusted if necessary, and fewer hosts need the updates (as opposed to many-many more clients that are notoriously resistant to updates). There is a significant measure of safety provided by that strategy alone.

-=-=-=-=-=-=-=-=-=-=-=-=-

I do appreciate your work. However, I still don't think the load control algorithm description is sufficiently well specified and parameterized to be published in a standards track document. Maybe it can be made into that.

@@@@Answer: This is the path we were on, and prefer to remain on that path. Like every other reviewer, you tell us what you what to see clarified, and we will do it.

However, I think this would make an excellent experimental specification with some additional clarifications on parameters ranges and defined responses when feedback is not received in a timely fashion. 

@@@@Answer: If it were March 2019, we would agree with you that the specification is experimental back then. But now, two additional years of experimentation and rigorous review has occurred, and every other review indicates that we're ready for the Proposed Standard phase of metric and method standardization, with further evaluation according to BCP 176.  

Also I think you can clarify the protocol requirements for the measurement methodology. 

@@@@Answer: We can do more than that. The protocol requirements draft is ready for submission. It could be an Informative reference in this draft.

Then a crystal clear applicability statement for the load algorithm. 

@@@@Answer: I agree, and you can help us write the load algorithm-specific wording for the applicability section. In any case, that's part of a good compromise.

-=-=-=-=-=-=-=-=-

Here I think we can see another point in actually splitting out the load algorithm and that is related to the applicability statement. Does the metric and the load algorithm have the same applicability? As a metric it is not truly restricted, it is just a question of the impact of measuring the metric across the internet. 

@@@@Answer: The IPPM precedent is that we specify both the metric and the method together on the Standards Track. Even if the methods are intentionally minimal, as they were at the beginning of IPPM work >20 years ago, the drafts went on the Standards Track, and IPPM was tasked to figure-out our own Standards Track advancement criteria (BCP 176). As a result, IPPM has developed a very complete formulation of our working methods over time.
 
-=-=-=-=-=-=-=-=-

Also, I think we should consider somewhat the update and evolution path that hopefully will occur here. I think the load algorithm is likely to benefit from further clarification in its specification in the future. Thus, having it in a separate specification makes it easier to update and maintain. 

@@@@Answer: Actually, experience tells me that aspects throughout the memo might benefit from updates after tests according to BCP 176 with additional experience and implementation. For example, the process to develop STD 81 (for delay) and STD 82 (for loss) involved improvements throughout the normative text, not just the methods:
https://tools.ietf.org/html/rfc7679#section-7
https://tools.ietf.org/html/rfc7680#section-6

Also, a load-adjustment algorithm specification on its own is more likely to be used beyond the original intent (and less-likely to be found if hidden inside an RFC on measurement, as-is the case now).

@@@@Conclusion: So, we have followed all previous IPPM precedents very closely and gone beyond them to provide additional specifications in many ways. We have certainly done more experimentation that was directly accessible to IETF participants than for any other IPPM metric and method (e.g., at the hackathons). We are open to all suggestions to embellish or clarify the current text. Let us work that aspect together now.


> -----Original Message-----
> From: Magnus Westerlund [mailto:magnus.westerlund@ericsson.com]
> Sent: Tuesday, March 2, 2021 5:27 AM
> To: MORTON, ALFRED C (AL) <acm@research.att.com>; iesg@ietf.org
> Cc: tpauly@apple.com; ianswett@google.com; draft-ietf-ippm-capacity-
> metric-method@ietf.org; ippm-chairs@ietf.org; ippm@ietf.org
> Subject: RE: Magnus Westerlund's Discuss on draft-ietf-ippm-capacity-
> metric-method-06: (with DISCUSS)
> 
> Hi Al,
> 
> Would it be possible for you and your co-authors to call in tomorrow
> Wednesday at 16:30 CET to the TSV-Office hours to discuss this document?
> 
> Meeting link:
> 
> https://ietf.webex.com/ietf/j.php?MTID=m689aa12ae4b319f4e371988b8330a863
> Meeting number:
>     185 768 2828
> Password:
>     PNvNTuQm823
> 
> 
> Please see inline.
> 
> > > > >
> > > > > ------------------------------------------------------------------
> > > > > ----
> > > > > DISCUSS:
> > > > > ------------------------------------------------------------------
> > > > > ----
> > > > >
> > > > > A) Section 8. Method of Measurement
> > > > >
> > > > > I think the metrics are fine, what makes me quite worried here is
> > > > > the measurement method. My concerns with it are the following.
> > > > >
> > > > > 1. The application of this measurement method is not clearly
> scoped.
> > <snip>
> > [acm] we agreed on text adding "access" applicability to the scope
> section.
> >
> > > > > However in
> > > > > that context I think the definition and protection against severe
> > > > > congestion has significant short comings. The main reason is that
> > > > > the during a configurable time period (default 1 s) the sender
> > > > > will attempt to send at a specified rate by a table independently
> > > > > on what happens during that second.
> > > >
> > > > [acm]
> > > > Not quite, 1 second is the default measurement interval for
> > > > Capacity, but sender rate adjustments occur much faster (and we add
> > > > a default at 50ms). This is a an important point (and one that Ben
> > > > also noticed, regarding variable F in section 8.1). So, I have added
> FT as a
> > parameter in section 4:
> > > >
> > > > o FT, the feedback time interval between status feedback messages
> > > > communicating measurement results, sent from the receiver to control
> > > > the sender. The results are evaluated to determine how to adjust the
> > > > current offered load rate at the sender (default 50ms)
> > > >
> > > > -=-=-=-=-=-=-=-
> > > > Note that variable F in section 8.1 is redundant with parameter F in
> > > Section
> > > > 4,
> > > > the number of flows (in-06). So we changed the section 8.1 variable
> > > > F to
> > > FT
> > > > in the working text.
> > >
> > > Okay, that makes things clearer. With all the equal intervals in the
> > > metrics I had missinterpreted that also the transmission would be
> > > uniform during the measurement intervals.
> > >
> > > However, when rereading Section 8.1 I do have to wonder if the
> > > non-cumaltive feedback actually creates two issues. First, it appears
> > > to loose information for reordering that crosses the time when the FT
> timer
> > fires, due to reset.
> > [acm]
> > I don't understand how the sequence error counting "loses information"
> > when reordered packets cross a measurement feedback boundary. I'm not
> > sure what aspect of measurement you are "resetting", but I assume it is
> >     "The accumulated statistics are then
> >      reset by the receiver for the next feedback interval."
> >
> > Suppose I have two measurement intervals and I receive:
> >
> > ||  1  2  3  5  6 || 4  7  8  9 ...||
> >
> > where || is the measurement feedback boundary.
> >
> > Packet 4 arrives late enough from its original position to span the
> boundary.
> > The 3->5 sequence is one sequence error, and the 4->7 sequence is
> another
> > error.
> > This example produces two sequence errors in different feedback
> intervals,
> > but that's a typical measurement boundary problem. We can't get rid of
> > measurement boundaries, and they affect many measurements.
> >
> > Note that a reordered packet contributes to IP-Layer Capacity, by
> definition.
> >
> > Perhaps you had some other scenario in mind?
> >
> 
> It was the above type of cases I was considering. I was worried that you
> don't detect some cases of issues if you reset and have no memory of last
> received packets. I think one obvious case that doesn't report as out of
> order for really short ones are this case when zero or single packets are
> delivered. Then a reordering could look like this.
> 
> || 1   || 3 || 2 || and without a memory of the last received this would
> not be an out of order event. Nor would a packet loss like this
> 
> || 1 ||     ||  3 ||
> 
> So when FB is on the same magnitude as the sent packet interval this
> measurement method appear to have issues. I have not analyzed if a single
> last seen packet number memory would solve most or all issues.
> 
> At least it would also solve the issue of this
> 
> || 1 2  3 4 || 6 7 8 || that I am also uncertain if that is detected.
> 
> So, I think more details are needed to what is required on the feedback
> mechanism to detect these type of issues.
> 
> >
> > > In addition if the feedback is not reliable it looses the information
> > > for that interval.
> > [acm]
> > That's right, and:
> > 1. the sending rate does not increase or decrease without feedback 2.
> the
> > feedback is traveling the reverse path that the test do not congest with
> >    test traffic
> > 3. the running code has watchdog time-outs that *terminate the
> > connection* if
> >    either the sender or receiver go quiet
> >
> > In essence, the test method is not reliable byte stream transfer like
> TCP.
> > We can shut-down the test traffic very quickly if something is wrong and
> > useful measurement is in question.
> >
> >
> 
> Al, I hope you understand that my concern here is that the documents
> description of the control algorithm is incomplete and are missing
> important protection aspects. As well as its behavior to different
> parameterizations in various conditions.
> 
> > > And making feedback reliabel could cause worse HOL issues for
> > > reacting to later feedback that are recived prior to the lost one.
> > >
> > [acm]
> > So the alternative to un-reliable feedback can be worse?
> > Good thing it's not planned.
> 
> So lets assume that receiver measures two feedback intervals in a row that
> there where reordering events etc. So in case the first report is lost,
> then from the senders perspective would there be an advantage to know that
> both intervals indicated issues at this rate? It at least indicates that
> the measurement that one is over capacity is consistent.
> 
> Also, will the sender know that feedback is missing. This is all depending
> on the protocol used for the measurements.
> 
> So I think there are two exiting protocols that I could reasonably easily
> implement this capacity measurement into. These two are QUIC and RTP/UDP.
> By taking of the shelf implementations I would have frameworks that would
> contain a number of features that appear needed to be able to do this
> measurement and receive feedback needed.
> 
> 
> >
> > >
> > > >
> > > >
> > > > >
> > > > > 2. The algorithm for adjusting rate is table driven but give no
> guidance
> > on
> > > > > how
> > > > > to construct the table and limitations on value changes in the
> table. In
> > > > > addition the algorithm discusses larger steps in the table without
> any
> > > > > reflection of what theses steps sides may represent in offered
> load.
> > > >
> > > > [acm]
> > > > We can add (Len suggested the following text addition):
> > > > OLD
> > > > 8.1. Load Rate Adjustment Algorithm
> > > >
> > > > A table SHALL be pre-built defining all the offered load rates that
> > > > will be supported (R1 through Rn, in ascending order, corresponding
> > > > to indexed rows in the table). Each rate is defined as datagrams
> of...
> > > >
> > > > NEW
> > > > 8.1. Load Rate Adjustment Algorithm
> > > >
> > > > A table SHALL be pre-built defining all the offered load rates that
> > > > will be supported (R1 through Rn, in ascending order, corresponding
> > > > to indexed rows in the table). It is RECOMMENDED that rates begin
> with
> > > > 0.5 Mbps at index zero, use 1 Mbps at index one, and then continue
> in
> > > > 1 Mbps increments to 1 Gbps. Above 1 Gbps, and up to 10 Gbps, it is
> > > > RECOMMENDED that 100 Mbps increments be used. Above 10 Gbps,
> > > > increments of 1 Gbps are RECOMMENDED. Each rate is defined as...
> > >
> > > Is this what you actually used in your test implementation?
> > [acm]
> > Yes, except that the current table stops at 10Gbps. We haven't had the
> > opportunity to test >10Gbps.
> >
> > > At my first glance
> > > this recommendation looks to suffer from rather severe step effects
> and
> > also
> > > make the respond to losses behave strange around the transitions.
> > Wouldn't some
> > > type of logarithmic progression be more appropriate here for initial
> > > probing?
> > [acm]
> > Len and I considered various algorithms for the search.
> >
> > Logarithmic increase typically means more rate overshoot than Linear
> > increases.
> > Unfortunately, a large rate overshoot means that the queues will fill
> and
> > need
> > a longer time to bleed-off, meaning that rate reductions will take you
> far
> > from
> > the "right neighborhood" again.
> >
> > Our experience is that we avoid large overshoot with fast or slow linear
> > increases.
> > It means we've taken some care to keep the network running. We haven't
> > broken
> > any test path yet, and we bail-out quickly and completely if something
> goes
> > wrong.
> 
> I understand that there are several considerations here that are
> contradicting.
> 
> 1. Quickly finding roughly the existing capacity
> 2. Avoid significant overshoot that build up to much queue if that is
> possible
> 3. Have sufficient fine grained control around the capacity so that
> adjustment step finds the capacity.
> 
> For example the last I am worried when you define the step size as 1 GBPS
> above 10 GBPS. My fear is that one 1 GBPS more traffic will increase the
> buffer occupancy so quickly that one alternates between not filled, and
> over filled between stepping up and receiving the feedback in the control
> cycle. We have to remember that with a FB of 50 ms the delay in the
> control cycle will be up to RTT + 50 ms.
> 
> So I am speculating and you haven't tested. So if you put in table
> construction recommendations, then I do think you should be very clear
> that these are untested.
> 
> 
> >
> > >
> > > If I have 1 GBPS line rate, there is a 1000 steps in the table to this
> value.
> > > Even if I increase with the suggested 10 steps until first congestion
> seen, it
> > > will take 100 steps, and with 50 ms feedback interval that is 5
> seconds
> > before
> > > it is in the right ball park.
> > [acm]
> > Your math is correct.
> >
> > Remember that the only assumption we made when building the table of
> > sending rates
> > is that the maximum *somewhere between 500kbps and 10Gbps*. Our lab
> > tests used
> > unknown rates between 50Mbps and 10Gbps, as the "ground truth" that we
> > asked UDP
> > and TCP-based methods to measure correctly. Measurements on production
> > networks
> > encountered many different technologies. Some subscriber rates were 5 to
> > 10 Mbps
> > on upstream.
> >
> > > And I get one random loss at 10 mbps, then its 990
> > > steps, In such a situation the whole measurement period (10 s) would
> be
> > over
> > > before one has reached actual capacity.
> > [acm]
> > I'm sorry, that's not quite correct, assuming delay range meets the
> criteria
> > below, which would be consistent with "one random loss".
> >
> 
> Yes, my mistake, but let me replace one random loss with one congestion
> event due to a single intermittent transaction. So harder to trigger if
> the capacity is large, easier if it is low and then it matters less. So
> this reduced the issue significantly.
> 
> > The text says:
> >   If the feedback indicates that sequence number anomalies were detected
> > OR
> >   the delay range was above the upper threshold, the offered load rate
> is
> >   decreased.  (by one step)
> >
> > But when the next feedback message arrives with no loss, and the
> > "congestion"
> > state has not been declared, the relevant text is:
> >
> >   If the feedback indicates that no sequence number anomalies were
> > detected AND
> >   the delay range was below the lower threshold, the offered load rate
> is
> > increased.
> >   If congestion has not been confirmed up to this point, the offered
> load rate
> > is
> >   increased by more than one rate (e.g., Rx+10).
> >
> > and we return to the high speed increases, because:
> >
> >   Lastly, the method for inferring congestion is that there were
> sequence
> >   number anomalies AND/OR the delay range was above the upper threshold
> > for
> >   *two* consecutive feedback intervals.
> >
> > So, there is a single step back for the single random loss, but then
> > immediately
> > back to Rx+10 increases.
> >
> >
> 
> Ok. Good that this is less than a issue then I thought.
> 
> > >
> > > To me it appears that the probing (slow start) equivalent do need
> > logarithmic
> > > increase to reach likely capacity quickly. Then how big the adjustment
> is
> > > actually dependent on what extra delay one consider the target for the
> > test.
> > [acm]
> > Our delay variation values are low, but need to accommodate the
> relatively
> > high
> > delay variation of some access technologies. We learned this during our
> > testing
> > on production networks.
> >
> > > Having a step size of 1 GBPS if probing a 2.5 GBPS path would likely
> make it
> > > very hard to keep the delay in the intended interval when it would
> > fluxtuate
> > > between 500 mbps to much traffic and then 500 mbps to little. Sure
> with
> > > sufficiently short FT it will likely work in this algorithm. However,
> I wonder
> > > about regulation stability here for differnet RTT, FTs and buffer
> depth
> > > fluxtuations.
> > [acm]
> > I'm sorry, that's not quite correct, the text we proposed to add says:
> >
> >    It is RECOMMENDED that rates begin with 0.5 Mbps at index zero,
> >    use 1 Mbps at index one, and then continue in 1 Mbps increments to 1
> > Gbps.
> >    Above 1 Gbps, and up to 10 Gbps, it is RECOMMENDED that 100 Mbps
> > increments be used.
> > and                                                        ^^^^^^^^
> >    Above 10 Gbps, increments of 1 Gbps are RECOMMENDED.
> >
> > Your example falls in the 1Gbps to 10Gbps range, where table increments
> are
> > 100Mbps.
> >
> > >
> > > From my perspective I think this is a indication that the load rate
> > adjustment
> > > algorithm is not ready to be a standards track specification.
> > [acm]
> > Given several corrections above, the authors ask that you reconsider
> your
> > position.
> > Please read on.
> >
> > >
> > > I would recommend that you actually take out the control algorithm and
> > write a
> > > high level functional description of what needs to happen when
> measuring
> > this
> > > capacity.
> > [acm]
> > We worked for several weeks in December to make the current high-level
> > description
> > an accurate one. The IESG review has resulted added some useful details
> > that
> > I have shared along the way.
> >
> > >
> > > If I understand this correctly the requirements on the measurement are
> > the
> > > following.
> > >
> > > - Need to seek the available capacity so that several measurement
> period
> > are
> > > likely to be done at capacity
> > > - Must not create persistent congestion as the capacity measurement
> > should be
> > > based on traffic capacity that doesn't cause more standing queue that
> X,
> > where X
> > > is some additiona delay in ms compared to minimal one way delay. And X
> is
> > > actually something that is configurable for a measurement campagin as
> > capacity
> > > for a given one way delay and delay variation can be highly relevant
> to
> > > know.
> > [acm]
> > These are not identical to our requirements.  For example:
> >  - Both the metric and method consider a number of measurement
> intervals,
> > and the Maximum IP-Layer Capacity is determined from one (or more) of
> the
> > intervals.
> 
> 
> 
> >
> >
> > >
> > > What else is needed?
> > >
> > > Are synchronized clocks needed or just relative delay changes
> necessary?
> > [acm]
> > Just delay variation, and the safest is RT delay variation.
> 
> So sender side transmission timestamping and the receiver measure delay
> variations based on these transmission times would suffice.
> 
> Good.
> 
> >
> > >
> > > >
> > > > -=-=-=-=-=-=-=-
> > > >
> > > > >
> > > > > 3. Third the algorithms reaction to any sequence number gaps is
> > dependent on
> > > > > delay and how it is related to unspecified delay thresholds. Also
> no text
> > > > > discussion how these thresholds should be configured for safe
> > operation.
> > > >
> > > > [acm]
> > > > We can add some details in the paragraph below:
> > > > OLD
> > > > If the feedback indicates that sequence number anomalies were
> > detected OR
> > > > the delay range was above the upper threshold, the offered load rate
> is
> > > > decreased.
> > > > Also, if congestion is now ...
> > > > NEW
> > > > If the feedback indicates that sequence number anomalies were
> > detected OR
> > > > the delay range was above the upper threshold, the offered load rate
> is
> > decreased.
> > > > The RECOMMENDED values are 0 for sequence number gaps and 30-90
> > ms for lower
> > > > and upper delay thresholds. Also, if congestion is now ...
> > >
> > > Ok, but the delay values as I noted before highly dependent of what my
> > goal with
> > > the capacity metric is. If I want to figure out the capacity for like
> say XR or
> > > cloud gaming applications that maybe have much lower OWD variances and
> > absolute
> > > values so maybe my values are 10-25 ms.
> > [acm]
> > We intend to measure the limit of the access technology, with a set of
> > parameters
> > that work well for all technologies we have tested so far.
> >
> > Notice that I didn't type the word "application" above. Or "user
> experience".
> >
> > Sure, there is sensitivity to the parameters chosen, and we supplied our
> > well-tested defaults to maximize results comparability and technology
> > coverage
> > (with no twiddling).
> >
> >
> > >
> > > How much explaration have you done of the control stability over a
> range
> > > of parameters? Do you have any material about that?
> > [acm]
> > Yes. There are several parameter ranges we examined.
> >
> > If we set the delay thresholds high enough, we see the RTT grow as the
> > queues
> > fill to max and tail-drop finally restricts the rate. We can measure the
> > extent of buffer bloat this way (if it is present). It's Not our goal.
> >
> > We have used lower thresholds of delay variation, which work fine on the
> > PON 1Gbps access services.
> >
> > In the collaborative testing of the Open Broadband Open Source project,
> > one participant contributed tests with a 5G system that exhibited
> systematic
> > low-level loss and reordering in his lab. For this unusual case, Len
> added
> > the features to set a loss threshold above zero, and to tolerate
> reordered
> > and duplicate packets with no penalty in rate adjustment.
> >
> > We have tried a range of test durations (I=20, 30 for example).
> >
> > We have tried different steep-ness of ramp-up slope. Rate += 10 steps
> > works well, even when measuring rates separated by 3 orders of
> magnitude.
> >
> >
> > But for the co-authors, it was more important that the load adjustment
> > search
> > produce the correct Maximum IP-Layer Capacity for each of the lab
> > conditions
> > we created (including challenging conditions with competing traffic,
> long
> > delay
> > etc.), and the many access technologies we tested in production use
> > (where again we encountered similar challenging conditions).
> >
> >
> 
> Thanks for the clarifications.
> 
> > >
> > >
> > > >
> > > > -=-=-=-=-=-=-=-
> > > >
> > > > Please also note many requirements for safe operation in Section 10,
> > > > Security Considerations.
> > > >
> > > > >
> > > > > B) Section 8. Method of Measurement
> > > > >
> > > > > There are no specification of the measurement protocol here that
> > provides
> > > > > sequence numbers, and the feedback channel as well as the control
> > channel.
> > > >
> > > > [acm]
> > > > That is correct. The Scope does not include protocol development.
> > > >
> > > > > Is this intended to use TWAMP?
> > > >
> > > > [acm]
> > > > Maybe, but a lot of extensions would be involved.
> > >
> > >
> > >
> > > >
> > > > >
> > > > > From my perspective this document defines the metrics on standards
> > track
> > > > > level. However, the method for actually running the measurements
> are
> > not
> > > > > specified on a standards track level.
> > > >
> > > > [acm]
> > > > In IPPM work, the methods of measurement are described more broadly
> > than
> > > > the metrics, as actions and operations the Src and Dst hosts perform
> to
> > > > send and receive, and calculate the results.
> > > >
> > > > IPPM Methods of Measurement have not included protocol
> > requirements in
> > > > the past, in any of our Standards Track Metrics RFCs.  In fact, we
> > developed
> > > > a measurement-specific criteria for moving our RFCs along the
> standards
> > track
> > > > that has nothing to do with protocols or interoperability.
> > > > See BCP 176 aka RFC 6576:
> > https://protect2.fireeye.com/v1/url?k=65acc415-3a37fcc7-65ac848e-
> > 866132fe445e-b4323cb4b6e4206f&q=1&e=5fe1792c-f641-4903-9a32-
> > 317350790872&u=https%3A%2F%2Furldefense.com%2Fv3%2F__https%3A%
> > 2F%2Ftools.ietf.org%2Fhtml%2Frfc6576__%3B%21%21BhdT%213
> > RMQMZOAFloQ1FvtC_LIrUsDqBGxQeJZ1F4GFt8TpbQ60w841wT54sLLoKS9-
> > Yc$
> > > > IP Performance Metrics (IPPM) Standard Advancement Testing
> > > >
> > > > > No one can build implementation.
> > > >
> > > > [acm]
> > > > I'm sorry, but that is not correct.  Please see Section 8.4.
> > >
> > > Sorry, that was poorly formulated. I mean that you can't give this
> > specification
> > > to a guy on a island without external communication and have them
> > > implemented it and it will work with someone else implementation.
> > [acm]
> > But then you are asking for protocol-level interoperability, Magnus.
> > That is not our scope, or the scope of any IPPM Metric and Method RFCs.
> > The procedures of BCP 176 tell us when independent implementations
> > produce
> > equivalent results, which is IETF's definition of "works with" for
> metrics
> > and methods.
> >
> >
> > > You have clearly
> > > implemented a solutiont that works for some set of parameters. And I
> am
> > asking how
> > > much of the reasonable parameter space you have tested.
> > [acm]
> > Right. I answered this question qualitatively above, but the co-authors
> > claim that an equally important question is the breadth of access
> > technologies
> > we have tested.
> >
> > The tests conducted over 2+ years used the following production access
> > types:
> >
> > 1. Fixed: DOCSIS 3.0 cable modem with "triple-play" capability and
> embedded
> > WiFi and
> > Wired GigE switch (two manufacturers).
> > 2. Mobile: LTE cellular phone with a Cat 12 modem (600 Mbps Downlink, 50
> > Mbps uplink).
> > 3. Fixed: passive optical network (PON) "F", 1 Gbps service.
> > 4. Fixed: PON "T", 1000 Mbps Service.
> > 5. Fixed: VDSL, service, at various rates <100 Mbps.
> > 6. Fixed: ADSL, 1.5 Mbps.
> > 7. Mobile: LTE enabled router with ETH LAN to client host
> > 8. Fixed: DOCSIS 3.1 cable modem with "triple-play" capability and
> embedded
> > WiFi and
> > Wired GigE switch (two other manufacturers).
> >
> >
> > >
> > > Based on this discussion I don't think I can build an implementation
> that
> > > fulfills the measurement goals, becasue I have questions about them.
> And I
> > > suspect it would take substantial amount of experimentation to get it
> to
> > > work correctly over a broader range of input parameters.
> > [acm]
> > Now we refer you to the references in the memo, particularly Appendix X
> > of Y.1540:
> >
> >    [Y.1540]   Y.1540, I. R., "Internet protocol data communication
> >               service - IP packet transfer and availability performance
> >               parameters", December 2019,
> >               <https://www.itu.int/rec/T-REC-Y.1540-201912-I/en>.
> >
> >    [Y.Sup60]  Morton, A., Rapporteur, "Recommendation Y.Sup60,
> > Interpreting
> >               ITU-T Y.1540 maximum IP-layer capacity measurements", June
> >               2020, <https://www.itu.int/rec/T-REC-Y.Sup60/en>.
> >
> > and Liaisons, were many of the experimental results are summarized:
> >
> >    [LS-SG12-A]
> >               12, I. S., "LS - Harmonization of IP Capacity and Latency
> >               Parameters: Revision of Draft Rec. Y.1540 on IP packet
> >               transfer performance parameters and New Annex A with Lab
> >               Evaluation Plan", May 2019,
> >               <https://datatracker.ietf.org/liaison/1632/>.
> >
> >    [LS-SG12-B]
> >               12, I. S., "LS on harmonization of IP Capacity and Latency
> >               Parameters: Consent of Draft Rec. Y.1540 on IP packet
> >               transfer performance parameters and New Annex A with Lab &
> >               Field Evaluation Plans", March 2019,
> >               <https://datatracker.ietf.org/liaison/1645/>.
> >
> > Also, see our slides from the Hackathons at IETF 105 and 106, and the
> > IPPM WG sessions slides beginning with IETF-105, July 2019.
> > You might also look into the discussions on the mailing list.
> > Some other results are available to those with ITU-TIES accounts.
> >
> > The load adjustment algorithm itself was improved after experimentation,
> > adding the fast ramp-up with rate += 10 when feedback indicates no
> > impairments. The original/current algorithms appear in Y.1540 Annexes A
> > and B, respectively.
> >
> >
> > >
> > >
> > >
> > > >
> > > > > And if the section is
> > > > > intended to provide requirements on a protocol that performs these
> > > > > measurements
> > > > > I think several aspects are missing. There appear several ways
> forward
> > here
> > > > > to
> > > > > resolve this; one is to split out the method of measurement and
> define
> > it
> > > > > separately to standard tracks level using a particular protocol,
> another
> > > > > is to write it purely as requirements on a measurement protocols.
> > > >
> > > > [acm]
> > > > As stated above, connecting a method with a single protocol is not
> IPPM's
> > way.
> > >
> > > That is fine. However, I find the attempt to specify a specific load
> regulator
> > > in the method of measurement to take this specification beyond a
> general
> > method
> > > of measurment. The high level requirement appear to be that to
> correctly
> > find
> > > the capacity, and that requires that one load to the point where
> buffers are
> > > filled sufficiently to introduce extra delay or where AQM starts
> dropping or
> > > marking some of the load. Thus, I am questioning if the described
> algorithm
> > will
> > > adqueately solve that issue over a wider range of parameters.
> > >
> > > So if you have more information to show at least which range it has
> been
> > proven
> > > to do its work and with what input parameters?
> > [acm]
> > Yes. See ~10 references  and replies above.
> >
> > > I hope you understand that I
> > > expect this load control algorithm to get simularly scrutinized to
> congestion
> > > control algorithms that we standardize in IETF.
> > [acm]
> > Yes, although it is a surprising at this point, we certainly understand
> your
> > current position.
> >
> > However, Rüdiger made a relevant point in our discussions (why our
> > algorithm's
> > role is different from Transport Area congestion control algorithms,
> > and need not be subjected to the same scrutiny):
> >
> >     This is a measurement method designed for infrequent and sensible
> > maximum
> >     capacity assessment, instantiated only in an OAM or diagnostic tool.
> >
> >     It is not a blueprint for a congestion control algorithm (CCA) in a
> bulk
> >     transfer protocol that runs by default and is globally deployed by
> >     commodity stacks.
> >
> > We don't want to re-create any TCP CCA: they weren't designed for
> accurate
> > measurement of maximum rate (as the referenced measurements show).
> >
> > It appears that the most recent (2018) standardized and widely used
> > CCA is Cubic (RFC 8312 https://tools.ietf.org/html/rfc8312 ).
> >
> > The great TCP CCA Census (2019)
> > https://datatracker.ietf.org/meeting/109/materials/slides-109-iccrg-the-
> > great-internet-tcp-congestion-control-census-00
> >
> > finds that BBR versions account for greater popularity on Alexa-250
> sites
> > (25.2%) than CUBIC, and more than 40% of downstream traffic on the
> > Internet
> > (slide 14). I found some references to BBR in ICCRG drafts, but no RFC.
> > I would guess that BBR has already provided CCA for more traffic than
> the
> > test traffic complying to this memo ever will.
> >
> > Our overall method works similar to BBR: Received rate per RTT is the
> > feedback to the sender.
> >
> > We added Applicability to the access portion, not the global Internet
> > where standardized transport protocol CCAs must operate.
> >
> > We are not specifying a transport CCA that must support many
> applications.
> > Measurement is the *only* application (for an IP-layer metric).
> >
> > Early impressions have been formed on several erroneous assumptions
> > regarding algorithm stability (operation above 1Gbps) and
> > suitability for purpose (one random loss case).
> 
> I agree to misunderstandings and misinterpretation of what was defined.
> And I do have to say that some of these assumptions or guesses are due to
> the lack of clarity of the specification. Which from my perspective
> strengthens my position that this is not yet standards track quality
> around the load algorithm.
> 
> >
> > Ad-hoc methods resulting in TCP-based under-estimates of Internet Speed
> > are the problem we attack here! Implementation of harmonized industry
> > standards are the solution.
> >
> > We believe in rough consensus and running code.
> >
> 
> So BBR is frequently discussed in ICCRG, and the BBR v1 was found to have
> some truly horrible stable state. Like one where it was running on at 30%
> packet loss without reacting as it only contained delay. BBRv2 has
> addressed this so it will react to higher degrees of packet loss.
> 
> So I do believe in running code to and as long as you are monitoring your
> impact I think experimentation is the right way here. However, if we put
> the label IETF proposed standard on the load algorithm people will assume
> that it is safe to use. That is my issue here and I don't think we can
> give that recommendation even with a limited applicability statement in
> the current form.
> 
> 
> >
> > We also ask that you understand our position, that tests with many
> different
> > access technologies in production, and careful comparison of ad-hoc
> > methods
> > claiming to make the similar measurements in the lab and the field are
> > equally,
> > if not more important than even more parameter investigations at this
> point.
> > I have personally been running lab tests since September 2018 with
> various
> > tools.
> > Len released his first version of the code in Feb 2019,
> > and we immediately focused on tests with his utility instead of UDP
> packet
> > blasters like iPerf, and Trex with my own Binary Search with Loss
> verification
> > algorithm that we use in device benchmarking (cross-over with BMWG).
> >
> >
> > >
> > > I would very much prefer to take out the load algorithm and place it
> in a
> > > seperate document where it can have a tighter scope description and
> more
> > > discussion about about that it does its job.
> > >
> > > I hope this clarifies what my concerns are with this document in its
> > > current form.
> > [acm]
> > Yes, and we have rather exhaustively argued to go ahead here, especially
> > since a much less-frequently-used testing-only algorithm is a different
> > situation
> > than specifying a CCA for global TCP deployment: transport area's usual
> role.
> >
> > We hope you can now appreciate the years of study, experimentation and
> > running code that you apparently first encountered last Thursday,
> > and will look into some more of the supporting background material.
> >
> 
> I do appreciate your work. However, I still don't think the load control
> algorithm description is sufficiently well specified and parameterized to
> be published in a standards track document. Maybe it can be made into
> that. However, I think this would make an excellent experimental
> specification with some additional clarifications on parameters ranges and
> defined responses when feedback is not received in a timely fashion. Also
> I think you can clarify the protocol requirements for the measurement
> methodology. Then a crystal clear applicability statement for the load
> algorithm.
> 
> Here I think we can see another point in actually splitting out the load
> algorithm and that is related to the applicability statement. Does the
> metric and the load algorithm have the same applicability? As a metric it
> is not truly restricted, it is just a question of the impact of measuring
> the metric across the internet.
> 
> Also, I think we should consider somewhat the update and evolution path
> that hopefully will occur here. I think the load algorithm is likely to
> benefit from further clarification in its specification in the future.
> Thus, having it in a separate specification makes it easier to update and
> maintain.
> 
> 
> Cheers
> 
> Magnus Westerlund

[ippm] Magnus Westerlund's Discuss on draft-ietf-… Magnus Westerlund via Datatracker
Re: [ippm] Magnus Westerlund's Discuss on draft-i… MORTON, ALFRED C (AL)
Re: [ippm] Magnus Westerlund's Discuss on draft-i… Magnus Westerlund
Re: [ippm] Magnus Westerlund's Discuss on draft-i… Magnus Westerlund
Re: [ippm] Magnus Westerlund's Discuss on draft-i… MORTON, ALFRED C (AL)
Re: [ippm] Magnus Westerlund's Discuss on draft-i… Magnus Westerlund
Re: [ippm] Magnus Westerlund's Discuss on draft-i… Ruediger.Geib
Re: [ippm] Magnus Westerlund's Discuss on draft-i… MORTON, ALFRED C (AL)
Re: [ippm] Magnus Westerlund's Discuss on draft-i… MORTON, ALFRED C (AL)
Re: [ippm] Magnus Westerlund's Discuss on draft-i… Magnus Westerlund
Re: [ippm] Magnus Westerlund's Discuss on draft-i… Ruediger.Geib
Re: [ippm] Magnus Westerlund's Discuss on draft-i… Magnus Westerlund
Re: [ippm] Magnus Westerlund's Discuss on draft-i… MORTON, ALFRED C (AL)
Re: [ippm] Magnus Westerlund's Discuss on draft-i… Ruediger.Geib
Re: [ippm] Magnus Westerlund's Discuss on draft-i… MORTON, ALFRED C (AL)
Re: [ippm] Magnus Westerlund's Discuss on draft-i… Magnus Westerlund
Re: [ippm] Magnus Westerlund's Discuss on draft-i… Ruediger.Geib
Re: [ippm] Magnus Westerlund's Discuss on draft-i… MORTON, ALFRED C (AL)
Re: [ippm] Magnus Westerlund's Discuss on draft-i… Magnus Westerlund
Re: [ippm] Magnus Westerlund's Discuss on draft-i… Ruediger.Geib
Re: [ippm] Magnus Westerlund's Discuss on draft-i… Magnus Westerlund
Re: [ippm] Magnus Westerlund's Discuss on draft-i… MORTON, ALFRED C (AL)