Re: [ippm] Magnus Westerlund's Discuss on draft-ietf-ippm-capacity-metric-method-06: (with DISCUSS)

Magnus Westerlund <magnus.westerlund@ericsson.com> Fri, 12 March 2021 17:36 UTC

From: Magnus Westerlund <magnus.westerlund@ericsson.com>
To: "acm@research.att.com" <acm@research.att.com>, "Ruediger.Geib@telekom.de" <Ruediger.Geib@telekom.de>
CC: "tpauly@apple.com" <tpauly@apple.com>, "ianswett@google.com" <ianswett@google.com>, "draft-ietf-ippm-capacity-metric-method@ietf.org" <draft-ietf-ippm-capacity-metric-method@ietf.org>, "ippm-chairs@ietf.org" <ippm-chairs@ietf.org>, "ippm@ietf.org" <ippm@ietf.org>, "iesg@ietf.org" <iesg@ietf.org>
Thread-Topic: Magnus Westerlund's Discuss on draft-ietf-ippm-capacity-metric-method-06: (with DISCUSS)
Thread-Index: AQHXC4ErlwkM1aTuAkuUi6wN9U4nLapo9txggAGffICAACldMIAGGjkAgADI6CCAAuLjkIAAqr5AgAAZfgCAAB6PYIAABYwAgARiWICAAJqpAIACxBLQgAHh24CAALQKAIAApucA
Date: Fri, 12 Mar 2021 17:35:56 +0000
Message-ID: <HE1PR0702MB37726F61474EDE6742DC5008956F9@HE1PR0702MB3772.eurprd07.prod.outlook.com>
References: <161426272345.2083.7668347127672505809@ietfa.amsl.com> <4D7F4AD313D3FC43A053B309F97543CF01476A0C0E@njmtexg5.research.att.com> <66f367953ae838c8ba7505c60e51367843117787.camel@ericsson.com> <4D7F4AD313D3FC43A053B309F97543CF01476A0FE3@njmtexg5.research.att.com> <HE1PR0702MB3772A66E2C0409F5A69DC7DA95999@HE1PR0702MB3772.eurprd07.prod.outlook.com> <4D7F4AD313D3FC43A053B309F97543CF0147CA50DA@njmtexg5.research.att.com> <HE1PR0702MB377281B141FBB6D63015CC1895969@HE1PR0702MB3772.eurprd07.prod.outlook.com> <FRYP281MB01127EE4544CADF8B6E6E2E19C969@FRYP281MB0112.DEUP281.PROD.OUTLOOK.COM> <HE1PR0702MB37725A93AE2748D0619DB95D95969@HE1PR0702MB3772.eurprd07.prod.outlook.com> <4D7F4AD313D3FC43A053B309F97543CF0147CA565A@njmtexg5.research.att.com> <FRYP281MB01125B1728BCEF1D721B81EE9C939@FRYP281MB0112.DEUP281.PROD.OUTLOOK.COM> <4D7F4AD313D3FC43A053B309F97543CF0147CA9031@njmtexg5.research.att.com> <VI1PR0702MB37757902F5B59F99C5D8F24995909@VI1PR0702MB3775.eurprd07.prod.outlook.com> <FRYP281MB01124DAB19CA73818AE5F3759C909@FRYP281MB0112.DEUP281.PROD.OUTLOOK.COM> <4D7F4AD313D3FC43A053B309F97543CF0147CACA0C@njmtexg5.research.att.com>
In-Reply-To: <4D7F4AD313D3FC43A053B309F97543CF0147CACA0C@njmtexg5.research.att.com>
Accept-Language: sv-SE, en-US
Content-Language: en-US
Content-Type: multipart/signed; protocol="application/x-pkcs7-signature"; micalg="SHA1"; boundary="----=_NextPart_000_056D_01D7176E.89C99590"
MIME-Version: 1.0
X-MS-Exchange-CrossTenant-Network-Message-Id: 5e2b2205-245c-4057-125c-08d8e57d4b6f
X-MS-Exchange-CrossTenant-originalarrivaltime: 12 Mar 2021 17:35:57.0948 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 92e84ceb-fbfd-47ab-be52-080c6b87953f
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: sXE8m7nXlnHtNxNJBMzhGZyVDV708GifYx3zTExYytbuftMAcBZVVqhu/owcMCCHqUMclGVg9vlEAAl+1I4zT5g1xa9Wii31MFwSDEFbKlQ=
X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR0701MB3001
Archived-At: <https://mailarchive.ietf.org/arch/msg/ippm/_QCTc3eOkX5RiohPfmIm4VDCwzg>
Subject: Re: [ippm] Magnus Westerlund's Discuss on draft-ietf-ippm-capacity-metric-method-06: (with DISCUSS)
X-BeenThere: ippm@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF IP Performance Metrics Working Group <ippm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ippm>, <mailto:ippm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ippm/>
List-Post: <mailto:ippm@ietf.org>
List-Help: <mailto:ippm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ippm>, <mailto:ippm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 12 Mar 2021 17:36:05 -0000

Hi,

I do provide some comments inline.

As I am no longer a member of the IESG list, you will need to ensure that I am 
in to or cc field to ensure that I do receive future emails on this topic that 
you want my feedback on.

> >
> > Section 2.
> >
> > I think the scope text is still fairly open. Yes it is clear that the
> > load algorithm is only intended for measurements. However, the usage
> > is not particular limited, especially in regards to my main concern of
> > edge to central nodes across multiple AS in the Internet like the
> > different TCP speed tests often are deployed. The security
> > consideration requirements are good for a number of reasons but put no
> > limitations on that aspect. So I would prefer a more explicit
> > statement here. I think an important aspect here is that any ISP
> > seeing issues from these measurements should know who to talk to. I
> don't know how to best formulate that.

Okay, I will try again to reword the scope next week.

> >
> >
> > Section 8.1:
> >
> > So I am trying to understand the implication of the load algorithm at
> > higher rates and how recommendations works out in relation to the
> > definition of the rate.
> >
> > So the document says:
> >
> >     Each rate is defined as
> >    datagrams of size ss, sent as a burst of count cc, each time interval
> >    tt (default for tt is 1ms, a likely system tick-interval)
> >
> > So I think this definition is fine for lower rate as the number of
> > packets in each 1 ms burst is fairly small and the buffer it hits will
> > likely be relatively large compared to the increase in load. However
> > at higher rates like beyond 10 GBPS where 1 GBPS steps are
> > recommended. So transmitting at bursts every 1 ms intervals means that
> > one are transmitting 833 packets each burst at 10 GBP rate of 1500
> > bytes size, so likely even higher for more moderate 1200 byte size
> > packets. That is almost 1,3 mb of data. So where pacing may be quite
> > good at lower bit-rates < 1Gbps I wonder if it starts breaking down at
> > higher rates, which appears to in the region where buffers becomes
> > more shallow due to the cost of having large buffers and where good
> > pacing reduces the need for buffering. I would also note that the
> > reaction time for the control can be 1RTT + 50 ms which thus the
> > increase in offered load for a step size becomes 10s of MB during an
> regulation period.
> >
> > As the load algorithm hasn't been tested beyond 10Gbps and it appears
> > that the numbers can start to become more problematic at these speeds,
> > wouldn't it be better to say that this is not intended beyond 10 Gbps.
> >
> > On the table I have the following comments:
> >
> >    +--------------+-------------+--------------+-----------------------+
> >    | Parameter    | Default     | Tested Range | Expected Safe Range   |
> >    |              |             | or values    | (not entirely tested, |
> >    |              |             |              | other values NOT      |
> >    |              |             |              | RECOMMENDED)          |
> >    +--------------+-------------+--------------+-----------------------+
> >    | FT, feedback | 50ms        | 20ms, 100ms  | 5ms <= FT <= 250ms    |
> >    | time         |             |              | Larger values may     |
> >    | interval     |             |              | slow the rate         |
> >    |              |             |              | increase and fail to  |
> >    |              |             |              | find the max          |
> >
> > +--------------+-------------+--------------+-----------------------+
> >
> > I would note that a FT of 5 ms will have the potential to result in
> > significant fluxtuations in some systems like mobile systems as the
> > scheduler time is actually likely to be longer than 5 ms.
> [acm]
> Then we can increase the low end of the range, what value would you
> prefer??

So, I would prefer to raise this at least to 20 ms. However, I think that is 
for general robustness without knowledge about the network one is measuring.

>
> >
> >    +--------------+-------------+--------------+-----------------------+
> >    | Feedback     | L*FT, L=10  | L=100 with   | 0.5sec <= L*FT <=     |
> >    | message      | (500ms)     | FT=50ms      | 30sec Upper limit for |
> >    | timeout      |             | (5sec)       | very unreliable test  |
> >    | (stop test)  |             |              | paths only            |
> >
> > +--------------+-------------+--------------+-----------------------+
> >
> > Even the default means that one looses 10 feedback packets in a row.
> > That is a lot and shows that one have a serious interruption on the
> > return path. Already loosing 3 feedback packets in a row indicates
> > that one have significant outages if this is lost.
> >
> > Secondly, this is formulated only based on  intervals of FT. For
> > startup the RTT is relevant factor. So I think there are several
> > factors here for the timeout that maybe need to be teased apart? So
> > initially one offers a very low load and one may not have a good
> measurement on base RTT.
> [acm]
> All the timeouts are conducted on inter-packet arrival times at a single
> interface.
> This removes the dependency on RTT.

Yes, I think you can define this as time, since you last received a feedback 
message. However, my concern with what the appropriate for a value is actually 
dependent on the RTT as it effects the control loop. So the amount of damage a 
sequence of lost feedback message is dependent on the total time as the RTT is 
effecting the reaction time. So clearly receiving no feedback message for 10 
FT intervals is bad as it represents a lot of missing packets.

>
> > Thus, time to first feedback is okay to be fairly large and 500 ms is
> > likely okay but quite longer than expected for an access to local
> > internet exchange measurement. However, when one scale up the rate I
> > think these values are way to long as the total amount of traffic sent
> > without feedback becomes quite significant. Receiving no feedback for
> > more than 10 reporting intervals are already way to long. And to state
> > that 30 seconds would be an acceptable value I can't support even for a
> measurement tool.
> [acm]
> We are willing to revise values, especially 30 sec, but would like to avoid
> prematurely shutting down measurements due to (what we consider to be)
> short interruptions.

I Understand the view that the interruption is a short time. However, a 
several second without feedback transmission continuing to transmit is also a 
significant issue. So I think everything beyond 1 second is not really 
acceptable.


>
> >
> > The definition of what "Feedback message timeout" and "Load packet
> > Timeout" is not defined. I assume that Feedback message timeout is the
> > time without receiving any feedback messages after starting a
> measurement.
> [acm]
> That's close:
> Operation: The load packet timeout SHALL be reset to the configured value
> each time a load packet received. If the timeout expires, the receiver SHALL
> be closed and no further feedback sent.

Ok, that needs to be included.

>
>
> > Is the load packet timeout the time the receiver is waiting before
> > using signalling channel to end the measurement without receiving any
> > packets, or for the sender to receive feedback that says that no
> > packets have been received? The roles here are not clear.
> [acm]
> Operation: The feedback message timeout SHALL be reset to the configured
> value each time a feedback message is received. If the timeout expires, the
> sender SHALL be closed and no further load packets sent.

Ok. So I think there are two aspects here. The separation between something 
that backs off a transmission, and terminates the whole measurement. Do you 
really like to have these two clumped together.

>
> >
> > Sending packets for several seconds without seeing any result appears
> > problematic and allowing values beyond several seconds looks broken.
> [acm]
> Then let us make some revisions together. You've seen our proposals.
> Our concern is stopping a test unnecessarily. There can be a happy
> conclusion.

Yes, and as I said above do you really want to have timeout of measurement be 
connected to a rate reductions timeout?

>
>
> >
> >    +--------------+-------------+--------------+-----------------------+
> >    | table index  | 0.5Mbps     | 0.5Mbps      | when testing <=10Gbps |
> >    | 0            |             |              |                       |
> >    +--------------+-------------+--------------+-----------------------+
> >    | table index  | 1Mbps       | 1Mbps        | when testing <=10Gbps |
> >    | 1            |             |              |                       |
> >    +--------------+-------------+--------------+-----------------------+
> >
> > Why is this value not relevant when testing beyond 10 Gbps, the ramp up
> > time becomes to long with these values or?
> [acm]
> "not relevant" is different from the title of the column, which is:
>
> Expected Safe Range:  when testing <=10Gbps
>
> The parameters above and several others simply determine where a test
> starts.
>
> This parameter:
> +--------------+-------------+--------------+-----------------------+
> | table index  | 1Mbps       | 1Mbps -      | same as tested        |
> | (step) size  |             | 1Gbps        |                       |
> +--------------+-------------+--------------+-----------------------+
>

Okay, I will consider if I have any proposal for how to make this clearer.


> >
> >    | ss, UDP      | none        | <=1222       | Recommend max at      |
> >    | payload      |             |              | largest value that    |
> >    | size, bytes  |             |              | avoids fragmentation  |
> >    +--------------+-------------+--------------+-----------------------+
> >
> > So isn't there a mismatch between the metric and the load algorithm values
> > here? With the rate definition in Section 8.1 being defined as based on
> > "ss" that UDP payload bytes, rather than IP packet sizes that are used?
> [acm]
>
> Not really, UDP is mandatory in the metric definition.

Hmm, I think then there is a mismatch here. Section 6.3 states:

n0 is the total number of IP-layer header and payload bits that
      can be transmitted in standard-formed packets from the Src host
      and correctly received by the Dst host during one contiguous sub-
      interval, dt in length, during the interval [T, T+I],

So the metric appears to be defined based on the IP packets, not the UDP 
payload size. Thus, one need to convert between the value of ss and the actual 
packet size sent. Thus, I think the rate value in the table will be 
misinterpreted as a rate of 560 mbit/s based on ss = 1210 bytes would in fact 
be an IPv6 capacity of 582,2 mbps.

>
> >
> > I understand that one want to ensure that one measure using a size that
> > actually works in the path. However, I think one should be warned that one
> > might run into packet rate limitations rather than byte limits if one
> > would use too small.
> [acm]
> Ok
> "Use of too-small payload size might result in unexpected sender
> limitations."
>
> > `
> >    +--------------+-------------+--------------+-----------------------+
> >    | cc, burst    | none        | 1 - 100      | same as tested        |
> >    | count        |             |              |                       |
> >    +--------------+-------------+--------------+-----------------------+
> >
> > So the cc value is dependent on target rate and the value of ss and tt. So
> > should it be included in this table? Especially as 100 is not sufficient
> > for multi-gigabit speeds with a tt of 1 ms.
> [acm]
> We can remove it if the values cause confusion.
>

I think that should be done, unless it has a real purpose here.

> >
> >    +--------------+-------------+--------------+-----------------------+
> >    | low delay    | 30ms        | 5ms, 30ms    | same as tested        |
> >    | range        |             |              |                       |
> >    | threshold    |             |              |                       |
> >    +--------------+-------------+--------------+-----------------------+
> >
> > So I think this value is highly dependent on several aspects and maybe
> > should get more discussion. First for a measurement campaign it is
> > relevant what one consider as the target additional latency that is
> > acceptable when finding capacity. Secondly, the jitter in the network
> > technology. For WIFI,  mobile and DOCIS a to low value may be shorter than
> > the scheduling latencies that might occur. It is also a question about how
> > precise the implementation are capable of measuring per packet latency
> > variances.
> [acm]
> As we discussed much earlier in this long thread, we arrived at both the
> delay threshold values after testing with the WIFI, mobile and DOCIS access
> services we could use in production, and many others.

Understood, did really 5 ms work well in DOCIS and 4G Mobile? Or is it 30 ms 
that works well?

>
>
> >
> >    +--------------+-------------+--------------+-----------------------+
> >    | high delay   | 90ms        | 10ms, 90ms   | same as tested        |
> >    | range        |             |              |                       |
> >    | threshold    |             |              |                       |
> >    +--------------+-------------+--------------+-----------------------+
> >
> > Also here I wished there was a bit more discussion. So this value clearly
> > must be above expected jitter for the network technology. It also needs to
> > be sufficient large to represent a fair amount of queue to avoid
> > measurement errors. I assume that if one would chose a value larger than
> > available buffer depth one would drive the network into packet loss. And
> > as long as there are some room between low delay range threshold and
> the
> > actual delay causing loss or this higher one has a chance to regulate to
> > that rate.
> >
> >    +--------------+-------------+--------------+-----------------------+
> >    | sequence     | 0           | 0, 100       | same as tested        |
> >    | error        |             |              |                       |
> >    | threshold    |             |              |                       |
> >    +--------------+-------------+--------------+-----------------------+
> >
> > What is this value really?
> [acm]
> When loss or reordering occur, initially these impairments appear as missing
> or unexpected sequence numbers in the stream, or sequence errors.
>

So, this is the amount of change beyond the expected next sequential value 
should be used. Did you actually use 100 as threshold, i.e. that you need 
burst loss of 100 packets or reordering that moved a packet 100 out of 
sequence for it to be considered an error? What was the purpose of using a so 
high value?

>
> >
> >    +--------------+-------------+--------------+-----------------------+
> >    | consecutive  | 2           | 2            | Use values >1 to      |
> >    | errored      |             |              | avoid misinterpreting |
> >    | status       |             |              | transient loss        |
> >    | report       |             |              |                       |
> >    | threshold    |             |              |                       |
> >    +--------------+-------------+--------------+-----------------------+
> >
> > Also here I am uncertain what is the criteria here?
> [acm]
> From the draft, where consecutive status reports are sent at feedback
> intervals:
>
>    Lastly, the method for inferring congestion is that there were
>    sequence number anomalies AND/OR the delay range was above the
> upper
>    threshold for two consecutive feedback intervals.

Okay I get it.


>
> >
> >    +--------------+-------------+--------------+-----------------------+
> >    | Fast mode    | 30          | 3 * Fast     | same as tested        |
> >    | decrease, in |             | mode         |                       |
> >    | table index  |             | increase     |                       |
> >    | steps        |             |              |                       |
> >    +--------------+-------------+--------------+-----------------------+
> >
> > So is the recommended value 30 or 3*Fast mode increase? Should they be
> > proportional or not?
> [acm]
> The Default can be 3 * Fast mode increase if you want, that's what we
> tested.

Yes, I think that makes more sense and avoid causing the values to not be 
related to.

>
> >
> > The last entry appears to be a summary fact of the parameterization, and
> > is it relevant?
> [acm]
> It might be removed, we thought it was useful info.

It would be good if these values are not input parameters, rather a 
consequence of others would be separated to its own category.

> >
> >
> > What is the goal here in relation to push other congestion controlled
> > traffic out of the way? It appears that it is likely to cause delay based
> > congestion to be pushed out of the way. I am more uncertain how it
> > interacts with loss based ones, as depending on situation it appears that
> > it could avoid going into the loss regim.
> [acm]
> The goal is to measure the true maximum rate during the test duration.
>

So pushing traffic out of the way during the test period. I think that should 
be made more explicit and with that stated explicit it is easier to make clear 
why this only should be deployed for measurement within cooperating 
administrative domains.


> >
> > My conclusion is that some aspect of this do appear more clarifications on
> > what they are
> [acm]
> These ASCII tables don't provide much space for explanation without
> becoming
> awkward due to row height.  We'll add some definitions elsewhere.

Yes, please do. I would also recommend that you try doing these as XMLv3 
tables so they look much better in the HTML version.

>
> > and further assumptions on how the load algorithm will be
> > deployed spelled out so that its function is more controlled.
> [acm]
> We could use some text suggestions to continue the discussion productively,
> having already tried several times.
>

I starting to see how we can get this into where I personally will find it 
acceptable.

I think rewriting the applicability and make the intention clear will be the 
main part. I also think there should be a paragraph in the security 
consideration that makes it clear that deployments should prevent metrics to 
be run by clients that are outside of the intended administrative domains to 
prevent that this traffic can be used to interfere with other administrative 
domains traffic.

Cheers

Magnus Westerlund

Attachment: smime.p7s

[ippm] Magnus Westerlund's Discuss on draft-ietf-… Magnus Westerlund via Datatracker
Re: [ippm] Magnus Westerlund's Discuss on draft-i… MORTON, ALFRED C (AL)
Re: [ippm] Magnus Westerlund's Discuss on draft-i… Magnus Westerlund
Re: [ippm] Magnus Westerlund's Discuss on draft-i… Magnus Westerlund
Re: [ippm] Magnus Westerlund's Discuss on draft-i… MORTON, ALFRED C (AL)
Re: [ippm] Magnus Westerlund's Discuss on draft-i… Magnus Westerlund
Re: [ippm] Magnus Westerlund's Discuss on draft-i… Ruediger.Geib
Re: [ippm] Magnus Westerlund's Discuss on draft-i… MORTON, ALFRED C (AL)
Re: [ippm] Magnus Westerlund's Discuss on draft-i… MORTON, ALFRED C (AL)
Re: [ippm] Magnus Westerlund's Discuss on draft-i… Magnus Westerlund
Re: [ippm] Magnus Westerlund's Discuss on draft-i… Ruediger.Geib
Re: [ippm] Magnus Westerlund's Discuss on draft-i… Magnus Westerlund
Re: [ippm] Magnus Westerlund's Discuss on draft-i… MORTON, ALFRED C (AL)
Re: [ippm] Magnus Westerlund's Discuss on draft-i… Ruediger.Geib
Re: [ippm] Magnus Westerlund's Discuss on draft-i… MORTON, ALFRED C (AL)
Re: [ippm] Magnus Westerlund's Discuss on draft-i… Magnus Westerlund
Re: [ippm] Magnus Westerlund's Discuss on draft-i… Ruediger.Geib
Re: [ippm] Magnus Westerlund's Discuss on draft-i… MORTON, ALFRED C (AL)
Re: [ippm] Magnus Westerlund's Discuss on draft-i… Magnus Westerlund
Re: [ippm] Magnus Westerlund's Discuss on draft-i… Ruediger.Geib
Re: [ippm] Magnus Westerlund's Discuss on draft-i… Magnus Westerlund
Re: [ippm] Magnus Westerlund's Discuss on draft-i… MORTON, ALFRED C (AL)

Re: [ippm] Magnus Westerlund's Discuss on draft-ietf-ippm-capacity-metric-method-06: (with DISCUSS)

Attachment: smime.p7s