Re: [ippm] draft-shalunov-ippm-reporting-00.txt

"Schmoll, Carsten" <Carsten.Schmoll@fokus.fraunhofer.de> writes:

> I read your draft and I think it is going in the 
> right direction. Some suggestions for the next 
> version from my point of view are:

Carsten,

Thank you for the thoughtful review.  In my response, unless specified
otherwise, unidiff format means changes I made to the document to
accommodate your concern.

> * Motivation: should state some application areas, i.e.
>   "why would a human user want to get those values" - e.g.
>   'to check that IP telephony would be feasable with available QoS'

@@ -68,7 +68,28 @@
 Such a set would enable different tools to produce results that can be
 compared against each other.</t>
 
-<t>The set is meant for human consumption.  It must therefore be
+<t>Existing tools already report statistic about the network.  This is
+done for varying reasons: network testing tools, such as the ping
+program available in UNIX-derived operating systems as well as in
+Microsoft Windows, report statistics with no knowledge of why the user
+is running the program; networked games might report statistics of the
+network connection to the server so users can better understand why
+they get the results they get (e.g., if something is slow, is this
+because of the network or the CPU?), so they can compare their
+statistics to those of others (``you're not lagged any more than I
+am'') or perhaps so that users can decide whether they need to upgrade
+the connection to their home; IP telephony hardware and software might
+report the statistics for similar reasons.  While existing tools
+report statistics all right, the particular set of metrics they choose
+is ad hoc; some metrics are not statistically robust, some are not
+relevant, and some are not easy to understand; more important than
+specific shortcomings, however, is the incompatibility: even if the
+sets of metrics were perfect, they would still be all different, and,
+therefore, metrics reported by different tools would not be
+comparable.</t>
+
+<t>The set of metrics of this document is meant for human consumption.
+It must therefore be
 small.  Anything greater than half-dozen numbers is certainly too
 confusing.</t>

> * please don't be offended, but I disagree on the traffic 
>   statistics you plan to apply to the traffic metrics.
>   To me they seem to be quite unusual. see below:
> 
> * delay: I would suggest to use mean or a high percentile as
>   I have never seen median delay to be used to report QoS

Henk already responded to this.

Mean of delay is undefined, infinite, or unknown (depending on how you
look at it) for most samples.  Reporting invalid statistics can't be a
good strategy.  The only thing mean has going for it is that it's
trivial to compute.  Luckily, percentiles can be computed
approximately with good efficiency, too.  In addition, mean, even when
defined, is not robust.

I fail to see advantage of high percentile over median.  A high
percentile will be more frequently infinite, is less robust, is harder
to understand, and, in general, has no advantages in relevancy,
orthogonality, or ease of computation [1].

> * loss: just a typo = per cent -> percent

Actually, it was not a typo, but if it causes confusion, then

-<t>Loss is the fraction, expressed in per cent, of packets that did
+<t>Loss is the fraction, expressed as a percentage, of packets that did

> * jitter: as far as I know the two common defintions are 
>   (high) percentile minus min delay across some interval
>   (ITU approach) or min/max of IP delay variation, taken from
>   the series of differences of consecutuve OWD values (IETF approach)

The definition of the document fits into the very general IPDV
framework (with an appropriate selection function, if I remember IPDV
terminology right).

The use of 0th percentile (minimum) in inter-percentile spread is,
indeed, appealing.  However, it is appealing for reasons that might
turn out to be temporary: namely, the main reason to use a min is that
it the maximum likelihood estimator of propagation delay in network
models where propagation delay is fixed and queuing delay is
distributed so that there's significant clumping at 0 (ddf(x) -->
+infinity as x --> +0 is a necessary condition, but almost certainly
not sufficient; shifted exponential distribution works as an example
where minimum, is, indeed, the maximum likelihood estimate of the
shift parameter).  Maximum likelihood estimators are asymptotically
consistent, and this makes minimum attractive.  However, two
considerations made me reject it:

(1) While one end of the spread starts to mean something, the other is
    just as arbitrary.  Using maximum as the other end is clearly a
    poor choice, so symmetry must be broken.  This makes the metric
    harder to understand.

(2) In today's world, indeed, the minimum delay is a decent
    propagation delay estimate.  We don't know whether it'll remain
    this way.  For example, a world with most traffic going between
    high-speed wireless points (e.g., where a ubiquitous mode of
    connectivity is through a low-orbit satellites) has little use for
    minimum delay.  In such a world minimum breaks.  On the other
    hand, interquartile spread is a well-understood metric that is
    robust without strong assumptions about the nature of the
    distribution.

> * duplication: just lacks definition of a default timeout value

 <t>Duplication is the fraction of packets for which more than a single
-copy of the packet was received within the timeout period, expressed
+copy of the packet was received within the timeout period (same
+timeout as in the definition of loss), expressed
 in percentage points.</t>

> * additionally I think two things will need to be added to the draft:
>   a) mentioning about the size of the interval about which all
>      those metrics are obtained and the fact whether this will be 
>      a sliding window or fixed time slots, e.g. new values each 10sec
>   b) I'd personally like to see some "packet filter" added to the
>      reporting record. Data might relate only to a fraction of all
>      traffic, e.g. only to UDP or only to traffic from videos.fun.org

+<section title="Sample Source">
+
+<t><xref target="sec-metrics"/> describes the metrics to compute on a
+sample of measurements.  The source of the sample in not discussed
+there, and, indeed, the metrics discussed (delay, loss, etc.) are
+simply estimators that could be applied to any sample whatsoever.  For
+the names of the estimators to be applicable, of course, the
+measurements need to come from a packet delivery network.</t>
+
+<t>The data in the samples for the set of metrics discussed in this
+document can come from the following sources: one-way active
+measurement, round-trip measurement, and passive measurement.  There
+infrequently is a choice between active and passive measurement, as,
+typically, only one is available; consequently, no preference is given
+to one over the other.  In cases where clocks can be expected to be
+synchronized, in general, one-way measurements are preferred over
+round-trip measurements (as one-way measurements are more
+informative).  When one-way measurements cannot be obtained, or when
+clocks cannot be expected to be synchronized, round-trip measurement
+MAY be used.</t>
+
+<section title="One-Way Active Measurement" anchor="sec-one-way">
+
+<t>The default duration of the measurement interval is 10 milliseconds.</t>
+
+<t>The default sending schedule is a Poisson stream.</t>
+
+<t>The default sending rate is 10 packets/second on average.  When
+randomized schedules, such as a Poisson stream, are used, the rate
+MUST be set with the distribution parameter(s).</t>
+
+<t>The default packet size is the minimum necessary for the
+measurement.</t>
+
+<t>Values other than the default ones MAY be used; if they are used,
+their use, and specific values used, MUST be reported.</t>
+
+<t>A one-way active measurement is characterized by the source IP
+address, the destination IP address, and the time when measurement was
+taken.  For the time, the middle of the measurement interval MUST be
+reported.</t>
+
+</section>
 
+<section title="Round-Trip Active Measurement">
+
+<t>The same default parameters and characterization apply to
+round-trip measurement as to one-way measurement (<xref
+target="sec-one-way"/>).</t>
+
+</section>
+
+<section title="Passive Measurement">
+
+<t>Passive measurement use whatever data it is natural to use.  For
+example, an IP telephony application or a networked game would use the
+data that it sends.  An analysis of performance of a link might use
+all the packets that traversed the link in the measurement interval.
+An analysis of performance of an Internet service provider's network
+might use all the packets that traversed the network in the
+measurement interval.  An analysis of performance of a specific
+service from the point of view of a given site might use an
+appropriate filter to select only the relevant packets.</t>
+
+<t>The same default duration applies to passive measurement as to
+one-way active measurement (<xref target="sec-one-way"/>).</t>
+
+<t>When the passive measurement data is reported in real time, a
+sliding window SHOULD be used as a measurement period, so that recent
+data become more quickly reflected.</t>
+
+</section>
+
+</section>

> * additionally "availability" could be one metric
>   (probably with param = (list of) server(s) or network)

Why aren't existing metrics enough?  When no packets come through,
you'll see

delay = +infinity
loss = 100%
jitter = undefined
duplication = 0%
reordering = 0%

Is that not enough?

> * a small final note (not specific to the draft): when applying the 
>   ITU definition for jitter, then high percentile of delay and jitter
>   are quite correlated, and just differ by the interval's min-delay.
>   I cannot say if/how they correlate when the IETF definition (IPDV) is
> used.

The median and the interquartile spread are sufficiently orthogonal
and widely used by statisticians.

> * one question from my side: how would you define/report the measurement
> parameters?
>   (i.e. the src/dst/network/path to which the reported delay/jitter etc.
> applies)

I believe this is covered now.

Thanks again,                                           --Stanislav

P.S. I'll send out a version of the draft with these changes rolled in.

[1] Well, OK, the 100th percentile (maximum) *is* easier to compute,
but it is clearly unusable otherwise, as it has all the deficiencies
of mean, only to a much larger extent.

-- 
Stanislav Shalunov		http://www.internet2.edu/~shalunov/

This message is designed to be viewed in boustrophedon.

_______________________________________________
ippm mailing list
ippm@ietf.org 
https://www1.ietf.org/mailman/listinfo/ippm