[ippm] Benjamin Kaduk's No Objection on draft-ietf-ippm-capacity-metric-method-06: (with COMMENT)

Benjamin Kaduk via Datatracker <noreply@ietf.org> Wed, 24 February 2021 19:54 UTC

MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
From: Benjamin Kaduk via Datatracker <noreply@ietf.org>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-ippm-capacity-metric-method@ietf.org, ippm-chairs@ietf.org, ippm@ietf.org, Ian Swett <ianswett@google.com>, tpauly@apple.com, tpauly@apple.com
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Benjamin Kaduk <kaduk@mit.edu>
Message-ID: <161419645471.18083.16706266293896961774@ietfa.amsl.com>
Date: Wed, 24 Feb 2021 11:54:15 -0800
Archived-At: <https://mailarchive.ietf.org/arch/msg/ippm/Uxft3WysqBJF1efpMkI7e4U5HGM>
Subject: [ippm] Benjamin Kaduk's No Objection on draft-ietf-ippm-capacity-metric-method-06: (with COMMENT)

Benjamin Kaduk has entered the following ballot position for
draft-ietf-ippm-capacity-metric-method-06: No Objection

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-ippm-capacity-metric-method/



----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

Section 3

   5.  Less emphasis on ISP gateway measurements, possibly due to less
       traffic crossing ISP gateways in future.

nit: sentence fragment.

Section 5

   This section sets requirements for the following components to
   support the Maximum IP-layer Capacity Metric.

editorial/nit: I don't think I understand what "the following
components" are.  Is this referencing some preexisting template for a
metric definition?  (Same for §6 and §7.)

Section 5.3

   The number of these IP-layer bits is designated n0[dtn,dtn+1] for a
   specific dt.

(editorial) I'm a little confused by this notation for n0.  In Section 4
we say that "dtn" references a specific sub-interval, but the brackets
and two items look like they should be the start an end of an interval,
for which perhaps just the 'n' would be more appropriate (but we'd want
a half-open interval, and would need to worry about 0- vs 1-indexing for
n, etc.).

   Anticipating a Sample of Singletons, the interval dt SHOULD be set to
   a natural number m so that T+I = T + m*dt with dtn+1 - dtn = dt and
   with 1 <= n <= m.

nit: "dt [...] set to m" doesn't make any sense.
nit: this looks like more of a "for 1 <= n <= m" than an "and with".

   Mathematically, this definition can be represented as:

                                      ( n0[dtn,dtn+1] )
                      C(T,dt,PM) = -------------------------
                                             dt

(nit) the "n" appears (in 'dtn') only on the right side but there is no
operation applied over it, so this "equation" seems unbalanced without
also specifying 'n'.  I think the introductory text should mention
something about "for each n" or for a given interval dtn" or similar.

   o  n0 is the total number of IP-layer header and payload bits that
      can be transmitted in Standard Formed packets [RFC8468] from the

(nit) RFC 8468 spells it "standard-formed".

   o  C(T,dt,PM) the IP-Layer Capacity, corresponds to the value of n0
      measured in any sub-interval ending at dtn (meaning T + n*dt),
      divided by the length of sub-interval, dt.

If it's supposed to be "ending at dtn", then why is the "dtn+1" in the
picture at all?

   o  PM represents other performance metrics [see section 5.4 below];
      their measurement results SHALL be collected during measurement of
      IP-layer Capacity and associated with the corresponding dtn for
      further evaluation and reporting.

(nit) this seems to duplicate (be a subset of) the paragraph before
"Mathematically, this definition can be represented".

   o  The bit rate of the physical interface of the measurement device
      must be higher than that of the link whose C(T,I,PM) is to be
      measured.

(nit) I thought we were measuring a path, not a link.

Section 5.4

   RTD[dtn-1,dtn] is defined as a sample of the [RFC2681] Round-trip

Do we really want to be using n0[dtn,dtn+1] but RTD[dtn-1,dtn]?  Can we
pick a consistent sub-interval notation?

Section 6.3

   Define the Maximum IP-layer capacity, Maximum_C(T,I,PM), to be the
   maximum number of IP-layer bits n0[dtn,dtn+1] that can be transmitted
   in packets from the Src host and correctly received by the Dst host,

(nit) the relevant formulae include a dt divisor, but I don't see
anything in this prose that would correspond to such a divisor.

   The interval dt SHOULD be set to a natural number m so that T+I = T +
   m*dt with dtn+1 - dtn = dt and with 1 <= n <= m.

nit: "dt [...] set to m" doesn't make any sense.
nit: this looks like more of a "for 1 <= n <= m" than an "and with".

   Mathematically, this definition can be represented as:

                                      max  ( n0[dtn,dtn+1] )
                                     [T,T+I]
                Maximum_C(T,I,PM) = -------------------------
                                               dt
               where:
                  T                                      T+I
                  _________________________________________
                  |   |   |   |   |   |   |   |   |   |   |
              dtn=1   2   3   4   5   6   7   8   9  10  n+1
                                                     n=m

(nit) as mentioned previously, the definition of "dtn" lists it as being
the sub-interval, not the boundary point/time of a sub-interval.

Section 6.5

   If traffic conditioning (e.g., shaping, policing) applies along a
   path for which Maximum_C(T,I,PM) is to be determined, different
   values for dt SHOULD be picked and measurements be executed during
   multiple intervals [T, T+I].  A single constant interval dt SHOULD be
   chosen so that is an integer multiple of increasing values k times
   serialisation delay of a path MTU at the physical interface speed
   where traffic conditioning is expected.  [...]

nit: "so that is an integer multiple" seems to be missing a word.

Also, this seems to say that different values for dt SHOULD be picked,
but also that a constant dt SHOULD be chosen.  How can those both be
recommended and be consistent with each other?  (I mean, I assume that
the intent is to multiple runs with different (fixed) dt, but that's not
what the text seems to say.)

Section 7.3

   Define the IP-layer Sender Bit Rate, B(S,st), to be the number of IP-
   layer bits (including header and data fields) that are transmitted
   from the Source during one contiguous sub-interval, st, during the
   test interval S (where S SHALL be longer than I), and where the
   fixed-size packet count during that single sub-interval st also
   provides the number of IP-layer bits in any interval: n0[stn-1,stn].

(1) there doesn't seem to be any restriction that the observed packets
list Dst as the destination address, so formally it seems this would
count *all* traffic generated by Sender, not just the traffic relevant
for the (path capacity) test.
(2) It seems a little unfortunate that we reuse the 'n0' symbol here for
a different meaning than in the earlier capacity metrics.

   Measurements according to these definitions SHALL use the UDP
   transport layer.  Any feedback from Dst host to Src host received by
   Src host during an interval [stn-1,stn] MUST NOT result in an
   adaptation of the Src host traffic conditioning during this interval
   (rate adjustment occurs on st boundaries).

Hmm, this "MUST NOT" is interesting, as it seems to imply extremely
tight coordination between the measurement point for this metric and the
Source itself.  (Note that the toplevel §7 admits the possibility that
measurement will occur at a location other than the Src host to network
path interface, via "(or as close as practical)".)

Section 8.1

   At the beginning of a test, the sender begins sending at rate R1 and
   the receiver starts a feedback timer at interval F (while awaiting

It's a little hard to search for, but I didn't find any previous mention
of 'F' or it being defined as a parameter or term.  Should it be a
listed parameter somewhere?

   If the feedback indicates that sequence number anomalies were
   detected OR the delay range was above the upper threshold, the
   offered load rate is decreased.  Also, if congestion is now confirmed
   by the current feedback message being processed, then the offered
   load rate is decreased by more than one rate (e.g., Rx-30).  [...]

Does "congestion is now confirmed" mean that "congestion confirmed" is
like a one-way latch and this transition only occurs at most once over
the course of a test?  Or could the Rx-30 happen multiple times?
(The pseudocode indicates the former.)

   If the feedback indicates that there were no sequence number
   anomalies AND the delay range was above the lower threshold, but
   below the upper threshold, the offered load rate is not changed.

The way this is written suggests that there will always be a lower and
an upper threshold for delay, but the rest of the document so far didn't
give me that impression.  E.g., we talk about PM only as "at least one
fundamental metric and target performance threshold MUST be supplied",
and to me having both upper and lower thresholds would be two
thresholds, not one.

Section 8.2

   Here, as with any Active Capacity test, the test duration must be
   kept short. 10 second tests for each direction of transmission are
   common today.  The default measurement interval specified here is I =
   10 seconds).  In combination with a fast search method and user-
   network coordination, the concerns raised in RFC 6815[RFC6815] are
   alleviated.  [...]

I skimmed RFC 6815 and had a bit of a hard time making the connection
for why combining a 10-second interval, fast search method, and
user-network coordination alleviate the concerns of RFC 6815.  There
doesn't seem to be much in 6815 itself about how testing in production
can be done safely, so my current working assumption is that the
conclusion presented here reflects the results of "new work" being
recorded for the first time (in the RFC series) in this document.  If
that assumption is correct, I'd suggest spending some more words to
support the conclusion, e.g., making analogies to other "normal" traffic
patterns and how the benchmarking setup is not qualitatively different
from them.

Section 8.3

   As testing continues, implementers should expect some evolution in
   the methods.  The ITU-T has published a Supplement (60) to the
   Y-series of Recommendations, "Interpreting ITU-T Y.1540 maximum IP-
   layer capacity measurements", [Y.Sup60], which is the result of
   continued testing with the metric and method described here.

I pulled up the [Y.Sup60] reference, and it does not seem to reference
this draft by name.  On what basis do we conclude that it "is the result
of continued testing with the metric and method described here"?
Skimming/searching, I do see many similar formulae and methods
presented, but how do we conclude they are precisely the same?

Section 10

Should we say something about making sure that I is reasonably bounded?
IIRC we say so elsewhere in the text but not exactly here.

   2.  A REQUIRED user client-initiated setup handshake between
       cooperating hosts and allows firewalls to control inbound
       unsolicited UDP which either go to a control port [expected and
       w/authentication] or to ephemeral ports that are only created as
       needed.  [...]

nit: the grammar is odd in the first part of this sentence; the part
before the "and" doesn't seem like it can join up with anything after
the "and".  Is the intent something like "It is REQUIRED to have a user
client-initiated setup handshake between cooperating hosts that allows
firewalls to [...]"?

   3.  Integrity protection for feedback messages conveying measurements
       is RECOMMENDED.

(In some sense you want authentication as well as integrity protection.)

   5.  Senders MUST be rate-limited.  This can be accomplished using the
       pre-built table defining all the offered load rates that will be
       supported (Section 8.1).  The recommended load-control search
       algorithm results in "ramp up" from the lowest rate in the table.

nit: since (effectively) each implementation will have their own
pre-built table, I think it should be "using a pre-built table".

Appendix 13

If we start at Rx (row) 1, is it going to cause problems when we drop
down to Rx = 0 in the loss/congestion cases?

The mechcanism in the pseudocode to stop taking large increments in
sending rate above the "hSpeedThresh" does not seem to be described in
the prose in §8.1.  (That said, it seems like a good idea, given the
likely table composition.)

(Also, indenting one tab for the outer conditionals and two more for the
inner ones looks a bit unusual.)

Section 14

It's not entirely clear to me why RFC 2330 is classified as normative
but RFC 7312 is informative, just based on the locations where they are
referenced.

[ippm] Benjamin Kaduk's No Objection on draft-iet… Benjamin Kaduk via Datatracker
Re: [ippm] Benjamin Kaduk's No Objection on draft… MORTON, ALFRED C (AL)
Re: [ippm] Benjamin Kaduk's No Objection on draft… Benjamin Kaduk
Re: [ippm] Benjamin Kaduk's No Objection on draft… MORTON, ALFRED C (AL)