[Dime] Benjamin Kaduk's No Objection on draft-ietf-dime-doic-rate-control-10: (with COMMENT)

Benjamin Kaduk <kaduk@mit.edu> Wed, 23 January 2019 19:03 UTC

Return-Path: <kaduk@mit.edu>
X-Original-To: dime@ietf.org
Delivered-To: dime@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id CDCC3130F86; Wed, 23 Jan 2019 11:03:10 -0800 (PST)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
From: Benjamin Kaduk <kaduk@mit.edu>
To: "The IESG" <iesg@ietf.org>
Cc: draft-ietf-dime-doic-rate-control@ietf.org, Lionel Morand <lionel.morand@orange.com>, dime-chairs@ietf.org, lionel.morand@orange.com, dime@ietf.org
X-Test-IDTracker: no
X-IETF-IDTracker: 6.90.0
Auto-Submitted: auto-generated
Precedence: bulk
Message-ID: <154827019075.7547.9421622385944852216.idtracker@ietfa.amsl.com>
Date: Wed, 23 Jan 2019 11:03:10 -0800
Archived-At: <https://mailarchive.ietf.org/arch/msg/dime/mhofT-Rcod4_nAFE8kuGQmQP3pE>
Subject: [Dime] Benjamin Kaduk's No Objection on draft-ietf-dime-doic-rate-control-10: (with COMMENT)
X-BeenThere: dime@ietf.org
X-Mailman-Version: 2.1.29
List-Id: Diameter Maintanence and Extentions Working Group <dime.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dime>, <mailto:dime-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dime/>
List-Post: <mailto:dime@ietf.org>
List-Help: <mailto:dime-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dime>, <mailto:dime-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 23 Jan 2019 19:03:16 -0000

Benjamin Kaduk has entered the following ballot position for
draft-ietf-dime-doic-rate-control-10: No Objection

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-dime-doic-rate-control/



----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

Thanks for this well-written document!  My comments are all essentially
editorial in nature.

One comment of a general note regards the usage of the word "indicate" --
usually when I read "indicate" I expect that to be part of some
protocol message or other formal data structure, but IIUC the OCS is
entirely a local matter, so "indicating" something in the OCS could be
equally well said as "storing" or "noting" or similar.  (I do see other
similar usage of "indicate" in RFC 7683, so it's unclear that there are
really any grounds for changing the usage in this document.)

Section 4

nit: Saying that nodes MUST indicate support for *both* loss and rate seems
to duplicate the requirement from RFC 7683 and would potentially complicate
future updates.  The descriptive note about "nodes supporting the rate
feature will support both" seems a better way to phrase things.

Section 5.1

Is keeping track of how much a reacting node is actually sending considered
to not be part of the OCS (as opposed to the allocated rate, which is part
of the OCS as noted here)?

Section 6.2

   This extension does not define new overload report types.  The
   existing report types of host and realm defined in [RFC7683] apply to
   the rate control algorithm.  The peer report type defined in
   [I-D.ietf-dime-agent-overload] also applies to the rate control
   algorithm.

side note: I'm curious how the directionality is such that the report type
applies to the algorithm, as opposed to the other way around.

Section 7.1

   Upon receiving the overload report with a target maximum Diameter
   request rate, each reacting node applies abatement treatment for new
   Diameter requests towards the reporting node.

(nit?) My (hasty) reading of 7683 is that "abatement treatment" means
either diversion or throttling, and that traffic processed normally is not
considered to receive "abatement treatment".  If that reading is correct,
then this text is suggesting that no new requests receive normal treatment
after the reception of an OLR with a target rate, which does not seem quite
right.

Section 7.2

   Note that the value of OC-Maximum-Rate AVP (in request messages per
   second) for the rate algorithm provides an upper bound on the traffic
   sent by the reacting node to the reporting node.

I see that this is not using normative language, and that the following
paragraph does clarify the caveats, but "upper bound" usually is read as
"strict upper bound", and there are several ways in which this bound could
(at least temporarily) not be strict.  Perhaps "loose upper bound" is
better phrasing.

Section 7.3.1

Perhaps note explicitly that "//" denotes comments?

   In determining whether or not to transmit a specific message, the
   reacting node can use any algorithm that limits the message rate to
   the OC-Maximum-Rate AVP value in units of messages per second.  For
   ease of discussion, we define T = 1/[OC-Maximum-Rate] as the target
   inter-Diameter request interval.  It may be strictly deterministic,
   or it may be probabilistic.  It may, or may not, have a tolerance

nit: The intervening sentence defining 'T' seems to change the binding of
"It" away from "the algorithm".

   Note that when the OC-Maximum-Rate value is 0 with a non-zero OC-
   Validity-Duration, then the reacting node should apply abatement
   treatment to 100% of Diameter requests destined to the overloaded
   reporting node.  However, when the OC-Validity-Duration value is 0,
   the reacting node should stop applying abatement treatment.

nit: this paragraph seems like it would be better placed elsewhere, as its
content is independent of any particular throttling algorithm.

   Reporting nodes with a very large number of reacting nodes, each with
   a relatively small arrival rate, will generally benefit from a
   smaller value for TAU in order to limit queuing (and hence response
   times) at the reporting node when subjected to a sudden surge of
   traffic from all reacting nodes.  Conversely, a reporting node with a
   relatively small number of reacting nodes, each with proportionally
   larger arrival rate, will benefit from a larger value of TAU.

Am I correct in assuming that "larger" and "smaller" values of TAU here are
to be measured with respect to T (i.e., as a ratio)?  This may be worth
stating more explicitly.

Section 8.3

Do you want to add this requirement as a "Note" on the IANA registry
itself?

Section 9

Other than what Mirja has already noted, I only have one minor remark.

It seems that an attacker that can set up reacting nodes has a slightly
different way to disrupt legitimate traffic when "rate" is used vs. "loss",
but the details of any attack depend on implementation behavior at the
reporting node (e.g., whether it divides its total capacity evenly amongst
reacting nodes or uses a more complicated allocation scheme).  And since an
attacker that can set up new reacting nodes is almost certainly able to
send traffic from those nodes, in practice there is no substantial
difference, so the decision to ignore this difference and just refer to the 7683 security
considerations seems justified.