[Dime] Benjamin Kaduk's No Objection on draft-ietf-dime-group-signaling-13: (with COMMENT)

Benjamin Kaduk via Datatracker <noreply@ietf.org> Wed, 03 February 2021 01:40 UTC

Return-Path: <noreply@ietf.org>
X-Original-To: dime@ietf.org
Delivered-To: dime@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id 484743A11B2; Tue, 2 Feb 2021 17:40:26 -0800 (PST)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
From: Benjamin Kaduk via Datatracker <noreply@ietf.org>
To: "The IESG" <iesg@ietf.org>
Cc: draft-ietf-dime-group-signaling@ietf.org, dime-chairs@ietf.org, dime@ietf.org, jounikor@gmail.com
X-Test-IDTracker: no
X-IETF-IDTracker: 7.25.0
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Benjamin Kaduk <kaduk@mit.edu>
Message-ID: <161231642581.26850.13965330226098009886@ietfa.amsl.com>
Date: Tue, 02 Feb 2021 17:40:26 -0800
Archived-At: <https://mailarchive.ietf.org/arch/msg/dime/4rZ4unut4dTZryB4ncKypDjcQgg>
Subject: [Dime] Benjamin Kaduk's No Objection on draft-ietf-dime-group-signaling-13: (with COMMENT)
X-BeenThere: dime@ietf.org
X-Mailman-Version: 2.1.29
List-Id: Diameter Maintanence and Extentions Working Group <dime.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dime>, <mailto:dime-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dime/>
List-Post: <mailto:dime@ietf.org>
List-Help: <mailto:dime-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dime>, <mailto:dime-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 03 Feb 2021 01:40:26 -0000

Benjamin Kaduk has entered the following ballot position for
draft-ietf-dime-group-signaling-13: No Objection

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-dime-group-signaling/



----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

I agree with the other commenters that an editorial pass over the
document before it is sent to the RFC Editor is probably in order; I
only noted the more eggregious nits that have obvious fixes but skipped
the ones where I did not have a resolution ready at hand to suggest.
Barry's remarks about restrictive vs non-restrictive clauses are
particularly important.

I also have several high-level issues that don't quite rise to
discuss-level but seem worth getting resolution on; they mostly involve
edge cases where the current text either leaves me unclear on what is
supposed to happen or seems to present conflicting guidance:

- what's the error handling in cases of partial success?
  * For operations adding/removing sessions to/from groups, it seems that
    the general principle is that the new changes introduced with any
    group-relevant CCF exchange are to be treated atomically: either all
    group additions/removals succeed or all fail (and when it fails the
    session is forced into a single-session fallback mode).  But for
    server-initiated modifications this is enforced by "if the client
    can't do it, the client MUST tear down the affected session(s)", and
    I'm not sure if there's a case where the client can send a new
    request that includes some modifications and the server also tries
    to make some modifications in its response; such a scenario might in
    effect allow partial success and might also be hard to correctly
    interpret.
  * For operations acting on groups, we do allow partial success
    (DIAMETER_LIMITED_SUCCESS) and attempt to mandate fallback to
    single-session operation for affected sessions, but the specifics of
    how the Failed-AVP effectuates this are not clear to me (see my
    comment on Section 4.4.3).

- we require that group identifiers are globally unique, which can be
  done with the diameter-node namespace prefix.  But for the portion
  after that prefix, we seem to suggest reusing the construction from
  Section 8.8 of RFC 6733, which is just (in essence) a 64-bit global
  counter.  In the vein of draft-gont-numeric-ids-sec-considerations,
  that seems overly constrained, in that it reveals sequencing across
  group creation and rate of per-node group creation to the remote peer.
  It seems that a construction akin to a keyed hash of a counter would
  preserve the guaranteed uniqueness property but avoid leaking such
  information to the network.

- the permissions model is a bit unorthodox, with groups "owned" by
  their creator but the binding of a session to a group "owned" by the
  node that performed the binding.  It seems like there is some risk of
  deadlock situations or conflict, e.g., where a client has added a
  session to a group G but the server wants to remove the session from
  G, or where a session is part of a group but the owner of the group
  wants to delete the group.  For the latter case, my understanding is
  that group deletion "trumps" ownership of the session/group binding,
  such that deletion can proceed even while sessions are in the group,
  but I'm less sure what should happen in the former case (or others).

- Is it possible to have a group with no member sessions?  Section 3.3
  suggests that if the last session is removed the group should be
  deleted, but AFAICT this would still require the node performing the
  removal to unset SESSION_GROUP_STATUS_IND, and I didn't see a
  requirement to do that spelled out.  I do see how it is impossible to
  create a group directly in a state with no member sessions.

- Is there a requirement to always send the current state of group
  membership when acting on a session?  I could (perhaps naively)
  imagine a case where due to previous interactions a session belongs to
  groups G1, G2, and G3, but in some subsequent request the client only
  mentions G1.  Does the membership in G2 and G3 get implicitly retained
  in the absence of an explicit unset SESSION_GROUP_ALLOCATION_ACTION or
  are there some other constraints that make such a scenario impossible?

- Is there supposed to be a strong distinction between "including" an
  AVP and "appending" one?  I see that 6733 does make some fairly clear
  distinction between the terms, but it seems that (e.g.) in Section
  4.2.1 we use both phrasings to discuss Session-Group-Info.

Now for some section-by-section (mostly editorial or nit-level) notes.

Abstract, Introduction

   a million concurrent Diameter sessions.  Recent use cases have
   revealed the need for Diameter nodes to apply the same operation to a
   large group of Diameter sessions concurrently.  The Diameter base

I note that the -00 is from 2012; are these use cases still "recent"?

Section 3

As an editorial note, the way the current text jumps in to say that
sessions can be assigned to groups leaves the reader uncertain whether
this is describing preexisting functionality or the new mechanisms added
by this document.  A top-level intro paragraph for Section 3 that says
roughly "to accomodate bulk operations on Diameter sessions, the concept
of session groups is introduced; once sessions are added to a group, a
command acting on the group will affect all the member sessions" might
help.

Section 3.3

If I understand correctly, the lines in the table for "remove a session
from an owned Session Group"/"remove a session from a non-owned Session
Group" mean only that this operation can be done sometimes, not that it
can always be done (per the lines about "created the assignment".  Would
it be helpful to indicate this, perhaps by using a different symbol for
those lines and adding a footnote?

Section 4

While I understand the desire to keep the document structure as it is,
with the actual AVPs specified in Section 7, it would have been very
helpful to have a toplevel introductory paragraph here that mentions or
references that there is a containing Session-Group-Info grouped AVP
that contains the Session-Group-Control-Vector with information about
the action and group, and zero or one(?) Session-Group-Id AVP to
identify the group when a specific group is being identified.  This
would also allow clarifying whether the Session-Group-Id AVP is
currently only defined to appear within the Session-Group-Info.

Section 4.1.1

                                           Such applications provide
   intrinsic discovery for the support of session grouping capability
   using the assigned Application Id advertised during the capability
   exchange phase two Diameter peers establish a transport connection
   (see Section 5.3 of [RFC6733]).

nit: I think there's a missing word here, perhaps "where" after "phase"?

Section 4.2.1

   The client may also indicate in the request that the server is
   responsible for the assignment of the session in one or multiple
   sessions owned by the server.  [...]

nit(?): is this supposed to be "assignment of the session into one or
multiple session *groups* owned by the server"?  I'm having a hard time
understanding it as written.

   If the assignment of the session to one or some of the multiple
   identified session groups fails, the session group assignment is
   treated as failure.  In such case the session is treated as single
   session without assignment to any session group by the Diameter
   nodes.  The server sends the response to the client and MAY include
   those Session-Group-Info AVPs for which the group assignment failed.
   The SESSION_GROUP_ALLOCATION_ACTION flag of included Session-Group-
   Info AVPs MUST be cleared.

I guess I understand the part where the entire set of group-assignment
operations has to succeed or fail as an atomic unit, but this text
perhaps implies some semantics that the server is supposed to only
explicitly include in the response the subset of group assignments that
were unable to be processed, omitting the ones that could have been
processed (but were not processed since a failure on one means that none
of the operations get applied).  If that's the intent, I'd suggest being
a bit more explicit about what is and isn't sent.

   A Diameter client, which sent a request for session initiation to a
   Diameter server and appended a single or multiple Session-Group-Id
   AVPs but cannot find any Session-Group-Info AVP in the associated

(editorial) the phrase "cannot find" makes me wonder how hard it's
expected to be looking; more definitive statements about "not present"
seem more typical for RFC style.

Section 4.2.3

   When a Diameter server enforces an update to the assigned groups mid-

nit: this seems to be the only time we use the word "enforce" in this
sense in the document; previous discussion seems to just use "decides to
make an update" or similar.

   answer.  The client subsequently sends a service-specific re-
   authorization request containing one or multiple Session-Group-Info
   AVPs with the SESSION_GROUP_ALLOCATION_ACTION flag set and the
   Session-Group-Id AVP identifying the session group to which the
   session had been previously assigned.  [...]

nit: I think this has to be "group or groups" to be consistent with the
rest of the doc.

Section 4.4.1

   Either Diameter node (client or server) can request the recipient of
   a request to process an associated command for all sessions assigned
   to one or multiple groups by identifying these groups in the request.
   The sender of the request appends for each group, to which the
   command applies, a Session-Group-Info AVP including the Session-
   Group-Id AVP to identify the associated session group.  Both, the
   SESSION_GROUP_ALLOCATION_ACTION flag as well as the
   SESSION_GROUP_STATUS_IND flag MUST be set.

What's the error handling if one or both listed flags are not set --
just ignore the request?

   Action AVP to ALL_GROUPS (1) or PER_GROUP (2).  If the answer can be
   sent before the complete process of the request for all the sessions
   or if the request timeout timer is high enough, the sender MAY set
   the Group-Response-Action AVP to ALL_GROUPS (1) or PER_GROUP (2).

(side note) just the phrase "high enough" doesn't give much of an
indication of what the criteria are and what numerical values might be
appropriate.  That said, it's not entirely clear how much guidance we
can really give in this situation.

Section 4.4.2

   If the received request identifies multiple groups in multiple
   appended Session-Group-Id AVPs, the receiver SHOULD process the
   associated command for each of these groups.  If a session has been
   assigned to more than one of the identified groups, the receiver MUST
   process the associated command only once per session.

Why is this only a SHOULD for each group -- what other behaviors could
the receiver do?

Section 4.4.3

   In the case of limited success, the sessions, for which the
   processing of the group command failed, MUST be identified using a
   Failed-AVP AVP as per Section 7.5 of [RFC6733].  [...]

My reading of the referenced part of RFC 6733 is that there is a single
Failed-AVP pointing to a single AVP that could not be processed
properly.  Is there such a single failed AVP in the case where
processing failed for multiple sessions in the group(s)?  It seems that
the "largest containing AVP" that includes all failed groups might be so
large so as to not be useful in indicating the problem.

Section 7.2

   SESSION_GROUP_STATUS_IND (0x00000010)

If there's a mnemonic for the "IND" part of "SESSION_GROUP_STATUS_IND",
that would be helpful to expand.

Section 9.2

The Specification Required policy includes review by Designated Experts;
is there any guidance we should provide to the DEs?

Appendix A.1

      Discon    GASA received                  Cleanup      Idle

Spot-checking against RFC 6733's state machine, the non-group ASA
received case only makes this transition when there was a previous ASR
that was successfully sent.  Is that am important distinction?  (Also,
as an editorial nit, RFC 6733 spells "Clean up" as two words.)