Re: [Dime] Ben Campbell's Discuss on draft-ietf-dime-drmp-05: (with DISCUSS and COMMENT)

"Ben Campbell" <ben@nostrum.com> Fri, 06 May 2016 20:45 UTC

Return-Path: <ben@nostrum.com>
X-Original-To: dime@ietfa.amsl.com
Delivered-To: dime@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 62AC912D13B; Fri, 6 May 2016 13:45:47 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.896
X-Spam-Level:
X-Spam-Status: No, score=-2.896 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RP_MATCHES_RCVD=-0.996] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id kiz73IGoitIE; Fri, 6 May 2016 13:45:45 -0700 (PDT)
Received: from nostrum.com (raven-v6.nostrum.com [IPv6:2001:470:d:1130::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id AC3FD128874; Fri, 6 May 2016 13:45:45 -0700 (PDT)
Received: from [10.0.1.18] (cpe-70-119-246-39.tx.res.rr.com [70.119.246.39]) (authenticated bits=0) by nostrum.com (8.15.2/8.14.9) with ESMTPSA id u46KjhOP007509 (version=TLSv1 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO); Fri, 6 May 2016 15:45:44 -0500 (CDT) (envelope-from ben@nostrum.com)
X-Authentication-Warning: raven.nostrum.com: Host cpe-70-119-246-39.tx.res.rr.com [70.119.246.39] claimed to be [10.0.1.18]
From: "Ben Campbell" <ben@nostrum.com>
To: "Steve Donovan" <srdonovan@usdonovans.com>
Date: Fri, 06 May 2016 15:45:43 -0500
Message-ID: <7EB67D8A-B1DE-4456-B89A-6E049A6BADE5@nostrum.com>
In-Reply-To: <572CF510.20202@usdonovans.com>
References: <20160504023124.8242.52368.idtracker@ietfa.amsl.com> <572A227D.1040203@usdonovans.com> <45E2E5D4-091E-4311-9FDF-271B04D59D05@nostrum.com> <572CF510.20202@usdonovans.com>
MIME-Version: 1.0
Content-Type: text/plain; format=flowed
X-Mailer: MailMate (1.9.4r5234)
Archived-At: <http://mailarchive.ietf.org/arch/msg/dime/KuEQLCcYCgt_4KSKh270oGnZNnI>
Cc: draft-ietf-dime-drmp@ietf.org, "dime-chairs@ietf.org" <dime-chairs@ietf.org>, dime@ietf.org, The IESG <iesg@ietf.org>
Subject: Re: [Dime] Ben Campbell's Discuss on draft-ietf-dime-drmp-05: (with DISCUSS and COMMENT)
X-BeenThere: dime@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Diameter Maintanence and Extentions Working Group <dime.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dime>, <mailto:dime-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dime/>
List-Post: <mailto:dime@ietf.org>
List-Help: <mailto:dime-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dime>, <mailto:dime-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 06 May 2016 20:45:47 -0000

On 6 May 2016, at 14:48, Steve Donovan wrote:


[...]

>>
>>>>
>>>> ----------------------------------------------------------------------
>>>> DISCUSS:
>>>> ----------------------------------------------------------------------
>>>>
>>>> I have a few concerns that I think need some discussion.
>>>>
>>>> 1) Priority between applications: The fact that agents can apply 
>>>> priority
>>>> for messages from multiple applications without knowledge of those
>>>> applications seems dangerous. Let's say application A is a critical
>>>> infrastructure application, and application B is not. But clients 
>>>> for
>>>> application B might set requests to have a higher priority than do
>>>> clients for application A.  Further, application B could become a 
>>>> DoS
>>>> vector for application A. One potential (and likely half-baked) way 
>>>> to
>>>> mitigate this would be to say that nodes that are not "application 
>>>> aware"
>>>> can only apply priority among messages for the same application.
>>> SRD> This is similar to saying that priority setting across 
>>> applications need to be set in a consistent way.  We might need to 
>>> define the "priority scheme" or some similar concept as sketched out 
>>> in my response to Alissa's DISCUSS.
>>
>> That would probably help. Am I correct in assuming that would require 
>> some WG discussion?
>> It occurs to me after sleeping on it that, in congestion cases, this 
>> would leave things no worse than when a relay agent throttles 
>> requests with even less information. But it's a little more worrisome 
>> when used in non-congested circumstances, assuming they leave the 
>> possibility of throttling or otherwise rejecting requests under 
>> normal processing conditions.
> SRD> See my proposed changes to the draft sent in a separate email.  I 
> think this addresses the above concern and is currently waiting WG 
> discussion.

See my comment in that thread. For the record, I think the proposal is 
on the right track, but may still need some tweaks.

>>
>>>>
>>>> 2) Priority between clients of the same application: If you have 
>>>> multiple
>>>> clients for the same application, don't they need to use the same
>>>> prioritization strategy? How is this to be managed?
>>> SRD> It is not directly defined.  This is back to the question of 
>>> whether or not the mechanism is constrained to only work in a 
>>> trusted environment.
>>
>> The potential "priority scheme" might help here. But why does a 
>> trusted environment matter? Lets say you have trusted clients from 
>> vender A and from vendor B, but they select priorities differently?
> SRD> They aren't trusted if they aren't using the defined standards.

What standard? The application specification? (See my question in 
Alissa's thread about whether DRMP can be applied to legacy applications 
that don't (yet) define it's use.)

> In addition, trusted environment generally means operated by a single 
> entity.  That operator has the job of ensuring that what you are 
> proposing would not happen.

I can accept if that is the case, but it's not a very satisfying answer. 
It leads to having profiles for each operator. (I realize that the 
primary users of Diameter have always had operator-specific profiles of 
standards, so maybe there's nothing to do here.) This might at least 
imply a requirement that priority definitions need to be configurable.

[...]

>>
>>>>
>>>> 4) I am nervous about the idea that clients and servers would use a
>>>> generic message priority mechanism to manage the allocation of 
>>>> resources
>>>> that result from a requests and answers. It seems like that should 
>>>> be
>>>> based on application specific rules and information. (Now, if the 
>>>> point
>>>> is that these same AVPs might be used in an application according 
>>>> to
>>>> application specific rules, that might be okay--but then you might 
>>>> run
>>>> into issues where application-agnostic agents don't know the 
>>>> difference.)
>>> SRD> The definition of what different priority levels mean will 
>>> reflect the application specific knowledge.  Agents just route 
>>> requests and with the introduction of DOIC, sometimes throttle them. 
>>>  The agent doesn't need anything but the priority value, as long as 
>>> the priority values are defined consistently across all 
>>> applications.
>>
>> My concern was more about client and server behavior, but it may be 
>> impacted by relays. I assume that a relay agent would not be making 
>> decisions about resource allocation, at least beyond transactions 
>> state.
>>
>> My concern is twofold: In the case of servers, this seems to give a 
>> server leave to send a successful answer, but not allocate any 
>> necessary resources to support that success. This seems to lead to 
>> circumstances where the server thinks something failed but the client 
>> thinks it worked. (I have no problem with saying that a 
>> resource-constrained server might include priority as a factor in 
>> it's decisions about which requests to reject.)
> SRD> DRMP does not change application server behavior.  It just 
> influences the order in which requests are processed.  As such, I 
> don't see how DRMP would cause the server to falsely send a success 
> response.

I agree, but I thought the language in DRMP seemed to allow for it. 
Maybe, rather than saying that a server could use DRMP to decide whether 
the allocate resources for a request, it would be better to say the 
server could use DRMP to decide whether to process or fail a request.

>>
>> In the case of clients, I think we are talking about a client 
>> abandoning some task even though the server thinks it worked. I don't 
>> see that as a problem per se, but I would expect a server to do that 
>> based on some local knowledge of priority. It seems a stretch that it 
>> would do so based on a priority value inserted by the server. But 
>> even if that makes sense, I understand that this draft allows relay 
>> agents with no knowledge of the application semantics to insert, 
>> remove, or change priority values in answer messages. (Maybe the 
>> answer here is stricter guidance on when relay agents can change 
>> priority values, and allowing the client to ignore priority values in 
>> answers (especially success answers)
> SRD> Clients abandoning things can happen without DRMP.  It's up to 
> the application to define the correct behavior when this happens.  
> DRMP doesn't change anything on that front.  As such, I don't see the 
> concern.

I think that's fine if only application-aware nodes are involved (and 
the application defines how to use DRMP). My concern is that a relay 
agent is allowed to change the priority. So imagine a client sends a 
request with a high priority, and gets back a successful answer with a 
low priority. Section 8 says it MUST use the priority in the answer over 
it's local idea of what the priority was. How does it know a relay 
didn't change the priority in the answer.

(Even if nothing changes the answer en route, is it really reasonable 
for the server's idea of priority to always override the client's?)
>>
>>
>>
>>>>
>>>>
>>>> ----------------------------------------------------------------------
>>>> COMMENT:
>>>> ----------------------------------------------------------------------
>>>>
>>>> General: I approached this assuming prioritization would matter 
>>>> only in
>>>> overload scenarios. But I gather you envision using this in 
>>>> non-overload
>>>> scenarios? (This interacts with my discuss point about the use of 
>>>> DRMP to
>>>> prioritize resource allocation that result result from successful
>>>> diameter transactions.) Use case 5.3 is a bit worrisome in this 
>>>> context.
>>> SRD> I don't understand the concern.
>>
>> I'm concerned that this could effectively induce overload for lower 
>> priority messages, when the nodes would normally be able to handle 
>> all messages. In use case 5.3, does the platinum customer buy the 
>> right to cause other customer's requests to fail?
> SRD> It is certainly possible that higher priority messages could 
> starve lower priority messages.  The only case when this would result 
> in a lower priority message not getting service in non congestion 
> scenarios is if the lower priority message times out. This is a 
> recognized side effect of priority based schemes like this.  An 
> operator offering a service like 5.3 would have to take this into 
> consideration.  I would expect that in a well thought out priority 
> scheme the percentage of high priority messages will be significantly 
> less than low priority messages, reducing the probability of starving 
> the low priority messages.  How this is done, however, is out side the 
> scope of this document.

It's probably worth mentioning in the draft text that a naive approach 
to priority can introduce transaction failure in cases where all 
transactions might have succeeded without it, and that operators and 
implementers need to consider that.

[...]

>>
>>>>
>>>> -- "Diameter nodes MUST use the priority indicated in the DRMP AVP
>>>> carried in the answer message, if it exists."
>>>> The MUST seems odd, since paying attention to the priority in the 
>>>> initial
>>>> request was only SHOULD.
>>> SRD> The wording here is cumbersome.  The full paragraph is as 
>>> follows:
>>>
>>>    When determining the priority to apply to answer messages, 
>>> Diameter
>>>    nodes MUST use the priority indicated in the DRMP AVP carried in 
>>> the
>>>    answer message, if it exists.  Otherwise, the Diameter node MUST 
>>> use
>>>    the priority indicated in the DRMP AVP of the associated request
>>>    message.
>>>
>>> This is meant to say that a Diameter node receiving an answer 
>>> message MUST use the priority value in the answer message when 
>>> processing the message.  If there is no DRMP AVP in the answer 
>>> message then the receiving node uses the priority that was in the 
>>> request message.  I'd be happy to re-craft this to be more clear.
>>
>> Part of my concern is that the MUST vs SHOULD bit is different for 
>> requests and answers. statements of the form "If you use the 
>> mechanism, you MUST do X" do not mean the same as "SHOULD do X", even 
>> if the reasoning behind the SHOULD is that a node might not use the 
>> mechanism.
> SRD> There are no SHOULD statements in this paragraph.  This 
> requirement only applies "When determining the priority to apply to 
> __received__ answer messages".  I don't see the conflict.

So this may be down to a nit--but the text as written says that the 
receiver of a _request_ "SHOULD" use the included priority, but the 
receiver of a response "MUST" use the priority. Now, maybe these are 
just different ways of encoding the idea that the node may not implement 
DRMP. But the former seems to allow the node to override the priority in 
the message if it has good reason, but the latter does not allow the 
same for a response.

As I mention in the discussion about my last discuss point, it seems odd 
that a client cannot choose it's own idea of priority over the servers 
if it thinks it has a good reason.

[...]