Re: [Dime] Ben Campbell's Discuss on draft-ietf-dime-drmp-05: (with DISCUSS and COMMENT)

Steve Donovan <srdonovan@usdonovans.com> Fri, 06 May 2016 19:48 UTC

To: Ben Campbell <ben@nostrum.com>, The IESG <iesg@ietf.org>
References: <20160504023124.8242.52368.idtracker@ietfa.amsl.com> <572A227D.1040203@usdonovans.com> <45E2E5D4-091E-4311-9FDF-271B04D59D05@nostrum.com>
From: Steve Donovan <srdonovan@usdonovans.com>
Message-ID: <572CF510.20202@usdonovans.com>
Date: Fri, 06 May 2016 14:48:32 -0500
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:38.0) Gecko/20100101 Thunderbird/38.7.2
MIME-Version: 1.0
In-Reply-To: <45E2E5D4-091E-4311-9FDF-271B04D59D05@nostrum.com>
Content-Type: text/plain; charset="windows-1252"; format="flowed"
Content-Transfer-Encoding: 7bit
Archived-At: <http://mailarchive.ietf.org/arch/msg/dime/XXo_0HmJZRp7mrqjOZozGPk2CSY>
Cc: draft-ietf-dime-drmp@ietf.org, "dime-chairs@ietf.org" <dime-chairs@ietf.org>, dime@ietf.org
Subject: Re: [Dime] Ben Campbell's Discuss on draft-ietf-dime-drmp-05: (with DISCUSS and COMMENT)
Precedence: list

Ben,

See my comments inline.

Steve

On 5/4/16 10:58 PM, Ben Campbell wrote:
> (oops, sent this earlier, but it just went to Steve. Resending to full 
> list)
>
> Thanks for the quick response. Further discussion inline:
>
> Ben.
>
> On 4 May 2016, at 11:25, Steve Donovan wrote:
>
> [...]
>
>>>
>>> ----------------------------------------------------------------------
>>> DISCUSS:
>>> ----------------------------------------------------------------------
>>>
>>> I have a few concerns that I think need some discussion.
>>>
>>> 1) Priority between applications: The fact that agents can apply 
>>> priority
>>> for messages from multiple applications without knowledge of those
>>> applications seems dangerous. Let's say application A is a critical
>>> infrastructure application, and application B is not. But clients for
>>> application B might set requests to have a higher priority than do
>>> clients for application A.  Further, application B could become a DoS
>>> vector for application A. One potential (and likely half-baked) way to
>>> mitigate this would be to say that nodes that are not "application 
>>> aware"
>>> can only apply priority among messages for the same application.
>> SRD> This is similar to saying that priority setting across 
>> applications need to be set in a consistent way.  We might need to 
>> define the "priority scheme" or some similar concept as sketched out 
>> in my response to Alissa's DISCUSS.
>
> That would probably help. Am I correct in assuming that would require 
> some WG discussion?
> It occurs to me after sleeping on it that, in congestion cases, this 
> would leave things no worse than when a relay agent throttles requests 
> with even less information. But it's a little more worrisome when used 
> in non-congested circumstances, assuming they leave the possibility of 
> throttling or otherwise rejecting requests under normal processing 
> conditions.
SRD> See my proposed changes to the draft sent in a separate email.  I 
think this addresses the above concern and is currently waiting WG 
discussion.
>
>>>
>>> 2) Priority between clients of the same application: If you have 
>>> multiple
>>> clients for the same application, don't they need to use the same
>>> prioritization strategy? How is this to be managed?
>> SRD> It is not directly defined.  This is back to the question of 
>> whether or not the mechanism is constrained to only work in a trusted 
>> environment.
>
> The potential "priority scheme" might help here. But why does a 
> trusted environment matter? Lets say you have trusted clients from 
> vender A and from vendor B, but they select priorities differently?
SRD> They aren't trusted if they aren't using the defined standards.  In 
addition, trusted environment generally means operated by a single 
entity.  That operator has the job of ensuring that what you are 
proposing would not happen.
>
>>>
>>> 3) Out of order requests: The draft explicitly allows agents to 
>>> re-route
>>> and even explicitly re-order messages. Is it safe to have a
>>> non-application aware node change the order of messages?
>> SRD> This mechanism doesn't change the need for Diameter nodes to 
>> handle messages arriving out of order.  This exists in any protocol 
>> that has agents/proxies.
>
> Okay, I will withdraw that part. (How quick one forgets...)
>
>>>
>>> 4) I am nervous about the idea that clients and servers would use a
>>> generic message priority mechanism to manage the allocation of 
>>> resources
>>> that result from a requests and answers. It seems like that should be
>>> based on application specific rules and information. (Now, if the point
>>> is that these same AVPs might be used in an application according to
>>> application specific rules, that might be okay--but then you might run
>>> into issues where application-agnostic agents don't know the 
>>> difference.)
>> SRD> The definition of what different priority levels mean will 
>> reflect the application specific knowledge.  Agents just route 
>> requests and with the introduction of DOIC, sometimes throttle them.  
>> The agent doesn't need anything but the priority value, as long as 
>> the priority values are defined consistently across all applications.
>
> My concern was more about client and server behavior, but it may be 
> impacted by relays. I assume that a relay agent would not be making 
> decisions about resource allocation, at least beyond transactions state.
>
> My concern is twofold: In the case of servers, this seems to give a 
> server leave to send a successful answer, but not allocate any 
> necessary resources to support that success. This seems to lead to 
> circumstances where the server thinks something failed but the client 
> thinks it worked. (I have no problem with saying that a 
> resource-constrained server might include priority as a factor in it's 
> decisions about which requests to reject.)
SRD> DRMP does not change application server behavior.  It just 
influences the order in which requests are processed.  As such, I don't 
see how DRMP would cause the server to falsely send a success response.
>
> In the case of clients, I think we are talking about a client 
> abandoning some task even though the server thinks it worked. I don't 
> see that as a problem per se, but I would expect a server to do that 
> based on some local knowledge of priority. It seems a stretch that it 
> would do so based on a priority value inserted by the server. But even 
> if that makes sense, I understand that this draft allows relay agents 
> with no knowledge of the application semantics to insert, remove, or 
> change priority values in answer messages. (Maybe the answer here is 
> stricter guidance on when relay agents can change priority values, and 
> allowing the client to ignore priority values in answers (especially 
> success answers)
SRD> Clients abandoning things can happen without DRMP.  It's up to the 
application to define the correct behavior when this happens.  DRMP 
doesn't change anything on that front.  As such, I don't see the concern.
>
>
>
>>>
>>>
>>> ----------------------------------------------------------------------
>>> COMMENT:
>>> ----------------------------------------------------------------------
>>>
>>> General: I approached this assuming prioritization would matter only in
>>> overload scenarios. But I gather you envision using this in 
>>> non-overload
>>> scenarios? (This interacts with my discuss point about the use of 
>>> DRMP to
>>> prioritize resource allocation that result result from successful
>>> diameter transactions.) Use case 5.3 is a bit worrisome in this 
>>> context.
>> SRD> I don't understand the concern.
>
> I'm concerned that this could effectively induce overload for lower 
> priority messages, when the nodes would normally be able to handle all 
> messages. In use case 5.3, does the platinum customer buy the right to 
> cause other customer's requests to fail?
SRD> It is certainly possible that higher priority messages could starve 
lower priority messages.  The only case when this would result in a 
lower priority message not getting service in non congestion scenarios 
is if the lower priority message times out. This is a recognized side 
effect of priority based schemes like this.  An operator offering a 
service like 5.3 would have to take this into consideration.  I would 
expect that in a well thought out priority scheme the percentage of high 
priority messages will be significantly less than low priority messages, 
reducing the probability of starving the low priority messages.  How 
this is done, however, is out side the scope of this document.
>
>
>>>
>>>
>>> -6, list item 4: Are there really use cases for answer senders to set a
>>> different priority on the answer than was on the request? That seems to
>>> add quite a bit of complexity.
>> SRD> This was an explicit conversation within the working group. I 
>> don't recall the specific use case off the top of my head, but this 
>> was changed to the current wording after discussions within the 
>> working group.  I can go back to the email archive to refresh my 
>> memory if necessary.
>
> I am willing to accede to working group consensus on this. But one 
> question: Does this apply to answers that indicate success, as well as 
> failure?
SRD> Yes.
>
>>>
>>> - 6, list item 8: I'm not sure what it means for a client to prioritize
>>> answers to it's own requests.
>> SRD> The client could choose to complete the transaction, and 
>> initiate other dependent actions, based on the priority received in 
>> the answer message.  It is those dependent actions -- setting up a 
>> data channel, authorizing call completion, etc -- that would be 
>> impacted by the priorities received in the answer.
>
> See the comments on item 4 from my discuss.
SRD> See my response.  Is there something else that I'm missing?
>
>>>
>>> -8,  "Diameter nodes SHOULD
>>>     include Diameter routing message priority in the DRMP AVP in all
>>>     Diameter request messages." :
>>> Does that apply to all nodes that touch a request, or just the request
>>> originator?
>> SRD> This statement was meant to apply to the request originator.  
>> The statement should be updated.
>
> Okay.
>
>>>
>>> -- "Diameter nodes MUST use the priority indicated in the DRMP AVP
>>> carried in the answer message, if it exists."
>>> The MUST seems odd, since paying attention to the priority in the 
>>> initial
>>> request was only SHOULD.
>> SRD> The wording here is cumbersome.  The full paragraph is as follows:
>>
>>    When determining the priority to apply to answer messages, Diameter
>>    nodes MUST use the priority indicated in the DRMP AVP carried in the
>>    answer message, if it exists.  Otherwise, the Diameter node MUST use
>>    the priority indicated in the DRMP AVP of the associated request
>>    message.
>>
>> This is meant to say that a Diameter node receiving an answer message 
>> MUST use the priority value in the answer message when processing the 
>> message.  If there is no DRMP AVP in the answer message then the 
>> receiving node uses the priority that was in the request message.  
>> I'd be happy to re-craft this to be more clear.
>
> Part of my concern is that the MUST vs SHOULD bit is different for 
> requests and answers. statements of the form "If you use the 
> mechanism, you MUST do X" do not mean the same as "SHOULD do X", even 
> if the reasoning behind the SHOULD is that a node might not use the 
> mechanism.
SRD> There are no SHOULD statements in this paragraph.  This requirement 
only applies "When determining the priority to apply to __received__ 
answer messages".  I don't see the conflict.
>
>>>
>>> -- "Another is to use the Proxy-Info
>>>        mechanism defined in [RFC6733].":
>>> That probably needs some elaboration.
>> SRD> A reference to the document that defines Proxy-Info isn't 
>> sufficient?
>
> On a re-read, it's probably good enough, although on reflection, I 
> don't think this draft needs to tell relay agents how to manage 
> transaction state. But it doesn't hurt either way.
SRD> This draft doesn't, it lists two options.  The Proxy-Info qualifier 
was inserted because there was a suggestion that the wording indicate 
the state is stored locally by the agent.

[Dime] Ben Campbell's Discuss on draft-ietf-dime-… Ben Campbell
Re: [Dime] Ben Campbell's Discuss on draft-ietf-d… Steve Donovan
Re: [Dime] Ben Campbell's Discuss on draft-ietf-d… Ben Campbell
Re: [Dime] Ben Campbell's Discuss on draft-ietf-d… Steve Donovan
Re: [Dime] Ben Campbell's Discuss on draft-ietf-d… Ben Campbell
Re: [Dime] Ben Campbell's Discuss on draft-ietf-d… Steve Donovan
Re: [Dime] Ben Campbell's Discuss on draft-ietf-d… Ben Campbell
Re: [Dime] Ben Campbell's Discuss on draft-ietf-d… Steve Donovan