Re: [Dime] New Version Notification for draft-donovan-doic-agent-cases-00.txt

Hi Ulrich,

Thanks for the feedback!

We intended this draft as more of a discussion paper than something that the working group would publish. I will respond to some of your substantive comments inline. If we publish a new revision at some point, we can address the editorial and clarification comments at that time.

Thanks!

Ben.

On Jul 11, 2014, at 7:03 AM, Wiehe, Ulrich (NSN - DE/Munich) <ulrich.wiehe@nsn.com> wrote:

[...]

> 7. clause 2 Deployment Architectures
> There are many more architectures and scenarios which cannot all be covered separately. I therefore propose to focus on general principles that DOIC agents must follow independent from architecture and scenario. These principles are:

I agree the scenarios in the draft are not exhaustive. We picked some that we think exercise certain behaviors, that are discussed in the scenarios and in the recommendations section. I believe they are all reasonable scenarios that DOIC should allow.

> 
> A) When an agent receives a host-routed request that contains an OC-S-F AVP, it takes no DOIC-specific action, i.e. it forwards the request to the next hop without removing, modifying or inserting OC AVPs; also no DOIC specific action is performed when receiving the corresponding answer. The agent may however store OLR information received in the answer for potential later use if it supports the features associated with the OLR.

Do I understand correctly that you mean that an agent MUST NOT remove or modify the OC-S-F from host routed requests? If so, I disagree. There are real world scenarios that will require that modification or removal. (For example, those in section 5.2)

> 
> B) When an agent receives a host-routed request that does not contain an OC-S-F AVP, it performs throttling according to a previously receive host-type OLR (if any).If the request survives (or no throttling was performed because no host-type OLR matched), the agent inserts an OC-S-F AVP and sends the request to the next hop. When receiving the corresponding answer, the agent checks whether it contains an OLR update and if so replaces the stored info (if any) with the updated info. In any case the agent removes the OLR from the answer before sending it to the previous hop.

I agree in general, although I think we can separate the " what if this request gets throttled" behaviors from the "how you treat OC-S-F in forwarded requests" aspects.

> 
> C) When an agent that is not configured to perform server selection for realm-routed requests receives a realm-routed request that contains an OC-S-F AVP, it takes no DOIC-specific action, i.e. it forwards the request to the next hop without removing, modifying or inserting OC AVPs; also no DOIC specific action is performed when receiving the corresponding answer. The agent may however store OLR information received in the answer for potential later use if it supports the features associated with the OLR.
> 
> D) When an agent that is not configured to perform server selection for realm-routed requests receives a realm-routed request that does not contain an OC-S-F AVP, it performs throttling according to a previously receive realm-type OLR (if any).If the request survives (or no throttling was performed because no realm-type OLR matched), the agent inserts an OC-S-F AVP and sends the request to the next hop. When receiving the corresponding answer, the agent checks whether it contains an OLR update and if so replaces the stored info (if any) with the updated info. In any case the agent removes the OLR from the answer before sending it to the previous hop.
> 
> E) When an agent that is configured to perform server selection for realm-routed requests receives a realm-routed request that contains an OC-S-F AVP, it performs server selection, adds the Destination-Host AVP with the value identifying the selected server to the request (this need not be done when the selected server is an immediate peer), replaces the received OC-S-F AVP with its own OC-S-F AVP in the request and forwards the request to the next hop. When receiving the corresponding answer the agent checks whether it contains an OLR update and if so replaces the stored info (if any) with the updated info, calculates an (aggregated) realm-type OLR that fits to the supported features as received in the request. In any case the agent replaces the OC-S-F in the answer with its own OC-S-F (indicating the selected algorithm), removes the OC-OLR from the answer and adds its own calculated aggregated realm-type OLR (if any) to the answer before sending it to the previous hop.
> 
> F) When an agent that is configured to perform server selection for realm-routed requests receives a realm-routed request that does not contain an OC-S-F AVP, logically a combination of case D) followed by case E) is performed.
> 
> 

[...]

> 10. clause 4. Overload Abatement Methods, 4th word:
> Should read "server". Agent overload is out of scope.
> 

It is currently out of scope for the DOIC draft, but I believe it will eventually become in-scope for the working group. I'd prefer to keep things general to "node" in this section.  

> 11. clause 4. Overload Abatement Method, 7th paragraph:
> Diversion (which is limited to realm-routed requests) can only occur at the Diameter Node that performs the server selection. Topology knowledge is not needed. Overload state knowledge actually is pushed further down the chain. The node can control how the received realm-routed request is routed upstream by inserting a Destination-Host AVP. 

The knowledge that an agent performs server selection is, in itself, topology knowledge. In general, a node (TC or agent) selects the next hop. That hop might be a TS, or it might be an agent. There are scenarios in which a node cannot know which it is (e.g. certain proxies may be indistinguishable from servers.)

 I believe there are scenarios where an agent that is not the last hop before the TS could be configured to do diversion. That requires the "topology knowledge" that the set possible servers that might receive the overloaded request do not overlap with the set of possible servers that the request was diverted away from. I'm not saying those scenarios are common, or even a good idea--just that DOIC should not forbid them.

> 13. clause 5. DOIC Use Cases:
> I have lots of detailed comments which I can share in a separate mail. The essence is that all the use cases described should follow the general principles A to F outlined above. It may be worth to totally rewrite this clause to illustrate the general principles A to F.

I disagree. Use cases drive behaviors, not the other way around.

> 
> 14. clause 6.1. General Recommendation, last paragraph:
> This is not strictly required. Multiple occurrences of OC-OLR could be regarded an an optimization. But then there are issues: when there are two OLRs in one answer, e.g. one realm-type OLR for loss and one host-type OLR for rate, we would also need two OC-S-F AVPs: one indicating that loss is selected, and one indicating that rate is selected. But this seems to be contradicting information.

I don't think we can solve the need for realm-type reports without allowing multiple OLRs. Once a server declares a host-overload condition, it will continue to send the host-report in every single answer until the overload condition ends. If the agent needs to insert a realm-report, it will have no where to put it unless we either let it remove one of the realm reports (which I think would be a bad idea) or let it insert another one.

I think it's worth discussion whether multiple OLRs in the same answer are allowed to have different algorithms. (But this particular aspect of that discussion would be a non-issue if the OLR included the selected algorithm, as I have previously argued.)

> 
> 15. clause 6.2.1 Capabilities Exchange Behaviours, 3rd paragraph:
> Should read: "An agent may act as a reporting node on behalf of a non-supporting TS, or as reacting node on behalf of a non-supporting TC.[Section 5.3]". I'm not sure whether we should cover the first case. At least it should be limited to architectures where the agent (acting as reporting node on behalf of the non supporting server) and the (non supporting) server are immediate peers and the server has no other immediate peers, so that the two nodes can be regarded a single supporting server.

What do we gain by that limitation? Keep in mind, we are not suggesting that the DOIC draft specify how you build out networks--It should try really hard to avoid that. We only propose that the specified agent behaviors are flexible enough to handle the listed scenarios. 

> 16. clause 6.2.1 Capabilities Exchange Behaviours, 4th paragraph:
> To add some clarificatios this should read:
> "An agent that acts as a reacting node must include an OC-Supported-Features in each Diameter request that it forwards in that role. If the inbound request included an OC-Supported-Features AVP, the agent may copy its content to the one in the outbound request (this is the case where the request is a) host-routed or b) realm-routed and the agent is not configured to perform server selection), or may replace the contents indicating the DOIC capabilities of the agent itself (this is the case where the request is realm routed and the agent is configured to perform server selection). If an inbound request does not contain an OC-Supported-Features AVP, the agent must insert one into the outbound request, indicating the DOIC capabilities of the agent itself."

I think that goes into more details than we need to specifiy. Specifically, I don't think the type of the request changes whether an agent can change the supported features.

> 
> 17. clause 6.2.1 Capabilities Exchange Behaviours, last paragraph:
> Should read: "An agent that does not support the DOIC mechanism is likely to forward an OC-Supported-Features AVP without modification. A DOIC node must be able to tell between an OC-Supported-Features AVP that was inserted by a node within a trusted domain, and one inserted by a node within a non-trusted domain.[Section 5.4]". This is because the reporting node (if it sends an OLR within an answer) sends it's OLR to the node that has inserted the OC-S-F AVP to the request.

I think we agree in general (although I think the last sentence is unnecessary.) But, until we have an end-to-end security mechanism (or add one to DOIC, which I hope we don't do) , trust relationships are really hop-by-hop. So the practical meaning is that you can tell the AVP was inserted (or validated--more on that in a second) by the peer vs something on the other side of that peer.

When I say "validated", I am talking about potential transitive trust relationship. That is, I know my peer supports DOIC, and I trust it to only send me stuff that it generated itself, or that it received from a peer that _it_ trusts.  (and so on.)

Since I can assume a peer that does _not_ support DOIC, will do neither, it really comes down to knowing whether my peer supports DOIC or not. Any further trust relationship has to be determined administratively. 

> 
> 18. clause 6.2.2 Overload Report Behaviours, 1st sentence, the part in brackets:
> I do not agree; when passing through, the responsibility is at the source, not at the relay. The sentence should read: "When a DOIC-supporting relay inserts or replaces an OC-Supported-Features AVP, it becomes responsible for ensuring that any OLRs it receives from upstream nodes are honored."

I disagree. "Responsible for ensuring..." abatement is not the same as "responsible for performing abatement". In your example, the agent "ensures" abatement by delegating it to the source, which "performs" abatement.

In the context of a transaction, this is the only distinction between an agent that supports DOIC, but forwards OC-S-F without change, and one that does not support it at all.

But I think the trust issues from the previous comment are going to make this moot. If we are to distinguish a supporting agent from a non-supporting one that simply forwards unknown AVPs, then supporting agents are going to need to make _some_ modification to the OC-S-F prior to forwarding it. For example, it may need to insert it's diameter identity as a forwarding agent. If we go that route, there will be no such thing as a _supporting_ agent forwarding an OC-S-F AVP without change.)

> 
> 19. clause 6.2.2 Overload Report Behaviours, 2nd sentence:
> If the abatement is not "Diversion" and "Delegation" is possible, "Delegation" rather than "Throttling" must be done.

Why is this a requirement? If someone wants to deploy an agent that never delegates, we should not prevent them. I wouldn't build my network that way, but don't see why the IETF should put anything more than guidance about why delegation might be better.

(That said, it would be perfectly reasonable for 3GPP to require delegation whenever possible.)

> 
> 20. clause 6.2.2 Overload Report Behaviours, 2nd paragraph, last sentence:
> I do not agree. See also comment 11. Diversion is limited to realm-routed requests and can only be performed by nodes that do the server selection. These nodes can convert the realm-routed request to a host-routed request.

I agree that diversion typically cannot be done for host-routed requests, but I don't think we can say it "never" can. There are situations where an agent (or client) can divert host-routed requests. For example, you might have more than one physical server that can handle requests for the same Destination-Host value.

Otherwise, I don't see how your text disagrees with the text in the draft.

> 
> 21. clause 6.2.2 Overload Report Behaviours, 3rd paragraph, last sentence:
> Modifying OLRs must follow strict rules. We either have
> a) one DOIC association betwee reacting node and reporting node where all agents in between are transparent and do not modify OC-xxx AVPs, or 
> b) two independent DOIC associations, one between reacting node and agent (acting as reporting node) and one between the same agent (now acting as reacting node) and reporting node. Here the agent removes the received host-type OLR and inserts its own aggregated realm-type OLR. I would not call this a modification but a replacement. Modifications other than this may not be a good idea.

I'm happy to call a modification a replacement. (Replacing an AVP with one that is similar to the original but slightly different is indistinguishable from modification.) But I don't think we (the IETF) can specify any formal limitations on how an agent can modify any OC-XXX without making perfectly reasonable network designs become illegal. Again, it's perfectly reasonable for 3GPP to add additional constraints.

(After working with Steve to put this draft together, I am no longer convinced that the "DOIC association" is useful as a formal concept. )

> 
> 22. clause 6.2.2 Overload Report Behaviours, last but one paragraph:
> It is the other way round:
> An agent shall not throttle traffic locally when it has already sent (or will soon send) an OLR downstream (i.e. when it can or already has delegated the abatement). When receiving a xxR that contains an OC-S-F AVP (and the xxR matches an OLR) the agent can almost safely assume that this request survived an ongoing throttling downstream. The principle is that throttling should be done as close as possible to the client.

Again, I don't think we should forbid either approach at the IETF. Doing throttling as close to the client as possible is guidance, not a normative rule. Or at least nothing stronger than SHOULD strength.

(And again, 3GPP can add further constraints...)

> 
> Regards,
> Ulrich
> 
> 
> From: DiME [mailto:dime-bounces@ietf.org] On Behalf Of ext Steve Donovan
> Sent: Saturday, July 05, 2014 9:49 PM
> To: dime@ietf.org
> Subject: [Dime] Fwd: New Version Notification for draft-donovan-doic-agent-cases-00.txt
> 
> All,
> 
> The below referenced draft focuses on a number of DOIC deployment scenarios involving agents.  The goal of this draft is to identify any new DOIC behaviors required to address these deployment scenarios.  
> 
> This directly addresses open issues #25, #27, #60 and #61 while indirectly addressing other open issues.
> 
> Regards,
> 
> Steve
> 
> -------- Original Message -------- 
> Subject: 
> New Version Notification for draft-donovan-doic-agent-cases-00.txt
> Date: 
> Thu, 03 Jul 2014 09:17:11 -0700
> From: 
> internet-drafts@ietf.org
> To: 
> Ben Campbell <ben@nostrum.com>, "Steve Donovan" <srdonovan@usdonovans.com>, Steve Donovan <srdonovan@usdonovans.com>, "Ben Campbell" <ben@nostrum.com>
> 
> A new version of I-D, draft-donovan-doic-agent-cases-00.txt
> has been successfully submitted by Steve Donovan and posted to the
> IETF repository.
> 
> Name:		draft-donovan-doic-agent-cases
> Revision:	00
> Title:		Analysis of Agent Use Cases for Diameter Overload Information Conveyance (DOIC)
> Document date:	2014-07-03
> Group:		Individual Submission
> Pages:		34
> URL:            http://www.ietf.org/internet-drafts/draft-donovan-doic-agent-cases-00.txt
> Status:         https://datatracker.ietf.org/doc/draft-donovan-doic-agent-cases/
> Htmlized:       http://tools.ietf.org/html/draft-donovan-doic-agent-cases-00
> 
> 
> Abstract:
>   The Diameter Overload Information Conveyance (DOIC) solution
>   describes a mechanism for exchanging information about Diameter
>   Overload among Diameter nodes.  A DOIC node is a Diameter node that
>   acts as either a reporting node are a reacting node.  A reporting
>   node originates overload reports, requesting reacting nodes to reduce
>   the amount of traffic sent.  DOIC allows Diameter agents to act as
>   reporting nodes, reacting nodes, or both, but does not describe agent
>   behavior.  This document explores several use cases for agents to
>   participate in overload control, and makes recommendations for
>   certain agent behaviors to be added to DOIC.
> 
> 
> 
> 
> Please note that it may take a couple of minutes from the time of submission
> until the htmlized version and diff are available at tools.ietf.org.
> 
> The IETF Secretariat
> 
> 
> 
> 
> _______________________________________________
> DiME mailing list
> DiME@ietf.org
> https://www.ietf.org/mailman/listinfo/dime