Re: [Dime] WGLC #1 for draft-ietf-dime-agent-overload-05

Steve Donovan <srdonovan@usdonovans.com> Wed, 08 June 2016 19:06 UTC

Return-Path: <srdonovan@usdonovans.com>
X-Original-To: dime@ietfa.amsl.com
Delivered-To: dime@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D0F7B12D511 for <dime@ietfa.amsl.com>; Wed, 8 Jun 2016 12:06:10 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.121
X-Spam-Level:
X-Spam-Status: No, score=-1.121 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, SPF_NEUTRAL=0.779] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rFfVbgFzuT9K for <dime@ietfa.amsl.com>; Wed, 8 Jun 2016 12:06:09 -0700 (PDT)
Received: from biz131.inmotionhosting.com (biz131.inmotionhosting.com [74.124.197.190]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6CE9C12B053 for <dime@ietf.org>; Wed, 8 Jun 2016 12:06:09 -0700 (PDT)
Received: from inet-141-146-6-188.oracle.com ([137.254.4.60]:54772 helo=Steves-MacBook-Air.local) by biz131.inmotionhosting.com with esmtpsa (TLSv1.2:DHE-RSA-AES256-SHA:256) (Exim 4.86_1) (envelope-from <srdonovan@usdonovans.com>) id 1bAinn-002tOw-DF; Wed, 08 Jun 2016 12:06:09 -0700
To: Maria Cruz Bartolome <maria.cruz.bartolome@ericsson.com>, "dime@ietf.org" <dime@ietf.org>
References: <5bba2470-8921-f7db-0f1b-aad280eae684@gmail.com> <087A34937E64E74E848732CFF8354B92181E2DAF@ESESSMB101.ericsson.se> <0f981f69-cea1-6cc4-6837-213d27649963@usdonovans.com> <087A34937E64E74E848732CFF8354B92181F08F6@ESESSMB101.ericsson.se>
From: Steve Donovan <srdonovan@usdonovans.com>
Message-ID: <77df2a07-ca55-df36-30bb-87a2ff506418@usdonovans.com>
Date: Wed, 08 Jun 2016 14:06:06 -0500
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:45.0) Gecko/20100101 Thunderbird/45.1.1
MIME-Version: 1.0
In-Reply-To: <087A34937E64E74E848732CFF8354B92181F08F6@ESESSMB101.ericsson.se>
Content-Type: text/plain; charset="windows-1252"; format="flowed"
Content-Transfer-Encoding: 7bit
X-OutGoing-Spam-Status: No, score=-1.0
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - biz131.inmotionhosting.com
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - usdonovans.com
X-Get-Message-Sender-Via: biz131.inmotionhosting.com: authenticated_id: srdonovan@usdonovans.com
X-Authenticated-Sender: biz131.inmotionhosting.com: srdonovan@usdonovans.com
X-Source:
X-Source-Args:
X-Source-Dir:
Archived-At: <https://mailarchive.ietf.org/arch/msg/dime/BIXLlsQNngcHLGldPBimMtZ6GQ4>
Subject: Re: [Dime] WGLC #1 for draft-ietf-dime-agent-overload-05
X-BeenThere: dime@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Diameter Maintanence and Extentions Working Group <dime.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dime>, <mailto:dime-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/dime/>
List-Post: <mailto:dime@ietf.org>
List-Help: <mailto:dime-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dime>, <mailto:dime-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 08 Jun 2016 19:06:10 -0000


On 6/7/16 1:54 AM, Maria Cruz Bartolome wrote:
> Hello Steve,
> Thanks for your reply.
> See some comments below, just keeping relevant parts of my first email, those that still need clarification.
> Best regards
> /MCruz
>
>
>
>> 2. Clause 5.2.3
>>     "In all cases, if the reacting node is a relay then it MUST strip the
>>      OC-OLR AVP from the message."
>>
>>      But, will the relay react against the overload report received? i.e. is it a "reacting node" or it is just relaying the message?
> SRD> That is determined by the other statements in that section. If the
> SourceID received in the message matches that of a peer then the relay is a reacting node.  If it doesn't match then it is not a reacting node.  Either way, the OC-OLR AVP is stripped.
>
> MCRUZ> But a relay can't be a "reacting node", can it? A relay does not read or understand any AVP apart from routing related AVPs.
SRD> Yes a relay is the reacting node for any next hop that generates a 
peer overload report.  As with base DOIC, a relay must be able to handle 
DOIC AVPs, in addition to the routing AVPs.
>
>>     
>> 2. Clause 3:
>>
>> Now:
>>      This section outlines representative use cases for the peer report
>>      used to communicate agent overload.
>>      There are two primary classes of use cases currently identified,
>>      those involving the overload of agents and those involving overload
>>      of Diameter endpoints (Diameter Clients and Diameter Servers) that
>>      wish to use an overload algorithm suited controlling traffic sent
>>      from a peer.
>>
>> Proposed:
>>      This section outlines representative use cases for the peer report
>>      used.
>>      There are two primary classes of use cases currently identified,
>>      those involving the overload of agents and those involving overload
>>      of Diameter endpoints that
>>      wish to use an overload algorithm that requires controlling traffic sent
>>      towards peers.
>>
>> Reasoning:
>>     For the second use case considered the peer report does not communicate agent overload, but Diameter server overload.
>>     Diameter Endpoint is already defined.
>>    Last sentence as it is, it is a bit difficult to understand.
> SRD> I agree to removing the parenthetical.
>
> SRD> I propose changing the paragraph to the following:
>
>      There are two primary classes of use cases currently identified,
>      those involving the overload of agents and those involving overload
>      of Diameter endpoints.  In both cases the goal is to use an overload
>      algorithm that controls traffic sent towards peers.
>
> MCRUZ> Ok
>
>
>> 3. Clause 3.1.1
>>
>> Now:
>> This will result in the throtting of the abated traffic
>>      that would have been sent to the agent, as there is no alternative
>>      route, with the appropriate indication given to the service request
>>      that resulted in the need for the Diameter transaction.
>>
>> Proposed:
>>     This will result in the queuing (temporally at least) and/or the throttling of the abated traffic
>>      that would have been sent to the agent, as there is no alternative
>>      route.
>>
>> Reasoning:
>>      Traffic could be queued, at least temporally, before being throttled.
>>      I do not think it is required to inform about what is sent back to the originator of the initial request.
> SRD> This talks about the abated traffic.  As such, any queuing that
> might have been used as already been done.
>
> SRD> I also think that we should be explicit that a response is sent
> back to the originator of the request.  It would do more harm if
> throttling were interpreted as just dropping the message on the floor.
>
> MCRUZ> Ok, but I suggest we rephrase a bit the sentence, is a bit blurry.  Like e.g.:
>    This will result in the throttling of the abated traffic
>     that would have been sent to the agent, as there is no alternative
>     route. An appropriate error response is sent back to the originator of the request.
SRD> I'm okay with this change.
>
>
>
>> 7. Clause 3.1.3
>>
>> "Another example of this type of
>>      deployment is when there are multiple sets of servers, each
>>      supporting a subset of the Diameter traffic."
>>
>>     This example does not include an "agent chain", since for each Client-Server connection there is only one single Agent in the chain, right?
> SRD> I don't understand why there would be a single agent in the chain.
> It is valid (and done) to have multiple agents between clients and
> servers in this scenario.
>
> MCRUZ> The possibility to have multiple agents in a chain is covered by the previous sentence in same paragraph. This sentence here seems to point out that
> There may be different set of servers, and my understanding is that there may be a chain of agents for each set.
> Therefore, I think this sentence here can be removed, or clarified.
SRD> I agree, it can be removed.
>
>
>> 8. Clause 4
>>
>> "Any messages that survive throttling due
>>      to host or realm reports should then go through abatement for the
>>      peer overload report."
>>
>>     There is an interaction between PEER and HOST reports. The reduction of traffic towards a HOST reduces as well the traffic through the agents in the path. This should be taken into account when applying reduction for that particular PEER. However, depending on the routing schema it may not be straight forward to identify what is the reduction for each agent path when reducing traffic towards a HOST.
> SRD> The goal of this statement is to say that when a Diameter node is
> applying overload abatement algorithms, the order in which active
> overload reports are applied is host/realm report first and then peer
> report.  In other words, abatement is done for traffic being sent to a
> host and then independent abatement is done for the peer to which the
> request is to be routed.  If these are treated as independent actions
> then I don't understand the issue you are raising.
>
> MCRUZ> If you think the PEER algorithm is RATE, then there is not interaction, as long as when PEER abatement is performed after HOST/REALM, it simply keeps a RATE. However, if the PEER algorithm is LOSS, when performed after HOST/REALM it should be stated that it is the initial traffic (before any HOST/REALM abatement) the one that should be taken into account. Then, I think a clarification is required.
SRD> While it is true that, as stated, the presence of a HOST LOSS 
report and a peer LOSS report could result in extra messages being 
abated, I would prefer to keep the definition of the interaction as 
simple as possible and not change the requirement. My reasoning is that 
there is value in keeping it simple, especially given that it a self 
correcting scenario.  The next hop will see more of a reduction than it 
was expecting and will subsequently update the requested reduction.  If 
there isn't consensus on this approach we can do a special case on this 
scenario.

>
>
>