Re: [Dime] Issue #27 - Result code sent when agent throttles message

Ben Campbell <ben@nostrum.com> Fri, 04 April 2014 19:26 UTC

Return-Path: <ben@nostrum.com>
X-Original-To: dime@ietfa.amsl.com
Delivered-To: dime@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6F2501A04C5 for <dime@ietfa.amsl.com>; Fri, 4 Apr 2014 12:26:00 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.91
X-Spam-Level:
X-Spam-Status: No, score=-1.91 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, T_RP_MATCHES_RCVD=-0.01] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3GQKjEXxnAAL for <dime@ietfa.amsl.com>; Fri, 4 Apr 2014 12:25:56 -0700 (PDT)
Received: from nostrum.com (raven-v6.nostrum.com [IPv6:2001:470:d:1130::1]) by ietfa.amsl.com (Postfix) with ESMTP id 1398A1A04BA for <dime@ietf.org>; Fri, 4 Apr 2014 12:25:55 -0700 (PDT)
Received: from [10.0.1.29] (cpe-173-172-146-58.tx.res.rr.com [173.172.146.58]) (authenticated bits=0) by nostrum.com (8.14.8/8.14.7) with ESMTP id s34JPnrO030930 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Fri, 4 Apr 2014 14:25:50 -0500 (CDT) (envelope-from ben@nostrum.com)
X-Authentication-Warning: raven.nostrum.com: Host cpe-173-172-146-58.tx.res.rr.com [173.172.146.58] claimed to be [10.0.1.29]
Content-Type: text/plain; charset="iso-8859-1"
Mime-Version: 1.0 (Mac OS X Mail 7.2 \(1874\))
From: Ben Campbell <ben@nostrum.com>
In-Reply-To: <533C0FDC.1080705@usdonovans.com>
Date: Fri, 04 Apr 2014 14:25:49 -0500
X-Mao-Original-Outgoing-Id: 418332349.128267-f9c2d7cbea49005855fd71f95c08870e
Content-Transfer-Encoding: quoted-printable
Message-Id: <98602E4A-500A-436B-B559-568ECE61933A@nostrum.com>
References: <53304336.20006@usdonovans.com> <533C0FDC.1080705@usdonovans.com>
To: Steve Donovan <srdonovan@usdonovans.com>
X-Mailer: Apple Mail (2.1874)
Archived-At: http://mailarchive.ietf.org/arch/msg/dime/1VWD8xftrXdN5xYNC4z7_3gplys
Cc: dime@ietf.org
Subject: Re: [Dime] Issue #27 - Result code sent when agent throttles message
X-BeenThere: dime@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Diameter Maintanence and Extentions Working Group <dime.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dime>, <mailto:dime-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/dime/>
List-Post: <mailto:dime@ietf.org>
List-Help: <mailto:dime-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dime>, <mailto:dime-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 04 Apr 2014 19:26:00 -0000

On Apr 2, 2014, at 8:25 AM, Steve Donovan <srdonovan@usdonovans.com> wrote:

> All,
> 
> Do we have any thoughts on the options outlined below?
> 
> Regards,
> 
> Steve
> 
> On 3/24/14, 3:37 PM, Steve Donovan wrote:
>> All,
>> 
>> We don't have a clean solution for this as existing error result codes do not get the desired behavior of the reacting node not retrying the request on another path, if available.
>> 
>> The proposal has been made that we propose an extension to 6733 to introduce a new result code.  This would be something like:
>> 
>>    DIAMETER_REQUEST_THROTTLED  3011
>> 
>>     A request was received by an agent and the agent determined that the request needed to be throttled due to an existing overload 
>>     condition.  The client SHOULD NOT attempt to retry the request and the client SHOULD NOT attempt to sent the request to an
>>     an alternate peer.

Why SHOULD NOTs and not MUST NOTs?

OTOH, I'm not sure the proscription against sending to alternate peers is generally applicable. There may be cases where you want the downstream node to resend to an alternate peer. E.g some agent overload scenarios. Maybe we need two separate codes? (Or maybe TOO_BUSY is enough for the other case?)


>> 
>> My assumption is that this extension would be included in the the CER/CEA exchange so an agent would know if it is supported.  A client that supports this extension is referred to as a 6733+ client below.  A client that doesn't is referred to as a 6733 client.
>> 
>> With this extension we have two options for wording to be put into the DOIC specification.
>> 
>> 1) Indicate that a DOIC agent that throttles a request MUST send a 3011 error response for all clients, 6733 and 6733+.  This would depend on default processing in clients for result codes that the client does not understand.  This default processing is not defined in 6733 (at least I didn't find it).  The closest I could find was the following.  If an unrecognized result code was interpreted as the answer message containing an error then throttling a request would result in the session being terminated.  

I could have sworn that I read that for unknown errors in a known class, the error should be treated as the base for the class. But I can't find that anywhere.

>>  
>> In the case where the answer message itself contains errors, any
>>    related session SHOULD be terminated by sending an STR or ASR
>>    message.  The Termination-Cause AVP in the STR MAY be filled with the
>>    appropriate value to indicate the cause of the error.  An application
>>    MAY also send an application-specific request instead of an STR or
>>    ASR message to signal the error in the case where no state is
>>    maintained or to allow for some form of error recovery with the
>>    corresponding Diameter entity.
>> 

Would that guidance still make since if one or both peers in question are agents?

>> 2) Indicate that a DOIC agent that throttles a request MUST send a 3011 response to 6733+ clients.  For 6733 clients the agent MUST send DIAMETER_TOO_BUSY 3002.  This is not perfect as 3002 says the client should try to send to an alternative peer, but it is as close as we can get.
>> 

I am skeptical of requiring different behavior depending on whether the client supports 6733bis. That would require a version number change, or some other way to signal 6733bis support. If we were to go that route, it might make more sense to make the new code part of DOIC.

>> Neither of these solutions are perfect and would come with the strong recommendation that Diameter nodes that support DOIC should also support the above extension.

So the real reason for separating this from DOIC is some sense that nodes might support the new code when they otherwise don't support DOIC. Do we expect that to really happen?

>> 
>> I propose number 1 as it at least does result in reduction of traffic sent toward the overloaded server.
>> 
>> Regards,
>> 
>> Steve
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> DiME mailing list
>> 
>> DiME@ietf.org
>> https://www.ietf.org/mailman/listinfo/dime
> 
> _______________________________________________
> DiME mailing list
> DiME@ietf.org
> https://www.ietf.org/mailman/listinfo/dime