Re: [Dime] Proposed Example Text for draft-docdt-dime-ovli-01

Ben Campbell <ben@nostrum.com> Tue, 26 November 2013 20:20 UTC

Return-Path: <ben@nostrum.com>
X-Original-To: dime@ietfa.amsl.com
Delivered-To: dime@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B13ED1AE057 for <dime@ietfa.amsl.com>; Tue, 26 Nov 2013 12:20:52 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.164
X-Spam-Level:
X-Spam-Status: No, score=0.164 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HELO_MISMATCH_COM=0.553, HOST_MISMATCH_NET=0.311, J_CHICKENPOX_25=0.6, J_CHICKENPOX_36=0.6] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zpnkHW9-qY2K for <dime@ietfa.amsl.com>; Tue, 26 Nov 2013 12:20:50 -0800 (PST)
Received: from shaman.nostrum.com (nostrum-pt.tunnel.tserv2.fmt.ipv6.he.net [IPv6:2001:470:1f03:267::2]) by ietfa.amsl.com (Postfix) with ESMTP id 243C11ADF54 for <dime@ietf.org>; Tue, 26 Nov 2013 12:20:50 -0800 (PST)
Received: from [10.0.1.29] (cpe-173-172-146-58.tx.res.rr.com [173.172.146.58]) (authenticated bits=0) by shaman.nostrum.com (8.14.3/8.14.3) with ESMTP id rAQKKmCe095341 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO) for <dime@ietf.org>; Tue, 26 Nov 2013 14:20:49 -0600 (CST) (envelope-from ben@nostrum.com)
Content-Type: text/plain; charset="us-ascii"
Mime-Version: 1.0 (Mac OS X Mail 7.0 \(1822\))
From: Ben Campbell <ben@nostrum.com>
In-Reply-To: <66DEB166-8DEB-42CA-A46E-8128F0D0900B@nostrum.com>
Date: Tue, 26 Nov 2013 14:20:47 -0600
Content-Transfer-Encoding: quoted-printable
Message-Id: <4CFE9D80-E25A-4B8F-96D1-EB7C21F2F11A@nostrum.com>
References: <66DEB166-8DEB-42CA-A46E-8128F0D0900B@nostrum.com>
To: "dime@ietf.org list" <dime@ietf.org>
X-Mailer: Apple Mail (2.1822)
Received-SPF: pass (shaman.nostrum.com: 173.172.146.58 is authenticated by a trusted mechanism)
Subject: Re: [Dime] Proposed Example Text for draft-docdt-dime-ovli-01
X-BeenThere: dime@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Diameter Maintanence and Extentions Working Group <dime.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dime>, <mailto:dime-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/dime/>
List-Post: <mailto:dime@ietf.org>
List-Help: <mailto:dime-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dime>, <mailto:dime-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 26 Nov 2013 20:20:52 -0000

[I've received a number of in-person comments on this, but non on the list. Given that I will never remember everything people said to me when traveling, I'd appreciate it if people would repeat their comments on list. Failing to do so risks me describing your argument incorrectly; resulting in an unintentional strawman. ]

That being said, if I understand the comments correctly, there are a few major open questions here:

1) Is this a real problem?

Several people argued to me that, if the agent load intelligently load balances traffic across all the servers, then the servers should become overloaded roughly at the same time. OTOH, certain sessions may have much more traffic than other sessions. If these "big" sessions are distributed proportionately across the servers, load should still be fairly balanced.

IMHO, we cannot assume smart load balancing, as there is no standard for doing it. Our solution should still work for agents that use simple (and perhaps naive) load distributions strategies. But even if we do have super-smart load balancing, things like hardware failures can throw things out of balance. Also, the argument that "big" sessions should still be proportionately distributed only works if you have a large enough sample of "big" sessions to distribute, compared to the number of servers. That may be true most of the time, but I don't think we want a solution that fails if it's not true.

2)  Lionel argued that OLRs are best used to describe overload for the realm as a whole. Overload for specific nodes would be better handled by sending Diameter error codes, either by fixing one or more existing error codes to have the correct semantics, or introducing new ones. If we did this, we would not need different report types.

My opinion is that I would like a consistent way of reporting overload, that is, I don't like using OLRs for one kind of overload, and error result codes for another. In particular, I would like to be able to report overload before reaching the point that I need to fail a transaction, i.e. I would like to be able to report any kind of overload in a Diameter answer message with a result code of DIAMETER_SUCCESS.

3) Are we allowed to put more than one OLRs in a single answer, as my example shows in step 8?

It might be possible to construct an example where you never see more than one OLR in a single answer. But I don't see what purpose would be served by such a limitation, as long as multiple OLRs do not contradict each other. Since the reports in the example have different report types, there is no conflict. On disadvantage of _not_ allowing multiple reports in one answer is that, if the servers choose to send reports in every answer, life gets complicated for the agent when trying to find a place to put the "realm" report.  It either needs to strip the server reports (which is hard given that the server overload conditions are best handled by the clients.) Or it needs to originate its own answers, which means forcing the failure of at least some transactions.

On Nov 4, 2013, at 6:36 PM, Ben Campbell <ben@nostrum.com> wrote:

> Hi,
> 
> draft-docdt-dime-ovli-01 has a TBD item in appendix C, section 3. Namely,  an example showing a mix of Destination-Realm and Destination-Host routed requests. Here's a straw man proposal for that example. I don't expect this to be the "one true way" to approach the scenario; rather, it's a possible way to do it, and there are likely other ways as well.
> 
> Comments solicited.
> 
> Thanks!
> 
> Ben.
> 
> ----------------------------
> 
> C.3.  Mix of Destination-Realm routed requests and Destination-Host
>       reouted requests
> 
>    Diameter allows a client to optionally select the destination server
>    of a request, even if there are agents between the client and the
>    server.  The client does this using the Destination-Host AVP.  In
>    cases where the client does not care if a specific server receives
>    the request, it can omit Destination-Host and route the request using
>    the Destination-Realm and ApplicationId AVPs, effectively letting an
>    agent select the server.
> 
>    Clients commonly send mixtures of Destination-Host and Destination-
>    Realm routed requests.  For example, in an application that uses user
>    sessions, a client typically won't care which server handles a
>    session-initiating requests.  But once the session is initiated, the
>    client will send all subsequent requests in that session to the same
>    server.  Therefore it would send the initial request with no
>    Destination-Host AVP.  If it receives a successful answer, the client
>    would copy the Origin-Host value from the answer message into a
>    Destination-Host AVP in each subsequent request in the session.
> 
>    An agent has very limited options in applying overload abatement to
>    requests that contain Destination-Host AVPs.  It typically cannot
>    route the request to a different server than the one identified in
>    Destination-Host.  It's only remaining options are to throttle such
>    requests locally, or to send an overload report back towards the
>    client so the client can throttle the requests.  The second choice is
>    usually more efficient, since it prevents any throttled requests from
>    being sent in the first place, and removes the agent's need to send
>    errors back to the client for each dropped request.
> 
>    On the other hand, an agent has much more leeway to apply overload
>    abatement for requests that do not contain Destination-Host AVPs.  If
>    the agent has multiple servers in its peer table for the given realm
>    and application, it can route such requests to other, less overloaded
>    servers.
> 
>    If the overload severity increases, the agent may reach a point where
>    there is not sufficient capacity across all servers to handle even
> 
> 
> 
> Korhonen, et al.          Expires May 03, 2014                 [Page 35]
> Internet-Draft                    DOIC                      October 2013
> 
> 
>    realm-routed requests.  In this case, the realm itself can be
>    considered overloaded.  The agent may need the client to throttle
>    realm-routed requests in addition to Destination-Host routed
>    requests.  The overload severity may be different for each server,
>    and the severity for the realm at is likely to be different than for
>    any specific server.  Therefore, an agent may need to forward, or
>    originate, multiple overload reports with differing ReportType and
>    Reduction-Percentage values.
> 
>    Figure 8 illustrates such a mixed-routing scenario.  In this example,
>    the servers S1, S2, and S3 handle requests for the realm "realm".
>    Any of the three can handle requests that are not part of a user
>    session (i.e. routed by Destination-Realm).  But once a session is
>    established, all requests in that session must go to the same server.
> 
>                   Client     Agent      S1        S2        S3
>                      |         |         |         |         |
>                      |         |         |         |         |
>                      |         |         |         |         |
>                      |(1) Request (DR:realm)       |         |
>                      |-------->|         |         |         |
>                      |         |         |         |         |
>                      |         |         |         |         |
>                      |         |Agent selects S1   |         |
>                      |         |         |         |         |
>                      |         |         |         |         |
>                      |         |         |         |         |
>                      |         |(2) Request (DR:realm)       |
>                      |         |-------->|         |         |
>                      |         |         |         |         |
>                      |         |         |         |         |
>                      |         |         |S1 overloaded, returns OLR
>                      |         |         |         |         |
>                      |         |         |         |         |
>                      |         |         |         |         |
>                      |         |(3) Answer (OR:realm,OH:S1,OLR:RT=DH)
>                      |         |<--------|         |         |
>                      |         |         |         |         |
>                      |         |         |         |         |
>                      |         |sees OLR,routes DR traffic to S2&S3
>                      |         |         |         |         |
>                      |         |         |         |         |
>                      |         |         |         |         |
>                      |(4) Answer (OR:realm,OH:S1, OLR:RT=DH) |
>                      |<--------|         |         |         |
>                      |         |         |         |         |
>                      |         |         |         |         |
>                      |Client throttles requests with DH:S1   |
>                      |         |         |         |         |
>                      |         |         |         |         |
>                      |         |         |         |         |
>                      |(5) Request (DR:realm)       |         |
>                      |-------->|         |         |         |
>                      |         |         |         |         |
>                      |         |         |         |         |
>                      |         |Agent selects S2   |         |
>                      |         |         |         |         |
>                      |         |         |         |         |
>                      |         |         |         |         |
>                      |         |(6) Request (DR:realm)       |
>                      |         |------------------>|         |
>                      |         |         |         |         |
>                      |         |         |         |         |
>                      |         |         |         |S2 is overloaded...
>                      |         |         |         |         |
>                      |         |         |         |         |
>                      |         |         |         |         |
>                      |         |(7) Answer (OH:S2, OLR:RT=DH)|
>                      |         |<------------------|         |
>                      |         |         |         |         |
>                      |         |         |         |         |
>                      |         |Agent sees OLR, realm now overloaded
>                      |         |         |         |         |
>                      |         |         |         |         |
>                      |         |         |         |         |
>                      |(8) Answer (OR:realm,OH:S2, OLR:RT=DH, OLR: RT=R)
>                      |<--------|         |         |         |
>                      |         |         |         |         |
>                      |         |         |         |         |
>                      |Client throttles DH:S1, DH:S2, and DR:realm
>                      |         |         |         |         |
>                      |         |         |         |         |
>                      |         |         |         |         |
>                      |         |         |         |         |
>                      |         |         |         |         |
> 
> 
>       Figure 8: Mix of Destination-Host and Destination-Realm Routed
>                                  Requests
> 
>    1: The client sends a request with no Destination-Host AVP (that is,
>       a Destination-Realm routed request.)
> 
>    2: The agent follows local policy to select a server from its peer
>       table.  In this case, the agent selects S2 and forwards the
>       request.
> 
>    3: S1 is overloaded.  It sends a answer indicating success, but also
>       includes an overload report.  Since the overload report only
>       applies to S1, the ReportType is "Destination-Host".
> 
>    4: The agent sees the overload report, and records that S1 is
>       overloaded by the value in the Reduction-Percentage AVP.  It
>       begins diverting the indicated percentage of realm-routed traffic
>       from S1 to S2 and S3.  Since it can't divert Destination-Host
>       routed traffic, it forwards the overload report to the client.
>       This effectively delegates the throttling of traffic with
>       Destination-Host:S1 to the client.
> 
>    5: The client sends another Destination-Realm routed request.
> 
>    6: The agent selects S2, and forwards the request.
> 
>    7: It turns out that S2 is also overloaded, perhaps due to all that
>       traffic it took over for S1.  S2 returns an successful answer
>       containing an overload report.  Since this report only applies to
>       S2, the ReportType is "Destination-Host".
> 
>    8: The agent sees that S2 is also overloaded by the value in
>       Reduction-Percentage.  This value is probably different than the
>       value from S1's report.  The agent diverts the remaining traffic
>       to S3 as best as it can, but it calculates that the remaining
>       capacity across all three servers is no longer sufficient to
>       handle all of the realm-routed traffic.  This means the realm
>       itself is overloaded.  The realm's overload percentage is most
>       likely different than that for either S1 or S2.  The agent
>       forward's S2's report back to the client in the Diameter answer.
>       Additionally, the agent generates a new report for the realm of
>       "realm", and inserts that report into the answer.  The client
>       throttles requests with Destination-Host:S1 at one rate, requests
>       with Destination-Host:S2 at another rate, and requests with no
>       Destination-Host AVP at yet a third rate.  (Since S3 has not
>       indicated overload, the client does not throttle requests with
>       Destination-Host:S3.)
> 
> _______________________________________________
> DiME mailing list
> DiME@ietf.org
> https://www.ietf.org/mailman/listinfo/dime