Re: [Dime] Review of

Eric McMurry <emcmurry@computer.org> Fri, 26 July 2013 13:10 UTC

Return-Path: <emcmurry@computer.org>
X-Original-To: dime@ietfa.amsl.com
Delivered-To: dime@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C6B2D21F9935 for <dime@ietfa.amsl.com>; Fri, 26 Jul 2013 06:10:28 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.103
X-Spam-Level:
X-Spam-Status: No, score=-2.103 tagged_above=-999 required=5 tests=[AWL=-0.496, BAYES_00=-2.599, DATE_IN_PAST_12_24=0.992]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tzXp3MxUXd6D for <dime@ietfa.amsl.com>; Fri, 26 Jul 2013 06:10:24 -0700 (PDT)
Received: from mho-02-ewr.mailhop.org (mho-02-ewr.mailhop.org [204.13.248.72]) by ietfa.amsl.com (Postfix) with ESMTP id 03AAC21F9926 for <dime@ietf.org>; Fri, 26 Jul 2013 06:10:23 -0700 (PDT)
Received: from cpe-76-184-161-215.tx.res.rr.com ([76.184.161.215] helo=antikythera.casamcmurry.com) by mho-02-ewr.mailhop.org with esmtpa (Exim 4.72) (envelope-from <emcmurry@computer.org>) id 1V2hmp-000EpF-C7; Fri, 26 Jul 2013 13:10:23 +0000
Received: from localhost (localhost [127.0.0.1]) by antikythera.casamcmurry.com (Postfix) with ESMTP id 83B72CDB68A; Fri, 26 Jul 2013 08:10:21 -0500 (CDT)
X-Mail-Handler: Dyn Standard SMTP by Dyn
X-Originating-IP: 76.184.161.215
X-Report-Abuse-To: abuse@dyndns.com (see http://www.dyndns.com/services/sendlabs/outbound_abuse.html for abuse reporting information)
X-MHO-User: U2FsdGVkX1/13l7yECpIOjr4Gn06XYvc+qL7Ek8P4HU=
X-Virus-Scanned: amavisd-new at casamcmurry.com
Received: from antikythera.casamcmurry.com ([127.0.0.1]) by localhost (antikythera.casamcmurry.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id FcQwBTncli4X; Fri, 26 Jul 2013 08:10:21 -0500 (CDT)
Received: from [192.168.13.6] (unknown [192.168.13.6]) by antikythera.casamcmurry.com (Postfix) with ESMTPSA id 66C39CDB67A; Fri, 26 Jul 2013 08:10:18 -0500 (CDT)
Content-Type: text/plain; charset="us-ascii"
Mime-Version: 1.0 (Mac OS X Mail 6.5 \(1508\))
From: Eric McMurry <emcmurry@computer.org>
In-Reply-To: <51F11EFE.7040109@deployingradius.com>
Date: Thu, 25 Jul 2013 23:36:55 +0200
Content-Transfer-Encoding: quoted-printable
Message-Id: <14A3737E-B429-4895-B396-B0A48C15CBAF@computer.org>
References: <51EF1A3D.802@deployingradius.com> <D391257E-6EA6-46B1-ABD3-582AC00C9DD9@nostrum.com> <51EFE09E.90102@deployingradius.com> <AE5B54C7-7D7C-40C4-AC56-0EC011B888FB@computer.org> <51F11EFE.7040109@deployingradius.com>
To: Alan DeKok <aland@deployingradius.com>
X-Mailer: Apple Mail (2.1508)
Cc: "dime@ietf.org" <dime@ietf.org>
Subject: Re: [Dime] Review of
X-BeenThere: dime@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Diameter Maintanence and Extentions Working Group <dime.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/dime>, <mailto:dime-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/dime>
List-Post: <mailto:dime@ietf.org>
List-Help: <mailto:dime-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/dime>, <mailto:dime-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 26 Jul 2013 13:10:28 -0000

On Jul 25, 2013, at 14:50 , Alan DeKok <aland@deployingradius.com> wrote:

> Eric McMurry wrote:
>> I understand your point about out of band.  That would make sense for several reasons.  However, using the existing protocol is highly desirable since it is already in place and can support this.
> 
> Using the existing protocol is fine.  I'm just wary of overloading a
> link-layer QoS protocol, and turning into a multi-hop routing protocol.
> 
> It would be better (IMHO) to signal routing changes independent from
> link changes.

I'm still a bit lost here.  This signaling piggybacks directly in Diameter messages and doesn't define any link or transport protocols.  It does not signal link changes or routing changes.  Can you point me to the language in the draft that is giving that impression?

> 
>> so having the introduction of the new bit in the higher level behavior section was the source of confusion?
> 
> Yes.  If the first use of "overload metric" assumes you know the
> meaning, it's confusing.  e.g. Section 1.3 says:
> 
>  ... Overload-Metric (which is input to the negotiated
>  load abatement algorithm).
> 
> It might as well be called "foo".  It's an opaque input parameter,
> with no meaning, and no context.
> 
> I find it clearer to lay out the document by introducing a concept,
> defining it, giving a hint as to how it's used, and then later using it
> in a more complex context.
> 
>> yes, Diameter clients are different.  They tend to be sophisticated servers doing a number of functions with Diameter being used for some of their control signaling.  Their other functions usually require the use of other protocols and they may have to consider information in making overload control decisions that servers do not have.  The requirements draft may be helpful for describing the problem domain for this (http://tools.ietf.org/html/draft-ietf-dime-overload-reqs-09)
> 
> I scanned it.  I'll take a deeper look.

understood.  It was a good review for a scan.  I wasn't complaining, just trying to answer your question.

> 
>> As far as what to do with it, not specifying that was intentional.  What to do with load information is implementation specific.
> 
> Then I'm not sure why this document is useful.  It would seem to be
> useful to describe both the data, and how the data is used.

There is more than one thing going on here.  I was talking about the load information used for proactive mitigation.  The document also describes overload control information which has more constraints on its usage.

> 
>>> This section is a bit opaque to me.  It's not clear *why* these
>>> calculations are being done.  I presume it's taking a convolution of the
>>> suggested load distribution with the servers current load.  But it's not
>>> clear.
>> 
>> this example just uses the load information to scale SRV weights, so you have the gist of it.  It is non-normative and just an example of how load information can be used.  Perhaps some of the discussed name changing will help illustrate that the load information usage is different than the reactive overload control information.
> 
> I was looking for an English explanation of what was going on.  Right
> now, the calculation looks a lot like:
> 
> A = 5
> B = 6
> C = 10
> 
> D = A * B / C + 2*A
> E = D^A + C
> 
> ??? OK, that's nice.  WHY is this being done?  What does it mean?

Does the text in 3.4 talking about proactive mitigation help with this?

> 
>> you're not the only one to think that :-)  we have to pick a default algorithm to use and drop has the nice characteristic of being quite simple.  What actually happens here is up for debate.
> 
> Traffic is bursty.  Let's say a server gets a burst of 100Mbps
> traffic.  It signals to the client "drop traffic by 90%".  By the time
> the message makes it to the client, the burst is over.  Traffic is down
> to 5Mbps.  So the client happily caps it at .5Mbps, which is not what
> the server wanted.
> 
> Signaling a percentage drop is meaningless, because it's taken at a
> time on the server.  The client doesn't know when that measurement was
> done on the server.
> 
> So either you signal "percentage drop AND (servers' load OR server
> time)", or you just signal "limit traffic to X bps".  Both have the same
> effect.  The second is simpler.

This depends entirely on the traffic mix and the number of clients.  In some cases it is true, in others (i.e. less bursty traffic, large numbers of clients, ...), it is not.  I don't disagree that rate based control would be better in many instances.  It requires more consideration for the types of use cases and topologies that are expected to be typical (similar to what happened in SIP overload as Janet pointed out).  Of course, either one may not be appropriate in some cases, which is why it's extensible.  Thanks for the vote on rate based.  I think there will be more discussion on this coming from several folks.