Re: [OPSAWG] AD review of draft-ietf-opsawg-large-flow-load-balancing (draft response)

Anoop Ghanwani <anoop@alumni.duke.edu> Wed, 16 April 2014 00:55 UTC

Return-Path: <ghanwani@gmail.com>
X-Original-To: opsawg@ietfa.amsl.com
Delivered-To: opsawg@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 59DB01A009E for <opsawg@ietfa.amsl.com>; Tue, 15 Apr 2014 17:55:57 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.277
X-Spam-Level:
X-Spam-Status: No, score=-1.277 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FM_FORGED_GMAIL=0.622, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_PASS=-0.001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ZCJ1gIrLwbze for <opsawg@ietfa.amsl.com>; Tue, 15 Apr 2014 17:55:53 -0700 (PDT)
Received: from mail-wg0-x231.google.com (mail-wg0-x231.google.com [IPv6:2a00:1450:400c:c00::231]) by ietfa.amsl.com (Postfix) with ESMTP id 39F201A0097 for <opsawg@ietf.org>; Tue, 15 Apr 2014 17:55:53 -0700 (PDT)
Received: by mail-wg0-f49.google.com with SMTP id a1so10490427wgh.32 for <opsawg@ietf.org>; Tue, 15 Apr 2014 17:55:49 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=+p8TNRaMPgjNvC5kaNoz0T0/566ioUqfYRjsOEpo70U=; b=WUO76VWXfrjxiXZagHZmhNPfWhaTeHxbguefxQJv0qs28XSGrblZS86bhBnua/CY/Q kHg2nnhuGo4YnfIDI7/TIOPbFMLR7YSeO7ImYedX6QG5CG574xvTuZquH0ZlbWLOXHnN msx5+nt/4B0FNz9twPUHb79hVxAt2Q9VC0Y9OfwixTeBfYrJqlyV5c/fShjxvo7B7Cgb 7eL5+VDqWSOBEO6P75APhNkgrczp105IZzSCNI9IUNt8V64w4Yfws8rTRVqtcNQkFq+n cToqr73j0x1HJJ3XC3ty7J9jWmIbHa0e9PItWBIWIFexJ+m7rMUwSSdElQikl7Qlcc0B BbtA==
MIME-Version: 1.0
X-Received: by 10.180.188.66 with SMTP id fy2mr4842882wic.45.1397609749733; Tue, 15 Apr 2014 17:55:49 -0700 (PDT)
Sender: ghanwani@gmail.com
Received: by 10.216.73.138 with HTTP; Tue, 15 Apr 2014 17:55:49 -0700 (PDT)
In-Reply-To: <534D2030.3020809@cisco.com>
References: <CA+-tSzxDpD2V7Q15Jjgzz2A+d5Gn_92YQ-1_Zvx2AP=s5AWpxA@mail.gmail.com> <534BD465.4090503@cisco.com> <CA+-tSzxh-R_4W+bqy7ATrjVf5cmPx29Oo_371BcrZXDODFTEug@mail.gmail.com> <534D2030.3020809@cisco.com>
Date: Tue, 15 Apr 2014 17:55:49 -0700
X-Google-Sender-Auth: eM4HexTurnrB8dW4BZdkrpucK2o
Message-ID: <CA+-tSzy4+nswSyj27cKSTKr3jo=sZXpo1Eut894B+=7h_30zsg@mail.gmail.com>
From: Anoop Ghanwani <anoop@alumni.duke.edu>
To: Benoit Claise <bclaise@cisco.com>
Content-Type: multipart/alternative; boundary="001a11c383beaae75504f71e6084"
Archived-At: http://mailarchive.ietf.org/arch/msg/opsawg/OnguBVRdUeF1uq1ju-RSYO-MUvU
Cc: "opsawg@ietf.org" <opsawg@ietf.org>
Subject: Re: [OPSAWG] AD review of draft-ietf-opsawg-large-flow-load-balancing (draft response)
X-BeenThere: opsawg@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: OPSA Working Group Mail List <opsawg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/opsawg>, <mailto:opsawg-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/opsawg/>
List-Post: <mailto:opsawg@ietf.org>
List-Help: <mailto:opsawg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/opsawg>, <mailto:opsawg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Apr 2014 00:55:57 -0000

Benoit,

Please see inline.


On Tue, Apr 15, 2014 at 5:04 AM, Benoit Claise <bclaise@cisco.com> wrote:

>  On 14/04/2014 19:09, Anoop Ghanwani wrote:
>
> Hi Benoit,
>
>  I will work on the editorials shortly and I'm removing those from the
> discussion.   See below:
>
>
> On Mon, Apr 14, 2014 at 5:28 AM, Benoit Claise <bclaise@cisco.com> wrote:
>
>>  Hi Anoop,
>>
>> Thanks for the new draft version.
>> I removed some of the points
>>
>>
>>
>>
>>    On Tue, Feb 18, 2014 at 7:55 AM, Benoit Claise <bclaise@cisco.com>wrote:
>>
>>>  -
>>>
>>>    A number of routers support sampling techniques such as sFlow [sFlow-
>>>    v5, sFlow-LAG], PSAMP [RFC 5475] and NetFlow Sampling [RFC 3954].
>>>    For the purpose of large flow identification, sampling must be
>>>    enabled on all of the egress ports in the router where such
>>>    measurements are desired.
>>>
>>> I don't understand the second sentence.
>>> One way to read this is:  sampling must be * enabled *on all of the
>>> egress ports where such measurements are desired.
>>>     Ok, this is an obvious statement. If the measurements are desired,
>>> enable them
>>>
>>
>>  Yes,
>>
>>
>>>  Or maybe you want to say: *sampling *must be enabled on all of the
>>> egress ports where such measurements are desired.
>>>     This is a false statement: if you have the choice between sampling
>>> and non sampling, use non sampling measurements.
>>> Or maybe you want to say: sampling must be enabled on *all *of the
>>> egress ports where such measurements are desired.
>>>     This is a false statement: if I have ECMP on 2 links, and only one
>>> of them can't do non sampling, then we should not force
>>>     sampling on both links.
>>> You see, I'm confused.
>>>
>>> You miss a couple of key messages:
>>> - if unsampled measurements are available, use those.
>>> - egress means where LAG/ECMP are enabled (this is important for the
>>> paragraph starting with "If egress sampling is not available, ingress
>>> sampling can suffice since the central management entity use")
>>>
>>
>>   We were not intending to discuss a mix sampling and non-sampling
>> interfaces in the same router, but this is a reasonable point and it will
>> be clarified (i.e. we will state that it's possible to mix sampled and non
>> sampled interfaces as long as the function of large flow
>> detection/identification can be performed).
>>
>> You're still missing the point that unsampled measurements is better than
>> sampled ones.
>>
>
>  We do point this out in Section 4.3.4.
>
> http://tools.ietf.org/html/draft-ietf-opsawg-large-flow-load-balancing-10#section-4.3.4
>  >>>
>
>         As link speeds get higher, sampling rates are typically reduced
>         to keep the number of samples manageable which places a lower
>         bound on the detection time.  With automatic hardware
>         recognition, large flows can be detected in shorter windows on
>         higher link speeds since every packet is accounted for in
>         hardware [NDTM <http://tools.ietf.org/html/draft-ietf-opsawg-large-flow-load-balancing-10#ref-NDTM>].
>
>  >>>
>
> I've seen that, but why do you equate automatic *hardware *recognition to
> unsampled measurements.
> Whether it's done in hardware of software is orthogonal.
>
>
OK, I think I see the reason for the disconnect.  In the draft we only
talked about automatic hardware recognition and sampling as methods for
large flow recognition.  It seems you're suggesting there's a third way --
unsampled measurements (likely in hardware) but use of software for the
actual recognition of large flows from those measurements?  Can you
confirm?  If so, we can add that to the draft as well.


>
>> Is this what you mean by:
>>
>> It is possible that a router may have line cards that support a
>> sampling technique while other line cards support automatic hardware
>> detection of large flows.
>>
>>  It's not very clear.
>>
>>
>  No, this does not address your point.  This is talking about the case
> where line cards have different capabilities, rather than a line card that
> supports both.
>
>  Since we already have the advantages and disadvantages listed in 4.3.4,
> do you still see a need for explicitly mentioning that automatic hardware
> detection is to be preferred over sampling if both are available?
>
>  We did debate the point about accuracy quite a bit among the authors.
>  The question is -- does that level of accuracy really matter for the large
> flow case?
>
> Maybe not (for the details: http://dl.acm.org/citation.cfm?id=1791959),
> but I don't understand why you want to limit this mechanism to sampling
> only. Simply telling that sampled data could be good enough, but if you
> have unsampled data, you will get a better accuracy.
>
>
Thanks for the reference.

>   Since we are dealing with flows that need to consume a certain percent
> of the link bandwidth, sampling, if configured correctly,
>
> And you don't go in the details of "sampling, if configured correctly"...
>

There are suggestions in some of the references (e.g. the DevoFlow paper),
but there are also other references, e.g.
http://www.sflow.org/packetSamplingBasics/index.htm.   This is a general
sampling problem, rather than something that was introduced by this draft.
 If you think it would be useful to add something (or maybe just pointers
to the references), that can be done.

Anoop