Re: [bmwg] draft-morton-bmwg-b2b-frame-05

"MORTON, ALFRED C (AL)" <acm@research.att.com> Thu, 04 July 2019 15:47 UTC

From: "MORTON, ALFRED C (AL)" <acm@research.att.com>
To: "Maciek Konstantynowicz (mkonstan)" <mkonstan@cisco.com>
CC: "bmwg@ietf.org" <bmwg@ietf.org>
Thread-Topic: draft-morton-bmwg-b2b-frame-05
Thread-Index: AQHVKoByNbM/rQa8vkyTnQVsC3xjaKa6fRyg
Date: Thu, 04 Jul 2019 15:46:53 +0000
Message-ID: <4D7F4AD313D3FC43A053B309F97543CFA0AA7FC7@njmtexg5.research.att.com>
References: <81391894-FC37-4048-9371-453BCE6E4EF8@cisco.com>
In-Reply-To: <81391894-FC37-4048-9371-453BCE6E4EF8@cisco.com>
Accept-Language: en-US
Content-Language: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/bmwg/0sBUwICUdsyeUhJFXbPlw6oZZ2A>
Subject: Re: [bmwg] draft-morton-bmwg-b2b-frame-05
Precedence: list

Hi Maciek,

Thanks very much for your comments and suggestions.
Your support of this draft means much to me, because 
you have demonstrated your benchmarking expertise on 
many occasions. 

I'll respond to your comments below, and the new draft
(adopted by the working group earlier this year) 
will reflect the changes you suggested.

regards,
Al


> -----Original Message-----
> From: Maciek Konstantynowicz (mkonstan) [mailto:mkonstan@cisco.com]
> Sent: Monday, June 24, 2019 7:32 AM
> To: MORTON, ALFRED C (AL) <acm@research.att.com>
> Cc: bmwg@ietf.org
> Subject: Re: draft-morton-bmwg-b2b-frame-05
> 
> Hi Al,
> 
> I finally had time to properly review this draft.
> 
> It is a very well written benchmarking specification that does address
> an important area of measuring ingress packet buffer capacity used for
> adapting packet arrival rate to not-always-steady packet processing rate
> capability in any packet processing system.
> 
> It does take into consideration specifics of NFV systems (i.e.
> vswitches, other VNFs) where this capability is of elevated importance
> based on experience from OPNFV VSPERF benchmarking (backed up by a
> number of measurement references).
> 
> The draft rightly notes that the size of this buffer matters as it
> enables storing packets disruptions in the software packet processor
> operation due to some external system interference. This a very
> important aspect of NFV and as such I would suggest to make this aspect
> more prominent in the draft, instead of burying it at the end of section
> 3, as it is a case now.
[acm] 
Thanks for that suggestion, this is one of the 
"learnings along the way" and it had proven to 
be a valuable motivation for this update.
There's a sentence in the introduction now,
explaining why "buffer-size matters".

> 
> Two other general observations:
> 
> - The goal is measuring (ingress) buffer size in front of HeaderProc,
>    per DUT definition in section 3.
>    - This works if there is no other buffering in the system under test.
>    - Suggest to add a paragraph dictating the setup where no egress
>      queue build up is possible.
[acm] 
I touched on this at the end of the scope section, citing 
Jacob and Lucien's RFC 8239, but I have made is a requirement
now.

> 
> - Proposed methodology works for setups where the DUT's (composed here
>    of "Buffer" and "HeaderProc" functions) behaviour can be measured
>    with external tester:
>    - This requires that any "noise" impacting DUT's behaviour is
>      identified and isolated.
>    - Potential sources of "noise":
>      - In-path active components, other than DUT, noted in draft as
>        "Ingress", "Egress".
>      - Operating system environment interrupting DUT operation.
>      - Shared resource(s) access collisions between DUT and some
>        off-path component(s), impacting DUT's behaviour, a.k.a. "noisy
>        neighbour" problem.
[acm] 
Good point, I made this a consideration among the Pre-requisites.
For example, if the links/LAN to the DUT is causing some loss,
that should always be found and fixed *first*.

>    - To deal with this e.g. for NFV DUT, the draft suggests to use
>      enhanced Binary Search with Loss Verification as specified in
>      [TST009], sec. 5.2. Plus repeating the test N times, sec. 5.3.
>      - Agree this is the right way to isolate the DUT behaviour.
[acm] 
Ok.... and this is where I have added references to the 
promising work-in-progress you cited below...

>    - But I am puzzled when it comes to proposed calculation of
>      "Corrected DUT Buffer Time", sec. 5.4.
>      - There "Measured Throughput" is measured as per RFC2544, instead
>        of referring to Binary Search with Loss Verification [TST009], or
>        possibly using MLRsearch [draft-vpolak-mkonstan-bmwg-mlrsearch]
>        to find NDR (non drop rate) and PDR (partial drop rate) and use
>        those to calculate the actual DUT buffer time.
[acm] 
We should talk about how these can apply, because we are searching
for a single loss level (zero), and need to make the changes quickly,
but I have referenced both drafts as informational in Section 5.2
(where BSxLV is introduced and described).

>      - Another potential candidate for the "Measured Throughput" is the
>        maximum measured throughput regardless of loss, metric popular
>        with academics dealing with software network processing, also
>        defined in FD.io CSIT as Maximum Receive Rate.
[acm] 
BMWG literature defines this as Maximum Forwarding Rate,
https://tools.ietf.org/html/rfc2285#page-16
in RFC 2285, but this rate will be present when there is
lots of frame loss, buffers overloading everywhere in the DUT,
and the painted manufacturer's name peeling off the 
front panel due to the heat generated :-)

The latency of the "surviving packets" says something about
total buffer size, but I don't want to try to unpack that from other
sources of latency in the DUT while delivering Max Frame Rate. 

> 
> Hope this makes sense.
[acm] It does - all very clear to me!
> 
> Cheers,
> -Maciek

Re: [bmwg] draft-morton-bmwg-b2b-frame-05 Maciek Konstantynowicz (mkonstan)
Re: [bmwg] draft-morton-bmwg-b2b-frame-05 MORTON, ALFRED C (AL)