Re: [bmwg] An Upgrade to Benchmarking Methodology for Network Interconnect Devices -- Fwd: New Version Notification for draft-lencse-bmwg-rfc2544-bis-00.txt

Lencse Gábor <lencse@hit.bme.hu> Mon, 01 June 2020 09:44 UTC

Return-Path: <lencse@hit.bme.hu>
X-Original-To: bmwg@ietfa.amsl.com
Delivered-To: bmwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 780613A0EBC for <bmwg@ietfa.amsl.com>; Mon, 1 Jun 2020 02:44:15 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id pXtIpDHSwQ3A for <bmwg@ietfa.amsl.com>; Mon, 1 Jun 2020 02:44:12 -0700 (PDT)
Received: from frogstar.hit.bme.hu (frogstar.hit.bme.hu [IPv6:2001:738:2001:4020::2c]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1B88A3A0EBB for <bmwg@ietf.org>; Mon, 1 Jun 2020 02:44:11 -0700 (PDT)
Received: from [192.168.1.135] (host-79-121-42-113.kabelnet.hu [79.121.42.113]) (authenticated bits=0) by frogstar.hit.bme.hu (8.15.2/8.15.2) with ESMTPSA id 0519hs1U051337 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NO) for <bmwg@ietf.org>; Mon, 1 Jun 2020 11:44:03 +0200 (CEST) (envelope-from lencse@hit.bme.hu)
X-Authentication-Warning: frogstar.hit.bme.hu: Host host-79-121-42-113.kabelnet.hu [79.121.42.113] claimed to be [192.168.1.135]
To: "bmwg@ietf.org" <bmwg@ietf.org>
References: <158995996438.13925.2934780472900149847@ietfa.amsl.com> <14002442-9713-d474-8012-bca5dcd6976c@hit.bme.hu> <4D7F4AD313D3FC43A053B309F97543CF0108A5BA22@njmtexg5.research.att.com> <598e85fd-cf9b-1cdd-61c0-3a76623145f9@hit.bme.hu> <4D7F4AD313D3FC43A053B309F97543CF0108A5BC52@njmtexg5.research.att.com>
From: Lencse Gábor <lencse@hit.bme.hu>
Message-ID: <1c81f904-bb24-5f42-2ac4-919913fddf8a@hit.bme.hu>
Date: Mon, 01 Jun 2020 11:43:50 +0200
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.8.1
MIME-Version: 1.0
In-Reply-To: <4D7F4AD313D3FC43A053B309F97543CF0108A5BC52@njmtexg5.research.att.com>
Content-Type: multipart/alternative; boundary="------------FFB54AC89F4DA700B1E95EB8"
Content-Language: en-US
X-Virus-Scanned: clamav-milter 0.101.2 at frogstar.hit.bme.hu
X-Virus-Status: Clean
Received-SPF: pass (frogstar.hit.bme.hu: authenticated connection) receiver=frogstar.hit.bme.hu; client-ip=79.121.42.113; helo=[192.168.1.135]; envelope-from=lencse@hit.bme.hu; x-software=spfmilter 2.001 http://www.acme.com/software/spfmilter/ with libspf2-1.2.10;
X-DCC--Metrics: frogstar.hit.bme.hu; whitelist
X-Scanned-By: MIMEDefang 2.79 on 152.66.248.44
Archived-At: <https://mailarchive.ietf.org/arch/msg/bmwg/50qoL0gxTEKGU6CkUwPIf8FO-hc>
Subject: Re: [bmwg] An Upgrade to Benchmarking Methodology for Network Interconnect Devices -- Fwd: New Version Notification for draft-lencse-bmwg-rfc2544-bis-00.txt
X-BeenThere: bmwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Benchmarking Methodology Working Group <bmwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bmwg>, <mailto:bmwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/bmwg/>
List-Post: <mailto:bmwg@ietf.org>
List-Help: <mailto:bmwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bmwg>, <mailto:bmwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 01 Jun 2020 09:44:15 -0000

Dear Al,

Thank you very much for your reply. Please see my answers inline. (I 
keep only those parts of the text that I reply to.)

2020.05.23. 19:40 keltezéssel, MORTON, ALFRED C (AL) írta:
>
>> As for the non-overlapping areas of our draft and the documents you
>> cited, I have not found anything in them about our suggestion for
>> "Improved Throughput and Frame Loss Rate Measurement Procedures using
>> Individual Frame Timeout".
>>
>> If you think this one joins well into your efforts to update RFC 2544,
>> then we could focus on this one first, and deal with some others later
>> one by one in (in different documents).
>>
>> What do you think of it?
> [acm]
> Using a constant frame timeout for declaration of Loss (and delay) is
> much like the IPPM WG metrics and methods (see RFC 7679 and RFC 7680)
> for the production network measurements (Tmax). A constant waiting time for
> frames to arrive at the receiver simply excludes frames on an on-going
> basis. RFC 2544 has a waiting time at the end of the trial, where the
> tester must wait for 2 seconds for buffers to clear but the first frame
> sent has the entire trial duration+2sec to arrive.
>

Yes, this is why I feel that some of the frames we count as "received" 
may be completely useless for the applications.

> Perhaps the biggest potential change to the Throughput definition is
> whether or not we demand frame correspondence between sender and receiver,
> so that we can calculate one-way delay. RFC 2544 and ETSI NFV TST009 only
> require equal send and receive frame counts to satisfy the zero loss
> criteria in the Throughput benchmark definition. But in ETSI NFV TST009,
> the Capacity at X% Loss Ratio metric-variant definition begins to infer
> frame correspondence. Later, the definitions of (one-way) Latency and
> Delay Variation and Loss allow a Sample Filter, which could be a constant
> maximum time-out for individual frames, as you suggest. Note that a Sample
> Filter could be applied in post-measurement processing, assuming all the
> delays are available.

I have implemented siitperf-pdv (part of siitpef, available from: 
https://github.com/lencsegabor/siitperf ) in a way that depending on the 
value of its last parameter (*frame timeout*), it can do the 
post-processing in two different ways:
- If the value of *frame timeout* is 0, then proper PDV measurement is 
done.
- If the value of *frame timeout* is higher than zero, then rather a 
special throughput (or frame loss rate) measurement is performed, where 
the tester checks *frame timeout* for each frame individually: if the 
measured delay of a frame is longer than the timeout, then the frame is 
reclassified as lost.

> But the real question in my mind now, after looking into the possibilities
> to take frame correspondence with a fixed timeout into account, is whether
> we can find some examples where adding timeout criteria makes a significant
> difference for Throughput measurements that would not be accounted for by
> revised/modernized Latency and new Delay Variation Benchmarks and
> metric variants?

With other words (as my co-author, Keiichi Shima expressed):

"Can defining a per-frame timeout test provide more sense than the delay 
variation test?"

IMHO, we can answer "yes".

Let us consider the following example. A delay sensitive application can 
tolerate at most 100ms delay and 0.01% frame loss. (The "lost" frames 
include also frames with higher than 100ms delay.)

Section 7.3.1. of RFC 8219 ( 
https://tools.ietf.org/html/rfc8219#section-7.3.1) defines PDV as follows:

    PDV = D99.9thPercentile - Dmin

    Where:

    o  D99.9thPercentile = the 99.9th percentile (as described in
       [RFC5481  <https://tools.ietf.org/html/rfc5481>]) of the one-way delay for the stream

    o  Dmin = the minimum one-way delay in the stream


There can be two problems with using PDV:
- As  PDV uses 99.9th percentile, and even if we measure PDV as 99ms, it 
gives a guarantee for 99.9% of the frames and 0.1% of them may have 
higher than 100ms latency, whereas our system tolerates only 0.01% frame 
loss (including frames with higher latency than 100ms)
- Dmin is subtracted from the 99.9th percentile, thus the final result 
of the PDV measurement is not exactly, what we need.

IMHO, our suggested method can provide a better solution:
- The *frame timeout* parameter of siitperf-pdv should be set to 100ms
- The bash shell script that performs the binary search (and executes 
siitperf-pdv in every single step) should allow 0.01% frame loss.
Thus the measurement can be easily performed.

What do you think of it?

Best regards,

Gábor