Re: [bmwg] WG Last Call: draft-ietf-bmwg-ngfw-performance-05 Tue, 12 January 2021 15:36 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 7B8373A0927 for <>; Tue, 12 Jan 2021 07:36:41 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id pl8QUbHmmjFT for <>; Tue, 12 Jan 2021 07:36:38 -0800 (PST)
Received: from ( [IPv6:2607:f8b0:4864:20::d35]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 866C63A091C for <>; Tue, 12 Jan 2021 07:36:38 -0800 (PST)
Received: by with SMTP id w18so4959857iot.0 for <>; Tue, 12 Jan 2021 07:36:38 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20150623; h=from:to:cc:references:in-reply-to:subject:date:message-id :mime-version:content-transfer-encoding:thread-index :content-language; bh=eXf3YpU3V9MriSF3bcpscH2VgB8u/7GmUhX0gIDjkHQ=; b=Hoix6vA287OSxHND398dsf8NeXWTlqz/435UQayY+HgcRB11ZxyzTeT86uDLPNsfIp VcLRpSjnBKqyhvfX3V5AKY0XCoxPDFrDesasKJGRgwi8K0pQ7oYt+7CGnT8I/XHsPoma MxpvP9o/8z3qzKup4AI8fH4txAjY5tDvw5sy2itjw1xM3qq6zo/to0YMHamU5PozLBt3 Rv3fjpJvpfqi5c/aYTBtmXW8chaCXEaMPRp7dN56O3O5bJXEXGbZ/WxfGnJaAUnPPAD/ IZNCJvf/VW51TKlGRj6SqoNA1mITN9VGJlLezRGyiE7xmQH3MXLHEkWZ+af0Pzv0BE3s 7yfQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:from:to:cc:references:in-reply-to:subject:date :message-id:mime-version:content-transfer-encoding:thread-index :content-language; bh=eXf3YpU3V9MriSF3bcpscH2VgB8u/7GmUhX0gIDjkHQ=; b=T8UfZmUC37rsBi+4It9tpnIRjHtaKGgpA5uwLfPK+NuA/T1m+8Kf3og/X/na5g7Iai Pp7vLuz9IQe7M0YnfaOAQac2l+9JQ/Cn7zoJxFw9NqUbKM86U4SS6L/G5z9dFA3/aB64 GcL/Ps7bnpU1X4Ky3pjpplHDyCuCC8XvrMtiq2kawjHISU2ZvSMV6uCEDrj2PuVbhiiB HVhrjI6t3CO6R/jDYZ1V6Y/T9sx3zJPwrM4cAlzJI6HAWkR0dDcdWB+u/FEuStZ7xIuU nEAS3J8OHJRpvrrYVtUtkBy4/KOYSm/OtJNgzmOZX5q3ctWsPmjM9OVnGKsDefdmmZ8b 0cdw==
X-Gm-Message-State: AOAM530d5BRi+0uj46H747KVIZFKXQQ4nKb/ZzZsSGiBjM9aUVhfNV+/ /t3+4a4SKxP15OT0luaSUhU/8g==
X-Google-Smtp-Source: ABdhPJzS+VGnK9wdAk0ljMYNaTGCMYdnIwaYyWDOQ+RRWGF5t4gjvm9Zkg8QKgoHuGDqX4d5m+nrgg==
X-Received: by 2002:a6b:6b18:: with SMTP id g24mr3725232ioc.189.1610465797630; Tue, 12 Jan 2021 07:36:37 -0800 (PST)
Received: from WINDOWSU6SOVGL ( []) by with ESMTPSA id d5sm2825407ilf.33.2021. (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 12 Jan 2021 07:36:37 -0800 (PST)
From: <>
To: <>, "'MORTON, ALFRED C \(AL\)'" <>
Cc: "'Bala Balarajah'" <>, "'Carsten Rossenhoevel'" <>
References: <> <> <027801d6da0e$3ac20740$b04615c0$>
In-Reply-To: <027801d6da0e$3ac20740$b04615c0$>
Date: Tue, 12 Jan 2021 10:36:35 -0500
Message-ID: <018201d6e8f8$b69e2800$23da7800$>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Mailer: Microsoft Outlook 16.0
Thread-Index: AQEuqiYZGMRTjsr9V6cISPU4hr9IhQFfZTXVAVM+aayrXwee4A==
Content-Language: en-ca
Archived-At: <>
Subject: Re: [bmwg] WG Last Call: draft-ietf-bmwg-ngfw-performance-05
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Benchmarking Methodology Working Group <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Tue, 12 Jan 2021 15:36:42 -0000

Al et al,

Given that there have been no further comments we will update the draft and
post it for, hopefully, final review.


-----Original Message-----
From: <> 
Sent: December 24, 2020 11:03 AM
To: 'Vratko Polak -X (vrpolak - PANTHEON TECH SRO at Cisco)'
Cc: 'MORTON, ALFRED C (AL)' <>om>; 'Bala Balarajah'
<>rg>; 'Carsten Rossenhoevel' <>
Subject: RE: [bmwg] WG Last Call: draft-ietf-bmwg-ngfw-performance-05

Vratko et al,

See comments from the authors inline below - preceded by [authors].


-----Original Message-----
From: bmwg <> On Behalf Of Vratko Polak -X (vrpolak -
Sent: December 23, 2020 10:31 AM
Subject: Re: [bmwg] WG Last Call: draft-ietf-bmwg-ngfw-performance-05

> Please read and express your opinion on whether or not this 
> Internet-Draft should be forwarded to the Area Directors for 
> publication as an Informational RFC.

The current draft is a large document, and I will have multiple comments.
I expect some of them will be addressed by creating -06 version, so my
opinion is -05 should not be forwarded for publication.

> Send your comments to this list or to co-chairs at 

The issue is, I do not have all the comments ready yet.
In general, I need to spend some effort when turning my nebulous ideas into
coherent sentences (mostly because only when writing the sentences I realize
the topic is even more complicated than I thought at first).

Also, specifically for BMWG, I want my comments to be more complete than
Not just "I do not like/understand this sentence", but give a new sentence
and a short explanation why the new sentence is better.
I have two reasons for aiming for high quality comments.
First, I imagine many people are reading this list.
That means, if I write a lazy superficial comment, I save my time, but
readers will spend more time trying to reconstruct my meaning.
(Similar to how in software development, code is written once but read many
times.) Second reason is high latency on this mailing list.
Usually, by the time the author reacts to the comments, the reviewer has
switched their attention to other tasks, so it is better when the first
comment does not need any subsequent clarifications from the reviewer.

> allow for holidays and other competing topics

I reserved some time before holidays, originally for improving MLRsearch,
but NGFW is closer to publishing so it takes precedence.

My plan is to start with giving a few low-quality comments, mainly to hint
what areas I want to see improved.
After holidays, I will write higher quality comments, one e-mail per area.
This e-mail contains the low-quality comments (in decreasing order of

1. Test Bed Considerations. Useful, but maybe should be expanded into a
separate draft.
(Mainly expanding on "testbed reference pre-tests", and what to do if they
fail but we still want some results.)

[authors]  The section "Test Bed Considerations" just gives a recommendation
(even though we haven't use Capital letter "RECOMMEND"). The section
describes the importance of the pre-test, and it also gives an idea about
pre-test. The Test labs or any user can decide themselves, if the pre-test
is needed for their test.  However, based on our discussions with test labs,
they usually perform such a pre-test. In our opinion, we should keep this
section in the draft. It just creates an awareness of pre-test to the

2. Sentence with "safety margin of 10%". Unclear.
If you want to add or subtract, name both the quantity before and after the
operation, so in later references it is clear which quantity is referenced.
Also, why 10% and not something else (e.g. 5%)?

[authors] You are right. Either we need to change the wording or remove the
whole sentence. We suggest removing it  

3. Is it "test bed" or "testbed"?
I assume it means "SUT" plus "test equipment" together, but is should be

[authors] Based on Oxford and Cambridge, it should be "test bed". We will
solve the inconsistency  issue in the next version. A test bed should also
include test equipment.  we will describe this in the next version.

4. Sustain phase follows after ramp-up phase immediately, without any pause,
right? Then there is in-flight traffic at sustain phase start and end,
making it hard to get precise counters.

[authors] We don't think we can add a pause between ramp-up and sustain
phase.  Since the frequency of the measurements are 2 second and the total
sustain phase is 300s,I don't think the in-flight traffic will impact
accuracy of the results.  However, we have two suggestions here:
1. ask test tool vendors if there is any way to add pause between two phases
2. we can describe in the draft that the measurement should occur between  X
sec (e.g. 2sec) after ramp-up begins and X sec before ramp-up ends.
If it doesn't appear to be [possible to build in a pause we would go with
option 2.

5. Validation criteria. The draft contains terms "target throughput" and
"initial throughput", but also phrases like "the maximum and average
achievable throughput within the validation criteria".
I am not even sure if validation criteria apply to a trial (e.g. telemetry
suggests test equipment behavior was not stable enough) or a whole search
(e.g. maximum achievable throughput is below acceptance threshold).

[authors] Section 6 .1 describes the average throughput.  Due to the
behavior of stateful traffic (TCP) and also test tools behavior, getting a
100% linear (stable) throughput is not easy. There will always be continuous
minor spikes. That's Why we chose to measure the average values.
We will remove the wording "maximum ..." in the next version. Also, we will
clarify that throughput means always avg. throughput. For an e.g. "target
throughput" means "average target throughput"

6. It seems the same word "throughput" is used to mean different quantities
depending on context.
Close examination suggests it probably means forwarding rate [0] except the
offered load [1] is not given explicitly (and maybe is not even constant).
When I see "throughput" I think [2] (max offered load with no loss), which
does not work as generally the draft allows some loss.
Also, some terms (e.g. "http throughput") do not refer to packets, but other

[authors] The throughput measurement defined in [2] doesn't fit for L7
stateful traffic.  For example TCP retransmissions are not always packet
loss. Due to the test complexity and test tools behavior we have to allow
some transaction failures. Therefore, we needed to define a different
definition for the KPI throughput. Section 6.1 describes that the KPI
measures the average Layer 2 throughput. But you are right; the term "http
throughput" can be considered as L7 throughput or Goodput.   We will work on
this in the next draft.

7. SUT state affecting performance. The draft does not mention any, so I
think it assumes "stateless" SUT.
An example of "stateful" SUT is NAT, where opening sessions has smaller
performance than forwarding on already opened sessions.
Or maybe it is assumed any such state enters a stationary state during
ramp-up, so in sustain phase the performance is stable (e.g. NAT sessions
may be timing out, but in a stable rate).

[authors] SUT MUST be stateful, and it must do Stateful inspection. It
doesn't mean that the SUT must do NAT if it is in stateful mode. NAT is just
another feature which can or can't be enabled and this is based on the
customer scenario.
The traffic profile has limited (e.g. 10 for throughput test) transactions
per TCP connection and the session will be closed once the transactions are
completed. SUT will then remove the session entries from its session table.
This means, there will be always new stateful sessions will be opened and
established during the sustain period as well.   Apart from this, we can
consider whether we want to add NAT as an option feature in the feature
table (table 2).

8. Stateless or stateful traffic generation. Here stateless means
predetermined packets are sent at predetermined times.
Stateful means time or content of next-to-send packet depends on time or
content of previously received packets.
Draft section 7.1 looks like stateless traffic to me (think IMIX [3]), while
others look like stateful (you cannot count http transaction rate from lossy
stateless traffic).
In general, stateful traffic is more resource intensive for test equipment,
so it is harder to achieve high enough offered load.
Also, stateful traffic generation is more sensitive to packet loss and
latency of SUT.

[authors] This is not IMIX [3].  IMIX [3] defines based on variable packet
sizes. But here in the draft, we define traffic mix based  on different
applications, and it's object sizes. For example an application mix can be
HTTPS, HTTPS, DNS (UDP), VOIP (TCP and UDP), and, etc.). In this example we
have a mix of stateful and stateless traffic and each application has
different object sizes. One object can have multiple packets with different
sizes. The packet sizes are dependent on multiple factors namely; TCP
behavior, MTU size, total object size.
Note: Stateful traffic generators MUST be used for all benchmarking tests
and we used/are using stateful traffic generators for the NSO certification



-----Original Message-----
From: bmwg <> On Behalf Of MORTON, ALFRED C (AL)
Sent: Friday, 2020-December-18 19:16
Subject: [bmwg] WG Last Call: draft-ietf-bmwg-ngfw-performance-05


We will start a WG Last Call for

Benchmarking Methodology for Network Security Device Performance

The WGLC will close on 22 January, 2021, allow for holidays and other
competing topics (IOW, plenty of time!)

Please read and express your opinion on whether or not this Internet-Draft
should be forwarded to the Area Directors for publication as an
Informational RFC.  Send your comments to this list or to co-chairs at

for the co-chairs,

bmwg mailing list
bmwg mailing list