[Iot-directorate] [iotdir] telechat Review for draft-ietf-bmwg-ngfw-performance-13

tte@cs.fau.de Sun, 30 January 2022 12:16 UTC

Return-Path: <eckert@i4.informatik.uni-erlangen.de>
X-Original-To: iot-directorate@ietfa.amsl.com
Delivered-To: iot-directorate@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 68CDC3A25FE; Sun, 30 Jan 2022 04:16:53 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 3.351
X-Spam-Level: ***
X-Spam-Status: No, score=3.351 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, GB_SUMOF=5, HEADER_FROM_DIFFERENT_DOMAINS=0.25, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2Pm_HbtBeXqk; Sun, 30 Jan 2022 04:16:40 -0800 (PST)
Received: from faui40.informatik.uni-erlangen.de (faui40.informatik.uni-erlangen.de [131.188.34.40]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 3B39C3A25FD; Sun, 30 Jan 2022 04:16:37 -0800 (PST)
Received: from faui48e.informatik.uni-erlangen.de (faui48e.informatik.uni-erlangen.de [131.188.34.51]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by faui40.informatik.uni-erlangen.de (Postfix) with ESMTPS id 6A4A158C4AF; Sun, 30 Jan 2022 13:16:29 +0100 (CET)
Received: by faui48e.informatik.uni-erlangen.de (Postfix, from userid 10463) id 2F83C4EA4E8; Sun, 30 Jan 2022 13:16:29 +0100 (CET)
Date: Sun, 30 Jan 2022 13:16:29 +0100
From: tte@cs.fau.de
To: iot-directorate@ietf.org, evyncke@cisco.com
Cc: draft-ietf-bmwg-ngfw-performance.all@ietf.org, mariainesrobles@googlemail.com, bmwg@ietf.org
Message-ID: <YfaBna5xH7hrmcfC@faui48e.informatik.uni-erlangen.de>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
Archived-At: <https://mailarchive.ietf.org/arch/msg/iot-directorate/CekKXOGwr2GugcyvnFf8LbrOFRU>
Subject: [Iot-directorate] [iotdir] telechat Review for draft-ietf-bmwg-ngfw-performance-13
X-BeenThere: iot-directorate@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Mailing list for the IoT Directorate Members <iot-directorate.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/iot-directorate>, <mailto:iot-directorate-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/iot-directorate/>
List-Post: <mailto:iot-directorate@ietf.org>
List-Help: <mailto:iot-directorate-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/iot-directorate>, <mailto:iot-directorate-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 30 Jan 2022 12:16:54 -0000

Reviewer: toerless eckert
Review result: On the right track

Summary:
Thanks a lot for this work. Its an immensibley complex and important problem to
tackle. I have in my time only measured router traffic performance and that already was an
infinite matrix. This looks to me like some order of infinite bigger a problem.

Meaning: however questioning my reviews feedback may be wrt. nitpicking about the document,
i think that the document in its existing form is already a great advancement to measure
performance for these security devices, and in doubt should be progressed rather faster than
slower especially because in my (limited) market understanding, many security device vendors will
only provide actual feedback once it is an RFC (community i think overall more conservative in
adopting IETF work, most not proactively engaging during draft stage).

But of course: feel free to improve the document with any of the feedback/suggestions
in my review that you feel are useful.

Maybe high level, i would suggest most importantly to add more explanations, especially in
an appropriate section about those aspects known NOT to be considered (but potentially
important) so that the applicability of the tests that are described are better put into
perspective by adopters of the draft to their real-world situations.

Favorite pet topic: Add req. to measure the DUT through a power meter and report consumption
so we can start making sure products with lower power consumptions will see sales benefits
when reporting numbers from this document (see details inline).

Formal:
I choose to keep the whole document inline to make it easier for readers to vet
my comments without having to open in parallel a copy of the whole document.

Rest inline - email ends with string EOF (i have seen some email truncation happening).

Thanks!
    Toerless

---
Please fix the following nits - from https://www.ietf.org/tools/idnits
idnits 2.17.00 (12 Aug 2021)

> /tmp/idnits29639/draft-ietf-bmwg-ngfw-performance-13.txt:
> ... 
> 
>   Checking nits according to https://www.ietf.org/id-info/checklist :
>   ----------------------------------------------------------------------------
> 
>   ** The abstract seems to contain references ([RFC3511]), which it
>      shouldn't.  Please replace those with straight textual mentions of the
>      documents in question.
> 
>   == There are 1 instance of lines with non-RFC6890-compliant IPv4 addresses
>      in the document.  If these are example addresses, they should be changed.
> 
>   == There are 1 instance of lines with non-RFC3849-compliant IPv6 addresses
>      in the document.  If these are example addresses, they should be changed.
> 
>   -- The draft header indicates that this document obsoletes RFC3511, but the
>      abstract doesn't seem to directly say this.  It does mention RFC3511
>      though, so this could be OK.
> 
> 
>   Miscellaneous warnings:
>   ----------------------------------------------------------------------------
> 
>   == The document seems to lack the recommended RFC 2119 boilerplate, even if
>      it appears to use RFC 2119 keywords. 
> 
>      (The document does seem to have the reference to RFC 2119 which the
>      ID-Checklist requires).
> 
> 

The lines in the following commented copy of the document are from idnits too

When a comment/question is preceeded with "Nit:", then it indicates, that
it seems to me the best answer would be modified draft text.

When a comment/question is preceeded with "Q:", then i am actually not so
sure what the outcome could be, so an answer in mail would be a start.

2	Benchmarking Methodology Working Group                      B. Balarajah
3	Internet-Draft
4	Obsoletes: 3511 (if approved)                            C. Rossenhoevel
5	Intended status: Informational                                  EANTC AG
6	Expires: 16 July 2022                                         B. Monkman
7	                                                              NetSecOPEN
8	                                                            January 2022

10	    Benchmarking Methodology for Network Security Device Performance
11	                  draft-ietf-bmwg-ngfw-performance-13

13	Abstract

15	   This document provides benchmarking terminology and methodology for
16	   next-generation network security devices including next-generation
17	   firewalls (NGFW), next-generation intrusion prevention systems
18	   (NGIPS), and unified threat management (UTM) implementations.  The

Nit: Why does it have to be next-generation for all example type
of devices except for UTMs, and what does next-generation mean.
Would suggest to rewrite text so reader does not ask herself these
questions.

18	   (NGIPS), and unified threat management (UTM) implementations.  The
19	   main areas covered in this document are test terminology, test
20	   configuration parameters, and benchmarking methodology for NGFW and
21	   NGIPS.  This document aims to improve the applicability,

I don't live and breathe the security device TLA space, but i start to
suspect a UTM is some platform on which FW and IPS could run as software
modules, and because its only software you assume the UTM does not have
to be next-gen ? I wonder how much of this guesswork/thought process you
want the reader to have or if you want to avoid that by being somehawt
clearer...

21	   NGIPS.  This document aims to improve the applicability,
22	   reproducibility, and transparency of benchmarks and to align the test
23	   methodology with today's increasingly complex layer 7 security
24	   centric network application use cases.  As a result, this document
25	   makes [RFC3511] obsolete.

[minor] I kinda wonder if / how obsoleting RFC3511 could/should work.
I understand when we do a bis of a standard protocol and really don't
want anyone to implement the older version. But unless there is a
similar IETF mandate going along with this draft that says
non-NG FW and non-NG IPS are hereby obsoleted by the IETF, i can not
see how this draft can obsolete RFC3511 because it simply applies
to a different type of benchmarked entities. And RFC3511 would stay
on forever for whatever we call non-NG.

[minor] At least i think that is the case Unless this document actually does apply also
to non-NG FW/IPS and can therefore superceed RFC3511 and actually obsolete it. But the
text so far does say the opposite.

[mayor] I observe that RFC3511 asks to measure and report goodput (5.6.5.2), and this document 
does not mention the term, and if at all, the loss in performance of client/server TCP
or QUIC connections through behavior of the DUT (such as proxying) is at best
covered indirectly by mentioning parameters such as less than 5% reduction in
throughput. If this document is superceeding rfc3511 i think it should have a very
explicit section discussing goodput - and maybe expanding on it.

consider for example the impact of TCP connection throughput and goodput.
Very likely  DUT proxying TCP connections will have quite a different performance/goodput
impact for a calssical web-page vs. video streaming. Therefore i am also worried
about sending only average bitrates per session as opposed to some sessions going
up to e.g. 500Mbps for a video streaming connection (example best commercial available
UHD video streaming today). Those type of sessions might incur a lot of goodput loss
with bad DUTs, but if i understand the test profile, then the per-TCP connection througput
of the test profiles will be much less than 100Mbps. If such range in client session
bitrates is not meant to be tested, it might at least be useful to add a section listing
candidate gaps like this. Another one for example is the impact of higher RTT especially
between DUT and server in the Internet. This mostly challenges TCP window size
operation on DUT operating as TCP hosts and also their ability to buffer for retransmissions.
Test Equipment IMHO may/should be able to emulate such long RTT. But this is not included
in this document (RTT not mentioned).

Beside goodput related issues, there are a couple other points in this review that may be too 
difficult to fix this late in the development of the document, but maybe for any of those
considered to be useful input maybe add them to a section "out-of-scope (for future versions)
considerations" or the like to capture them.

27	Status of This Memo

29	   This Internet-Draft is submitted in full conformance with the
30	   provisions of BCP 78 and BCP 79.

32	   Internet-Drafts are working documents of the Internet Engineering
33	   Task Force (IETF).  Note that other groups may also distribute
34	   working documents as Internet-Drafts.  The list of current Internet-
35	   Drafts is at https://datatracker.ietf.org/drafts/current/.

37	   Internet-Drafts are draft documents valid for a maximum of six months
38	   and may be updated, replaced, or obsoleted by other documents at any
39	   time.  It is inappropriate to use Internet-Drafts as reference
40	   material or to cite them other than as "work in progress."

42	   This Internet-Draft will expire on 5 July 2022.

44	Copyright Notice

46	   Copyright (c) 2022 IETF Trust and the persons identified as the
47	   document authors.  All rights reserved.

49	   This document is subject to BCP 78 and the IETF Trust's Legal
50	   Provisions Relating to IETF Documents (https://trustee.ietf.org/
51	   license-info) in effect on the date of publication of this document.
52	   Please review these documents carefully, as they describe your rights
53	   and restrictions with respect to this document.  Code Components
54	   extracted from this document must include Revised BSD License text as
55	   described in Section 4.e of the Trust Legal Provisions and are
56	   provided without warranty as described in the Revised BSD License.

58	Table of Contents

60	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   4
61	   2.  Requirements  . . . . . . . . . . . . . . . . . . . . . . . .   4
62	   3.  Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . .   4
63	   4.  Test Setup  . . . . . . . . . . . . . . . . . . . . . . . . .   4
64	     4.1.  Testbed Configuration . . . . . . . . . . . . . . . . . .   5
65	     4.2.  DUT/SUT Configuration . . . . . . . . . . . . . . . . . .   6
66	       4.2.1.  Security Effectiveness Configuration  . . . . . . . .  12
67	     4.3.  Test Equipment Configuration  . . . . . . . . . . . . . .  12
68	       4.3.1.  Client Configuration  . . . . . . . . . . . . . . . .  12
69	       4.3.2.  Backend Server Configuration  . . . . . . . . . . . .  15
70	       4.3.3.  Traffic Flow Definition . . . . . . . . . . . . . . .  17
71	       4.3.4.  Traffic Load Profile  . . . . . . . . . . . . . . . .  17
72	   5.  Testbed Considerations  . . . . . . . . . . . . . . . . . . .  18
73	   6.  Reporting . . . . . . . . . . . . . . . . . . . . . . . . . .  19
74	     6.1.  Introduction  . . . . . . . . . . . . . . . . . . . . . .  19
75	     6.2.  Detailed Test Results . . . . . . . . . . . . . . . . . .  21
76	     6.3.  Benchmarks and Key Performance Indicators . . . . . . . .  21
77	   7.  Benchmarking Tests  . . . . . . . . . . . . . . . . . . . . .  23
78	     7.1.  Throughput Performance with Application Traffic Mix . . .  23
79	       7.1.1.  Objective . . . . . . . . . . . . . . . . . . . . . .  23
80	       7.1.2.  Test Setup  . . . . . . . . . . . . . . . . . . . . .  23
81	       7.1.3.  Test Parameters . . . . . . . . . . . . . . . . . . .  23
82	       7.1.4.  Test Procedures and Expected Results  . . . . . . . .  25
83	     7.2.  TCP/HTTP Connections Per Second . . . . . . . . . . . . .  26
84	       7.2.1.  Objective . . . . . . . . . . . . . . . . . . . . . .  26
85	       7.2.2.  Test Setup  . . . . . . . . . . . . . . . . . . . . .  27
86	       7.2.3.  Test Parameters . . . . . . . . . . . . . . . . . . .  27
87	       7.2.4.  Test Procedures and Expected Results  . . . . . . . .  28
88	     7.3.  HTTP Throughput . . . . . . . . . . . . . . . . . . . . .  30
89	       7.3.1.  Objective . . . . . . . . . . . . . . . . . . . . . .  30
90	       7.3.2.  Test Setup  . . . . . . . . . . . . . . . . . . . . .  30
91	       7.3.3.  Test Parameters . . . . . . . . . . . . . . . . . . .  30
92	       7.3.4.  Test Procedures and Expected Results  . . . . . . . .  32
93	     7.4.  HTTP Transaction Latency  . . . . . . . . . . . . . . . .  33
94	       7.4.1.  Objective . . . . . . . . . . . . . . . . . . . . . .  33
95	       7.4.2.  Test Setup  . . . . . . . . . . . . . . . . . . . . .  33
96	       7.4.3.  Test Parameters . . . . . . . . . . . . . . . . . . .  34
97	       7.4.4.  Test Procedures and Expected Results  . . . . . . . .  35
98	     7.5.  Concurrent TCP/HTTP Connection Capacity . . . . . . . . .  36
99	       7.5.1.  Objective . . . . . . . . . . . . . . . . . . . . . .  36
100	       7.5.2.  Test Setup  . . . . . . . . . . . . . . . . . . . . .  36
101	       7.5.3.  Test Parameters . . . . . . . . . . . . . . . . . . .  37
102	       7.5.4.  Test Procedures and Expected Results  . . . . . . . .  38
103	     7.6.  TCP/HTTPS Connections per Second  . . . . . . . . . . . .  39
104	       7.6.1.  Objective . . . . . . . . . . . . . . . . . . . . . .  40
105	       7.6.2.  Test Setup  . . . . . . . . . . . . . . . . . . . . .  40
106	       7.6.3.  Test Parameters . . . . . . . . . . . . . . . . . . .  40
107	       7.6.4.  Test Procedures and Expected Results  . . . . . . . .  42
108	     7.7.  HTTPS Throughput  . . . . . . . . . . . . . . . . . . . .  43
109	       7.7.1.  Objective . . . . . . . . . . . . . . . . . . . . . .  43
110	       7.7.2.  Test Setup  . . . . . . . . . . . . . . . . . . . . .  43
111	       7.7.3.  Test Parameters . . . . . . . . . . . . . . . . . . .  43
112	       7.7.4.  Test Procedures and Expected Results  . . . . . . . .  45
113	     7.8.  HTTPS Transaction Latency . . . . . . . . . . . . . . . .  46
114	       7.8.1.  Objective . . . . . . . . . . . . . . . . . . . . . .  46
115	       7.8.2.  Test Setup  . . . . . . . . . . . . . . . . . . . . .  46
116	       7.8.3.  Test Parameters . . . . . . . . . . . . . . . . . . .  46
117	       7.8.4.  Test Procedures and Expected Results  . . . . . . . .  48
118	     7.9.  Concurrent TCP/HTTPS Connection Capacity  . . . . . . . .  49
119	       7.9.1.  Objective . . . . . . . . . . . . . . . . . . . . . .  49
120	       7.9.2.  Test Setup  . . . . . . . . . . . . . . . . . . . . .  49
121	       7.9.3.  Test Parameters . . . . . . . . . . . . . . . . . . .  49
122	       7.9.4.  Test Procedures and Expected Results  . . . . . . . .  51
123	   8.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  52
124	   9.  Security Considerations . . . . . . . . . . . . . . . . . . .  53
125	   10. Contributors  . . . . . . . . . . . . . . . . . . . . . . . .  53
126	   11. Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  53
127	   12. References  . . . . . . . . . . . . . . . . . . . . . . . . .  53
128	     12.1.  Normative References . . . . . . . . . . . . . . . . . .  53
129	     12.2.  Informative References . . . . . . . . . . . . . . . . .  53
130	   Appendix A.  Test Methodology - Security Effectiveness
131	           Evaluation  . . . . . . . . . . . . . . . . . . . . . . .  54
132	     A.1.  Test Objective  . . . . . . . . . . . . . . . . . . . . .  55
133	     A.2.  Testbed Setup . . . . . . . . . . . . . . . . . . . . . .  55
134	     A.3.  Test Parameters . . . . . . . . . . . . . . . . . . . . .  55
135	       A.3.1.  DUT/SUT Configuration Parameters  . . . . . . . . . .  55
136	       A.3.2.  Test Equipment Configuration Parameters . . . . . . .  55
137	     A.4.  Test Results Validation Criteria  . . . . . . . . . . . .  56
138	     A.5.  Measurement . . . . . . . . . . . . . . . . . . . . . . .  56
139	     A.6.  Test Procedures and Expected Results  . . . . . . . . . .  57
140	       A.6.1.  Step 1: Background Traffic  . . . . . . . . . . . . .  57
141	       A.6.2.  Step 2: CVE Emulation . . . . . . . . . . . . . . . .  58
142	   Appendix B.  DUT/SUT Classification . . . . . . . . . . . . . . .  58
143	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  58

145	1.  Introduction

147	   18 years have passed since IETF recommended test methodology and
148	   terminology for firewalls initially ([RFC3511]).  The requirements
149	   for network security element performance and effectiveness have
150	   increased tremendously since then.  In the eighteen years since

[nit] What is a network security element ? Please provide reference or define.
If we are talking about them in this doc why are they not mentioned in the
abstract ?

150	   increased tremendously since then.  In the eighteen years since
151	   [RFC3511] was published, recommending test methodology and
152	   terminology for firewalls, requirements and expectations for network
153	   security elements has increased tremendously.  Security function

[nit] This does not parse as correct english to me "recommending test methodology ...
has increased tremendously". It would, if you mean that more and more
test methodologies where recommended, but not if there is an outstanding
need to do so (which this document intends to fill). 

[nit] Why does the recommending part apply only to firewalls and the requirements
and expectations only to security elements ? 

153	   security elements has increased tremendously.  Security function

[nit] What is a security function ? (i know, but i don't know if the reader is
supposed to know). Aka: provide reference, add terminology section or define.
Maybe easiest to restructure this intro paragraph to start with the
explanation of the evolution from firewalls to network security elements
which support one or more securit functions including firewall, intrusion
detection etc. pp - and then conclude easily how this means that this
requires this document to define all the good BMWG stuff it hopefully does.

Although a terminology section is never a bad thing either ;-)

154	   implementations have evolved to more advanced areas and have
155	   diversified into intrusion detection and prevention, threat
156	   management, analysis of encrypted traffic, etc.  In an industry of
157	   growing importance, well-defined, and reproducible key performance
158	   indicators (KPIs) are increasingly needed to enable fair and
159	   reasonable comparison of network security functions.  All these

[nit] maybe add what to compare - performance, functionality, scale,
flexibility, adjustability - or if you knowingly only discuss subsets
of these aspects, then maybe still list all the aspects you are aware
of to be of interest to likely readers of this document and summarize
those that you will and those that you won't cover in this document, so
that the readers don't have to continue reading the document hoping to find them described.

160	   reasons have led to the creation of a new next-generation network
161	   security device benchmarking document, which makes [RFC3511]
162	   obsolete.

[nit] as mentioned above, whether or not the obsolete is true is
not clear to me yet.

164	2.  Requirements

166	   The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
167	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
168	   "OPTIONAL" in this document are to be interpreted as described in BCP
169	   14 [RFC2119], [RFC8174] when, and only when, they appear in all
170	   capitals, as shown here.

172	3.  Scope

174	   This document provides testing terminology and testing methodology
175	   for modern and next-generation network security devices that are
176	   configured in Active ("Inline", see Figure 1 and Figure 2) mode.  It

[nit] The word Active does not again happen in the document, instead, the
description on line 261 defines Inline mode as "active", which in my book
makes 176+261 a perfect circular definition. I would suggest to have a 
terminology section that define "Inline", for example by also adding one
most likely possible alternative mode description.

177	   covers the validation of security effectiveness configurations of

[nit] security configuration effectiveness ?

179	   network security devices, followed by performance benchmark testing.
179	   This document focuses on advanced, realistic, and reproducible
180	   testing methods.  Additionally, it describes testbed environments,

[nit] are you sure advanced and realistic are meant to characterize the
testing method or the scenario that is being tested ? "reroducible
testing methods for advanced real world scenarios" ?

181	   test tool requirements, and test result formats.

183	4.  Test Setup

185	   Test setup defined in this document applies to all benchmarking tests

[nit] "/Test setup defined/The test setup defined/

186	   described in Section 7.  The test setup MUST be contained within an
187	   Isolated Test Environment (see Section 3 of [RFC6815]).

189	4.1.  Testbed Configuration

191	   Testbed configuration MUST ensure that any performance implications
192	   that are discovered during the benchmark testing aren't due to the

[nit] /aren't/are not/

193	   inherent physical network limitations such as the number of physical
194	   links and forwarding performance capabilities (throughput and
195	   latency) of the network devices in the testbed.  For this reason,
196	   this document recommends avoiding external devices such as switches
197	   and routers in the testbed wherever possible.

199	   In some deployment scenarios, the network security devices (Device
200	   Under Test/System Under Test) are connected to routers and switches,
201	   which will reduce the number of entries in MAC or ARP tables of the
202	   Device Under Test/System Under Test (DUT/SUT).  If MAC or ARP tables
203	   have many entries, this may impact the actual DUT/SUT performance due
204	   to MAC and ARP/ND (Neighbor Discovery) table lookup processes.  This
205	   document also recommends using test equipment with the capability of

[nit] /also/therefore/

206	   emulating layer 3 routing functionality instead of adding external
207	   routers in the testbed.

209	   The testbed setup Option 1 (Figure 1) is the RECOMMENDED testbed
210	   setup for the benchmarking test.

212	   +-----------------------+                   +-----------------------+
213	   | +-------------------+ |   +-----------+   | +-------------------+ |
214	   | | Emulated Router(s)| |   |           |   | | Emulated Router(s)| |
215	   | |    (Optional)     | +----- DUT/SUT  +-----+    (Optional)     | |
216	   | +-------------------+ |   |           |   | +-------------------+ |
217	   | +-------------------+ |   +-----------+   | +-------------------+ |
218	   | |     Clients       | |                   | |      Servers      | |
219	   | +-------------------+ |                   | +-------------------+ |
220	   |                       |                   |                       |
221	   |   Test Equipment      |                   |   Test Equipment      |
222	   +-----------------------+                   +-----------------------+

224	                     Figure 1: Testbed Setup - Option 1

226	   If the test equipment used is not capable of emulating layer 3
227	   routing functionality or if the number of used ports is mismatched
228	   between test equipment and the DUT/SUT (need for test equipment port
229	   aggregation), the test setup can be configured as shown in Figure 2.

231	    +-------------------+      +-----------+      +--------------------+
232	    |Aggregation Switch/|      |           |      | Aggregation Switch/|
233	    | Router            +------+  DUT/SUT  +------+ Router             |
234	    |                   |      |           |      |                    |
235	    +----------+--------+      +-----------+      +--------+-----------+
236	               |                                           |
237	               |                                           |
238	   +-----------+-----------+                   +-----------+-----------+
239	   |                       |                   |                       |
240	   | +-------------------+ |                   | +-------------------+ |
241	   | | Emulated Router(s)| |                   | | Emulated Router(s)| |
242	   | |     (Optional)    | |                   | |     (Optional)    | |
243	   | +-------------------+ |                   | +-------------------+ |
244	   | +-------------------+ |                   | +-------------------+ |
245	   | |      Clients      | |                   | |      Servers      | |
246	   | +-------------------+ |                   | +-------------------+ |
247	   |                       |                   |                       |
248	   |    Test Equipment     |                   |    Test Equipment     |
249	   +-----------------------+                   +-----------------------+

251	                     Figure 2: Testbed Setup - Option 2

[nit] Please elaborate on the "number of used ports", and if possible show in
Figure 2 by drawing multiple links. I guess that in a common case, the test
equipment might provide few, but fast ports, whereas the DUT/SU might provide
more slower ports, and one would there use external switches as port multiplexer ?
Or vice-versa ? Butif such adaptation is performed, i wonder how different
setup might impact the measurements. So for example let's say the Test Equipment
(TE) has a 100Gbps port, and the DUT has 4 * 10Gbps port, so you need on each
side a switch with 100Gbps and 2 * 10 Gbps. Would you try to use VLANs into the
TE, or would you just build a single LAN. Any recommendations for the switch
config, and why.

[mayor] The fact that the left side says only client and the right side says only
server is worth some more discussion. Especially because the Filtering in
Figure 3 also lets me wonder in which direction traffic is meant to be filtered/inspected.
Are you considering the case that clients are responders to (TCP/QUIC/UDP) connections ?
For example left side is "inside", DUT is a site firewall to the Internet (right side),
and there is some server on the left side (e.g.: SMTP). How about that you do
have on the right an Internet and a separate site DMZ interface and then of course
traffic not only between left and right, but between those interfaces on the right ?

More broadly applicable, dynamic port discovery for ICE/STUN, where you want to permit
inside to outside connections (to the STUN server) to permit new connections from other
external nodes to go back inside). E.g.: would be good to have some elaboration about the
rype of connections covered by this document. If its only initiators on the left and
responders on the right, that is fine, but it should be said so and maybe point to
those above cases (DMZ, inside servers, STUN/ICE) not covered by this document.

253	4.2.  DUT/SUT Configuration

255	   A unique DUT/SUT configuration MUST be used for all benchmarking
256	   tests described in Section 7.  Since each DUT/SUT will have its own
257	   unique configuration, users SHOULD configure their device with the
258	   same parameters and security features that would be used in the
259	   actual deployment of the device or a typical deployment in order to
260	   achieve maximum network security coverage.  The DUT/SUT MUST be

[nit] What is a "unique configuration" ? It could be different configurations
across two different DUT but both achieving the same service/filtering, just
difference in syntax, or it could be difference in functional outcome. Would
be good to be more precise what is meant.

[nit] Why would a user choose an actual deployment vs. a typical deployment ?
I am imagining that a user would choose an actual deployment to measure performance
specifically for that deployment but a typical deployment when the DUT would
need to be deployed in different setups but not each of those can be measured
individually, or because the results are meant to be comparable with other
users who may have taken performance numbers. WOuld be good to elaborate a bit
more so readers have a clearer understanding what "actual deployment" and
"typical deployment" means and how/why to pick one over the other.

[nit] I do not understand how the text up to "in order to" justifies that it will
achieve the maximum network security coverage. I also do not know what
"maximum network security coverage" means. If there is a definition, please
provide it. Else introduce it.

260	   achieve maximum network security coverage.  The DUT/SUT MUST be
261	   configured in "Inline" mode so that the traffic is actively inspected
262	   by the DUT/SUT.  Also "Fail-Open" behavior MUST be disabled on the
263	   DUT/SUT.

265	   Table 1 and Table 2 below describe the RECOMMENDED and OPTIONAL sets
266	   of network security feature list for NGFW and NGIPS respectively.
267	   The selected security features SHOULD be consistently enabled on the
268	   DUT/SUT for all benchmarking tests described in Section 7.

270	   To improve repeatability, a summary of the DUT/SUT configuration
271	   including a description of all enabled DUT/SUT features MUST be
272	   published with the benchmarking results.

274	          +============================+=============+==========+
275	          | DUT/SUT (NGFW) Features    | RECOMMENDED | OPTIONAL |
276	          +============================+=============+==========+
277	          | SSL Inspection             |      x      |          |
278	          +----------------------------+-------------+----------+
279	          | IDS/IPS                    |      x      |          |
280	          +----------------------------+-------------+----------+
281	          | Anti-Spyware               |      x      |          |
282	          +----------------------------+-------------+----------+
283	          | Anti-Virus                 |      x      |          |
284	          +----------------------------+-------------+----------+
285	          | Anti-Botnet                |      x      |          |
286	          +----------------------------+-------------+----------+
287	          | Web Filtering              |             |    x     |
288	          +----------------------------+-------------+----------+
289	          | Data Loss Protection (DLP) |             |    x     |
290	          +----------------------------+-------------+----------+
291	          | DDoS                       |             |    x     |
292	          +----------------------------+-------------+----------+
293	          | Certificate Validation     |             |    x     |
294	          +----------------------------+-------------+----------+

[mayor] This may be bogs because i don't know well enough how for the purpose
of this document security devices are expected to inspect HTTP connections
from client to server. Maybe this is a sane approach where the security device
operates as a client trusted HTTPs proxy, maybe its one of the more hacky approaches
(faked server certs). But however it works, i think that a security device can not
get away from validating the certificate of the server in a connection. Else
it shouldn't be called a security DUT.

But i am not sure if that validation is what you call "Certificate Validation".

294	          +----------------------------+-------------+----------+
295	          | Logging and Reporting      |      x      |          |
296	          +----------------------------+-------------+----------+
297	          | Application Identification |      x      |          |
298	          +----------------------------+-------------+----------+

300	                      Table 1: NGFW Security Features

[nit] Why are "Web Filtering"..."Certificate Validation" only MAY ?
Please point to a place in the document (or elsewhere) that rationales
the SHOULD/MAY recommendations. Same applies to Table 2.

[nit] 

302	          +============================+=============+==========+
303	          | DUT/SUT (NGIPS) Features   | RECOMMENDED | OPTIONAL |
304	          +============================+=============+==========+
305	          | SSL Inspection             |      x      |          |
306	          +----------------------------+-------------+----------+
307	          | Anti-Malware               |      x      |          |
308	          +----------------------------+-------------+----------+
309	          | Anti-Spyware               |      x      |          |
310	          +----------------------------+-------------+----------+
311	          | Anti-Botnet                |      x      |          |
312	          +----------------------------+-------------+----------+
313	          | Logging and Reporting      |      x      |          |
314	          +----------------------------+-------------+----------+
315	          | Application Identification |      x      |          |
316	          +----------------------------+-------------+----------+
317	          | Deep Packet Inspection     |      x      |          |
318	          +----------------------------+-------------+----------+
319	          | Anti-Evasion               |      x      |          |
320	          +----------------------------+-------------+----------+

322	                      Table 2: NGIPS Security Features

[nit] I ended up scrolling up and down to compare the tables.
It might be useful for other readers like me to merge the tables,
aka: put the columns for NGFW and NGIPS into one table.

[nit] Please start with Table 3 as it introduces the security features,
else the two above tables introduce a lot of features without defining them.


324	   The following table provides a brief description of the security
325	   features.

327	    +================+================================================+
328	    | DUT/SUT        | Description                                    |
329	    | Features       |                                                |
330	    +================+================================================+
331	    | SSL Inspection | DUT/SUT intercepts and decrypts inbound HTTPS  |
332	    |                | traffic between servers and clients.  Once the |
333	    |                | content inspection has been completed, DUT/SUT |
334	    |                | encrypts the HTTPS traffic with ciphers and    |
335	    |                | keys used by the clients and servers.          |
336	    +----------------+------------------------------------------------+
337	    | IDS/IPS        | DUT/SUT detects and blocks exploits targeting  |
338	    |                | known and unknown vulnerabilities across the   |
339	    |                | monitored network.                             |
340	    +----------------+------------------------------------------------+
341	    | Anti-Malware   | DUT/SUT detects and prevents the transmission  |
342	    |                | of malicious executable code and any           |
343	    |                | associated communications across the monitored |
344	    |                | network.  This includes data exfiltration as   |
345	    |                | well as command and control channels.          |
346	    +----------------+------------------------------------------------+
347	    | Anti-Spyware   | Anti-Spyware is a subcategory of Anti Malware. |
348	    |                | Spyware transmits information without the      |
349	    |                | user's knowledge or permission.  DUT/SUT       |
350	    |                | detects and block initial infection or         |
351	    |                | transmission of data.                          |
352	    +----------------+------------------------------------------------+
353	    | Anti-Botnet    | DUT/SUT detects traffic to or from botnets.    |
354	    +----------------+------------------------------------------------+
355	    | Anti-Evasion   | DUT/SUT detects and mitigates attacks that     |
356	    |                | have been obfuscated in some manner.           |
357	    +----------------+------------------------------------------------+
358	    | Web Filtering  | DUT/SUT detects and blocks malicious website   |
359	    |                | including defined classifications of website   |
360	    |                | across the monitored network.                  |
361	    +----------------+------------------------------------------------+
362	    | DLP            | DUT/SUT detects and prevents data breaches and |
363	    |                | data exfiltration, or it detects and blocks    |
364	    |                | the transmission of sensitive data across the  |
365	    |                | monitored network.                             |
366	    +----------------+------------------------------------------------+
367	    | Certificate    | DUT/SUT validates certificates used in         |
368	    | Validation     | encrypted communications across the monitored  |
369	    |                | network.                                       |
370	    +----------------+------------------------------------------------+
371	    | Logging and    | DUT/SUT logs and reports all traffic at the    |
372	    | Reporting      | flow level across the monitored network.       |
373	    +----------------+------------------------------------------------+
374	    | Application    | DUT/SUT detects known applications as defined  |
375	    | Identification | within the traffic mix selected across the     |
376	    |                | monitored network.                             |
377	    +----------------+------------------------------------------------+

379	                   Table 3: Security Feature Description

[nit] Why is DDoS and DPI not listed in this table ? I just randomnly stumbled across
that one, but maybe there are more mismatches between Table 1 and 2. Pls. make
sure all Table 1/2  Features are mentioned.

[nit] I have a bout 1000 questions and concerns about this stuff: Are there
actually IETF specifications for how any of these features on the DUT do work or
should work, or is this all vendor proprietary functionality ? For anything that
is vendor / market proprietary specification, how would the TE (Test Equipment)
know what the DUT does, so that it can effectively test it ? I imagine that
if there is a difference in how a particular feature functions across different
vendor DUTs, that the same is true for TE, so some TE would have more functional
overlap with DUT than others. ?

[nit (continued] E.g.: lets say some DUT1 feature , e.g.: DLP is really simple
and therefore ot very secure. But that makes it a lot faster than a DUT2 DLP
feature which is a lot more secure. Maybe there is a metric for this security,
like if i rememver correctly from the past, the number of signatures in virus
detection or the like... How would such differences be taken into account in
measurement? 

381	   Below is a summary of the DUT/SUT configuration:

383	   *  DUT/SUT MUST be configured in "inline" mode.

385	   *  "Fail-Open" behavior MUST be disabled.

387	   *  All RECOMMENDED security features are enabled.

389	   *  Logging SHOULD be enabled.  DUT/SUT SHOULD log all traffic at the
390	      flow level - Logging to an external device is permissible.

[nit] Does that mean logging of ALL flows or only of flows that trigger some
security issue ? Logging of ALL flows seems like a big performance hog and
may be something infeasible in fast deployments and may need to be tested as
a separate case by itself. (but my concern may be outdated).

[nit] If logging is to an external device, it may be useful to indicate in
Figure 1/2 such a logging receiver, and ideally have it operate via a link from the DUT that
does not pass test traffic so that it does not interfere.

392	   *  Geographical location filtering, and Application Identification
393	      and Control SHOULD be configured to trigger based on a site or
394	      application from the defined traffic mix.

[nit] Geographic location filtering does not sound like a generically necessary
or applicable security feature. If you are for example a high-tech manufacturer
that sells all over the world, you may appreciate customers visiting your
webserver from countries that happen to also host a lot of botnets. Or is this
document focussed on a more narrower set of use-cases ? E.g.: DUT only to filter
anything that could can not put into the cloud (such as web services) ? E.g.:
would be good to write up some justification for the GeoLoc SHOULD that would
then help readers to better understand when/how to conffigure and and when/how not.

396	   In addition, a realistic number of access control rules (ACL) SHOULD
397	   be configured on the DUT/SUT where ACLs are configurable and
398	   reasonable based on the deployment scenario.  This document
399	   determines the number of access policy rules for four different
400	   classes of DUT/SUT: Extra Small (XS), Small (S), Medium (M), and
401	   Large (L).  A sample DUT/SUT classification is described in
402	   Appendix B.

[mayor] IMHO, you can not put numbers such as those in Figure 3 into the main
text of the document, but the speed definitions of the four classes into an
Appendix B. It seems clear to me that the numbers in Figure 3 (and probably elsewhere) where
derived from the assumptions that the four speed classes are defined as in
Appendix B. Suggestion: inline the text of Appendix B here and mention that numbers
such as in Figure 3 are derived from the assumption of those XS/S/M/L numbers,
Add (if necessary, else not) that it may be appropriate to choose other numbers for
XS/S/M/L, but if one does that, then the dependent numbers (such as those from Figure 3)
may also need to be re-evaluated.

404	   The Access Control Rules (ACL) defined in Figure 3 MUST be configured
405	   from top to bottom in the correct order as shown in the table.  This
406	   is due to ACL types listed in specificity decreasing order, with
407	   "block" first, followed by "allow", representing a typical ACL based
408	   security policy.  The ACL entries SHOULD be configured with routable
409	   IP subnets by the DUT/SUT.  (Note: There will be differences between
410	   how security vendors implement ACL decision making.)  The configured

[nit] /security vendors/DUT/

[nit] I don't understand what i am supposed to learn from the (Note: ...) sentence.
Rephrase ? or remove.

410	   how security vendors implement ACL decision making.)  The configured
411	   ACL MUST NOT block the security and measurement traffic used for the
412	   benchmarking tests.

[nit] what is "security traffic" ? what is "measurement traffic" ?  Don't see these
terms defined before. Those two terms do not immediately click to me. I guess
measured user/client-server traffic vs. test-setup management traffic (including logging) ??
In any case introduce the terms, define them and use them consistently. Whatever they are.

414	                                                       +---------------+
415	                                                       | DUT/SUT       |
416	                                                       | Classification|
417	                                                       | # Rules       |
418	   +-----------+-----------+--------------------+------+---+---+---+---+
419	   |           | Match     |                    |      |   |   |   |   |
420	   | Rules Type| Criteria  |   Description      |Action| XS| S | M | L |
421	   +-------------------------------------------------------------------+
422	   |Application|Application| Any application    | block| 5 | 10| 20| 50|
423	   |layer      |           | not included in    |      |   |   |   |   |
424	   |           |           | the measurement    |      |   |   |   |   |
425	   |           |           | traffic            |      |   |   |   |   |
426	   +-------------------------------------------------------------------+
427	   |Transport  |SRC IP and | Any SRC IP subnet  | block| 25| 50|100|250|
428	   |layer      |TCP/UDP    | used and any DST   |      |   |   |   |   |
429	   |           |DST ports  | ports not used in  |      |   |   |   |   |
430	   |           |           | the measurement    |      |   |   |   |   |
431	   |           |           | traffic            |      |   |   |   |   |
432	   +-------------------------------------------------------------------+
433	   |IP layer   |SRC/DST IP | Any SRC/DST IP     | block| 25| 50|100|250|
434	   |           |           | subnet not used    |      |   |   |   |   |
435	   |           |           | in the measurement |      |   |   |   |   |
436	   |           |           | traffic            |      |   |   |   |   |
437	   +-------------------------------------------------------------------+

[nit] WOuld suggest to remove the word "Any" to minimize misinterpretation.

[nit] These three blocks seem to never get exercised by the actual measurement
traffic, right ? So the purpose would then be to simply load up the DUT with
them in case the DUT implementation is stupid enough to have these cause relevant
performance impacts even when not exercised by traffic. Would be good to write
this down as a rationale after the table.  Especially because the "Any" had me
confused first that in a real-world deployment you would of course not include
250 individual application/port/prefixes, but you just have some simple block-all.

[nit] Even 27 years ago i've seen routers acting as firewalls for universities
that had thousands of such ACL entries. Aka: i think these numbers are way too low.

438	   |Application|Application| Half of the        | allow| 10| 10| 10| 10|
439	   |layer      |           | applications       |      |   |   |   |   |
440	   |           |           | included in the    |      |   |   |   |   |
441	   |           |           | measurement traffic|      |   |   |   |   |
442	   |           |           |(see the note below)|      |   |   |   |   |
443	   +-------------------------------------------------------------------+
444	   |Transport  |SRC IP and | Half of the SRC    | allow| >1| >1| >1| >1|
445	   |layer      |TCP/UDP    | IPs used and any   |      |   |   |   |   |
446	   |           |DST ports  | DST ports used in  |      |   |   |   |   |
447	   |           |           | the measurement    |      |   |   |   |   |
448	   |           |           | traffic            |      |   |   |   |   |
449	   |           |           | (one rule per      |      |   |   |   |   |
450	   |           |           | subnet)            |      |   |   |   |   |
451	   +-------------------------------------------------------------------+
452	   |IP layer   |SRC IP     | The rest of the    | allow| >1| >1| >1| >1|
453	   |           |           | SRC IP subnet      |      |   |   |   |   |
454	   |           |           | range used in the  |      |   |   |   |   |
455	   |           |           | measurement        |      |   |   |   |   |
456	   |           |           | traffic            |      |   |   |   |   |
457	   |           |           | (one rule per      |      |   |   |   |   |
458	   |           |           | subnet)            |      |   |   |   |   |
459	   +-----------+-----------+--------------------+------+---+---+---+---+

[mayor] There should be an explanation of how this is supposed to work, and
it seems there are rules missing:

      rule on row 438 explicitly permits half the traffic sent by the test
      equiment. So supposedly only the other half has to be checked by rule on row 444.
      So when 444 says "Half of the SRC...", is that half of the total ? Would that
      have to be set up so that after 444 we now have 75% of the measurement
      traffic going through ?  Likewise then rule 452 does it bring the total amount
      of permitted traffic to 87.5% ?.

[nit] Ultimately, we only have "allows" here.
      Is there an assumption that after row 459 there is an implicit deny-anything-else ?
      I guess so, but it should be written out explicitly in the table.

461	                       Figure 3: DUT/SUT Access List

463	   Note: If half of the applications included in the measurement traffic
464	   is less than 10, the missing number of ACL entries (dummy rules) can
465	   be configured for any application traffic not included in the
466	   measurement traffic.

468	4.2.1.  Security Effectiveness Configuration

470	   The Security features (defined in Table 1 and Table 2) of the DUT/SUT
471	   MUST be configured effectively to detect, prevent, and report the
472	   defined security vulnerability sets.  This section defines the
473	   selection of the security vulnerability sets from Common

[nit] "from the CVE" ?!

474	   vulnerabilities and Exposures (CVE) list for the testing.  The

[nit] Add reference for CVE. (Not sure whats best spec, or wikipedia or cve.org,...)

475	   vulnerability set SHOULD reflect a minimum of 500 CVEs from no older
476	   than 10 calendar years to the current year.  These CVEs SHOULD be
477	   selected with a focus on in-use software commonly found in business
478	   applications, with a Common vulnerability Scoring System (CVSS)
479	   Severity of High (7-10).

481	   This document is primarily focused on performance benchmarking.
482	   However, it is RECOMMENDED to validate the security features
483	   configuration of the DUT/SUT by evaluating the security effectiveness
484	   as a prerequisite for performance benchmarking tests defined in the

[nit]  /in the/in/

485	   section 7.  In case the benchmarking tests are performed without
486	   evaluating security effectiveness, the test report MUST explain the
487	   implications of this.  The methodology for evaluating security
488	   effectiveness is defined in Appendix A.

490	4.3.  Test Equipment Configuration

492	   In general, test equipment allows configuring parameters in different
493	   protocol layers.  These parameters thereby influence the traffic
494	   flows which will be offered and impact performance measurements.

496	   This section specifies common test equipment configuration parameters
497	   applicable for all benchmarking tests defined in Section 7.  Any
498	   benchmarking test specific parameters are described under the test
499	   setup section of each benchmarking test individually.

501	4.3.1.  Client Configuration

503	   This section specifies which parameters SHOULD be considered while
504	   configuring clients using test equipment.  Also, this section
505	   specifies the RECOMMENDED values for certain parameters.  The values
506	   are the defaults used in most of the client operating systems
507	   currently.

509	4.3.1.1.  TCP Stack Attributes

511	   The TCP stack SHOULD use a congestion control algorithm at client and
512	   server endpoints.  The IPv4 and IPv6 Maximum Segment Size (MSS)
513	   SHOULD be set to 1460 bytes and 1440 bytes respectively and a TX and
514	   RX initial receive windows of 64 KByte.  Client initial congestion
515	   window SHOULD NOT exceed 10 times the MSS.  Delayed ACKs are
516	   permitted and the maximum client delayed ACK SHOULD NOT exceed 10
517	   times the MSS before a forced ACK.  Up to three retries SHOULD be
518	   allowed before a timeout event is declared.  All traffic MUST set the
519	   TCP PSH flag to high.  The source port range SHOULD be in the range
520	   of 1024 - 65535.  Internal timeout SHOULD be dynamically scalable per
521	   RFC 793.  The client SHOULD initiate and close TCP connections.  The
522	   TCP connection MUST be initiated via a TCP three-way handshake (SYN,
523	   SYN/ACK, ACK), and it MUST be closed via either a TCP three-way close
524	   (FIN, FIN/ACK, ACK), or a TCP four-way close (FIN, ACK, FIN, ACK).

[nit] Would be nice to have reference to where/how these parameters are determined.
Would be nice to mention why these parameters are choosen. Probably to reflect the
most common current TCP behavior that achieves best performance ?

[minor] The document mentions QUIC in three places, but has no equivalent
section for QUIC here as it has for TCP. I would suggest to add a section here,
even if it can just say "Due to the absence of suficient experience, QUIC parameters
are unspecified. Similarily to TCP, parameters should be choosen that best reflect
state-of-the art performance results for QUIC client/server traffic".

526	4.3.1.2.  Client IP Address Space

528	   The sum of the client IP space SHOULD contain the following
529	   attributes.

531	   *  The IP blocks SHOULD consist of multiple unique, discontinuous
532	      static address blocks.

534	   *  A default gateway is permitted.

[comment] How is this relevant, what do you expect it to do ? What would happen
if you just removed it ?

536	   *  The DSCP (differentiated services code point) marking is set to DF
537	      (Default Forwarding) '000000' on IPv4 Type of Service (ToS) field
538	      and IPv6 traffic class field.

540	   The following equation can be used to define the total number of
541	   client IP addresses that will be configured on the test equipment.

543	   Desired total number of client IP = Target throughput [Mbit/s] /
544	   Average throughput per IP address [Mbit/s]

546	   As shown in the example list below, the value for "Average throughput
547	   per IP address" can be varied depending on the deployment and use
548	   case scenario.

550	   (Option 1)  DUT/SUT deployment scenario 1 : 6-7 Mbit/s per IP (e.g.
551	               1,400-1,700 IPs per 10Gbit/s throughput)

553	   (Option 2)  DUT/SUT deployment scenario 2 : 0.1-0.2 Mbit/s per IP
554	               (e.g.  50,000-100,000 IPs per 10Gbit/s throughput)

556	   Based on deployment and use case scenario, client IP addresses SHOULD
557	   be distributed between IPv4 and IPv6.  The following options MAY be
558	   considered for a selection of traffic mix ratio.

560	   (Option 1)  100 % IPv4, no IPv6

562	   (Option 2)  80 % IPv4, 20% IPv6

564	   (Option 3)  50 % IPv4, 50% IPv6

566	   (Option 4)  20 % IPv4, 80% IPv6

568	   (Option 5)  no IPv4, 100% IPv6

[minor] This guidance is IMHO not very helpfull. It seems to me the first guidance
seems to be that the percentage of IPv4 vs. IPv6 addresses should be based on the
relevant ratio of IPv4 vs. IPv6 traffic in the target deployment because the way
the test setup is done, some N% IPv4 addresses will also roughly result in N% IPv4 traffic
in the test.

That type of explanation might be very helpfull, because the risk is that readers may
think they can derive the percentage of test IPv4/IPv6 addresses from the ratio 
of IPv4/IPv6 addresses in the target deployment, but that very often will not work:

For example in the common dual-stack deployment, every client has an IPv4 and an IPv6 address,
so its 50% IPv4, but the actual percentage of IPv4 traffic will very much depend on the
application scenario. Some enterprises may go up to 90% or more IPv6 traffic if the main
traffic is all newer cloud services traffic. An vice versa, it could be as little as 10% IPv6
if all the cloud services are legacy apps in the cloud not supporting IPv6.

570	   Note: The IANA has assigned IP address range for the testing purpose
571	   as described in Section 8.  If the test scenario requires more IP
572	   addresses or subnets than the IANA assigned, this document recommends
573	   using non routable Private IPv4 address ranges or Unique Local
574	   Address (ULA) IPv6 address ranges for the testing.

[minor] See comments in Section 8. It might be useful to merge the text of this
paragraph with the one in Section 8, else the addressing recommendations are
somewhat split in the middle.

[minor] It would be prudent to add a disclaimer that this document does not consider
to determine whether DUT may emobdy optimizations in performance behavior for known testing
address ranges. Such a disclaimer may be more general and go on the end of the
document, e.g.: before IANA section - no considerations against DUT optimizations of known
test scenarios including addressing ranges or other test profile specific parameters.

576	4.3.1.3.  Emulated Web Browser Attributes

578	   The client emulated web browser (emulated browser) contains
579	   attributes that will materially affect how traffic is loaded.  The

[nit] what does "how traffic is loaded" mean ? Rephrase.

580	   objective is to emulate modern, typical browser attributes to improve
581	   realism of the result set.

[nit] /result set/resulting traffic/ ?

583	   For HTTP traffic emulation, the emulated browser MUST negotiate HTTP
584	   version 1.1 or higher.  Depending on test scenarios and chosen HTTP
585	   version, the emulated browser MAY open multiple TCP connections per
586	   Server endpoint IP at any time depending on how many sequential
587	   transactions need to be processed.  For HTTP/2 or HTTP/3, the
588	   emulated browser MAY open multiple concurrent streams per connection
589	   (multiplexing).  HTTP/3 emulated browser uses QUIC ([RFC9000]) as
590	   transport protocol.  HTTP settings such as number of connection per
591	   server IP, number of requests per connection, and number of streams
592	   per connection MUST be documented.  This document refers to [RFC8446]
593	   for HTTP/2.  The emulated browser SHOULD advertise a User-Agent
594	   header.  The emulated browser SHOULD enforce content length
595	   validation.  Depending on test scenarios and selected HTTP version,
596	   HTTP header compression MAY be set to enable or disable.  This
597	   setting (compression enabled or disabled) MUST be documented in the
598	   report.

600	   For encrypted traffic, the following attributes SHALL define the
601	   negotiated encryption parameters.  The test clients MUST use TLS
602	   version 1.2 or higher.  TLS record size MAY be optimized for the

[minor] I would bet SEC review will challenge you to comment on TLS 1.3.
Would make sense to add a sentence stating that the ratio of TLS 1.2 vs TLS 1.3
traffic should be choosen based on expected target deployment and may range
from 100% TLS 1.2 to 100% TLS 1.3. In the absence of known ratios, a 50/50%
ratio is RECOMMENDED.

602	   version 1.2 or higher.  TLS record size MAY be optimized for the
603	   HTTPS response object size up to a record size of 16 KByte.  If
604	   Server Name Indication (SNI) is required in the traffic mix profile,
605	   the client endpoint MUST send TLS extension Server Name Indication
606	   (SNI) information when opening a security tunnel.  Each client

[minor] SNI is pretty standard today. I would remove the "if" and make the
whole sentence a MUST.

606	   (SNI) information when opening a security tunnel.  Each client
607	   connection MUST perform a full handshake with server certificate and
608	   MUST NOT use session reuse or resumption.

610	   The following TLS 1.2 supported ciphers and keys are RECOMMENDED to
611	   use for HTTPS based benchmarking tests defined in Section 7.

613	   1.  ECDHE-ECDSA-AES128-GCM-SHA256 with Prime256v1 (Signature Hash
614	       Algorithm: ecdsa_secp256r1_sha256 and Supported group: secp256r1)

616	   2.  ECDHE-RSA-AES128-GCM-SHA256 with RSA 2048 (Signature Hash
617	       Algorithm: rsa_pkcs1_sha256 and Supported group: secp256r1)

619	   3.  ECDHE-ECDSA-AES256-GCM-SHA384 with Secp521 (Signature Hash
620	       Algorithm: ecdsa_secp384r1_sha384 and Supported group: secp521r1)

622	   4.  ECDHE-RSA-AES256-GCM-SHA384 with RSA 4096 (Signature Hash
623	       Algorithm: rsa_pkcs1_sha384 and Supported group: secp256r1)

625	   Note: The above ciphers and keys were those commonly used enterprise
626	   grade encryption cipher suites for TLS 1.2.  It is recognized that
627	   these will evolve over time.  Individual certification bodies SHOULD
628	   use ciphers and keys that reflect evolving use cases.  These choices
629	   MUST be documented in the resulting test reports with detailed
630	   information on the ciphers and keys used along with reasons for the
631	   choices.

633	   [RFC8446] defines the following cipher suites for use with TLS 1.3.

635	   1.  TLS_AES_128_GCM_SHA256

637	   2.  TLS_AES_256_GCM_SHA384

639	   3.  TLS_CHACHA20_POLY1305_SHA256

641	   4.  TLS_AES_128_CCM_SHA256

643	   5.  TLS_AES_128_CCM_8_SHA256

645	4.3.2.  Backend Server Configuration

647	   This section specifies which parameters should be considered while
648	   configuring emulated backend servers using test equipment.

650	4.3.2.1.  TCP Stack Attributes

652	   The TCP stack on the server side SHOULD be configured similar to the
653	   client side configuration described in Section 4.3.1.1.  In addition,
654	   server initial congestion window MUST NOT exceed 10 times the MSS.
655	   Delayed ACKs are permitted and the maximum server delayed ACK MUST
656	   NOT exceed 10 times the MSS before a forced ACK.

658	4.3.2.2.  Server Endpoint IP Addressing

660	   The sum of the server IP space SHOULD contain the following
661	   attributes.

663	   *  The server IP blocks SHOULD consist of unique, discontinuous
664	      static address blocks with one IP per server Fully Qualified
665	      Domain Name (FQDN) endpoint per test port.

[minor] The "per FQDN per test port" is likely underspecified/confusing.
How would you recommend to configure the testbed if the same FQDN may be reachable
across more than one DUT server port and the DUT is doing load balancing ?
If that is not supposed to be considered, then it seems as if every FQDN is
supposed to be reachable across only one DUT port, but then the sentence
ikely should just say "per FQDN" (without the "per test port qualification").
Not 100% sure...

[minor] Especially for IPv4, there is obviously a big trend in DC to save
IPv4 address space by using SNI. Therefore a realistic scanerio would be
to have more than one FQDN per IPv4 address. Maybe as high as 10:1 (guesswork).
In any case i think it is prudent to include testing of such SNI overload
of IP addresses because it likely can impact performance (demux of processing
state not solely based on 5-tuple).

667	   *  A default gateway is permitted.  The DSCP (differentiated services

[minor] Again wondering why default gateway adds value to the doc.

667	   *  A default gateway is permitted.  The DSCP (differentiated services
668	      code point) marking is set to DF (Default Forwarding) '000000' on
669	      IPv4 Type of Service (ToS) field and IPv6 traffic class field.

671	   *  The server IP addresses SHOULD be distributed between IPv4 and
672	      IPv6 with a ratio identical to the clients distribution ratio.

674	   Note: The IANA has assigned IP address range for the testing purpose
675	   as described in Section 8.  If the test scenario requires more IP
676	   addresses or subnets than the IANA assigned, this document recommends
677	   using non routable Private IPv4 address ranges or Unique Local
678	   Address (ULA) IPv6 address ranges for the testing.

[minor] same note about moving these addressing recommendations out as in the
client section.

680	4.3.2.3.  HTTP / HTTPS Server Pool Endpoint Attributes

682	   The server pool for HTTP SHOULD listen on TCP port 80 and emulate the
683	   same HTTP version (HTTP 1.1 or HTTP/2 or HTTP/3) and settings chosen
684	   by the client (emulated web browser).  The Server MUST advertise
685	   server type in the Server response header [RFC7230].  For HTTPS
686	   server, TLS 1.2 or higher MUST be used with a maximum record size of
687	   16 KByte and MUST NOT use ticket resumption or session ID reuse.  The
688	   server SHOULD listen on TCP port 443 for HTTP version 1.1 and 2.  For
689	   HTTP/3 (HTTP over QUIC) the server SHOULD listen on UDP 443.  The
690	   server SHALL serve a certificate to the client.  The HTTPS server
691	   MUST check host SNI information with the FQDN if SNI is in use.
692	   Cipher suite and key size on the server side MUST be configured
693	   similar to the client side configuration described in
694	   Section 4.3.1.3.

696	4.3.3.  Traffic Flow Definition

698	   This section describes the traffic pattern between client and server
699	   endpoints.  At the beginning of the test, the server endpoint
700	   initializes and will be ready to accept connection states including
701	   initialization of the TCP stack as well as bound HTTP and HTTPS
702	   servers.  When a client endpoint is needed, it will initialize and be
703	   given attributes such as a MAC and IP address.  The behavior of the
704	   client is to sweep through the given server IP space, generating a
705	   recognizable service by the DUT.  Sequential and pseudorandom sweep
706	   methods are acceptable.  The method used MUST be stated in the final
707	   report.  Thus, a balanced mesh between client endpoints and server
708	   endpoints will be generated in a client IP and port to server IP and
709	   port combination.  Each client endpoint performs the same actions as
710	   other endpoints, with the difference being the source IP of the
711	   client endpoint and the target server IP pool.  The client MUST use
712	   the server IP address or FQDN in the host header [RFC7230].

[minor] given the prevalence of SNI centric server selection, i would suggest
to change server IP to server FQDN and note that server IP is simply derived from
server FQDN. Likewise server port is dervice from server protocol, which seems
to be just HTTP or HTTPs, so its unclear to me where we would get ports different
from 80 and 443 (maybe thats mentioned later). Aka: server Port may not be
relevant to mention.


714	4.3.3.1.  Description of Intra-Client Behavior

716	   Client endpoints are independent of other clients that are
717	   concurrently executing.  When a client endpoint initiates traffic,
718	   this section describes how the client steps through different
719	   services.  Once the test is initialized, the client endpoints
720	   randomly hold (perform no operation) for a few milliseconds for
721	   better randomization of the start of client traffic.  Each client
722	   will either open a new TCP connection or connect to a TCP persistence
723	   stack still open to that specific server.  At any point that the
724	   traffic profile may require encryption, a TLS encryption tunnel will
725	   form presenting the URL or IP address request to the server.  If
726	   using SNI, the server MUST then perform an SNI name check with the
727	   proposed FQDN compared to the domain embedded in the certificate.
728	   Only when correct, will the server process the HTTPS response object.
729	   The initial response object to the server is based on benchmarking
730	   tests described in Section 7.  Multiple additional sub-URLs (response
731	   objects on the service page) MAY be requested simultaneously.  This
732	   MAY be to the same server IP as the initial URL.  Each sub-object
733	   will also use a canonical FQDN and URL path, as observed in the
734	   traffic mix used.

[minor] This may be necessary to keep the configuration complexity at bay,
but in practice each particular IP client will likely exhibit quite different
traffic profiles. One may continuously request HTTP video segments when
streaming video. Another one may continuously do WebRTC (zoom), and the like.
BY having every client randomnly do all the services (this is what i figure from
above description), you forego the important performance aspect of "worst hit
client" if the DUT exhibits specific issues with specific services (false
filtering, performance degradation etc..). IMHO it would be great if test
equipment could create different client traffic profiles by segmentation of
the possible application space into groups and then assign new clients
randomnly to groups. Beside being able to easier find performance issues,
it is also resulting in more real-world performance, which might be higher.
For example in a multi-core CPU based DUT, there may be heuristics of
assigning different clients traffic to different CPU cores, so that L1..L3
cache of the CPU core can be better kept focussed on the codespace for
a particular type of client inspection. (just guessing).

736	4.3.4.  Traffic Load Profile

738	   The loading of traffic is described in this section.  The loading of
739	   a traffic load profile has five phases: Init, ramp up, sustain, ramp
740	   down, and collection.

742	   1.  Init phase: Testbed devices including the client and server
743	       endpoints should negotiate layer 2-3 connectivity such as MAC
744	       learning and ARP.  Only after successful MAC learning or ARP/ND
745	       resolution SHALL the test iteration move to the next phase.  No
746	       measurements are made in this phase.  The minimum RECOMMENDED
747	       time for Init phase is 5 seconds.  During this phase, the
748	       emulated clients SHOULD NOT initiate any sessions with the DUT/
749	       SUT, in contrast, the emulated servers should be ready to accept
750	       requests from DUT/SUT or from emulated clients.

752	   2.  Ramp up phase: The test equipment SHOULD start to generate the
753	       test traffic.  It SHOULD use a set of the approximate number of
754	       unique client IP addresses to generate traffic.  The traffic
755	       SHOULD ramp up from zero to desired target objective.  The target
756	       objective is defined for each benchmarking test.  The duration
757	       for the ramp up phase MUST be configured long enough that the
758	       test equipment does not overwhelm the DUT/SUTs stated performance
759	       metrics defined in Section 6.3 namely, TCP Connections Per
760	       Second, Inspected Throughput, Concurrent TCP Connections, and
761	       Application Transactions Per Second.  No measurements are made in
762	       this phase.

764	   3.  Sustain phase: Starts when all required clients are active and
765	       operating at their desired load condition.  In the sustain phase,
766	       the test equipment SHOULD continue generating traffic to constant
767	       target value for a constant number of active clients.  The
768	       minimum RECOMMENDED time duration for sustain phase is 300
769	       seconds.  This is the phase where measurements occur.  The test
770	       equipment SHOULD measure and record statistics continuously.  The
771	       sampling interval for collecting the raw results and calculating
772	       the statistics SHOULD be less than 2 seconds.

774	   4.  Ramp down phase: No new connections are established, and no
775	       measurements are made.  The time duration for ramp up and ramp
776	       down phase SHOULD be the same.

778	   5.  Collection phase: The last phase is administrative and will occur
779	       when the test equipment merges and collates the report data.

781	5.  Testbed Considerations

783	   This section describes steps for a reference test (pre-test) that
784	   control the test environment including test equipment, focusing on
785	   physical and virtualized environments and as well as test equipments.
786	   Below are the RECOMMENDED steps for the reference test.

788	   1.  Perform the reference test either by configuring the DUT/SUT in
789	       the most trivial setup (fast forwarding) or without presence of

[nit] Define/explain or provide reference for "fast forwarding".

790	       the DUT/SUT.

[minor] Is the DUT/SUT assumed to operate as a router or transparent L2 switch ?
Asking because "or without presence" should be amended (IMHO) with mentioning
that instead of the DUT one would put a router or switch in its place that
is pre-loaded with a config equivalent to that of the DUT but without any
seurity functions, just passing traffic at rates to bring the TE to its limits.

792	   2.  Generate traffic from traffic generator.  Choose a traffic
793	       profile used for HTTP or HTTPS throughput performance test with
794	       smallest object size.

796	   3.  Ensure that any ancillary switching or routing functions added in
797	       the test equipment does not limit the performance by introducing
798	       network metrics such as packet loss and latency.  This is
799	       specifically important for virtualized components (e.g.,
800	       vSwitches, vRouters).

802	   4.  Verify that the generated traffic (performance) of the test
803	       equipment matches and reasonably exceeds the expected maximum
804	       performance of the DUT/SUT.

806	   5.  Record the network performance metrics packet loss latency
807	       introduced by the test environment (without DUT/SUT).

809	   6.  Assert that the testbed characteristics are stable during the
810	       entire test session.  Several factors might influence stability
811	       specifically, for virtualized testbeds.  For example, additional
812	       workloads in a virtualized system, load balancing, and movement
813	       of virtual machines during the test, or simple issues such as
814	       additional heat created by high workloads leading to an emergency
815	       CPU performance reduction.

[minor] Add something to test the performance of the logging system. Without
DUT actually generating logging, this will so far not have been validated.
Maybe TE can generate logging records ? Especially burst logging from DUT
without loss is important to verify (no packet loss of logged events).

817	   The reference test SHOULD be performed before the benchmarking tests
818	   (described in section 7) start.

820	6.  Reporting

[minor] I would swap section 6 and 7, because it is problematic to read what's
to be reported without knowing whats to be measured first. For example, when i
read 6. first it was not clear to me if/how you would test the performance limits,
so the report data had a lot of questions for me.

Of course when you do run the testbed you first should have read both sections first.

822	   This section describes how the benchmarking test report should be
823	   formatted and presented.  It is RECOMMENDED to include two main
824	   sections in the report, namely the introduction and the detailed test
825	   results sections.

827	6.1.  Introduction

829	   The following attributes SHOULD be present in the introduction
830	   section of the test report.

[minor] I'd suggest to say here that the test report needs to include all information
sufficient for independent third-party reproduction of the test setup to permit
third party falsification of the test results. This includes but may not be limited
to the following...
 

832	   1.  The time and date of the execution of the tests

834	   2.  Summary of testbed software and hardware details
835	       a.  DUT/SUT hardware/virtual configuration

837	           *  This section SHOULD clearly identify the make and model of
838	              the DUT/SUT

840	           *  The port interfaces, including speed and link information

842	           *  If the DUT/SUT is a Virtual Network Function (VNF), host
843	              (server) hardware and software details, interface
844	              acceleration type such as DPDK and SR-IOV, used CPU cores,
845	              used RAM, resource sharing (e.g.  Pinning details and NUMA
846	              Node) configuration details, hypervisor version, virtual
847	              switch version

849	           *  details of any additional hardware relevant to the DUT/SUT
850	              such as controllers

852	       b.  DUT/SUT software

854	           *  Operating system name

856	           *  Version

858	           *  Specific configuration details (if any)

[minor] Any software details necessary and sufficient to preproduce the software setup of DUT/SUT.

860	       c.  DUT/SUT enabled features

862	           *  Configured DUT/SUT features (see Table 1 and Table 2)

864	           *  Attributes of the above-mentioned features

866	           *  Any additional relevant information about the features

868	       d.  Test equipment hardware and software

870	           *  Test equipment vendor name

872	           *  Hardware details including model number, interface type

874	           *  Test equipment firmware and test application software
875	              version

877	       e.  Key test parameters

879	           *  Used cipher suites and keys

881	           *  IPv4 and IPv6 traffic distribution
882	           *  Number of configured ACL

884	       f.  Details of application traffic mix used in the benchmarking
885	           test "Throughput Performance with Application Traffic Mix"
886	           (Section 7.1)

888	           *  Name of applications and layer 7 protocols

890	           *  Percentage of emulated traffic for each application and
891	              layer 7 protocols

893	           *  Percentage of encrypted traffic and used cipher suites and
894	              keys (The RECOMMENDED ciphers and keys are defined in
895	              Section 4.3.1.3)

897	           *  Used object sizes for each application and layer 7
898	              protocols

900	   3.  Results Summary / Executive Summary

902	       a.  Results SHOULD resemble a pyramid in how it is reported, with
903	           the introduction section documenting the summary of results
904	           in a prominent, easy to read block.

906	6.2.  Detailed Test Results

908	   In the result section of the test report, the following attributes
909	   SHOULD be present for each benchmarking test.

911	   a.  KPIs MUST be documented separately for each benchmarking test.
912	       The format of the KPI metrics SHOULD be presented as described in
913	       Section 6.3.

915	   b.  The next level of details SHOULD be graphs showing each of these
916	       metrics over the duration (sustain phase) of the test.  This
917	       allows the user to see the measured performance stability changes
918	       over time.

920	6.3.  Benchmarks and Key Performance Indicators

922	   This section lists key performance indicators (KPIs) for overall
923	   benchmarking tests.  All KPIs MUST be measured during the sustain
924	   phase of the traffic load profile described in Section 4.3.4.  All
925	   KPIs MUST be measured from the result output of test equipment.

[minor] At some other place of the document i think to remember observing of
DUT self-reporting. Shouldn't then the self-reporting of the DUT be vetted as well,
e.g.: compared against the TE report data ?

927	   *  Concurrent TCP Connections
928	      The aggregate number of simultaneous connections between hosts
929	      across the DUT/SUT, or between hosts and the DUT/SUT (defined in
930	      [RFC2647]).

[minor] Add reference to section in rfc2647 where this is defined. Also: If you
refer but not reproduce

932	   *  TCP Connections Per Second

934	      The average number of successfully established TCP connections per
935	      second between hosts across the DUT/SUT, or between hosts and the
936	      DUT/SUT.  The TCP connection MUST be initiated via a TCP three-way
937	      handshake (SYN, SYN/ACK, ACK).  Then the TCP session data is sent.
938	      The TCP session MUST be closed via either a TCP three-way close
939	      (FIN, FIN/ACK, ACK), or a TCP four-way close (FIN, ACK, FIN, ACK),
940	      and MUST NOT by RST.

942	   *  Application Transactions Per Second

944	      The average number of successfully completed transactions per
945	      second.  For a particular transaction to be considered successful,
946	      all data MUST have been transferred in its entirety.  In case of
947	      HTTP(S) transactions, it MUST have a valid status code (200 OK),
948	      and the appropriate FIN, FIN/ACK sequence MUST have been
949	      completed.

951	   *  TLS Handshake Rate

953	      The average number of successfully established TLS connections per
954	      second between hosts across the DUT/SUT, or between hosts and the
955	      DUT/SUT.

957	   *  Inspected Throughput

959	      The number of bits per second of examined and allowed traffic a
960	      network security device is able to transmit to the correct
961	      destination interface(s) in response to a specified offered load.
962	      The throughput benchmarking tests defined in Section 7 SHOULD
963	      measure the average Layer 2 throughput value when the DUT/SUT is
964	      "inspecting" traffic.  This document recommends presenting the
965	      inspected throughput value in Gbit/s rounded to two places of
966	      precision with a more specific Kbit/s in parenthesis.

968	   *  Time to First Byte (TTFB)

970	      TTFB is the elapsed time between the start of sending the TCP SYN
971	      packet from the client and the client receiving the first packet
972	      of application data from the server or DUT/SUT.  The benchmarking
973	      tests HTTP Transaction Latency (Section 7.4) and HTTPS Transaction
974	      Latency (Section 7.8) measure the minimum, average and maximum
975	      TTFB.  The value SHOULD be expressed in milliseconds.

977	   *  URL Response time / Time to Last Byte (TTLB)

979	      URL Response time / TTLB is the elapsed time between the start of
980	      sending the TCP SYN packet from the client and the client
981	      receiving the last packet of application data from the server or
982	      DUT/SUT.  The benchmarking tests HTTP Transaction Latency
983	      (Section 7.4) and HTTPS Transaction Latency (Section 7.8) measure
984	      the minimum, average and maximum TTLB.  The value SHOULD be
985	      expressed in millisecond.

[minor] Up to this point i don't think the report would include comparison for these
KPI between no-DUT-present vs. DUT-present. Is that true ? How then is the reaader of
the report meant to be able to vet the relative impact of the DUT for all these metrics 
vs. DUT not being present ?

987	7.  Benchmarking Tests

[minor] I think it would be good to insert here some descriptive and comparative overview
of the tests from the different 7.x sections.

For example, i guess (but don't know from the test), that the 7.1 test should ?
perform throughput test for non-http/https applications, or else if all the applications
in 7.1 would be http/https, then it would duplicate the results of 7.3 and 7.7, right ?
Not sure though if/where it is written out that you therefore want a traffic mix of
only non-HTTP/HTTPS application traffic for 7.1.

If instead the customer relevant application mix (7.1.1) does include some percentage
of HTTP/HTTP applications, then shouldn't all the tests, even those focussing on the
HTTP/HTTPs characteristic also always include the non-HTTP/HTTPs application flows as
kind of "background" traffic, even if not measured in the tests of particular 7.x sub-section ?

[minor] Section 7. is a lot of work to get right. I observe that there is a lot of
procedural replication across the steps. It would be easier to read if all that
duplication was removed and described once - such as the initial/max/iterative step
description. But i can understand how much work this might be, to then especially
extraxct only the differences for each 7.x and only describe those 7.x differences there.

989	7.1.  Throughput Performance with Application Traffic Mix

991	7.1.1.  Objective

993	   Using a relevant application traffic mix, determine the sustainable
994	   inspected throughput supported by the DUT/SUT.

996	   Based on the test customer's specific use case, testers can choose
997	   the relevant application traffic mix for this test.  The details
998	   about the traffic mix MUST be documented in the report.  At least the
999	   following traffic mix details MUST be documented and reported
1000	   together with the test results:

1002	      Name of applications and layer 7 protocols

1004	      Percentage of emulated traffic for each application and layer 7
1005	      protocol

1007	      Percentage of encrypted traffic and used cipher suites and keys
1008	      (The RECOMMENDED ciphers and keys are defined in Section 4.3.1.3.)

1010	      Used object sizes for each application and layer 7 protocols

1012	7.1.2.  Test Setup

1014	   Testbed setup MUST be configured as defined in Section 4.  Any
1015	   benchmarking test specific testbed configuration changes MUST be
1016	   documented.

1018	7.1.3.  Test Parameters

1020	   In this section, the benchmarking test specific parameters SHOULD be
1021	   defined.

1023	7.1.3.1.  DUT/SUT Configuration Parameters

1025	   DUT/SUT parameters MUST conform to the requirements defined in
1026	   Section 4.2.  Any configuration changes for this specific
1027	   benchmarking test MUST be documented.  In case the DUT/SUT is
1028	   configured without SSL inspection, the test report MUST explain the
1029	   implications of this to the relevant application traffic mix
1030	   encrypted traffic.

[nit] /SSL inspection/SSL Inspection/ - capitalized in all other places in the doc.

[minor] I am not quite familiar with the details, so i hope a ready knows what
the "MUST explain the implication" means.

[minor] What is the equivalent for TLS (inspection), and why is it not equally mentioned ?

1032	7.1.3.2.  Test Equipment Configuration Parameters

1034	   Test equipment configuration parameters MUST conform to the
1035	   requirements defined in Section 4.3.  The following parameters MUST
1036	   be documented for this benchmarking test:

1038	      Client IP address range defined in Section 4.3.1.2

1040	      Server IP address range defined in Section 4.3.2.2

1042	      Traffic distribution ratio between IPv4 and IPv6 defined in
1043	      Section 4.3.1.2

1045	      Target inspected throughput: Aggregated line rate of interface(s)
1046	      used in the DUT/SUT or the value defined based on requirement for
1047	      a specific deployment scenario

[minor] maybe add: or based on DUT specified performance limits (DUT may not always
provide "linerate" throughput, so the ultimate test would be to see if/how much of
the vendor promised performance is reachable.

1049	      Initial throughput: 10% of the "Target inspected throughput" Note:
1050	      Initial throughput is not a KPI to report.  This value is
1051	      configured on the traffic generator and used to perform Step 1:
1052	      "Test Initialization and Qualification" described under the
1053	      Section 7.1.4.

1055	      One of the ciphers and keys defined in Section 4.3.1.3 are
1056	      RECOMMENDED to use for this benchmarking test.

1058	7.1.3.3.  Traffic Profile

1060	   Traffic profile: This test MUST be run with a relevant application
1061	   traffic mix profile.

1063	7.1.3.4.  Test Results Validation Criteria

1065	   The following criteria are the test results validation criteria.  The
1066	   test results validation criteria MUST be monitored during the whole
1067	   sustain phase of the traffic load profile.

1069	   a.  Number of failed application transactions (receiving any HTTP
1070	       response code other than 200 OK) MUST be less than 0.001% (1 out
1071	       of 100,000 transactions) of total attempted transactions.

[minor] So this is the right number, as opposed to the 0.01% in A.4...
If you don't intend to fix A.4 (requested there), pls. explain the reason for the
difference.

1073	   b.  Number of Terminated TCP connections due to unexpected TCP RST
1074	       sent by DUT/SUT MUST be less than 0.001% (1 out of 100,000
1075	       connections) of total initiated TCP connections.

1077	7.1.3.5.  Measurement

1079	   Following KPI metrics MUST be reported for this benchmarking test:

1081	   Mandatory KPIs (benchmarks): Inspected Throughput, TTFB (minimum,
1082	   average, and maximum), TTLB (minimum, average, and maximum) and
1083	   Application Transactions Per Second

1085	   Note: TTLB MUST be reported along with the object size used in the
1086	   traffic profile.

1088	   Optional KPIs: TCP Connections Per Second and TLS Handshake Rate

[minor] I would prefer for TCP connections to be mandatory too. Makes it easier to
communicate test data with lower layer folks. FOr example, network layer equipment
often has per 5-tuple flow state also with build/churn-rate limits, so to match
a security SUT with the other networking equipment this TCP connection rate rate
is quite important.

1090	7.1.4.  Test Procedures and Expected Results

1092	   The test procedures are designed to measure the inspected throughput
1093	   performance of the DUT/SUT at the sustaining period of traffic load
1094	   profile.  The test procedure consists of three major steps: Step 1
1095	   ensures the DUT/SUT is able to reach the performance value (initial
1096	   throughput) and meets the test results validation criteria when it
1097	   was very minimally utilized.  Step 2 determines the DUT/SUT is able
1098	   to reach the target performance value within the test results
1099	   validation criteria.  Step 3 determines the maximum achievable
1100	   performance value within the test results validation criteria.

1102	   This test procedure MAY be repeated multiple times with different IP
1103	   types: IPv4 only, IPv6 only, and IPv4 and IPv6 mixed traffic
1104	   distribution.

1106	7.1.4.1.  Step 1: Test Initialization and Qualification

1108	   Verify the link status of all connected physical interfaces.  All
1109	   interfaces are expected to be in "UP" status.

1111	   Configure traffic load profile of the test equipment to generate test
1112	   traffic at the "Initial throughput" rate as described in
1113	   Section 7.1.3.2.  The test equipment SHOULD follow the traffic load
1114	   profile definition as described in Section 4.3.4.  The DUT/SUT SHOULD
1115	   reach the "Initial throughput" during the sustain phase.  Measure all
1116	   KPI as defined in Section 7.1.3.5.  The measured KPIs during the
1117	   sustain phase MUST meet all the test results validation criteria
1118	   defined in Section 7.1.3.4.

1120	   If the KPI metrics do not meet the test results validation criteria,
1121	   the test procedure MUST NOT be continued to step 2.

1123	7.1.4.2.  Step 2: Test Run with Target Objective

1125	   Configure test equipment to generate traffic at the "Target inspected
1126	   throughput" rate defined in Section 7.1.3.2.  The test equipment
1127	   SHOULD follow the traffic load profile definition as described in
1128	   Section 4.3.4.  The test equipment SHOULD start to measure and record
1129	   all specified KPIs.  Continue the test until all traffic profile
1130	   phases are completed.

1132	   Within the test results validation criteria, the DUT/SUT is expected
1133	   to reach the desired value of the target objective ("Target inspected
1134	   throughput") in the sustain phase.  Follow step 3, if the measured
1135	   value does not meet the target value or does not fulfill the test
1136	   results validation criteria.

1138	7.1.4.3.  Step 3: Test Iteration

1140	   Determine the achievable average inspected throughput within the test
1141	   results validation criteria.  Final test iteration MUST be performed
1142	   for the test duration defined in Section 4.3.4.

1144	7.2.  TCP/HTTP Connections Per Second

1146	7.2.1.  Objective

1148	   Using HTTP traffic, determine the sustainable TCP connection
1149	   establishment rate supported by the DUT/SUT under different
1150	   throughput load conditions.

1152	   To measure connections per second, test iterations MUST use different
1153	   fixed HTTP response object sizes (the different load conditions)
1154	   defined in Section 7.2.3.2.

1156	7.2.2.  Test Setup

1158	   Testbed setup SHOULD be configured as defined in Section 4.  Any
1159	   specific testbed configuration changes (number of interfaces and
1160	   interface type, etc.)  MUST be documented.

1162	7.2.3.  Test Parameters

1164	   In this section, benchmarking test specific parameters SHOULD be
1165	   defined.

1167	7.2.3.1.  DUT/SUT Configuration Parameters

1169	   DUT/SUT parameters MUST conform to the requirements defined in
1170	   Section 4.2.  Any configuration changes for this specific
1171	   benchmarking test MUST be documented.

1173	7.2.3.2.  Test Equipment Configuration Parameters

1175	   Test equipment configuration parameters MUST conform to the
1176	   requirements defined in Section 4.3.  The following parameters MUST
1177	   be documented for this benchmarking test:

1179	   Client IP address range defined in Section 4.3.1.2

1181	   Server IP address range defined in Section 4.3.2.2

1183	   Traffic distribution ratio between IPv4 and IPv6 defined in
1184	   Section 4.3.1.2

1186	   Target connections per second: Initial value from product datasheet
1187	   or the value defined based on requirement for a specific deployment
1188	   scenario

1190	   Initial connections per second: 10% of "Target connections per
1191	   second" (Note: Initial connections per second is not a KPI to report.
1192	   This value is configured on the traffic generator and used to perform
1193	   the Step1: "Test Initialization and Qualification" described under
1194	   the Section 7.2.4.

1196	   The client SHOULD negotiate HTTP and close the connection with FIN
1197	   immediately after completion of one transaction.  In each test
1198	   iteration, client MUST send GET request requesting a fixed HTTP
1199	   response object size.

1201	   The RECOMMENDED response object sizes are 1, 2, 4, 16, and 64 KByte.

1203	7.2.3.3.  Test Results Validation Criteria

1205	   The following criteria are the test results validation criteria.  The
1206	   Test results validation criteria MUST be monitored during the whole
1207	   sustain phase of the traffic load profile.

1209	   a.  Number of failed application transactions (receiving any HTTP
1210	       response code other than 200 OK) MUST be less than 0.001% (1 out
1211	       of 100,000 transactions) of total attempted transactions.

1213	   b.  Number of terminated TCP connections due to unexpected TCP RST
1214	       sent by DUT/SUT MUST be less than 0.001% (1 out of 100,000
1215	       connections) of total initiated TCP connections.

1217	   c.  During the sustain phase, traffic SHOULD be forwarded at a
1218	       constant rate (considered as a constant rate if any deviation of
1219	       traffic forwarding rate is less than 5%).

1221	   d.  Concurrent TCP connections MUST be constant during steady state
1222	       and any deviation of concurrent TCP connections SHOULD be less
1223	       than 10%. This confirms the DUT opens and closes TCP connections
1224	       at approximately the same rate.

1226	7.2.3.4.  Measurement

1228	   TCP Connections Per Second MUST be reported for each test iteration
1229	   (for each object size).

[minor] Add Variance or min/max rates to report in case above point d (line 1221) problem does
exist ?

1231	7.2.4.  Test Procedures and Expected Results

1233	   The test procedure is designed to measure the TCP connections per
1234	   second rate of the DUT/SUT at the sustaining period of the traffic
1235	   load profile.  The test procedure consists of three major steps: Step
1236	   1 ensures the DUT/SUT is able to reach the performance value (Initial
1237	   connections per second) and meets the test results validation
1238	   criteria when it was very minimally utilized.  Step 2 determines the
1239	   DUT/SUT is able to reach the target performance value within the test
1240	   results validation criteria.  Step 3 determines the maximum
1241	   achievable performance value within the test results validation
1242	   criteria.

1244	   This test procedure MAY be repeated multiple times with different IP
1245	   types: IPv4 only, IPv6 only, and IPv4 and IPv6 mixed traffic
1246	   distribution.

1248	7.2.4.1.  Step 1: Test Initialization and Qualification

1250	   Verify the link status of all connected physical interfaces.  All
1251	   interfaces are expected to be in "UP" status.

1253	   Configure the traffic load profile of the test equipment to establish
1254	   "Initial connections per second" as defined in Section 7.2.3.2.  The
1255	   traffic load profile SHOULD be defined as described in Section 4.3.4.

1257	   The DUT/SUT SHOULD reach the "Initial connections per second" before
1258	   the sustain phase.  The measured KPIs during the sustain phase MUST
1259	   meet all the test results validation criteria defined in
1260	   Section 7.2.3.3.

1262	   If the KPI metrics do not meet the test results validation criteria,
1263	   the test procedure MUST NOT continue to "Step 2".

1265	7.2.4.2.  Step 2: Test Run with Target Objective

1267	   Configure test equipment to establish the target objective ("Target
1268	   connections per second") defined in Section 7.2.3.2.  The test
1269	   equipment SHOULD follow the traffic load profile definition as
1270	   described in Section 4.3.4.

1272	   During the ramp up and sustain phase of each test iteration, other
1273	   KPIs such as inspected throughput, concurrent TCP connections and
1274	   application transactions per second MUST NOT reach the maximum value
1275	   the DUT/SUT can support.  The test results for specific test
1276	   iterations SHOULD NOT be reported, if the above-mentioned KPI
1277	   (especially inspected throughput) reaches the maximum value.
1278	   (Example: If the test iteration with 64 KByte of HTTP response object
1279	   size reached the maximum inspected throughput limitation of the DUT/
1280	   SUT, the test iteration MAY be interrupted and the result for 64
1281	   KByte SHOULD NOT be reported.)

1283	   The test equipment SHOULD start to measure and record all specified
1284	   KPIs.  Continue the test until all traffic profile phases are
1285	   completed.

1287	   Within the test results validation criteria, the DUT/SUT is expected
1288	   to reach the desired value of the target objective ("Target
1289	   connections per second") in the sustain phase.  Follow step 3, if the
1290	   measured value does not meet the target value or does not fulfill the
1291	   test results validation criteria.

1293	7.2.4.3.  Step 3: Test Iteration

1295	   Determine the achievable TCP connections per second within the test
1296	   results validation criteria.

1298	7.3.  HTTP Throughput

1300	7.3.1.  Objective

1302	   Determine the sustainable inspected throughput of the DUT/SUT for
1303	   HTTP transactions varying the HTTP response object size.

[nit] High level, what is the difference between 7.2 and 7.3 ? Some more explanation
would be useful. One interpretation i came up with is that 7.2 measures performane
of e.g.: HTTP connections where each connection performs a single GET, and 7.3
measures long-lived HTTP connections in which a high rate of HTTP GET is performed
(so as to differentiate transactions at TCP+HTTP level (7.2) from those only happening
at HTTP level (7.3). If that is a lucky guess it might help other similarily guessing
readers to write this out more explicitly.

1305	7.3.2.  Test Setup

1307	   Testbed setup SHOULD be configured as defined in Section 4.  Any
1308	   specific testbed configuration changes (number of interfaces and
1309	   interface type, etc.)  MUST be documented.

1311	7.3.3.  Test Parameters

1313	   In this section, benchmarking test specific parameters SHOULD be
1314	   defined.

1316	7.3.3.1.  DUT/SUT Configuration Parameters

1318	   DUT/SUT parameters MUST conform to the requirements defined in
1319	   Section 4.2.  Any configuration changes for this specific
1320	   benchmarking test MUST be documented.

1322	7.3.3.2.  Test Equipment Configuration Parameters

1324	   Test equipment configuration parameters MUST conform to the
1325	   requirements defined in Section 4.3.  The following parameters MUST
1326	   be documented for this benchmarking test:

1328	   Client IP address range defined in Section 4.3.1.2

1330	   Server IP address range defined in Section 4.3.2.2

1332	   Traffic distribution ratio between IPv4 and IPv6 defined in
1333	   Section 4.3.1.2

1335	   Target inspected throughput: Aggregated line rate of interface(s)
1336	   used in the DUT/SUT or the value defined based on requirement for a
1337	   specific deployment scenario
1338	   Initial throughput: 10% of "Target inspected throughput" Note:
1339	   Initial throughput is not a KPI to report.  This value is configured
1340	   on the traffic generator and used to perform Step 1: "Test
1341	   Initialization and Qualification" described under Section 7.3.4.

1343	   Number of HTTP response object requests (transactions) per
1344	   connection: 10

1346	   RECOMMENDED HTTP response object size: 1, 16, 64, 256 KByte, and
1347	   mixed objects defined in Table 4.

1349	           +=====================+============================+
1350	           | Object size (KByte) | Number of requests/ Weight |
1351	           +=====================+============================+
1352	           | 0.2                 | 1                          |
1353	           +---------------------+----------------------------+
1354	           | 6                   | 1                          |
1355	           +---------------------+----------------------------+
1356	           | 8                   | 1                          |
1357	           +---------------------+----------------------------+
1358	           | 9                   | 1                          |
1359	           +---------------------+----------------------------+
1360	           | 10                  | 1                          |
1361	           +---------------------+----------------------------+
1362	           | 25                  | 1                          |
1363	           +---------------------+----------------------------+
1364	           | 26                  | 1                          |
1365	           +---------------------+----------------------------+
1366	           | 35                  | 1                          |
1367	           +---------------------+----------------------------+
1368	           | 59                  | 1                          |
1369	           +---------------------+----------------------------+
1370	           | 347                 | 1                          |
1371	           +---------------------+----------------------------+

1373	                          Table 4: Mixed Objects

[minor] Interesting/useful data. If there was any reference/explanation how these
numbere where derived, that would be great to add.

1375	7.3.3.3.  Test Results Validation Criteria

1377	   The following criteria are the test results validation criteria.  The
1378	   test results validation criteria MUST be monitored during the whole
1379	   sustain phase of the traffic load profile.

1381	   a.  Number of failed application transactions (receiving any HTTP
1382	       response code other than 200 OK) MUST be less than 0.001% (1 out
1383	       of 100,000 transactions) of attempt transactions.

1385	   b.  Traffic SHOULD be forwarded at a constant rate (considered as a
1386	       constant rate if any deviation of traffic forwarding rate is less
1387	       than 5%).

1389	   c.  Concurrent TCP connections MUST be constant during steady state
1390	       and any deviation of concurrent TCP connections SHOULD be less
1391	       than 10%. This confirms the DUT opens and closes TCP connections
1392	       at approximately the same rate.

1394	7.3.3.4.  Measurement

1396	   Inspected Throughput and HTTP Transactions per Second MUST be
1397	   reported for each object size.

1399	7.3.4.  Test Procedures and Expected Results

1401	   The test procedure is designed to measure HTTP throughput of the DUT/
1402	   SUT.  The test procedure consists of three major steps: Step 1
1403	   ensures the DUT/SUT is able to reach the performance value (Initial
1404	   throughput) and meets the test results validation criteria when it
1405	   was very minimal utilized.  Step 2 determines the DUT/SUT is able to
1406	   reach the target performance value within the test results validation
1407	   criteria.  Step 3 determines the maximum achievable performance value
1408	   within the test results validation criteria.

1410	   This test procedure MAY be repeated multiple times with different
1411	   IPv4 and IPv6 traffic distribution and HTTP response object sizes.

1413	7.3.4.1.  Step 1: Test Initialization and Qualification

1415	   Verify the link status of all connected physical interfaces.  All
1416	   interfaces are expected to be in "UP" status.

1418	   Configure traffic load profile of the test equipment to establish
1419	   "Initial inspected throughput" as defined in Section 7.3.3.2.

1421	   The traffic load profile SHOULD be defined as described in
1422	   Section 4.3.4.  The DUT/SUT SHOULD reach the "Initial inspected
1423	   throughput" during the sustain phase.  Measure all KPI as defined in
1424	   Section 7.3.3.4.

1426	   The measured KPIs during the sustain phase MUST meet the test results
1427	   validation criteria "a" defined in Section 7.3.3.3.  The test results
1428	   validation criteria "b" and "c" are OPTIONAL for step 1.

1430	   If the KPI metrics do not meet the test results validation criteria,
1431	   the test procedure MUST NOT be continued to "Step 2".

1433	7.3.4.2.  Step 2: Test Run with Target Objective

1435	   Configure test equipment to establish the target objective ("Target
1436	   inspected throughput") defined in Section 7.3.3.2.  The test
1437	   equipment SHOULD start to measure and record all specified KPIs.
1438	   Continue the test until all traffic profile phases are completed.

1440	   Within the test results validation criteria, the DUT/SUT is expected
1441	   to reach the desired value of the target objective in the sustain
1442	   phase.  Follow step 3, if the measured value does not meet the target
1443	   value or does not fulfill the test results validation criteria.

1445	7.3.4.3.  Step 3: Test Iteration

1447	   Determine the achievable inspected throughput within the test results
1448	   validation criteria and measure the KPI metric Transactions per
1449	   Second.  Final test iteration MUST be performed for the test duration
1450	   defined in Section 4.3.4.

1452	7.4.  HTTP Transaction Latency

[nit] It would be nice to have explanatory text explaining why 7.4 requires different
test runs as opposed to just measuring the transaction latency as part of 7.2 and 7.3.
I have not tried to compare in detail the descriptions here to figure out the differences
in test runs, but even if there are differences, why would transaction latency not
also be measured in 7.2 and 7.3 as a metric ?

1454	7.4.1.  Objective

1456	   Using HTTP traffic, determine the HTTP transaction latency when DUT
1457	   is running with sustainable HTTP transactions per second supported by
1458	   the DUT/SUT under different HTTP response object sizes.

1460	   Test iterations MUST be performed with different HTTP response object
1461	   sizes in two different scenarios.  One with a single transaction and
1462	   the other with multiple transactions within a single TCP connection.
1463	   For consistency both the single and multiple transaction test MUST be
1464	   configured with the same HTTP version

1466	   Scenario 1: The client MUST negotiate HTTP and close the connection
1467	   with FIN immediately after completion of a single transaction (GET
1468	   and RESPONSE).

1470	   Scenario 2: The client MUST negotiate HTTP and close the connection
1471	   FIN immediately after completion of 10 transactions (GET and
1472	   RESPONSE) within a single TCP connection.

1474	7.4.2.  Test Setup

1476	   Testbed setup SHOULD be configured as defined in Section 4.  Any
1477	   specific testbed configuration changes (number of interfaces and
1478	   interface type, etc.)  MUST be documented.

1480	7.4.3.  Test Parameters

1482	   In this section, benchmarking test specific parameters SHOULD be
1483	   defined.

1485	7.4.3.1.  DUT/SUT Configuration Parameters

1487	   DUT/SUT parameters MUST conform to the requirements defined in
1488	   Section 4.2.  Any configuration changes for this specific
1489	   benchmarking test MUST be documented.

1491	7.4.3.2.  Test Equipment Configuration Parameters

1493	   Test equipment configuration parameters MUST conform to the
1494	   requirements defined in Section 4.3.  The following parameters MUST
1495	   be documented for this benchmarking test:

1497	   Client IP address range defined in Section 4.3.1.2

1499	   Server IP address range defined in Section 4.3.2.2

1501	   Traffic distribution ratio between IPv4 and IPv6 defined in
1502	   Section 4.3.1.2

1504	   Target objective for scenario 1: 50% of the connections per second
1505	   measured in benchmarking test TCP/HTTP Connections Per Second
1506	   (Section 7.2)

1508	   Target objective for scenario 2: 50% of the inspected throughput
1509	   measured in benchmarking test HTTP Throughput (Section 7.3)

1511	   Initial objective for scenario 1: 10% of "Target objective for
1512	   scenario 1"

1514	   Initial objective for scenario 2: 10% of "Target objective for
1515	   scenario 2"

1517	   Note: The Initial objectives are not a KPI to report.  These values
1518	   are configured on the traffic generator and used to perform the
1519	   Step1: "Test Initialization and Qualification" described under the
1520	   Section 7.4.4.

1522	   HTTP transaction per TCP connection: Test scenario 1 with single
1523	   transaction and test scenario 2 with 10 transactions.

1525	   HTTP with GET request requesting a single object.  The RECOMMENDED
1526	   object sizes are 1, 16, and 64 KByte.  For each test iteration,
1527	   client MUST request a single HTTP response object size.

1529	7.4.3.3.  Test Results Validation Criteria

1531	   The following criteria are the test results validation criteria.  The
1532	   Test results validation criteria MUST be monitored during the whole
1533	   sustain phase of the traffic load profile.

1535	   a.  Number of failed application transactions (receiving any HTTP
1536	       response code other than 200 OK) MUST be less than 0.001% (1 out
1537	       of 100,000 transactions) of attempt transactions.

1539	   b.  Number of terminated TCP connections due to unexpected TCP RST
1540	       sent by DUT/SUT MUST be less than 0.001% (1 out of 100,000
1541	       connections) of total initiated TCP connections.

1543	   c.  During the sustain phase, traffic SHOULD be forwarded at a
1544	       constant rate (considered as a constant rate if any deviation of
1545	       traffic forwarding rate is less than 5%).

1547	   d.  Concurrent TCP connections MUST be constant during steady state
1548	       and any deviation of concurrent TCP connections SHOULD be less
1549	       than 10%. This confirms the DUT opens and closes TCP connections
1550	       at approximately the same rate.

1552	   e.  After ramp up the DUT MUST achieve the "Target objective" defined
1553	       in Section 7.4.3.2 and remain in that state for the entire test
1554	       duration (sustain phase).

1556	7.4.3.4.  Measurement

1558	   TTFB (minimum, average, and maximum) and TTLB (minimum, average and
1559	   maximum) MUST be reported for each object size.

1561	7.4.4.  Test Procedures and Expected Results

1563	   The test procedure is designed to measure TTFB or TTLB when the DUT/
1564	   SUT is operating close to 50% of its maximum achievable connections
1565	   per second or inspected throughput.  The test procedure consists of
1566	   two major steps: Step 1 ensures the DUT/SUT is able to reach the
1567	   initial performance values and meets the test results validation
1568	   criteria when it was very minimally utilized.  Step 2 measures the
1569	   latency values within the test results validation criteria.

1571	   This test procedure MAY be repeated multiple times with different IP
1572	   types (IPv4 only, IPv6 only and IPv4 and IPv6 mixed traffic
1573	   distribution), HTTP response object sizes and single and multiple
1574	   transactions per connection scenarios.

1576	7.4.4.1.  Step 1: Test Initialization and Qualification

1578	   Verify the link status of all connected physical interfaces.  All
1579	   interfaces are expected to be in "UP" status.

1581	   Configure traffic load profile of the test equipment to establish
1582	   "Initial objective" as defined in Section 7.4.3.2.  The traffic load
1583	   profile SHOULD be defined as described in Section 4.3.4.

1585	   The DUT/SUT SHOULD reach the "Initial objective" before the sustain
1586	   phase.  The measured KPIs during the sustain phase MUST meet all the
1587	   test results validation criteria defined in Section 7.4.3.3.

1589	   If the KPI metrics do not meet the test results validation criteria,
1590	   the test procedure MUST NOT be continued to "Step 2".

1592	7.4.4.2.  Step 2: Test Run with Target Objective

1594	   Configure test equipment to establish "Target objective" defined in
1595	   Section 7.4.3.2.  The test equipment SHOULD follow the traffic load
1596	   profile definition as described in Section 4.3.4.

1598	   The test equipment SHOULD start to measure and record all specified
1599	   KPIs.  Continue the test until all traffic profile phases are
1600	   completed.

1602	   Within the test results validation criteria, the DUT/SUT MUST reach
1603	   the desired value of the target objective in the sustain phase.

1605	   Measure the minimum, average, and maximum values of TTFB and TTLB.

1607	7.5.  Concurrent TCP/HTTP Connection Capacity

[nit] again a summary comparison of the traffic in 7.5 vs. the prior traffic profiles
would be helpful to understand the benefit of these test runs. Is this about any
real-world reqirement or more a synthetic performance number for unrealistic HTTP
connections (which would still be a useful number IMHO, just want to know) ? 

The traffic profile below is somewhat strange because
it defines the rate of GET within a TCP connection based not on real-world application
behavior, but just to create some rate of GET per TCP connection over the steady state.
I guess the goal is something like "measure the maximum sustainable number of TCP/HTTP
connctions, wehreas each connection carries as little as possible traffic and a sufficiently
low number of HTTP (GET) transactions that the DUT is not too much performance loaded
with the HTTP level inspection, but mostly with HTTP/TCP flow maintainance ??

In general, describing for each of the 7.x section upfront the goal and design criteria
of the test runs in those high-level terms is IMHO very beneficial for reviewers to
vet if/how well the detailled description does meet the goals. Otherwise one is somewhat
left puzzling about that question. Aka: enhance the 7.x.1 objective sessions with that
amount of details.

1609	7.5.1.  Objective

1611	   Determine the number of concurrent TCP connections that the DUT/ SUT
1612	   sustains when using HTTP traffic.

1614	7.5.2.  Test Setup

1616	   Testbed setup SHOULD be configured as defined in Section 4.  Any
1617	   specific testbed configuration changes (number of interfaces and
1618	   interface type, etc.)  MUST be documented.

1620	7.5.3.  Test Parameters

1622	   In this section, benchmarking test specific parameters SHOULD be
1623	   defined.

1625	7.5.3.1.  DUT/SUT Configuration Parameters

1627	   DUT/SUT parameters MUST conform to the requirements defined in
1628	   Section 4.2.  Any configuration changes for this specific
1629	   benchmarking test MUST be documented.

1631	7.5.3.2.  Test Equipment Configuration Parameters

1633	   Test equipment configuration parameters MUST conform to the
1634	   requirements defined in Section 4.3.  The following parameters MUST
1635	   be noted for this benchmarking test:

1637	      Client IP address range defined in Section 4.3.1.2

1639	      Server IP address range defined in Section 4.3.2.2

1641	      Traffic distribution ratio between IPv4 and IPv6 defined in
1642	      Section 4.3.1.2

1644	      Target concurrent connection: Initial value from product datasheet
1645	      or the value defined based on requirement for a specific
1646	      deployment scenario.

1648	      Initial concurrent connection: 10% of "Target concurrent
1649	      connection" Note: Initial concurrent connection is not a KPI to
1650	      report.  This value is configured on the traffic generator and
1651	      used to perform the Step1: "Test Initialization and Qualification"
1652	      described under the Section 7.5.4.

1654	      Maximum connections per second during ramp up phase: 50% of
1655	      maximum connections per second measured in benchmarking test TCP/
1656	      HTTP Connections per second (Section 7.2)

1658	      Ramp up time (in traffic load profile for "Target concurrent
1659	      connection"): "Target concurrent connection" / "Maximum
1660	      connections per second during ramp up phase"

1662	      Ramp up time (in traffic load profile for "Initial concurrent
1663	      connection"): "Initial concurrent connection" / "Maximum
1664	      connections per second during ramp up phase"

1666	   The client MUST negotiate HTTP and each client MAY open multiple
1667	   concurrent TCP connections per server endpoint IP.

1669	   Each client sends 10 GET requests requesting 1 KByte HTTP response
1670	   object in the same TCP connection (10 transactions/TCP connection)
1671	   and the delay (think time) between each transaction MUST be X
1672	   seconds.

1674	   X = ("Ramp up time" + "steady state time") /10

1676	   The established connections SHOULD remain open until the ramp down
1677	   phase of the test.  During the ramp down phase, all connections
1678	   SHOULD be successfully closed with FIN.

1680	7.5.3.3.  Test Results Validation Criteria

1682	   The following criteria are the test results validation criteria.  The
1683	   Test results validation criteria MUST be monitored during the whole
1684	   sustain phase of the traffic load profile.

1686	   a.  Number of failed application transactions (receiving any HTTP
1687	       response code other than 200 OK) MUST be less than 0.001% (1 out
1688	       of 100,000 transaction) of total attempted transactions.

1690	   b.  Number of terminated TCP connections due to unexpected TCP RST
1691	       sent by DUT/SUT MUST be less than 0.001% (1 out of 100,000
1692	       connections) of total initiated TCP connections.

1694	   c.  During the sustain phase, traffic SHOULD be forwarded at a
1695	       constant rate (considered as a constant rate if any deviation of
1696	       traffic forwarding rate is less than 5%).

1698	7.5.3.4.  Measurement

1700	   Average Concurrent TCP Connections MUST be reported for this
1701	   benchmarking test.

1703	7.5.4.  Test Procedures and Expected Results

1705	   The test procedure is designed to measure the concurrent TCP
1706	   connection capacity of the DUT/SUT at the sustaining period of
1707	   traffic load profile.  The test procedure consists of three major
1708	   steps: Step 1 ensures the DUT/SUT is able to reach the performance
1709	   value (Initial concurrent connection) and meets the test results
1710	   validation criteria when it was very minimally utilized.  Step 2
1711	   determines the DUT/SUT is able to reach the target performance value
1712	   within the test results validation criteria.  Step 3 determines the
1713	   maximum achievable performance value within the test results
1714	   validation criteria.

1716	   This test procedure MAY be repeated multiple times with different
1717	   IPv4 and IPv6 traffic distribution.

1719	7.5.4.1.  Step 1: Test Initialization and Qualification

1721	   Verify the link status of all connected physical interfaces.  All
1722	   interfaces are expected to be in "UP" status.

1724	   Configure test equipment to establish "Initial concurrent TCP
1725	   connections" defined in Section 7.5.3.2.  Except ramp up time, the
1726	   traffic load profile SHOULD be defined as described in Section 4.3.4.

1728	   During the sustain phase, the DUT/SUT SHOULD reach the "Initial
1729	   concurrent TCP connections".  The measured KPIs during the sustain
1730	   phase MUST meet all the test results validation criteria defined in
1731	   Section 7.5.3.3.

1733	   If the KPI metrics do not meet the test results validation criteria,
1734	   the test procedure MUST NOT be continued to "Step 2".

1736	7.5.4.2.  Step 2: Test Run with Target Objective

1738	   Configure test equipment to establish the target objective ("Target
1739	   concurrent TCP connections").  The test equipment SHOULD follow the
1740	   traffic load profile definition (except ramp up time) as described in
1741	   Section 4.3.4.

1743	   During the ramp up and sustain phase, the other KPIs such as
1744	   inspected throughput, TCP connections per second, and application
1745	   transactions per second MUST NOT reach the maximum value the DUT/SUT
1746	   can support.

1748	   The test equipment SHOULD start to measure and record KPIs defined in
1749	   Section 7.5.3.4.  Continue the test until all traffic profile phases
1750	   are completed.

1752	   Within the test results validation criteria, the DUT/SUT is expected
1753	   to reach the desired value of the target objective in the sustain
1754	   phase.  Follow step 3, if the measured value does not meet the target
1755	   value or does not fulfill the test results validation criteria.

1757	7.5.4.3.  Step 3: Test Iteration

1759	   Determine the achievable concurrent TCP connections capacity within
1760	   the test results validation criteria.

1762	7.6.  TCP/HTTPS Connections per Second

[minor] The one big performance factor that i think is not documented or suggested
to be compared is the cost of certificate (chain) validation for different key-length
certificates used for the TCP/HTTPs connections. The parameters for TLS 1.2 and TLS 1.3
mentioned in before in the document do not cover that.  I think it would be prudent
to figure out an Internet common minimum (fastest to process) certificate and a 
common maximum complexity certificate. The latter one may simply be when revocation
is enabled, e.g.: checking the server certificate against a revocation list.

Just saying because server certificate verification may monopolise connection setup
performance - unless you want to make the argument that it is irrelevant because
due to the limited number of servers in the test, the DUT is assumed/known to be able
to cache server certificate validation results during ramput phase so it does become
irrelevant during steady state phase. But it would be at least good to describe this in text.


1763	7.6.1.  Objective

1765	   Using HTTPS traffic, determine the sustainable SSL/TLS session
1766	   establishment rate supported by the DUT/SUT under different
1767	   throughput load conditions.

1769	   Test iterations MUST include common cipher suites and key strengths
1770	   as well as forward looking stronger keys.  Specific test iterations
1771	   MUST include ciphers and keys defined in Section 7.6.3.2.

1773	   For each cipher suite and key strengths, test iterations MUST use a
1774	   single HTTPS response object size defined in Section 7.6.3.2 to
1775	   measure connections per second performance under a variety of DUT/SUT
1776	   security inspection load conditions.

1778	7.6.2.  Test Setup

1780	   Testbed setup SHOULD be configured as defined in Section 4.  Any
1781	   specific testbed configuration changes (number of interfaces and
1782	   interface type, etc.)  MUST be documented.

1784	7.6.3.  Test Parameters

1786	   In this section, benchmarking test specific parameters SHOULD be
1787	   defined.

1789	7.6.3.1.  DUT/SUT Configuration Parameters

1791	   DUT/SUT parameters MUST conform to the requirements defined in
1792	   Section 4.2.  Any configuration changes for this specific
1793	   benchmarking test MUST be documented.

1795	7.6.3.2.  Test Equipment Configuration Parameters

1797	   Test equipment configuration parameters MUST conform to the
1798	   requirements defined in Section 4.3.  The following parameters MUST
1799	   be documented for this benchmarking test:

1801	   Client IP address range defined in Section 4.3.1.2

1803	   Server IP address range defined in Section 4.3.2.2

1805	   Traffic distribution ratio between IPv4 and IPv6 defined in
1806	   Section 4.3.1.2

1808	   Target connections per second: Initial value from product datasheet
1809	   or the value defined based on requirement for a specific deployment
1810	   scenario.

1812	   Initial connections per second: 10% of "Target connections per
1813	   second" Note: Initial connections per second is not a KPI to report.
1814	   This value is configured on the traffic generator and used to perform
1815	   the Step1: "Test Initialization and Qualification" described under
1816	   the Section 7.6.4.

1818	   RECOMMENDED ciphers and keys defined in Section 4.3.1.3

1820	   The client MUST negotiate HTTPS and close the connection with FIN
1821	   immediately after completion of one transaction.  In each test
1822	   iteration, client MUST send GET request requesting a fixed HTTPS
1823	   response object size.  The RECOMMENDED object sizes are 1, 2, 4, 16,
1824	   and 64 KByte.

1826	7.6.3.3.  Test Results Validation Criteria

1828	   The following criteria are the test results validation criteria.  The
1829	   test results validation criteria MUST be monitored during the whole
1830	   test duration.

1832	   a.  Number of failed application transactions (receiving any HTTP
1833	       response code other than 200 OK) MUST be less than 0.001% (1 out
1834	       of 100,000 transactions) of attempt transactions.

1836	   b.  Number of terminated TCP connections due to unexpected TCP RST
1837	       sent by DUT/SUT MUST be less than 0.001% (1 out of 100,000
1838	       connections) of total initiated TCP connections.

1840	   c.  During the sustain phase, traffic SHOULD be forwarded at a
1841	       constant rate (considered as a constant rate if any deviation of
1842	       traffic forwarding rate is less than 5%).

1844	   d.  Concurrent TCP connections MUST be constant during steady state
1845	       and any deviation of concurrent TCP connections SHOULD be less
1846	       than 10%. This confirms the DUT opens and closes TCP connections
1847	       at approximately the same rate.

1849	7.6.3.4.  Measurement

1851	   TCP connections per second MUST be reported for each test iteration
1852	   (for each object size).

1854	   The KPI metric TLS Handshake Rate can be measured in the test using 1
1855	   KByte object size.

1857	7.6.4.  Test Procedures and Expected Results

1859	   The test procedure is designed to measure the TCP connections per
1860	   second rate of the DUT/SUT at the sustaining period of traffic load
1861	   profile.  The test procedure consists of three major steps: Step 1
1862	   ensures the DUT/SUT is able to reach the performance value (Initial
1863	   connections per second) and meets the test results validation
1864	   criteria when it was very minimally utilized.  Step 2 determines the
1865	   DUT/SUT is able to reach the target performance value within the test
1866	   results validation criteria.  Step 3 determines the maximum
1867	   achievable performance value within the test results validation
1868	   criteria.

1870	   This test procedure MAY be repeated multiple times with different
1871	   IPv4 and IPv6 traffic distribution.

1873	7.6.4.1.  Step 1: Test Initialization and Qualification

1875	   Verify the link status of all connected physical interfaces.  All
1876	   interfaces are expected to be in "UP" status.

1878	   Configure traffic load profile of the test equipment to establish
1879	   "Initial connections per second" as defined in Section 7.6.3.2.  The
1880	   traffic load profile SHOULD be defined as described in Section 4.3.4.

1882	   The DUT/SUT SHOULD reach the "Initial connections per second" before
1883	   the sustain phase.  The measured KPIs during the sustain phase MUST
1884	   meet all the test results validation criteria defined in
1885	   Section 7.6.3.3.

1887	   If the KPI metrics do not meet the test results validation criteria,
1888	   the test procedure MUST NOT be continued to "Step 2".

1890	7.6.4.2.  Step 2: Test Run with Target Objective

1892	   Configure test equipment to establish "Target connections per second"
1893	   defined in Section 7.6.3.2.  The test equipment SHOULD follow the
1894	   traffic load profile definition as described in Section 4.3.4.

1896	   During the ramp up and sustain phase, other KPIs such as inspected
1897	   throughput, concurrent TCP connections, and application transactions
1898	   per second MUST NOT reach the maximum value the DUT/SUT can support.
1899	   The test results for specific test iteration SHOULD NOT be reported,
1900	   if the above mentioned KPI (especially inspected throughput) reaches
1901	   the maximum value.  (Example: If the test iteration with 64 KByte of
1902	   HTTPS response object size reached the maximum inspected throughput
1903	   limitation of the DUT, the test iteration MAY be interrupted and the
1904	   result for 64 KByte SHOULD NOT be reported).

1906	   The test equipment SHOULD start to measure and record all specified
1907	   KPIs.  Continue the test until all traffic profile phases are
1908	   completed.

1910	   Within the test results validation criteria, the DUT/SUT is expected
1911	   to reach the desired value of the target objective ("Target
1912	   connections per second") in the sustain phase.  Follow step 3, if the
1913	   measured value does not meet the target value or does not fulfill the
1914	   test results validation criteria.

1916	7.6.4.3.  Step 3: Test Iteration

1918	   Determine the achievable connections per second within the test
1919	   results validation criteria.

1921	7.7.  HTTPS Throughput

1923	7.7.1.  Objective

1925	   Determine the sustainable inspected throughput of the DUT/SUT for
1926	   HTTPS transactions varying the HTTPS response object size.

1928	   Test iterations MUST include common cipher suites and key strengths
1929	   as well as forward looking stronger keys.  Specific test iterations
1930	   MUST include the ciphers and keys defined in Section 7.7.3.2.

1932	7.7.2.  Test Setup

1934	   Testbed setup SHOULD be configured as defined in Section 4.  Any
1935	   specific testbed configuration changes (number of interfaces and
1936	   interface type, etc.)  MUST be documented.

1938	7.7.3.  Test Parameters

1940	   In this section, benchmarking test specific parameters SHOULD be
1941	   defined.

1943	7.7.3.1.  DUT/SUT Configuration Parameters

1945	   DUT/SUT parameters MUST conform to the requirements defined in
1946	   Section 4.2.  Any configuration changes for this specific
1947	   benchmarking test MUST be documented.

1949	7.7.3.2.  Test Equipment Configuration Parameters

1951	   Test equipment configuration parameters MUST conform to the
1952	   requirements defined in Section 4.3.  The following parameters MUST
1953	   be documented for this benchmarking test:

1955	   Client IP address range defined in Section 4.3.1.2

1957	   Server IP address range defined in Section 4.3.2.2

1959	   Traffic distribution ratio between IPv4 and IPv6 defined in
1960	   Section 4.3.1.2

1962	   Target inspected throughput: Aggregated line rate of interface(s)
1963	   used in the DUT/SUT or the value defined based on requirement for a
1964	   specific deployment scenario.

1966	   Initial throughput: 10% of "Target inspected throughput" Note:
1967	   Initial throughput is not a KPI to report.  This value is configured
1968	   on the traffic generator and used to perform the Step1: "Test
1969	   Initialization and Qualification" described under the Section 7.7.4.

1971	   Number of HTTPS response object requests (transactions) per
1972	   connection: 10

1974	   RECOMMENDED ciphers and keys defined in Section 4.3.1.3

1976	   RECOMMENDED HTTPS response object size: 1, 16, 64, 256 KByte, and
1977	   mixed objects defined in Table 4 under Section 7.3.3.2.

1979	7.7.3.3.  Test Results Validation Criteria

1981	   The following criteria are the test results validation criteria.  The
1982	   test results validation criteria MUST be monitored during the whole
1983	   sustain phase of the traffic load profile.

1985	   a.  Number of failed Application transactions (receiving any HTTP
1986	       response code other than 200 OK) MUST be less than 0.001% (1 out
1987	       of 100,000 transactions) of attempt transactions.

1989	   b.  Traffic SHOULD be forwarded at a constant rate (considered as a
1990	       constant rate if any deviation of traffic forwarding rate is less
1991	       than 5%).

1993	   c.  Concurrent TCP connections MUST be constant during steady state
1994	       and any deviation of concurrent TCP connections SHOULD be less
1995	       than 10%. This confirms the DUT opens and closes TCP connections
1996	       at approximately the same rate.

1998	7.7.3.4.  Measurement

2000	   Inspected Throughput and HTTP Transactions per Second MUST be
2001	   reported for each object size.

2003	7.7.4.  Test Procedures and Expected Results

2005	   The test procedure consists of three major steps: Step 1 ensures the
2006	   DUT/SUT is able to reach the performance value (Initial throughput)
2007	   and meets the test results validation criteria when it was very
2008	   minimally utilized.  Step 2 determines the DUT/SUT is able to reach
2009	   the target performance value within the test results validation
2010	   criteria.  Step 3 determines the maximum achievable performance value
2011	   within the test results validation criteria.

2013	   This test procedure MAY be repeated multiple times with different
2014	   IPv4 and IPv6 traffic distribution and HTTPS response object sizes.

2016	7.7.4.1.  Step 1: Test Initialization and Qualification

2018	   Verify the link status of all connected physical interfaces.  All
2019	   interfaces are expected to be in "UP" status.

2021	   Configure traffic load profile of the test equipment to establish
2022	   "Initial throughput" as defined in Section 7.7.3.2.

2024	   The traffic load profile SHOULD be defined as described in
2025	   Section 4.3.4.  The DUT/SUT SHOULD reach the "Initial throughput"
2026	   during the sustain phase.  Measure all KPI as defined in
2027	   Section 7.7.3.4.

2029	   The measured KPIs during the sustain phase MUST meet the test results
2030	   validation criteria "a" defined in Section 7.7.3.3.  The test results
2031	   validation criteria "b" and "c" are OPTIONAL for step 1.

2033	   If the KPI metrics do not meet the test results validation criteria,
2034	   the test procedure MUST NOT be continued to "Step 2".

2036	7.7.4.2.  Step 2: Test Run with Target Objective

2038	   Configure test equipment to establish the target objective ("Target
2039	   inspected throughput") defined in Section 7.7.3.2.  The test
2040	   equipment SHOULD start to measure and record all specified KPIs.
2041	   Continue the test until all traffic profile phases are completed.

2043	   Within the test results validation criteria, the DUT/SUT is expected
2044	   to reach the desired value of the target objective in the sustain
2045	   phase.  Follow step 3, if the measured value does not meet the target
2046	   value or does not fulfill the test results validation criteria.

2048	7.7.4.3.  Step 3: Test Iteration

2050	   Determine the achievable average inspected throughput within the test
2051	   results validation criteria.  Final test iteration MUST be performed
2052	   for the test duration defined in Section 4.3.4.

2054	7.8.  HTTPS Transaction Latency

2056	7.8.1.  Objective

2058	   Using HTTPS traffic, determine the HTTPS transaction latency when
2059	   DUT/SUT is running with sustainable HTTPS transactions per second
2060	   supported by the DUT/SUT under different HTTPS response object size.

2062	   Scenario 1: The client MUST negotiate HTTPS and close the connection
2063	   with FIN immediately after completion of a single transaction (GET
2064	   and RESPONSE).

2066	   Scenario 2: The client MUST negotiate HTTPS and close the connection
2067	   with FIN immediately after completion of 10 transactions (GET and
2068	   RESPONSE) within a single TCP connection.

2070	7.8.2.  Test Setup

2072	   Testbed setup SHOULD be configured as defined in Section 4.  Any
2073	   specific testbed configuration changes (number of interfaces and
2074	   interface type, etc.)  MUST be documented.

2076	7.8.3.  Test Parameters

2078	   In this section, benchmarking test specific parameters SHOULD be
2079	   defined.

2081	7.8.3.1.  DUT/SUT Configuration Parameters

2083	   DUT/SUT parameters MUST conform to the requirements defined in
2084	   Section 4.2.  Any configuration changes for this specific
2085	   benchmarking test MUST be documented.

2087	7.8.3.2.  Test Equipment Configuration Parameters

2089	   Test equipment configuration parameters MUST conform to the
2090	   requirements defined in Section 4.3.  The following parameters MUST
2091	   be documented for this benchmarking test:

2093	   Client IP address range defined in Section 4.3.1.2

2095	   Server IP address range defined in Section 4.3.2.2
2096	   Traffic distribution ratio between IPv4 and IPv6 defined in
2097	   Section 4.3.1.2

2099	   RECOMMENDED cipher suites and key sizes defined in Section 4.3.1.3

2101	   Target objective for scenario 1: 50% of the connections per second
2102	   measured in benchmarking test TCP/HTTPS Connections per second
2103	   (Section 7.6)

2105	   Target objective for scenario 2: 50% of the inspected throughput
2106	   measured in benchmarking test HTTPS Throughput (Section 7.7)

2108	   Initial objective for scenario 1: 10% of "Target objective for
2109	   scenario 1"

2111	   Initial objective for scenario 2: 10% of "Target objective for
2112	   scenario 2"

2114	   Note: The Initial objectives are not a KPI to report.  These values
2115	   are configured on the traffic generator and used to perform the
2116	   Step1: "Test Initialization and Qualification" described under the
2117	   Section 7.8.4.

2119	   HTTPS transaction per TCP connection: Test scenario 1 with single
2120	   transaction and scenario 2 with 10 transactions

2122	   HTTPS with GET request requesting a single object.  The RECOMMENDED
2123	   object sizes are 1, 16, and 64 KByte.  For each test iteration,
2124	   client MUST request a single HTTPS response object size.

2126	7.8.3.3.  Test Results Validation Criteria

2128	   The following criteria are the test results validation criteria.  The
2129	   Test results validation criteria MUST be monitored during the whole
2130	   sustain phase of the traffic load profile.

2132	   a.  Number of failed application transactions (receiving any HTTP
2133	       response code other than 200 OK) MUST be less than 0.001% (1 out
2134	       of 100,000 transactions) of attempt transactions.

2136	   b.  Number of terminated TCP connections due to unexpected TCP RST
2137	       sent by DUT/SUT MUST be less than 0.001% (1 out of 100,000
2138	       connections) of total initiated TCP connections.

2140	   c.  During the sustain phase, traffic SHOULD be forwarded at a
2141	       constant rate (considered as a constant rate if any deviation of
2142	       traffic forwarding rate is less than 5%).

2144	   d.  Concurrent TCP connections MUST be constant during steady state
2145	       and any deviation of concurrent TCP connections SHOULD be less
2146	       than 10%. This confirms the DUT opens and closes TCP connections
2147	       at approximately the same rate.

2149	   e.  After ramp up the DUT/SUT MUST achieve the "Target objective"
2150	       defined in the parameter Section 7.8.3.2 and remain in that state
2151	       for the entire test duration (sustain phase).

2153	7.8.3.4.  Measurement

2155	   TTFB (minimum, average, and maximum) and TTLB (minimum, average and
2156	   maximum) MUST be reported for each object size.

2158	7.8.4.  Test Procedures and Expected Results

2160	   The test procedure is designed to measure TTFB or TTLB when the DUT/
2161	   SUT is operating close to 50% of its maximum achievable connections
2162	   per second or inspected throughput.  The test procedure consists of
2163	   two major steps: Step 1 ensures the DUT/SUT is able to reach the
2164	   initial performance values and meets the test results validation
2165	   criteria when it was very minimally utilized.  Step 2 measures the
2166	   latency values within the test results validation criteria.

2168	   This test procedure MAY be repeated multiple times with different IP
2169	   types (IPv4 only, IPv6 only and IPv4 and IPv6 mixed traffic
2170	   distribution), HTTPS response object sizes and single, and multiple
2171	   transactions per connection scenarios.

2173	7.8.4.1.  Step 1: Test Initialization and Qualification

2175	   Verify the link status of all connected physical interfaces.  All
2176	   interfaces are expected to be in "UP" status.

2178	   Configure traffic load profile of the test equipment to establish
2179	   "Initial objective" as defined in the Section 7.8.3.2.  The traffic
2180	   load profile SHOULD be defined as described in Section 4.3.4.

2182	   The DUT/SUT SHOULD reach the "Initial objective" before the sustain
2183	   phase.  The measured KPIs during the sustain phase MUST meet all the
2184	   test results validation criteria defined in Section 7.8.3.3.

2186	   If the KPI metrics do not meet the test results validation criteria,
2187	   the test procedure MUST NOT be continued to "Step 2".

2189	7.8.4.2.  Step 2: Test Run with Target Objective

2191	   Configure test equipment to establish "Target objective" defined in
2192	   Section 7.8.3.2.  The test equipment SHOULD follow the traffic load
2193	   profile definition as described in Section 4.3.4.

2195	   The test equipment SHOULD start to measure and record all specified
2196	   KPIs.  Continue the test until all traffic profile phases are
2197	   completed.

2199	   Within the test results validation criteria, the DUT/SUT MUST reach
2200	   the desired value of the target objective in the sustain phase.

2202	   Measure the minimum, average, and maximum values of TTFB and TTLB.

2204	7.9.  Concurrent TCP/HTTPS Connection Capacity

2206	7.9.1.  Objective

2208	   Determine the number of concurrent TCP connections the DUT/SUT
2209	   sustains when using HTTPS traffic.

2211	7.9.2.  Test Setup

2213	   Testbed setup SHOULD be configured as defined in Section 4.  Any
2214	   specific testbed configuration changes (number of interfaces and
2215	   interface type, etc.)  MUST be documented.

2217	7.9.3.  Test Parameters

2219	   In this section, benchmarking test specific parameters SHOULD be
2220	   defined.

2222	7.9.3.1.  DUT/SUT Configuration Parameters

2224	   DUT/SUT parameters MUST conform to the requirements defined in
2225	   Section 4.2.  Any configuration changes for this specific
2226	   benchmarking test MUST be documented.

2228	7.9.3.2.  Test Equipment Configuration Parameters

2230	   Test equipment configuration parameters MUST conform to the
2231	   requirements defined in Section 4.3.  The following parameters MUST
2232	   be documented for this benchmarking test:

2234	      Client IP address range defined in Section 4.3.1.2

2236	      Server IP address range defined in Section 4.3.2.2
2237	      Traffic distribution ratio between IPv4 and IPv6 defined in
2238	      Section 4.3.1.2

2240	      RECOMMENDED cipher suites and key sizes defined in Section 4.3.1.3

2242	      Target concurrent connections: Initial value from product
2243	      datasheet or the value defined based on requirement for a specific
2244	      deployment scenario.

2246	      Initial concurrent connections: 10% of "Target concurrent
2247	      connections" Note: Initial concurrent connection is not a KPI to
2248	      report.  This value is configured on the traffic generator and
2249	      used to perform the Step1: "Test Initialization and Qualification"
2250	      described under the Section 7.9.4.

2252	      Connections per second during ramp up phase: 50% of maximum
2253	      connections per second measured in benchmarking test TCP/HTTPS
2254	      Connections per second (Section 7.6)

2256	      Ramp up time (in traffic load profile for "Target concurrent
2257	      connections"): "Target concurrent connections" / "Maximum
2258	      connections per second during ramp up phase"

2260	      Ramp up time (in traffic load profile for "Initial concurrent
2261	      connections"): "Initial concurrent connections" / "Maximum
2262	      connections per second during ramp up phase"

2264	   The client MUST perform HTTPS transaction with persistence and each
2265	   client can open multiple concurrent TCP connections per server
2266	   endpoint IP.

2268	   Each client sends 10 GET requests requesting 1 KByte HTTPS response
2269	   objects in the same TCP connections (10 transactions/TCP connection)
2270	   and the delay (think time) between each transaction MUST be X
2271	   seconds.

2273	   X = ("Ramp up time" + "steady state time") /10

2275	   The established connections SHOULD remain open until the ramp down
2276	   phase of the test.  During the ramp down phase, all connections
2277	   SHOULD be successfully closed with FIN.

2279	7.9.3.3.  Test Results Validation Criteria

2281	   The following criteria are the test results validation criteria.  The
2282	   Test results validation criteria MUST be monitored during the whole
2283	   sustain phase of the traffic load profile.

2285	   a.  Number of failed application transactions (receiving any HTTP
2286	       response code other than 200 OK) MUST be less than 0.001% (1 out
2287	       of 100,000 transactions) of total attempted transactions.

2289	   b.  Number of terminated TCP connections due to unexpected TCP RST
2290	       sent by DUT/SUT MUST be less than 0.001% (1 out of 100,000
2291	       connections) of total initiated TCP connections.

2293	   c.  During the sustain phase, traffic SHOULD be forwarded at a
2294	       constant rate (considered as a constant rate if any deviation of
2295	       traffic forwarding rate is less than 5%).

2297	7.9.3.4.  Measurement

2299	   Average Concurrent TCP Connections MUST be reported for this
2300	   benchmarking test.

2302	7.9.4.  Test Procedures and Expected Results

2304	   The test procedure is designed to measure the concurrent TCP
2305	   connection capacity of the DUT/SUT at the sustaining period of
2306	   traffic load profile.  The test procedure consists of three major
2307	   steps: Step 1 ensures the DUT/SUT is able to reach the performance
2308	   value (Initial concurrent connection) and meets the test results
2309	   validation criteria when it was very minimally utilized.  Step 2
2310	   determines the DUT/SUT is able to reach the target performance value
2311	   within the test results validation criteria.  Step 3 determines the
2312	   maximum achievable performance value within the test results
2313	   validation criteria.

2315	   This test procedure MAY be repeated multiple times with different
2316	   IPv4 and IPv6 traffic distribution.

2318	7.9.4.1.  Step 1: Test Initialization and Qualification

2320	   Verify the link status of all connected physical interfaces.  All
2321	   interfaces are expected to be in "UP" status.

2323	   Configure test equipment to establish "Initial concurrent TCP
2324	   connections" defined in Section 7.9.3.2.  Except ramp up time, the
2325	   traffic load profile SHOULD be defined as described in Section 4.3.4.

2327	   During the sustain phase, the DUT/SUT SHOULD reach the "Initial
2328	   concurrent TCP connections".  The measured KPIs during the sustain
2329	   phase MUST meet the test results validation criteria "a" and "b"
2330	   defined in Section 7.9.3.3.

2332	   If the KPI metrics do not meet the test results validation criteria,
2333	   the test procedure MUST NOT be continued to "Step 2".

2335	7.9.4.2.  Step 2: Test Run with Target Objective

2337	   Configure test equipment to establish the target objective ("Target
2338	   concurrent TCP connections").  The test equipment SHOULD follow the
2339	   traffic load profile definition (except ramp up time) as described in
2340	   Section 4.3.4.

2342	   During the ramp up and sustain phase, the other KPIs such as
2343	   inspected throughput, TCP connections per second, and application
2344	   transactions per second MUST NOT reach to the maximum value that the
2345	   DUT/SUT can support.

2347	   The test equipment SHOULD start to measure and record KPIs defined in
2348	   Section 7.9.3.4.  Continue the test until all traffic profile phases
2349	   are completed.

2351	   Within the test results validation criteria, the DUT/SUT is expected
2352	   to reach the desired value of the target objective in the sustain
2353	   phase.  Follow step 3, if the measured value does not meet the target
2354	   value or does not fulfill the test results validation criteria.

2356	7.9.4.3.  Step 3: Test Iteration

2358	   Determine the achievable concurrent TCP connections within the test
2359	   results validation criteria.

[mayor] I would really love to see DUT power consumption numbers captured and reported
for the 10% and the maximum achieved rates for the 7.x tests (during steady state).

Energy consumption is becoming a more and more important factor in networking, and all the
high-touch operations of security devices are amongst the most power/compute hungry operations
of any network device, but with a wide variety depending on how its implemented.
Its also extremely simple to just plug a power-meter into the supply line of the DUT.

This would encourage DUT vendors to reduce power consumption, something that often
can be achieved by just selecting appropriate components (lowest power CPU options, going
FPGA etc.. routes).

Personally, i am of course also interested in easily derived performance factors such as
comparing 100% power consumption for the HTTP vs. HTTPS case - cost of end-to-end security
that is. If a DUT just shows linerate for both HTTP and HTTPS, but with double the
power consumption when using HTTPs, that may even impact deployment - even in small cases
with a small 19" rack, some ventilation and some amount of power - 100..500W makes a difference
whethre its 100 or 500W.


2361	8.  IANA Considerations

2363	   This document makes no specific request of IANA.

2365	   The IANA has assigned IPv4 and IPv6 address blocks in [RFC6890] that
2366	   have been registered for special purposes.  The IPv6 address block
2367	   2001:2::/48 has been allocated for the purpose of IPv6 Benchmarking
2368	   [RFC5180] and the IPv4 address block 198.18.0.0/15 has been allocated
2369	   for the purpose of IPv4 Benchmarking [RFC2544].  This assignment was
2370	   made to minimize the chance of conflict in case a testing device were
2371	   to be accidentally connected to part of the Internet.

[minor] I don't think the secnd paragraph belongs into an IANA considerations
section. This section is usually resesrved only for actions IANA is supposed to
take for this document. I would suggest to move this paragraph to an earlier
section, maybe even simply make one up "Addressing for tests".

2373	9.  Security Considerations

2375	   The primary goal of this document is to provide benchmarking
2376	   terminology and methodology for next-generation network security
2377	   devices for use in a laboratory isolated test environment.  However,
2378	   readers should be aware that there is some overlap between
2379	   performance and security issues.  Specifically, the optimal
2380	   configuration for network security device performance may not be the
2381	   most secure, and vice-versa.  The cipher suites recommended in this
2382	   document are for test purpose only.  The cipher suite recommendation
2383	   for a real deployment is outside the scope of this document.

2385	10.  Contributors

2387	   The following individuals contributed significantly to the creation
2388	   of this document:

2390	   Alex Samonte, Amritam Putatunda, Aria Eslambolchizadeh, Chao Guo,
2391	   Chris Brown, Cory Ford, David DeSanto, Jurrie Van Den Breekel,
2392	   Michelle Rhines, Mike Jack, Ryan Liles, Samaresh Nair, Stephen
2393	   Goudreault, Tim Carlin, and Tim Otto.

2395	11.  Acknowledgements

2397	   The authors wish to acknowledge the members of NetSecOPEN for their
2398	   participation in the creation of this document.  Additionally, the
2399	   following members need to be acknowledged:

2401	   Anand Vijayan, Chris Marshall, Jay Lindenauer, Michael Shannon, Mike
2402	   Deichman, Ryan Riese, and Toulnay Orkun.

2404	12.  References

2406	12.1.  Normative References

2408	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
2409	              Requirement Levels", BCP 14, RFC 2119,
2410	              DOI 10.17487/RFC2119, March 1997,
2411	              <https://www.rfc-editor.org/info/rfc2119>.

2413	   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
2414	              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
2415	              May 2017, <https://www.rfc-editor.org/info/rfc8174>.

2417	12.2.  Informative References

2419	   [RFC2544]  Bradner, S. and J. McQuaid, "Benchmarking Methodology for
2420	              Network Interconnect Devices", RFC 2544,
2421	              DOI 10.17487/RFC2544, March 1999,
2422	              <https://www.rfc-editor.org/info/rfc2544>.

2424	   [RFC2647]  Newman, D., "Benchmarking Terminology for Firewall
2425	              Performance", RFC 2647, DOI 10.17487/RFC2647, August 1999,
2426	              <https://www.rfc-editor.org/info/rfc2647>.

2428	   [RFC3511]  Hickman, B., Newman, D., Tadjudin, S., and T. Martin,
2429	              "Benchmarking Methodology for Firewall Performance",
2430	              RFC 3511, DOI 10.17487/RFC3511, April 2003,
2431	              <https://www.rfc-editor.org/info/rfc3511>.

2433	   [RFC5180]  Popoviciu, C., Hamza, A., Van de Velde, G., and D.
2434	              Dugatkin, "IPv6 Benchmarking Methodology for Network
2435	              Interconnect Devices", RFC 5180, DOI 10.17487/RFC5180, May
2436	              2008, <https://www.rfc-editor.org/info/rfc5180>.

2438	   [RFC6815]  Bradner, S., Dubray, K., McQuaid, J., and A. Morton,
2439	              "Applicability Statement for RFC 2544: Use on Production
2440	              Networks Considered Harmful", RFC 6815,
2441	              DOI 10.17487/RFC6815, November 2012,
2442	              <https://www.rfc-editor.org/info/rfc6815>.

2444	   [RFC6890]  Cotton, M., Vegoda, L., Bonica, R., Ed., and B. Haberman,
2445	              "Special-Purpose IP Address Registries", BCP 153,
2446	              RFC 6890, DOI 10.17487/RFC6890, April 2013,
2447	              <https://www.rfc-editor.org/info/rfc6890>.

2449	   [RFC7230]  Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer
2450	              Protocol (HTTP/1.1): Message Syntax and Routing",
2451	              RFC 7230, DOI 10.17487/RFC7230, June 2014,
2452	              <https://www.rfc-editor.org/info/rfc7230>.

2454	   [RFC8446]  Rescorla, E., "The Transport Layer Security (TLS) Protocol
2455	              Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018,
2456	              <https://www.rfc-editor.org/info/rfc8446>.

2458	   [RFC9000]  Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based
2459	              Multiplexed and Secure Transport", RFC 9000,
2460	              DOI 10.17487/RFC9000, May 2021,
2461	              <https://www.rfc-editor.org/info/rfc9000>.

2463	Appendix A.  Test Methodology - Security Effectiveness Evaluation

[nit] /Evaluation/Test/ - called test in the rest of this doc.

2464	A.1.  Test Objective

2466	   This test methodology verifies the DUT/SUT is able to detect,

[nit] /verifies the/ verifies that the/

2467	   prevent, and report the vulnerabilities.

2469	   In this test, background test traffic will be generated to utilize
2470	   the DUT/SUT.  In parallel, the CVEs will be sent to the DUT/SUT as
2471	   encrypted and as well as clear text payload formats using a traffic
2472	   generator.  The selection of the CVEs is described in Section 4.2.1.

2474	   The following KPIs are measured in this test:

2476	   *  Number of blocked CVEs

2478	   *  Number of bypassed (nonblocked) CVEs

2480	   *  Background traffic performance (verify if the background traffic
2481	      is impacted while sending CVE toward DUT/SUT)

2483	   *  Accuracy of DUT/SUT statistics in term of vulnerabilities
2484	      reporting

2486	A.2.  Testbed Setup

2488	   The same testbed MUST be used for security effectiveness test and as
2489	   well as for benchmarking test cases defined in Section 7.

2491	A.3.  Test Parameters

2493	   In this section, the benchmarking test specific parameters SHOULD be
2494	   defined.

[nit] /SHOULD/are/ - a requirement against the authors of the document to write
desirable text in the document is not normative.

2496	A.3.1.  DUT/SUT Configuration Parameters

2498	   DUT/SUT configuration parameters MUST conform to the requirements
2499	   defined in Section 4.2.  The same DUT configuration MUST be used for
2500	   Security effectiveness test and as well as for benchmarking test
2501	   cases defined in Section 7.  The DUT/SUT MUST be configured in inline
2502	   mode and all detected attack traffic MUST be dropped and the session

[nit] /detected traffic/detected CVE traffic/ - there is also background traffic, which i guess shouldnot be dropped, right ?

[nit] /the session/its session/ ?

2503	   SHOULD be reset

2505	A.3.2.  Test Equipment Configuration Parameters

2507	   Test equipment configuration parameters MUST conform to the
2508	   requirements defined in Section 4.3.  The same client and server IP
2509	   ranges MUST be configured as used in the benchmarking test cases.  In
2510	   addition, the following parameters MUST be documented for this
2511	   benchmarking test:

2513	   *  Background Traffic: 45% of maximum HTTP throughput and 45% of
2514	      Maximum HTTPS throughput supported by the DUT/SUT (measured with
2515	      object size 64 KByte in the benchmarking tests "HTTP(S)
2516	      Throughput" defined in Section 7.3 and Section 7.7).

[nit] RECOMMENDED Background Traffic ?

2518	   *  RECOMMENDED CVE traffic transmission Rate: 10 CVEs per second

2520	   *  It is RECOMMENDED to generate each CVE multiple times
2521	      (sequentially) at 10 CVEs per second

2523	   *  Ciphers and keys for the encrypted CVE traffic MUST use the same
2524	      cipher configured for HTTPS traffic related benchmarking tests
2525	      (Section 7.6 - Section 7.9)

2527	A.4.  Test Results Validation Criteria

2529	   The following criteria are the test results validation criteria.  The
2530	   test results validation criteria MUST be monitored during the whole
2531	   test duration.

[nit] /criteria are/lists/ - duplication of criteria in sentence.

2533	   a.  Number of failed application transaction in the background
2534	       traffic MUST be less than 0.01% of attempted transactions.

2536	   b.  Number of terminated TCP connections of the background traffic
2537	       (due to unexpected TCP RST sent by DUT/SUT) MUST be less than
2538	       0.01% of total initiated TCP connections in the background
2539	       traffic.

[comment] That is quite high. Shouldn't this at least be 5 nines of
success ? 99.999% -> 0.001% maximum rate of errors ? I thought thats the common
lore service provider product quality requirement minimum.

2541	   c.  During the sustain phase, traffic SHOULD be forwarded at a
2542	       constant rate (considered as a constant rate if any deviation of
2543	       traffic forwarding rate is less than 5%).

[minor] This seems underspecified. I guess in the ideally behaving DUT case
all background traffic is passed unmodified and all CVE connection traffic is dropped.
So the total amount of traffic with CVE events must be configured to be less then
5% ?! What additional information would this 5% tell me that i do not already
get from a. and b. ? E.g.: if i fail some background connection, then the impact
depends on how big that connection would have been, but it doesn't seem as if
i get new information if a big NetFlix background flow got killed and therefore
5 Gigabyte less background traffic where observed, or if the same happened to
a 200KByte amazon shopping connection. It would just cause DUT to maybe do
less inspection on big flows in fear of triggering false resets on them ?? Is
that what we want from DUTs ?

2545	   d.  False positive MUST NOT occur in the background traffic.

[comment]  I do not understand d. When a background transaction from a. fails,
how is that different from false-positively being classified as a CVE - it would
be droppen then, right ? Or are you saying that a./b. is the case that the
background traffic receives errors from the DUT even though the DUT does NOT
recognize it as a CVE ?  Any example reason why that would happen ?

2547	A.5.  Measurement

2549	   Following KPI metrics MUST be reported for this test scenario:

2551	   Mandatory KPIs:

2553	   *  Blocked CVEs: It SHOULD be represented in the following ways:

2555	      -  Number of blocked CVEs out of total CVEs

2557	      -  Percentage of blocked CVEs

2559	   *  Unblocked CVEs: It SHOULD be represented in the following ways:

2561	      -  Number of unblocked CVEs out of total CVEs

2563	      -  Percentage of unblocked CVEs

2565	   *  Background traffic behavior: It SHOULD be represented one of the
2566	      followings ways:

2568	      -  No impact: Considered as "no impact'" if any deviation of
2569	         traffic forwarding rate is less than or equal to 5 % (constant
2570	         rate)

2572	      -  Minor impact: Considered as "minor impact" if any deviation of
2573	         traffic forwarding rate is greater than 5% and less than or
2574	         equal to10% (i.e. small spikes)

2576	      -  Heavily impacted: Considered as "Heavily impacted" if any
2577	         deviation of traffic forwarding rate is greater than 10% (i.e.
2578	         large spikes) or reduced the background HTTP(S) throughput
2579	         greater than 10%

[minor] I would prefer reporting the a./b. numbers, e.g.: percentage of
failed background connections. As mentioned before, i find the total background
traffic rate impact a rather problematic/less valuable metric.

2581	   *  DUT/SUT reporting accuracy: DUT/SUT MUST report all detected
2582	      vulnerabilities.

2584	   Optional KPIs:

2586	   *  List of unblocked CVEs

[minor] I think this KPI is a SHOULD or even MUST. Otherwise one can not trace
security impacts (when one does not know which CVE it is). This is still the
security effectiveness appendix, and reporting is not effective without this.

2588	A.6.  Test Procedures and Expected Results

2590	   The test procedure is designed to measure the security effectiveness
2591	   of the DUT/SUT at the sustaining period of the traffic load profile.
2592	   The test procedure consists of two major steps.  This test procedure
2593	   MAY be repeated multiple times with different IPv4 and IPv6 traffic
2594	   distribution.

2596	A.6.1.  Step 1: Background Traffic

2598	   Generate background traffic at the transmission rate defined in
2599	   Appendix A.3.2.

2601	   The DUT/SUT MUST reach the target objective (HTTP(S) throughput) in
2602	   sustain phase.  The measured KPIs during the sustain phase MUST meet
2603	   all the test results validation criteria defined in Appendix A.4.

2605	   If the KPI metrics do not meet the acceptance criteria, the test
2606	   procedure MUST NOT be continued to "Step 2".

2608	A.6.2.  Step 2: CVE Emulation

2610	   While generating background traffic (in sustain phase), send the CVE
2611	   traffic as defined in the parameter section.

2613	   The test equipment SHOULD start to measure and record all specified
2614	   KPIs.  Continue the test until all CVEs are sent.

2616	   The measured KPIs MUST meet all the test results validation criteria
2617	   defined in Appendix A.4.

2619	   In addition, the DUT/SUT SHOULD report the vulnerabilities correctly.

2621	Appendix B.  DUT/SUT Classification

2623	   This document aims to classify the DUT/SUT in four different
2624	   categories based on its maximum supported firewall throughput
2625	   performance number defined in the vendor datasheet.  This
2626	   classification MAY help user to determine specific configuration
2627	   scale (e.g., number of ACL entries), traffic profiles, and attack
2628	   traffic profiles, scaling those proportionally to DUT/SUT sizing
2629	   category.

2631	   The four different categories are Extra Small (XS), Small (S), Medium
2632	   (M), and Large (L).  The RECOMMENDED throughput values for the
2633	   following categories are:

2635	   Extra Small (XS) - Supported throughput less than or equal to1Gbit/s

2637	   Small (S) - Supported throughput greater than 1Gbit/s and less than
2638	   or equal to 5Gbit/s

2640	   Medium (M) - Supported throughput greater than 5Gbit/s and less than
2641	   or equal to10Gbit/s

2643	   Large (L) - Supported throughput greater than 10Gbit/s

2645	Authors' Addresses

2647	   Balamuhunthan Balarajah
2648	   Berlin
2649	   Germany

2651	   Email: bm.balarajah@gmail.com
2652	   Carsten Rossenhoevel
2653	   EANTC AG
2654	   Salzufer 14
2655	   10587 Berlin
2656	   Germany

2658	   Email: cross@eantc.de

2660	   Brian Monkman
2661	   NetSecOPEN
2662	   417 Independence Court
2663	   Mechanicsburg, PA 17050
2664	   United States of America

2666	   Email: bmonkman@netsecopen.org

EOF