Re: [bmwg] [iotdir] telechat Review for draft-ietf-bmwg-ngfw-performance-13

Carsten Rossenhoevel <cross@eantc.de> Mon, 31 January 2022 17:00 UTC

Return-Path: <cross@eantc.de>
X-Original-To: bmwg@ietfa.amsl.com
Delivered-To: bmwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 622043A0D84; Mon, 31 Jan 2022 09:00:38 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.612
X-Spam-Level:
X-Spam-Status: No, score=-2.612 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, NICE_REPLY_A=-0.714, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HUWl-riY3x8s; Mon, 31 Jan 2022 09:00:33 -0800 (PST)
Received: from obelix.eantc.de (ns.eantc.com [89.27.172.100]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D93703A0D7B; Mon, 31 Jan 2022 09:00:31 -0800 (PST)
Received: from elephant.eantc.de ([192.168.100.179]) by obelix.eantc.de with esmtp (Exim 4.89) (envelope-from <cross@eantc.de>) id 1nEa2P-0000gj-AU; Mon, 31 Jan 2022 18:00:21 +0100
Content-Type: multipart/alternative; boundary="------------dTdtN3qDo6xxvWKd9Vaq5y32"
Message-ID: <3df1c388-f0cf-921e-077b-9c0ea537c3c4@eantc.de>
Date: Mon, 31 Jan 2022 18:00:20 +0100
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.5.0
Content-Language: en-US
To: Toerless Eckert <tte@cs.fau.de>
Cc: iot-directorate@ietf.org, evyncke@cisco.com, draft-ietf-bmwg-ngfw-performance.all@ietf.org, mariainesrobles@googlemail.com, bmwg@ietf.org
References: <Yfeun+FpkbxbYTqD@faui48e.informatik.uni-erlangen.de>
From: Carsten Rossenhoevel <cross@eantc.de>
Organization: EANTC AG
In-Reply-To: <Yfeun+FpkbxbYTqD@faui48e.informatik.uni-erlangen.de>
Archived-At: <https://mailarchive.ietf.org/arch/msg/bmwg/mxyuI_QO_LMO4VVd6cQ3DfN6AYc>
Subject: Re: [bmwg] [iotdir] telechat Review for draft-ietf-bmwg-ngfw-performance-13
X-BeenThere: bmwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Benchmarking Methodology Working Group <bmwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bmwg>, <mailto:bmwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/bmwg/>
List-Post: <mailto:bmwg@ietf.org>
List-Help: <mailto:bmwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bmwg>, <mailto:bmwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 31 Jan 2022 17:00:39 -0000

Hi Toerless,

Thanks for your lighting-fast and detailed response:

- We agree to rewrite the intro slightly, introducing the terms NGFW and 
UTM and detail the assumptions of minimum working knowledge expected by 
readers.  Unfortunately IETF has not introduced network security 
architectures in other RFCs.  Earlier we got discouraged to reference 
external sources such as Wikipedia regarding an introduction to the 
general topic of network security, because it seems that RFCs typically 
do not provide such references and they are objected to by the IESG.  If 
you have some advice please let me know.

- We will add a paragraph explaining the rationale to obsolete RFC3511, 
and the differences of the new document.  Note that we do not plan to 
obsolete RFC2647; we will mention that as well and explain briefly that 
we will just amend the terminology with additional terms such as HTTP 
throughput.  We will also explain how firewalls have evolved from simple 
stateless ACL filters since the time when RFC3511 was published.

- We will rename DDoS in table 1 to "DDoS protection"

- Regarding the term DPI, we will enable consistency between tables 1, 
2, and 3 (and re-order them as agreed before, moving table 3 before table 1)

- We will add a note to the ACL numbers in figure 3, explaining that 
these are minimum numbers that can be increased at the discretion of the 
test lab.  Initially, we had much intended higher ACL numbers, but NGFW 
vendors explained that these numbers do not influence the performance of 
firewalls during operations as much anymore.  Our PoC tests with a 
number of DUT candidates confirmed these claims.

- We will rename "security and measurement traffic" to "test traffic" to 
avoid confusion

- Regarding your proposed caveat to the test setups, we agree to adopt 
it with some small modifications.  The test topology should be as simple 
as possible, but the test traffic flows certainly shall not be 
over-simplified.  In fact, we are defining multiple different traffic 
mixes to adapt to real-world traffic in a reproducible way.

- We can remove "in order to achieve maximum network security" from 
section 4.2

- We will rename "A unique DUT/SUT configuration" to "The same DUT/SUT 
configuration" in the first sentence of section 4.2.



More feedback to your comments that we believe does not require 
modifications of the draft:

- Badly working TCP stacks would result in bad performance measured 
according to the test methodology in section 7.  From our point of view 
there should not be a specific discussion of underlying protocol issues 
in this application-layer testing document.  We discussed the TCP issues 
within the BMWG for some time.   By the way, the moment we start talking 
about TCP details, immediately IETF people show up that ask "what about 
TCPv6" and "what about QUIC"?

- Certain use case scenarios allow validating SSL certificates (such as 
protecting web server farms with well-known certificates including the 
private keys). But in perimeter scenarios, SSL private keys are 
unknown.  And even a simple SSL certificate validation ("does the public 
key have a valid keychain") cannot be done exhaustively.  The NGFW is in 
/between /client and server, but not necessarily /in the middle///- as 
in decoding and re-encoding things- for all application scenarios.

- We think the word "unique" is not confusing and would like to keep the 
original wording.

- Appendix B was created following guidance from the BMWG co-chairs, as 
the numbers in figure 3 are affecting the benchmarking methodology but 
the numbers in Appendix B are not. We would like to keep it this way 
unless the co-chairs agree with the suggested change.

- The acceptable failure rates are a result of extensive discussions 
between the authors and contributors from the BMWG and is a compromise.  
In the end it is a finite number.  Some people wanted zero (zero 
packets) failure rate, some much higher failure rates to be permitted.   
The agreed-to number was also influenced by extensive benchmarking of 
the BMWG and related Linux Foundation work regarding failure rates by 
virtualized network functions. The foreground failure rates are lower by 
one order of magnitude because foreground traffic is considered more 
important and DUTs/SUTs are expected to act more precisely on foreground 
traffic.


Best regards,
      Bala, Brian, and Carsten



On 31.01.2022 10:40, Toerless Eckert wrote:
> On Mon, Jan 31, 2022 at 08:20:00AM +0100, Carsten Rossenhoevel wrote:
>>   * NGFW and UTM are fixed terms in the industry.
> The terminology is fine. I just felt it would be fairly easy to somewhat
> rewrite the intro such that it correctly introduces those terms and how
> they relate to each other instead of expecting them to be known. Wrt to
> your argument that everybodt knows these terms: Just mentor a few young folks
> coming to IETF primarily to learn, and then revisit whether your statement
> holds true to the target readers you want to address as opposed to the
> expert you work with ;-)
>
>>   * We have had extensive discussions of goodput, throughput, and other
>>     terms describing how many packets of which type make it from source
>>     to destination.
> A summary of what was changed in scope and terminology from RFC3511.
> Such a change summary section is also quite common (and i'd say required)
> for RFCs that are meant to obsolete another draft. Just write there
> that goodput has fallen out of fashion or the like.
>
> Technically, my main concern is just badly working TCP stacks in the
> NGFW that create unnecessary much retransmission in the face of bad
> paths to either client or server, but let me check if/what standard
> TCP measurements are recommended for that case and get back to you.
>
>>   * Adding a terminology section would not have increased readability
>>     for the target audience.
>>   * This document cannot and does not aim to introduce the whole world
>>     of network security terms.
>>   * This is just a  benchmarking methodology document.  We have to assume
>>     that readers are network security professionals understanding the basics
>>     of today's network security functions.
> Is there any other IETF RFC that introduces/explain the network security
> device terminology/architecture ? If not RFC then any other reference
> you could add ?
>
>>     I am not sure how to respond to some of your questions (e.g.
>>     about making SSL inspection mandatory
> I did not ask for that. I just wondered about the text "MUST explain the implication".
>
>
>>     , or recommending to implement Web Filtering across the
>>     industry [row 294]).
> Row 294 comments where not about web filtering.
> I was asking for row294 whether my understanding of your term
> "Certificate Validation" is correct, because the way i understand it,
> it could never be optional for a security device: The NGFW must
> verify the server certificate if it is putting itslf in the middle of
> a TLS session client/server.
>
>>     DDoS is not a firewall feature (it is an attack type), so it is not
>>     listed as a security feature in table 3.
> DDoS is listed in Table 1 which is titled "NGFW Security Features".
> Maybe you want to rename that table 1 line enty to something like "DDoS protection".
> But in any case there seem to be some NGFW feature related to DDoS,
> and i jut wonder if/how that feature is reflected in table 3.
>
>>     DPI is an obsolete term.
> Why then is it listed in Table 2 ? Or do you consider the written out term
> "Deep Packet Inspection" (as in Table 2) to be something different than DPI ?
> oI was just doing consistency checking. Table 1 and Table 2 claim to be
> NGFW/NGIPS Security Features, and Table 3 claims to be "Security Feature Description",
> so i was just trying to check completeness of Table 3.
>
>>     Any NGFW does some type of deep-packet
>>     inspection.  More detailed terms in table 3 describe the different
>>     aspects of packet inspections.
> It would be easier to read if the terms used across Tables 1,2,3 where consistent.
> Maybe just put only those terms from Table 3 into Table 3 that apply to Table 2 ?
>
>>   *   The time of ACL-based routers acting as firewalls is over.
>>     Managing thousands of ACL entries is difficult and often bears a
>>     security risk in itself.  It is not a typical application scenario
>>     today.  Security is achieved by advanced features such as IDS and
>>     malware detection.
> That would be a good simple explanation for the rather low tested ACL value
> that should be somewhere into the document, because it too is a part of the
> evolution from RFC3511.
>
> As an example counterpoint: There are new IETF protocols, such as MUD (RFC8520), that
> ultimately result in automation of ACL building, for example for each IoT
> device on the client side (lets say large enterprise) an ACL with a
> few entries (defining the permitted traffic flows between this IoT device
> and the Internet). So if the enterprise has maybe 4000 IoT devices, it could easily
> be a (SrcIP,DstIP,Proto,*,DstPort) ACL list of maybe 15000 entries.
>
> If such novel IETF work can not be put into a NGFW because that new firewall
> will not support such long ACLs, that is your NGFW perogative, except that i would
> feel a lot safer if that was written somewhere. Such as in the list of
> excemptions not covered by this document.
>
> I have seen similar automation of security through combinations of controllers
> and firewalls in other areas as well, such as Telephony.
>
>>   * "security and measurement traffic" in row 411 can be changed if you
>>     feel that the use of these descriptive words requires a definition.
>>        If you like, we can remove "security and measurement", keeping
>>     "traffic".
> It seems to me that there are two type of traffic. One is the traffic passing
> through the DUT to measure it, the other one is all the other collateral
> traffic to make the test setup work. Those two type of traffic just need
> to terms thart are defined. Unfortunately, i am not aware if/what any
> two industry agreed upon terms are for these two type of traffic.
>
>> *Obsoleting RFC3511*: RFC3511 shall indeed be obsoleted; we have had
>> extensive discussions about it e.g. in IETF110
>> <https://datatracker.ietf.org/meeting/110/materials/minutes-110-bmwg-01.pdf>
>> and on the BMWG mailing list. Rationales:  1) Allowing the use of more than
>> 18-year old benchmarking methodology for the same group of network security
>> solutions in parallel to the new one would really confuse the market and not
>> be good SDO workmanship.  2) All relevant benchmarks in RFC3511 are
>> technically substituted and improved by the new draft; only a few L4
>> (TCP/UDP) test cases have deliberately been obsoleted.  These test cases do
>> not make any sense for today's NGFWs.  Even old firewalls (pre-NGFW) can be
>> tested much more accurate with the new methodology.
> I wasn't challenging the goal, i was just wondering if the document in
> its current version has all the bells and whistles normally expected
> for a document doing the obsoletion. Such as that "diff-from-obsoleted-rfc"
> section and explanation such as the ones in your paragraph above.
>
>> *Energy efficiency measurements* are of paramount importance - I agree.
>> Large-scale groups in ATIS and other SDOs are working on standardizing
>> energy efficiency measurements.  I invite contributors to create a new draft
>> for NGFW energy efficiency benchmarks. Unfortunately, attaching a power
>> meter is not sufficient and is not a key performance indicator (KPI) by
>> itself.
> But it does become  very useful KPI for reders of reports as soon as
> they compare different products with for their purpose appropriate
> security/performance features set for their particular use-case.
>
> I actually ran through similar vetting with routers and customers. They
> effectively where looking for Capex (equipment cost, space cost),
> Opex (power consumption), vs. performance for a particular feature set.
>
>>   A firewall with fewer security functions is not better because it
>> takes less power per Gigabit than a strong firewall. Balancing these KPIs
>> with power usage is a difficult and sensitive task that requires a lot of
>> compromises and industry consensus.
> I am trying to parse what you say but have difficulties. Are you
> afraid that people in the industry would fear that captiuring the power
> consumption number during the steady state run of their devices would
> aggravate people in the industry because they fear the power consumption
> numbers would make their devices look bad and therefore you would prefer
> for BMWG to not ask for taking such power measurement numbers ?
>
> IMHO, A simple power consumption measurement taken the way i suggest to
> (for the 10% and max-performance measurement points) is IMHO a very
> useful start, because it can help to motivate vendors within their
> existing designs to optimize. For example CPU based NGFW would have more
> of an incentive to go beyond simple PMD (Poll Mode Driver) for their
> CPU forwarding plane (which is the worst CPU burner), and enabling
>   of low-power mode dynamically anywere where the HW supports it when
> only low performance is required. A lot of firewalls at the edge of the internet have
> a lot of time where only low-performance is needed, but they need to
> be bought for peak performance (such as most enterprise offices outside office
> hours).
>
>>   I sincerely hope that the IETF
>> recognizes the unparalleled importance of energy-efficient standardization
>> and provides guidance to all areas and WGs; then the BMWG and NGFW
>> benchmarks could well be expanded in future work.
> I don't think we are setting a good example by simply pushing off the
> task instead of taking care of low hanging fruits when they offer themselves,
> like i think they do in this case.
>
>> The *test setups* (Figure 2) are to be used as specified:
>>
>>   * The goal of this benchmarking draft is to enable reproducible
>>     results in controlled, minimum lab setups.  Of course we could
>>     complicate it with mixing trusted and untrusted zones, adding more
>>     security zones, etc.  Within the four years of development, the
>>     group concluded that adding such complexity would not improve the
>>     reproducibility and readibility of results.  "actual deployments"
>>     and "typical deployments" relate to the "parameters and security
>>     features" (line 258) - not to the network topology setup.
>>     Reproducibility is gained by adequate documentation of the
>>     parameters and security features (see section 6 of the draft as
>>     well) - not by nailing down specific configurations.  The NGFWs are
>>     too diverse to aim for such a goal.
> Agreed. But i just came to that realization after going through the whole
> document and also not bing able to come up with a lot broader testing without
> a lot more complexity.
>
> I think a sentence like the following would be good explanation for the
> text: "The DUT test topology and test traffic flows do not aim
> to exercise the variety of real-world traffic flow and security zones
> often attached to an NGIPS/NGFW, but to represent the most simple topology
> in which the performance of traffic flows through the DUT can be measured.".
>
>>   * Of course, measurements with different sets of parameters will yield
>>     different results (your comment on line 380).  Detailed reporting
>>     (section 6) will allow readers to interpret results correctly.  One
>>     potential use of this draft is to establish an external
>>     certification program ("NetSecOPEN" initiative).  For such a
>>     program, parameter sets need to be defined in more detail.  But the
>>     consensus among authors and the WG was that the draft shall not be
>>     limited to very specific certification setups.
> Thats fine. Its again that the text is terse and lets one guess at what
> you so much better explain above.
>
>>   * "maximum security coverage" is a blanket clause, indicating that
>>     should focus configurations on best security not only on achieving
>>     maximum performance.  This is a typical conflict of goals in network
>>     security benchmark testing, specifically if vendors carry out tests.
> Seems to me as if you could simply delete "in order to achieve maximum network security coverage",
> and the remaining text is still perfectly fine, and you avoid having readers wonder
> about an undefined blanket clause.
>
>>   * The technique of aggregating lower-speed interfaces from test
>>     equipment for a higher-speed DUT interface is considered common lab
>>     knowledge and thus not explained in this draft.
>>
>>   * In line 255, the word "unique" could be misunderstood indeed. Maybe
>>     the word "single" would explain it better?
> single common ?
> I am not a native english speaker, so please apply your best knowledge
> to resolve those language nits if you agree with them ;-)
>
>> *DUT classifications* into XS, S, M, and L were made in the main document to
>> ensure this classification bears some weight.  It is important for
>> apples-to-apples comparisons of benchmarks. While the requirements for
>> number of rules per DUT classification is expected to be stable, the actual
>> device scaling will change faster due to innovations.  This is why the DUT
>> types are specified in Appendix B.  If you feel differently, please suggest
>> text.
> It just seems to me that the numbers in Figure 3 are completely dependent on the
> XS/S/M/L classifications. If you'd make all XS/S/M/L devices in the future
> 10x faster, then you would need to equally change the numbers in Figure 3.
> Tht makes the attempt of putting some numbers into an appendix, but not its
> dependent numbers somewhat feeble.
>
> Aka: I wouldn't t bother with appendix B. Just inline the text, makes the document
> also easier to read. Or else you may also want to move out Figure 3 into
> appendix B. Those seem to be the two consistent options to me.
>
> And yes, sorry, this is also just structural text nitpicking, nothing
> substantial, but hopefully adds to text quality.
>
>> *Test Case descriptions*.
>>
>>   * Your comment "Section 7 is a lot of work to get right" is
>>     interesting. Procedural replication is intentional in test plans, to
>>     make sure that each test case is complete.  Readers do not typically
>>     appreciate complex referrals and footnotes when executing test cases
>>     (speaking from a few years of experience). Being very descriptive in
>>     test case descriptions improves the quality and reproducibility of
>>     test execution.
> Yes. Indeed. This was mostly from the concerns of a poor reviewer trying to
> compare details across tests. Carry on.
>
> (in real test plans i have often seen also big tables across test runs
>   making comparisons easy when having large printers.... oh well ;-)
>
>>   * Test case scale goals are already aligned with promised vs. measured
>>     performance., as per the text in rows 1046/1047
>>   * Row 1070 relates to foreground measurement traffic, appendix A.4 to
>>     background traffic failure rates
> Ah, ok. Thanks. But just because from the security perspective it's called
> background traffic does not mean to me that 0.01% failure rate is appropriate.
> That traffic would be potentially all business critical traffic and a good
> amount of it might not be using resilient application code that recovers
> from failures well.
>
> E.g.: how did you folks come up with the 0.01% ?
>
> Cheers
>      Toerless
>
>> Best regards, Carsten

-- 
Carsten Rossenhövel
Managing Director, EANTC AG (European Advanced Networking Test Center)
Salzufer 14, 10587 Berlin, Germany
office +49.30.3180595-21, fax +49.30.3180595-10, mobile +49.177.2505721
cross@eantc.de,https://www.eantc.de

Place of Business/Sitz der Gesellschaft: Berlin, Germany
Chairman/Vorsitzender des Aufsichtsrats: Herbert Almus
Managing Directors/Vorstand: Carsten Rossenhövel, Gabriele Schrenk
Registered: HRB 73694, Amtsgericht Charlottenburg, Berlin, Germany
EU VAT No: DE812824025