Re: [bmwg] WGLC: draft-ietf-bmwg-ipflow-meth-03

"Jan Novak (janovak)" <janovak@cisco.com> Mon, 26 September 2011 11:44 UTC

Return-Path: <janovak@cisco.com>
X-Original-To: bmwg@ietfa.amsl.com
Delivered-To: bmwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A0D1121F8C1A for <bmwg@ietfa.amsl.com>; Mon, 26 Sep 2011 04:44:04 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.159
X-Spam-Level:
X-Spam-Status: No, score=-9.159 tagged_above=-999 required=5 tests=[AWL=-0.076, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-8, SARE_MILLIONSOF=0.315, SARE_OEM_S_DOL=1.2]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id RZJz6ewlRF4Z for <bmwg@ietfa.amsl.com>; Mon, 26 Sep 2011 04:43:49 -0700 (PDT)
Received: from ams-iport-2.cisco.com (ams-iport-2.cisco.com [144.254.224.141]) by ietfa.amsl.com (Postfix) with ESMTP id 6752A21F8C19 for <bmwg@ietf.org>; Mon, 26 Sep 2011 04:43:45 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=janovak@cisco.com; l=155967; q=dns/txt; s=iport; t=1317037588; x=1318247188; h=mime-version:subject:date:message-id:in-reply-to: references:from:to:cc; bh=pFlPYeTSOq9qQ9zTArvFQBAnd+lVn71Xm4KO4xcFQig=; b=DQCL9PxvYTOqT4l16OmS8O1jWD1/R/43+g24MJPbM3yF+SFDJRSbhUNY xMzLJo1czpZNMPbp8P7ax+njxaR2aqF94nx60FzSRJubSl1jDWAQJqe2W QzTGeH2Zo6kMsOrK58CohpMLNcZ8kKc176F9jNjbCcZMNybKnvTD+sRMJ 4=;
X-IronPort-AV: E=Sophos; i="4.68,443,1312156800"; d="scan'208,217"; a="56206001"
Received: from ams-core-1.cisco.com ([144.254.72.81]) by ams-iport-2.cisco.com with ESMTP; 26 Sep 2011 11:46:27 +0000
Received: from xbh-ams-201.cisco.com (xbh-ams-201.cisco.com [144.254.75.7]) by ams-core-1.cisco.com (8.14.3/8.14.3) with ESMTP id p8QBkRiJ015144; Mon, 26 Sep 2011 11:46:27 GMT
Received: from xmb-ams-212.cisco.com ([144.254.75.23]) by xbh-ams-201.cisco.com with Microsoft SMTPSVC(6.0.3790.4675); Mon, 26 Sep 2011 13:46:27 +0200
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01CC7C41.EC734DBF"
Date: Mon, 26 Sep 2011 13:46:26 +0200
Message-ID: <C95CC96B171AF24CA1BB6CA3C52D0BA00102F151@XMB-AMS-212.cisco.com>
In-Reply-To: <4E6A0E03.3060807@cisco.com>
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Thread-Topic: WGLC: draft-ietf-bmwg-ipflow-meth-03
Thread-Index: Acxu8IO6DLQZWOgkTCWugo/pPFbxFANUUyYw
References: <4DF60A70.4070902@cisco.com> <4DFF103A.1000209@cisco.com> <C95CC96B171AF24CA1BB6CA3C52D0BA0A3F270@XMB-AMS-212.cisco.com> <4E6A0E03.3060807@cisco.com>
From: "Jan Novak (janovak)" <janovak@cisco.com>
To: "Paul Aitken (paitken)" <paitken@cisco.com>
X-OriginalArrivalTime: 26 Sep 2011 11:46:27.0130 (UTC) FILETIME=[EC9C81A0:01CC7C41]
X-Mailman-Approved-At: Tue, 27 Sep 2011 02:01:35 -0700
Cc: Al Morton <acmorton@att.com>, bmwg@ietf.org
Subject: Re: [bmwg] WGLC: draft-ietf-bmwg-ipflow-meth-03
X-BeenThere: bmwg@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Benchmarking Methodology Working Group <bmwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bmwg>, <mailto:bmwg-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/bmwg>
List-Post: <mailto:bmwg@ietf.org>
List-Help: <mailto:bmwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bmwg>, <mailto:bmwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 26 Sep 2011 11:44:04 -0000

P.S.

 

I have pasted the file below so you could see it if you wanted ...

 

2.2.3 Active Timeout

 

Definition:

      For long-running Flows, the time interval after which the Metering

      Process expires a Cache entry so that only regular 

      Flow updates are exported. 

 

PJ: I don't understand what "so that only regular Flow updates are
exported" means. Perhaps, "to ensure that flow data is regularly
updated" ?

 

 

======================JN============================================

I have gone through at least 5-10 iterations of this definitions,

the currecnt wording is from Benoit - I supposed it was authoritative
:-).

======================JN============================================

 

 

PJ: this definition is terribly vague. 

How much is "several multiples", and how many is 

"a large number of packets than usual"? I'd expect fixed limits.

 

======================JN============================================

There really aren't any limits on that no ?? - that's why I suppose this


definition is missing 

in IPFIX docs - I would rather simply remove that ??

======================JN============================================

 

 

2.2.5 Flow Export Rate 

       

   Definition:

      The nNumber of Flow Records Cache entries that expire from the
Cache (as defined by 

      the Flow Expiration term) and are exported to the Collector within

      a measurement time interval.

 

PJ: are all expired cache entries exported? Possibly not, eg if 

there is export filtering: in the case of "flow sampling" 

(NB not packet sampling) flows may be discarded rather than exported.

You assume that expired entries are immediately exported: 

what if the window is sufficiently wide that cache entries 

expire but are not exported within the measured time interval?

 

PJ: does all the exported data come from the Cache? Possibly not. Eg,

 IPFIX options typically contain uncached metadata about the monitoring 

system. Consider explicitly excluding such data sources.

 

======================JN============================================

This is a physical quantity definition so what and when is exported

does not matter here.

What you suggest is detailed in section 5

 

I would like to have the options export EXPLICITLY INCLUDED in the

measured export rate to cover for misconfigs or different defaults

etc - so the measured values has everything what the DUT exported

======================JN============================================

 

 

Same section

 

PJ: it may be limited by the exporter to collector bandwidth,

 or by the collector itself. Eg, I've heard of a collector which 

performs poorly when connected to gigEth, because it cannot service 

the incoming data fast enough.

 

======================JN============================================

 

I have added a sentence in this section to exclude testbed issues

what you say is already covered in section 4.4

 

======================JN============================================

 

 

3.1 The Definition

 

   Maximum Flow Monitoring Throughput

 

PJ: The actual, average or effective "Flow Monitoring Throughput" may 

be much lower, so you should clearly name this quantity as a *maximum*.

 

 

======================JN============================================

Where your definition of Flow Monitoring Throughput comes from eh :-) ??

I thought there wasn't any before - that's what is "invented" as part

of this doc - and the definition here is maximum - but for the sake of

good relationship ....

======================JN============================================

 

same section

 

 

 

PJ: note that the DUT may form an export message containing both

 flow records which expired before the end of the test and flow 

records which expired afterwards. Ideally the testing method will 

allow any final export messages to be received and count any flow 

records within those messages which expired before the end of the 

test. Else, a device which delays the export packet for longer will

 have more granular results than a device which exports records 

 

======================JN============================================

Yes agreed, it is discussed in details in section 5

======================JN============================================

 

Section 3.2

 

PJ: per earlier comment, please give a lot more detail of the 

architecture.

 

======================JN============================================

It is done in the section 3.3 - it can't be three times within the

doc - in the introduction, here and then in 3.3

This just says on what type of devices we measure - that's what Brian 

always wanted - his probes etc.

I suppose you don't want me to discuss possible HW architectures

of all DUTs or architectures of Flow monitoring implementations.

======================JN============================================

 

 

 

PJ: you must define "corrupted".

 

======================JN============================================

I would say it is a common technology term and I can't define every 

single term I use but will try.

======================JN============================================

 

 

 

PJ: assuming that the DUT uses keys from the observed traffic.

 

======================JN============================================

Not quite sure what you mean - where else from it could use the keys ??

Or you mean not to confuse traffic with export data ??

======================JN============================================

 

 

PJ: then there needs to be another link directly from the sender to 

the receiver (ie, not through the DUT), so the receiver can make that

 comparison.

 

======================JN============================================

Standard testers like IXIA can do this without any direct link - the

sending port knows what it send and can pass that info to the receiving

port for the check - it is just an ordinary traffic tester function

======================JN============================================

 

   without congestion. In other words, the export interface MUST NOT be 

   a bottleneck during the measurement.

 

PJ: what if the receiving interface, or the collector itself, is a 

bottleneck? Previously you said it doesn't matter, and I disagreed.

 

======================JN============================================

These issues are covered in the section 4.4, I don't think I said

it did not matter anywhere - I think the intention here is clear,

if my wording is bad, please suggest text where necessary

I think I corrected what I said previously on the Collector performance

if that's what you are refering to - if it is not sufficient please 

suggest text.

======================JN============================================

 

 

PJ: then it might not be configured as it would in real life

 conditions. Wouldn't it be better to record the real-life results

 than to tweak the settings just for testing purposes?

 

======================JN============================================

I would think we are trying to measure the DUT performance and if

the perf "degradation" can be improved by some trivial config change

then lets do it and apply it in the real life if it really helped ??

======================JN============================================

 

PJ: how/why does IPFIX make the comparison easier?

 

======================JN============================================

It was Benoit - he wanted to have IPFIX all over place, measure ONLY

with IPFIX etc etc - I have just removed it then

======================JN============================================

 

 

   4.3.3 Exporting Process

Also, I didn't yet read anything about allowing for export templates 

and/or options. Should these be included or excluded from the

 measurements?

 

=====================JN============================================

It was in the definition 2.2.5 Flow Export Rate - under Control
Information

- it is defined in RFC5470 and I believe it contains both to what

you refer to - Benoit used to really hassle me about this term and 

I couldn't make him to read the RFC and see it is defined there

- it was all the time: "and what is Control Information" ...

It is discussed in 4.3.3 Exporting Process, 5.6 The Measurement
Procedure

and is part of the test report

I would like to avoid further discussion of this in the document since

it would lead to an endless draft - I am not writing an IPFIX bible,

just some info/experience on how to measure performance ..

=====================JN============================================

 

 

      The Exporting Process SHOULD be configured with IPFIX [RFC5101] as

      the protocol to use to format the Flow Export data. If the Flow

      monitoring implementation does not support it, proprietary

      protocols MAY be used.

 

 

PJ: what difference would this make to the testing?

Eg, Nfv5 is a proprietary protocol which has smaller data records 

and no templates - so it requires less export bandwidth. Wouldn't the 

results look better if NFv5 was used rather than IPFIX?

 

=====================JN============================================

Again, this was a result of Benoit's constant push for IPFIX everywhere,

I don't really care, I don't think it makes any difference, I just

wanted to show some willingnes/collaborativnes. I am happy to remove

it if it concerns you.

Benoit wanted me to call it IPFIX performance measurement which I had

to strictly refuse so I wanted to make some ?concessions? elsewhere ...

=====================JN============================================

 

 

PJ: the same as what? Consider "Preferably, the export of Control

 Information SHOULD be per a consistent configuration across all
testing."

 

=====================JN============================================

I am always opened to beter language - just change it :-) ..

=====================JN============================================

 

 

PJ: this seems vague: is any field sufficient? Is it possible to compare


results based on different fields? Some fields may be more costly or 

difficult to acquire or process, therefore reducing the metric.

 

=====================JN============================================

There isn't really anything else - out of 24 bits, 20 is the label ...

I removed it.

=====================JN============================================

 

 

 

PJ: The two SHOULDs make this sound vague. What if these aren't followed


(since SHOULD isn't compulsory)?

 

=====================JN============================================

I know, nothing really - these are just to make people think what they

are doing and  to take care with MPLS - so now we measure label pop,

label imposition etc

=====================JN============================================

 

 

PJ: No; just above you said that the data received at the collector 

should be checked.

 

=====================JN============================================

Data captured at the Collector need to be checked yes, but they don't
need

to be checked at the Collector ?? Or how would you express that or

isn't it just obvious what it wants to say ??

=====================JN============================================

 

 

PJ: In 3.2 you said, "The Collector performance is out of scope of 

this document." Yet here you place requirements on the collector

 performance - so it's not out of scope. Rather, a minimum performance

 level is required.

 

=====================JN============================================

I tried to clarify that, please let me know if that would do - you

sound like a lawyer here - just don't want to measure Collector but need

some functionality there ??

=====================JN============================================

 

 

PJ: Why Ethernet? What kind of ethernet?

You set many requirements without explaining any of them. It would be 

better to explain them so the reader can understand the reason and make

 informed choices about any compromises which may be necessary.

 

=====================JN============================================

It was just somebody's wish at one of the meetings, I have removed it

=====================JN============================================ 

 

   If measurements are performed with Flows containing more than one 

   packet per Flow (see section 6.4 of this document) the sampling ratio


   SHOULD always be higher than the number of packets in the Flows (for 

   small number of packets per Flow). This significantly decreases the

   probability of erasing a whole Flow to a minimum and the measured

   Flow Expiration Rate stays unaffected by sampling. 

 

PJ: I don't understand what you're saying here.

 

=====================JN============================================

If I have flows with 10 packets per each flow, I want to sample 1:11

for example so that just one packet from each flow is removed and the

flow itself stays. If I do 1:1 and have flows of 3 packets per flow

there is higher probability to erase whole flow and change the flow

rates while with sampling 1:4 the probability is near to zero I would

think. This probably assumes some ordered traffic streams - don't know

should I just remove it ??

=====================JN============================================

 

 

section 5.4 

 

PJ: show that calculation here.

 

=====================JN============================================

WHy - it is just a trivial subtraction ??

=====================JN============================================

 

   The Collector MUST stop collecting the Flow Export data at the

   measurement stop time. 

 

PJ: So if I use an immediate cache, won't my figures look better since
more data is exported?

 

=====================JN============================================

No, the Inactive timeout is zero here, so the result is what it should
be no ??

While with non zero timeout you were exporting for a shorter period so

you divide by a shorter period ??

The stop time and collection stop time is exactly same for both.

=====================JN============================================

 

 

PJ: Agreed. Shouldn't the method be: present a known amount of traffic, 

and measure from the first export to the last export (ie, regardless of 

the presented traffic)?

 

=====================JN============================================

No, this is aimed for emergency export mainly - and it stops to

emergency export with the traffic stopped. We don't want to measure

the DUT during the time, when it exports at its leasure at the rate

of 1000 records per second - we want to know what it can do at its

maximum.

=====================JN============================================

 

 

PJ: here, you're saying that collection of all the exported traffic

 is important - which contradicts what you previously said, ie to stop

 measuring the export as soon as the input traffic stream is stopped.

 

=====================JN============================================

It is important (in the bloody black box measurement scheme, else we use

just the DUT counters) to capture ALL export for certain traffic rate

to see if we experienced some traffic Flow Record losses.

To calculate the export rate we use the measurement interval as defined

and the portion of the export as defined otherwise the export rate

contains the DUT "Leasure" export rate which we don't want

=====================JN============================================

 

 

PJ: The DUT could be filling an export packet at the moment the input 

traffic is stopped - so that export packet could contain flow records 

which were expired before and after the boundary. Therefore it's
necessary 

to examine the timestamps of the individual flow records rather than the


timestamps of the exported packets or the netflow/IPFIX headers within 

those packets.

 

=====================JN============================================

Perhaps, if you are really interested we can have a chat about this 

- to avoid the "leasure" export rate as stated above

 

Considering we test with lets say 1 millions of flow records one export

packet has marginal meaning but will put your note there - good point

for the measurements statistics:-)

=====================JN============================================

 

 

PJ: you don't know what implementation-dependant schemes may be in place


to cope with a cache-full situation. Eg, the DUT may simply ignore any 

further packets which don't pertain to existing flows (per FNF permanent
cache).

PJ: The Flow Export Rate may be proportional to the packet rate, but I'd


not expect them to be at all equal since the export rate will be a small


proportion of the presented traffic rate.

 

=====================JN============================================

True, I have experienced that actually on EARL8 - once the cache got
full,

it started to drop the records somewhere - removed

=====================JN============================================

 

 

section 6.5

 

PJ: I'm confused by the final part. What are you trying to avoid?

 

=====================JN============================================

RFC2544 does not like to test with single pair of IP addresses in the

test traffic

=====================JN============================================

 

PJ: I'd like to see a greater discussion of the relevance of sampling, 

throughout this document. I've already indicated a few places where 

sampling would impact the test.

 

=====================JN============================================

I am sorry Paul, it has been three years of twisting and editing this

document and I don't have the mental strength (or even test experience)

to do that - if you have feel free to co-author the document

=====================JN============================================

 

 

 

 

 

 

 

 

 

 

 

 

 

 

The climate of Edinburgh is such that the weak succumb young .... 

and the strong envy them.

                                 Dr. Johnson

 

 

From: Paul Aitken (paitken) 
Sent: 09 September 2011 14:01
To: Jan Novak (janovak)
Cc: Al Morton; bmwg@ietf.org
Subject: Re: WGLC: draft-ietf-bmwg-ipflow-meth-03

 

Jan,

I reviewed the changes from -02 to -03, then the changes from -01 to
-03, and finally I reviewed what you've done in -03 in response to my
previous feedback.

The new version is a big improvement over -01 which I last reviewed - so
I have many minor comments, along with several points you seem to have
missed or overlooked.

Please see my comments inline below.

I look forward to seeing a -04.

P.




Paul,
 
I would like to thank you for a very detailed review including
the language and mainly for improving/introducing the use of 
correct terminology - I think it made the doc much better !!!
 
I have worked on the text since you first provided the review back
in May so I hope it reflects and satisfies all your comments.
 
I have agreed with the very most of changes you wanted but these
below which I consider substantial to mention separately. Otherwise
see some answers to your questions/concerns through the review
in the attached text file.
 
1) Maximum Flow Monitoring Throughput
 

	PJ: The actual, average or effective "Flow Monitoring
Throughput" 
	may be much lower, so you should clearly name this quantity as
	a *maximum*.

There is no such a quantity as actual/average/effective Flow Monitoring
Throughput - there is simply no definition anywhere for those. Flow
Monitoring Throughput is a "MAximum" by definition - please see section
3.1 of the reviewed document. 
This is in an exact analogy with the section 3.17 of RFC1242 defining 
the packet forwarding throughput. 
One could look at your suggestion as just a change
of the name for the defined quantity but the "Maximum" here would be an
adjective to something else and you would have to define what is the
 rest behind the adjective ...


OK.





2) Export of options and templates
 

	Also, I didn't yet read anything about allowing for export
templates
	and/or options. Should these be included or excluded from the 
	measurements?

I would like to strongly disagree here. We had extensive discussions 
about it already with one of the previous reviewers and I always
pointed to the definition of Control Information by RFC5470
 (and I think also other IPFIX documents) section 2 which in
my opinion covers for the export of options and templates but I never
got
any answer regarding that - neither from you yet.
The reviewed document has explicit statements about export of this
information
in several places unless my understanding of the RFC5470 definition is
wrong.
I hope not though since I wouldn't like to dive into the definitions of
what those are in IPFIX. I hope you can comfirm that.


OK, I'm happy with that. Please clarify this in section 4.3.3, eg:

      Various Flow monitoring implementations might use different
      default values regarding the export of Control Information
      [RFC5470]. Note that Control Information includes IPFIX Options
and Templates [RFC5101].
      The Flow Export corresponding to Control Information ...





3) Packet Sampling 
 

	PJ: I'd like to see a greater discussion of the relevance of
sampling, 
	throughout this document. I've already indicated a few places
where sampling would impact the test. 

Packet sampling is out of the scope of this document. If you prefer I
will
replace section 4.5 of the reviewed document with an explicit statement
saying just that instead. It is a separate I would say research subject
which can be undertaken by a future work/document.


Per Brian's feedback, please add to section 4.5:

    Packet sampling and flow sampling is out of scope of this document.
    This document applies to situations without packet or flow sampling.


Further specific comments:



Section 1: remove "the":

    is provided in the section 3.3.

(And many more times, throughout the document - eg, at the end of
Section 3.1, in Section 3.4.1, at the very end of 3.4.2, ...  Please
search yourself.)



In section 1, at the bottom of page 2, write "DUT's" and remove
"support":

    The only restriction may be the DUT's lack
    of support for Flow monitoring support of the particular traffic
    type.



Section 2.2.3: add "the"

    when the Active Timeout is zero



Section 2.2.5:

     mention that there SHOULD NOT be any export filtering, so that all
the expired cache entries are exported. If there is export filtering and
it can't be disabled, this needs to be noted. 



Also in section 3.1, I had said:

    note that the DUT may form an export message containing both flow
records which expired before the end of the test and flow records which
expired afterwards. Ideally the testing method will allow any final
export messages to be received and count any flow records within those
messages which expired before the end of the test. Else, a device which
delays the export packet for longer will have more granular results than
a device which exports records immediately.



Our implementation puts the expired records into an export packet - but
doesn't export this immediately, in order to save CPU and export
bandwidth. It's exported once it's full, or after a short time. It's
possible that this export packet may be being built when the testing
period ends - so these flow records won't be seen during the testing
window, and this implementation will seem less efficient (since less
data is exported) although it's actually more efficient (by using less
packets, CPU and export bandwidth).



Section 3.4.1:

   In this scenario every packet seen by the DUT creates a
   new Cache entry and forces the DUT to the full Cache processing fill
the Cache
   instead of just updating packet and byte counters on of an already
   existing Cache entry.



Section 3.4.2: the indentation of the second line in each point is wrong
- compare with other a/b/c immediately above and below:

    The required test traffic analysis mainly involves the following:

    a. Which packet header parameters are incremented or changed during
    traffic generation
    b. Which Flow Keys the Flow monitoring configuration uses to
generate
    Flow Records



Section 4.1:

s/if/whether/

   The ideal way to implement the measurement is by using a single
   device to provide the sender and receiver capabilities with a sending
   port and a receiving port. This allows for an easy check if whether
all the
   traffic sent by the sender was re-transmitted by the DUT and received
   at the receiver.

PJ: If the sender and receiver are independent (ie, two devices), then
there needs to be another link directly from the sender to the receiver
(ie, not through the DUT), so the receiver can make that comparison.

- how can the comparison be made if the tester is unable to implement
your suggestion of a single device?



Section 4.1:

    In all measurements, the export
    interface MUST have enough bandwidth to transmit Flow Export data
    without congestion. In other words, the export interface MUST NOT be
    a bottleneck during the measurement.

 

PJ: What if the receiving interface, or the collector itself, is a
bottleneck? Previously you said it doesn't matter, and I disagreed.

- data could be lost at the receiving end, making the DUT look worse
than it really is.



Section 4.2:

    The DUT export interface (see Figure 2) MUST be configured with
    sufficient output buffers to avoid dropping the Flow Export data due
    to a simple lack of resources in the interface hardware. 

 

PJ: then it might not be configured as it would in real life conditions.
Wouldn't it be better to record the real-life results than to tweak the
settings just for testing purposes?

- eg, consider if the DUT operates with insufficient buffers in the
real-world scenario, and therefore drops a lot more data than when
testing under "ideal" conditions in the lab.



Section 4.3:

    The DUT SHOULD support IPFIX [RFC5101] 

Say why?



Section 4.3.1:

      In the case when both ingress and egress Flow monitoring is 
      enabled on one DUT the results analysis needs to take into account
      that each Flow will be represented in the DUT Cache by two Flow
      Records (one for each direction) and therefore also the Flow
      Export will contain those two Flow Records.

PJ: you assume a single cache. ingress and egress may potentially be
recorded in separate caches. This may, or may not, impact the
performance.

- so mention that whether the combined ingress and egress traffic is
measured in one cache or each separately in its own cache, may impact
performance and should be recorded.



Section 4.3.2: add "of", remove "the", and move "instantly" :

      The Cache's Inactive and Active Timeouts MUST be known and taken
      into account when designing the measurement as specified in
      section 5. If the Flow monitoring implementation allows only
      timeouts of zero (e.g. immediate timeout or non-existent Cache)
then
      the measurement conditions in the section 5 are fulfilled
      inherently without any additional configuration. The DUT simply
      instantly exports instantly information about every single packet.

PJ: Assuming that the cache works with timeouts. What if it uses some
other mechanism, eg number of packets in the flow?

- state what impact other such mechanisms may have.



Section 4.3.3: add "of" :

    Section 10 of [RFC5101] and section 8.1 of [RFC5470] discuss



Section 4.3.3:

    The Exporting Process SHOULD be configured with IPFIX [RFC5101] as
    the protocol to use to format the Flow Export data. If the Flow
    monitoring implementation does not support it, proprietary
    protocols MAY be used.

 

PJ: what difference would this make to the testing?

Eg, NFv5 is a proprietary protocol which has smaller data records and no
templates - so it requires less export bandwidth. Wouldn't the results
look better if NFv5 was used rather than IPFIX?

- at least state that since proprietary protocols may make a
considerable difference to the testing, the exact protocol being tested
(and any related configuration parameters) MUST be recorded and only
similar protocols should be compared.



Section 4.3.5, remove "on" :

      The test report should therefore contain information
      containing on how many Metering and Exporting processes were
      configured on the DUT for the selected Observation Points.



Section 4.3.6: insert "The"

    The forwarding performance document [RFC5695] specifies



Section 4.4:

   However if the Collector is also used to decode the Flow Export data
   then it SHOULD support IPFIX [RFC5101] for easier results analysis.

PJ: Again, why does IPFIX make it easier?



Section 4.9.1:

   A packet with destination IP address equal to A is sent every 10
   seconds, so the Cache entry would be refreshed in the Cache every 10
   seconds. However, the Inactive Timeout is 5 seconds, so the Cache
   entries will expire from the Cache due to the Inactive Timeout and
   when a new packet is sent with the same IP address A it will create a
   new entry in the Cache.

PJ: theoretically. In practice... the DUT has to check those 10,000
cache entries within the 10 seconds to ensure that expired cache entries
are exported. If it checks 1,000 cache entries per second, it may only
just be ready to expire the existing cache entry when the new packet
arrives. Therefore the new packet may sometimes be added to the existing
cache entry, giving occasional 2 packet flows!

- so note that this behaviour depends upon the design an efficiency of
the cache ager, and incidences of multi-packet flows observed during
this test should be noted.



At the end of section 4.9.1:

   with large Cache Sizes and high packet rate where the DUT's actual



Section 4.9.2:

   So each stream has a packet rate of 10 packets per second. The
packets



Section 4.9.2:

PJ:

   A packet with destination IP address equal to A is sent every 0.1
   second, so it means that the Cache entry is refreshed in the Cache   



Section 5.1: Flow Monitoring Configuration

PJ: the discussion of Cache Size only applies to devices which implement
a fixed-size cache, and not to devices which allocate their cache
dynamically.

- A certain architecture and design are assumed here. However RFC 5470
doesn't even mention a cache, never mind the cache allocation policy.

So when you say:

      The number of unique Flow Keys sets that the traffic
      generator (sender) provides should be multiple times larger than
      the Cache Size, to ensure that the existing Cache entries are
      never updated before Flow Expiration and Flow Export.

- this may not be possible if the cache is dynamically allocated,
growing and shrinking as flows are added and removed.



Section 5.1: Active Timeout and Inactive Timeout:

PJ: Then timeouts are required. A cache which uses a different method to
expire entries can't be tested.

- this is discussed to some extent in section 5.1.1 of RFC 5470.

I think you could work around all the config issues by simply saying
that the DUT should conform to the IPFIX Config model described in
draft-ietf-ipfix-configuration-model (which is in the RFC Editor's
queue, waiting for the IPFIX MIB).



Section 5.3: the "Page 19" header/footer is broken - search for [Page
19].



Section 5.4:

   Otherwise the time to fill up the Cache needs to be used for
   calculation of the measurement time interval in the place of the
   Inactive Timeout.

PJ: show that calculation here.

- show the exact calculation you have in mind, to ensure that every
reader calculates the same.



Section 5.6: add "of" :

    b. the number of Flow Records corresponding to



Section 6.5, point "b" :

      In the
      particular set-up discussed here this would mean a traffic stream
      with just one pair of unique source and destination IP addresses
      (but could be avoided if Flow Keys were for example UDP/TCP source
      and destination ports and Flow Keys did not contain the
      addresses).

PJ: I'm confused by the final part. What are you trying to avoid?



Section 7: "The" shouldn't be capitalised:

   The pure Flow Monitoring Throughput measurement in section 5 provides
   The capability to verify the Flow monitoring accuracy in terms of the



B.1 add "the" and "a":

      where one traffic component exercises the Flow
      Monitoring Plane and the second traffic component loads only
      the Forwarding Plane without affecting Flow monitoring (e.g. it
      creates just a certain amount of permanent Cache entries).


Cheers,
P.