Re: [bmwg] updates to the data center benchmarking drafts

Hi Lucien,

As one of my action items from IETF 88 meeting, I¹d like to comment on the
limitation of snake test in this thread to keep everyone in the loop.

Snake test is most popular in manufacturing test, where the design
implementation has been fully characterized in system QA. The snake test
is highly effective to identify any issue caused by manufacturing process.
As Al pointed out in his comments, the snake test is not deterministic
across all the ingress ports connected with DUT. The DUT¹s behavior on
egress ports will affect the traffic characteristic of ingress ports
connected with DUT. I can summarize the limitation of snake test as the
following:

* The DUT¹s performance and behavior may create nondeterministic offered
load across all Ingress ports connected with DUT
* When packet loss is observed, the test tool will not be able to identify
the source of problem
* The loading pattern of snake test can be considered similar to RFC 2544
port-pair test, however it is drastically different than RFC 2889 full
mesh test. The full mesh test takes into consideration of frame ordering
and arrival time applied to all ingress ports, where snake test has no
control on timing of applied load.

Dean

On 10/31/13, 9:41 AM, "MORTON, ALFRED C (AL)" <acmorton@att.com> wrote:

>Re-reading the draft, I realized we are *not* talking about
>measuring variation in packet spacing here in BMWG, we ARE talking
>about variation in delay from packet to packet.
>
>While the source jitter could have some influence on the result,
>it's much less than I implied in this comment, so please disregard:
>
>> But what is the tester's stream is not perfectly CBR?
>> How would we account for this in the jitter measurement,
>> when using inter-packet variation as the benchmark?
>> (I need to go back and read some of the earlier RFCs...)
>> 
>> It occurs to me that we should be prepared to account for "source
>>jitter"
>> in the inter-packet delay variation benchmark somehow,
>> *or* use a different metric where we only have to confirm the tester
>>applies
>> the send and receive timestamps accurately (which would be PDV
>> as I mentioned last week).
>
>I have another comment about jitter, but I'm clearly guilty of mixing
>different threads this morning - sorry for the confusion.
>Al
>
>> -----Original Message-----
>> From: bmwg-bounces@ietf.org [mailto:bmwg-bounces@ietf.org] On Behalf Of
>> MORTON, ALFRED C (AL)
>> Sent: Thursday, October 31, 2013 12:27 PM
>> To: Lucien Avramov (lavramov); bmwg@ietf.org
>> Subject: Re: [bmwg] updates to the data center benchmarking drafts
>> 
>> *** Security Advisory: This Message Originated Outside of AT&T ***.
>> Reference http://cso.att.com/EmailSecurity/IDSP.html for more
>>information.
>> 
>> Hi Lucien, Jacob, and all,
>> 
>> I've been reading the revised drafts, and had a few comments
>> which continue the ideas expressed in our "jitter" discussion.
>> 
>> If anyone else wants an advance copy, please contact me or
>> the authors. If enough requests, we'll post them somewhere
>> and send a link, but I think the plan is to upload when
>> submission opens again on Monday...
>> 
>> Al
>> (as participant)
>> 
>> Both the drafts describe the "snake test", where traffic from a tester
>> with
>> insufficient ports to connect to all DUT ports is supplemented by
>> connecting
>> cables to loop DUT outputs to inputs. The methodology section 2.2 wisely
>> excludes latency and jitter benchmarks from this set-up, which is a
>> limitation I was looking for. It seems to me that the throughput results
>> could be affected in such a configuration, too. For example, if the test
>> traffic characteristics (like inter-packet spacing) are
>> drastically modified on any pass through the DUT, the next pass is
>> stressed
>> in different ways and the loss may be more likely.
>> 
>> I'd like to hear if others have similar concerns before we recommend
>> any change for the draft on the snake test topic.
>> 
>> This got me thinking about the test streams we generate, and that these
>> streams will have some inter-packet spacing relationships as they leave
>> the tester and enter the DUT. If the streams have perfectly regular
>> spacing
>> between packets (they are absolutely constant bit rate, CBR), then any
>> variation
>> from the fixed spacing is attributable to the DUT.
>> 
>> But what is the tester's stream is not perfectly CBR?
>> How would we account for this in the jitter measurement,
>> when using inter-packet variation as the benchmark?
>> (I need to go back and read some of the earlier RFCs...)
>> 
>> It occurs to me that we should be prepared to account for "source
>>jitter"
>> in the inter-packet delay variation benchmark somehow,
>> *or* use a different metric where we only have to confirm the tester
>> applies
>> the send and receive timestamps accurately (which would be PDV
>> as I mentioned last week).
>> 
>> Also, I read in section 2.3 of the methodology:
>>    -for latency and jitter, provide minimum, average and maximum values.
>>    if different iterations are done to gather the minimum, average and
>>    maximum, it SHOULD be specified in the report along with a
>>    justification on why the information could not have been gathered at
>>    the same test iteration
>> collecting single aspects of the distribution in separate iterations of
>> the
>> test could lead to inconsistency. This might be an area where we
>>encourage
>> test equipment vendors to add a few capabilities to claim compliance.
>> 
>> I had a few specific comments on the definitions draft, which says:
>> 3.3 Measurement Units
>> 
>>    The jitter MUST be measured when sending packets of the same size.
>>    Jitter MUST be measured as packet to packet delay variation and delta
>>    between min and max packet delay variation of all packets sent. A
>>    histogram MAY be provided as a population of packets measured per
>>    latency or latency buckets.
>> 
>> The first sentence is really part of the methodology, so it should move
>>to
>> that draft.
>> The second sentence is actually part of the definition, 3.1.
>> Somehow, the topic of measurement units doesn't get covered here :-(
>> though I've seen the units (milliseconds or microseconds) mentioned
>> elsewhere.
>> 
>> Please look more carefully at the BMWG template for terminology and how
>>it
>> has been
>> used in our RFCs, and reorganize this material so that our loyal
>>followers
>> don't become confused. (the same thing seems to happen in section 4, and
>> section
>> 5 is closer, but includes methodology too, etc.)
>> 
>> The section 6 Buffer size Definition is quite long, I was looking for a
>> tight
>> sentence that gives the meaning of the term...
>> 
>> Section 7 on Application Throughput: Data Center Goodput, says:
>> 7.1. Definition
>> 
>>    In Data Center Networking, a balanced network is a function of
>>    maximal throughput 'and' minimal loss at any given time. This is
>>    defined by the Goodput. Goodput is the application-level throughput.
>>    It is measured in bytes / second. Goodput is the measurement of the
>>    actual payload of the packet being sent.
>> I suspect there's an existing definition we can cite for this.
>> The "measurement units" creep in as the third sentence, you know where
>> they
>> go now.
>> 
>> 
>> > -----Original Message-----
>> > From: bmwg-bounces@ietf.org [mailto:bmwg-bounces@ietf.org] On Behalf
>>Of
>> > Lucien Avramov (lavramov)
>> > Sent: Friday, October 25, 2013 12:51 PM
>> > To: bmwg@ietf.org
>> > Subject: [bmwg] updates to the data center benchmarking drafts
>> >
>> > Hi all,
>> >
>> > As the submission for updates is closed until November 4th 2013, I
>>would
>> > like to provide you an update of the changes we are incorporating
>> > additionally to the current thread regarding jitter:
>> >
>> > A] First draft draft-dcbench-def-01 has the following changes to date:
>> >
>> > change 'measurement' -> to 'measurement units' [all sections]
>> > 2.1 change 'latency' -> to 'latency interval'
>> > 2.1 added content:
>> >
>> > Another possibility to summarize the four different definitions above
>>is
>> > to  refer to the bit position as they normally occur: input to output.
>> >      FILO is FL (First bit Last bit)
>> >      FIFO is FF (First bit First bit)
>> >      LILO is LL (Last bit Last bit)
>> >      LIFO is LF (Last bit First bit)
>> >
>> >
>> > 2.2 Edits around FILO due to the conversation, changed it based on the
>> > feedback provided
>> >
>> > 3.1 added the following:
>> >
>> > Even with the reference to RFC 3393, there are many definitions of
>> >     "jitter" possible. The one selected for Data Center Benchmarking
>>is
>> >     closest to RFC 3393.
>> >
>> > [note that we will update the current jitter conversation as well]
>> >
>> > B] draft-bmwg-dcbench-methodology-02:
>> >
>> > section 2.2 added the snake test for throughput as many customers may
>> > not have the luxury to have so many IXIA ports:
>> >
>> >     Alternatively when a traffic generator CAN NOT be connected to all
>> >     ports on the DUT, a snake test MUST be used for line rate testing,
>> >     excluding latency and jitter as those became then irrelevant. The
>> >     snake test consists in the following method: -connect the first
>>and
>> >     last port of the DUT to a traffic generator-connect back to back
>> >     sequentially all the ports in between: port 2 to 3, port 4 to 5
>>etc
>> >     to port n-2 to port n-1; where n is the total number of ports of
>>the
>> >     DUT-configure port 1 and 2 in the same vlan X, port 3 and 4 in the
>> >     same vlan Y, etc. port n-1 and port n in the same vlan ZZZ. This
>> >     snake test provides a capability to test line rate for Layer 2 and
>> >     Layer 3 RFC 2544/3918 in instance where a traffic generator with
>> only
>> >     two ports is available. The latency and jitter are not to be
>> >     considered with this test.
>> >
>> > section 2.3 added the imix genome and fixed some packet size examples.
>> >
>> > The pattern for testing can be expressed using RFC 6985 [IMIX Genome:
>> >     Specification of Variable Packet Sizes for Additional Testing]
>> >
>> > Also: "-for packet drops, they MUST be expressed in packet count value
>> > and SHOULD be expressed in % of line rate" changed to ' % of total
>> > transmitted frames' instead for more clarity.
>> >
>> > section 3.1: " To measure the size of the buffer of a DUT under all
>> > conditions." It's not realistic so we changed "all" to
>> > typical|many|multiple .
>> >
>> > Thank you for taking the time to read this.
>> > Looking forward to read comments.
>> >
>> > Cheers,
>> > Lucien
>> > _______________________________________________
>> > bmwg mailing list
>> > bmwg@ietf.org
>> > https://www.ietf.org/mailman/listinfo/bmwg
>> _______________________________________________
>> bmwg mailing list
>> bmwg@ietf.org
>> https://www.ietf.org/mailman/listinfo/bmwg
>_______________________________________________
>bmwg mailing list
>bmwg@ietf.org
>https://www.ietf.org/mailman/listinfo/bmwg
>