Re: [bmwg] Martin Duke's Discuss on draft-ietf-bmwg-b2b-frame-03: (with DISCUSS and COMMENT)

"Scott O. Bradner" <sob@sobco.com> Wed, 16 December 2020 11:44 UTC

Return-Path: <sob@sobco.com>
X-Original-To: bmwg@ietfa.amsl.com
Delivered-To: bmwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2B79F3A09E7; Wed, 16 Dec 2020 03:44:06 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.107
X-Spam-Level:
X-Spam-Status: No, score=-1.107 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RDNS_NONE=0.793, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 1qNQdx1BiUTS; Wed, 16 Dec 2020 03:44:04 -0800 (PST)
Received: from sobco.sobco.com (unknown [136.248.127.164]) by ietfa.amsl.com (Postfix) with ESMTP id 399F63A09D7; Wed, 16 Dec 2020 03:44:02 -0800 (PST)
Received: from localhost (localhost [127.0.0.1]) by sobco.sobco.com (Postfix) with ESMTP id 463234A00CE8; Wed, 16 Dec 2020 06:44:01 -0500 (EST)
X-Virus-Scanned: amavisd-new at sobco.com
Received: from sobco.sobco.com ([127.0.0.1]) by localhost (sobco.sobco.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id aWj-i5DmKTQW; Wed, 16 Dec 2020 06:43:50 -0500 (EST)
Received: from golem.sobco.com (golem.sobco.com [136.248.127.162]) by sobco.sobco.com (Postfix) with ESMTPSA id 73A6D4A00CD7; Wed, 16 Dec 2020 06:43:49 -0500 (EST)
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.120.23.2.4\))
From: "Scott O. Bradner" <sob@sobco.com>
In-Reply-To: <4D7F4AD313D3FC43A053B309F97543CF014766F1FE@njmtexg5.research.att.com>
Date: Wed, 16 Dec 2020 06:43:48 -0500
Cc: Martin Duke <martin.h.duke@gmail.com>, The IESG <iesg@ietf.org>, "draft-ietf-bmwg-b2b-frame@ietf.org" <draft-ietf-bmwg-b2b-frame@ietf.org>, "bmwg-chairs@ietf.org" <bmwg-chairs@ietf.org>, "bmwg@ietf.org" <bmwg@ietf.org>
Content-Transfer-Encoding: quoted-printable
Message-Id: <27AFD4CF-D66F-451D-AE2C-4D6CED32943E@sobco.com>
References: <160755503926.27888.3173906725876085467@ietfa.amsl.com> <4D7F4AD313D3FC43A053B309F97543CF014766D179@njmtexg5.research.att.com> <BC09F3BD-B046-44D8-8063-3EA10E9DE574@sobco.com> <4D7F4AD313D3FC43A053B309F97543CF014766EB8C@njmtexg5.research.att.com> <A720588B-2F9C-4EC8-8269-E27D0B3A2973@sobco.com> <4D7F4AD313D3FC43A053B309F97543CF014766F1FE@njmtexg5.research.att.com>
To: "MORTON, ALFRED C (AL)" <acm@research.att.com>
X-Mailer: Apple Mail (2.3608.120.23.2.4)
Archived-At: <https://mailarchive.ietf.org/arch/msg/bmwg/l3ThAm-W2BDFj0MdZyBkmQ5OmZ8>
Subject: Re: [bmwg] Martin Duke's Discuss on draft-ietf-bmwg-b2b-frame-03: (with DISCUSS and COMMENT)
X-BeenThere: bmwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Benchmarking Methodology Working Group <bmwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bmwg>, <mailto:bmwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/bmwg/>
List-Post: <mailto:bmwg@ietf.org>
List-Help: <mailto:bmwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bmwg>, <mailto:bmwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 16 Dec 2020 11:44:06 -0000


> On Dec 15, 2020, at 5:04 PM, MORTON, ALFRED C (AL) <acm@research.att.com> wrote:
> 
> Hi Scott,
> 
> Please see my replies below, marked [acm], with a couple of questions.
> I hope I'm not missing something obvious, so trying to be very clear in all replies!
> But I could be overlooking something, and if so I will learn something very soon...
> 
>> -----Original Message-----
>> From: Scott O. Bradner [mailto:sob@sobco.com]
>> Sent: Tuesday, December 15, 2020 9:06 AM
>> To: MORTON, ALFRED C (AL) <acm@research.att.com>
>> Cc: Martin Duke <martin.h.duke@gmail.com>om>; The IESG <iesg@ietf.org>rg>;
>> draft-ietf-bmwg-b2b-frame@ietf.org; bmwg-chairs@ietf.org; bmwg@ietf.org
>> Subject: Re: [bmwg] Martin Duke's Discuss on draft-ietf-bmwg-b2b-frame-03:
>> (with DISCUSS and COMMENT)
>> 
>> I basically understood that but it seemed to me that using a fixed (2
>> second) extra time, which is unrelated
>> to whatever time that the burst might have taken to be sent seemed risky
>> since I could
>> imagine cases where the play out speed was less than the receive speed
> 
> [acm] 
> I guess I don't understand your example, where (buffer?) play-out speed plays a role in the results, and how play-out speed could be less than the receive speed in the multi-second time scale of the buffer-bloat example. I think (buffer) play-out speed and receive speed should be nominally the same.


I expect that generally they would be about the same but its all software and different routines would 
handle input & output so one can not be sure - in addition the system could be adding keep-alive packets, 
routing updates etc to the output stream (not that they would take much time to send)
> 
> Although RFC 2544 Throughput definition is based on offered load delivered loss-free to the receiver, we use it here as the best approximation available for packet header processing rate (equal to playout rate from the buffer?), egress from the DUT, and the speed at which the test system receives packets. 
> 
> So, in our diagram from the memo:
> 
>                        |------------ DUT --------|
>   Generator -> Ingress -> Buffer -> HeaderProc -> Egress -> Receiver
> 
> Is your play-out speed the HeaderProc speed, or Egress speed?
> 
> And how can the (buffer) play-out speed be less than the speed at a subsequent interface (for very long)?

"very long" is a relative term :-)

I agree that there should not be any issue if 2 seconds is long relative to the burst length but
maybe not so if the burst length is long relative to 2 seconds (e.f. it takes a minute or two
to fill the buffer)

Scott
> 
> help me understand the mechanics I'm overlooking, my friend!
> 
>> 
>> but if you are convinced that the 2 seconds extra time would cover all
>> possible cases then go to it
> [acm] 
> 
> Well, we say "at least 2 seconds" and allow for customization if necessary.
> 
> As you know, I've conducted LOTS of production network testing, where we have used static waiting times to distinguish packet loss from long delay, and prescribed the same in IPPM RFCs, etc. A static waiting time "Tmax" has served us well.
> 
> Here, we have the added stability of the Isolated Test Environment (ITE, as Kevin Dubray called it), and the three time-component definition of trial duration, where we wait 2 seconds after the last packet on seen egress (it is more like a cool-down interval between trials). I think all the adaptation we need comes from explicit recognition that the time for the Test Receiver to receive the entire burst depends on the buffer size, the DUT header processing rate, the actual interface speed, etc. IOW, all the unknown variables.
> 
> Thanks again for your time, Scott!
> Al
> 
>> 
>> Scott
>> 
>> 
>>> On Dec 14, 2020, at 7:24 PM, MORTON, ALFRED C (AL)
>> <acm@research.att.com> wrote:
>>> 
>>> Hi Scott, thanks for helping with this discussion.
>>> 
>>> I'm trying to formulate adaptive extra time based on the time it takes
>> to *receive* the burst, with the additional "at least 2 seconds"  waiting
>> time to be sure we received all the packets that might arrive.  Let me try
>> drawing the timeline that's in my mind, and I'll use a buffer-bloat case
>> example of a 1 second buffer (which dominates all other buffers in the
>> DUT).
>>> 
>>> One of the key contributions of this memo is recognizing that the buffer
>> is being emptied while the burst of back-to-back frames is simultaneously
>> trying to fill the buffer.
>>> 
>>> Assume that the RFC 2544 Throughput is only half of the back-to-back
>> frame rate for the frame size used.
>>> 
>>> From the draft:
>>>  4.  A helpful concept is the buffer filling rate, which is the
>>>      difference between the Max Theoretical Frame Rate (ingress) and
>>>      the Measured Throughput (HeaderProc on egress).  If the actual
>>>      buffer size in frames was known, the time to fill the buffer
>>>      during a measurement can be calculated using the filling rate as
>>>      a check on measurements.  However, the Buffer in the model
>>>      represents many buffers of different sizes in the DUT data path.
>>> 
>>> So (danger: calculating while typing and drawing!), a 1 second burst of
>> B2B frames only raises the occupation buffer to 50%, and another second of
>> transmission is needed before reaching 100% occupation.
>>> 
>>> Trial
>>> Time, sec: 0          1          2          3         4          5
>> 6
>>> 
>>> Sender:    |==========|==========|
>>> Receiver:  |= = = = = |= = = = = |= = = = = |= = = = =|
>>> Waiting Time                                          |          |
>> |
>>> 
>> Trial
>>> 
>> Ends
>>> 
>>> In the ideal example timeline above, the back-to-back burst stopped
>> exactly when the buffer reached capacity, so there is no loss. The buffer
>> fill rate is half the back-to-back rate. Also, it takes 2 seconds to
>> deplete the buffer and for frames to stop arriving at the receiver. Only
>> then do we start the 2 second waiting time to ensure no more frames will
>> arrive!
>>> 
>>> While we're here, let's look at a calculation from the memo:
>>> 
>>>  Corrected DUT Buffer Time =
>>>                         /                                         \
>>>          Implied DUT    |Implied DUT       Measured Throughput    |
>>>       =  Buffer Time -  |Buffer Time * -------------------------- |
>>>                         |              Max Theoretical Frame Rate |
>>>                         \                                         /
>>>       =  2 - [ 2 * 0.5 ] seconds
>>>       =  1 second
>>> 
>>> and we avoid the error of calculating buffer time based on the sender's
>> burst duration alone.
>>> 
>>> hope this helps,
>>> Al
>>> 
>>> 
>>>> -----Original Message-----
>>>> From: Scott O. Bradner [mailto:sob@sobco.com]
>>>> Sent: Saturday, December 12, 2020 5:18 PM
>>>> To: MORTON, ALFRED C (AL) <acm@research.att.com>
>>>> Cc: Martin Duke <martin.h.duke@gmail.com>om>; The IESG <iesg@ietf.org>rg>;
>>>> draft-ietf-bmwg-b2b-frame@ietf.org; bmwg-chairs@ietf.org; bmwg@ietf.org
>>>> Subject: Re: [bmwg] Martin Duke's Discuss on draft-ietf-bmwg-b2b-frame-
>> 03:
>>>> (with DISCUSS and COMMENT)
>>>> 
>>>> this would seem to work if 2 seconds is significantly longer than it
>> takes
>>>> to send the burst - but if it takes 2 second to send the burst
>>>> then 2 seconds extra buffer could easily lose packets - seems to me
>> that
>>>> he extra time should be related to the time it takes to send the burst
>>>> 
>>>> e.g 50% of the burst time but not less than 2 seconds
>>>> 
>>>> Scott
>>>> 
>>>> 
>>>>> On Dec 12, 2020, at 10:18 AM, MORTON, ALFRED C (AL)
>>>> <acm@research.att.com> wrote:
>>>>> 
>>>>> Hi Martin, thanks for your review and comment,
>>>>> please see my reply, [acm] below,
>>>>> Al
>>>>> 
>>>>>> -----Original Message-----
>>>>> ...
>>>>>> 
>>>>>> ---------------------------------------------------------------------
>> -
>>>>>> DISCUSS:
>>>>>> ---------------------------------------------------------------------
>> -
>>>>>> 
>>>>>> Thank you for engaging with the TSVART review. Despite the
>> wordsmithing
>>>> that
>>>>>> has gone on, I am not sure that we have captured the correct text.
>>>>>> 
>>>>>> The proposed change is:
>>>>>>> I clarified:
>>>>>>> The duration of the trial MUST include at least 2 seconds in
>> addition
>>>> to the time
>>>>>>> required to send and receive each burst of frames, to ensure that
>> DUT
>>>> buffers to deplete.
>>>>>>> and I'll add:
>>>>>>> The upper search limit for the time to send each burst MUST be
>>>> configurable as
>>>>>>> high as 30 seconds (buffer time results
>>>>>>> reported at the configured upper limit are likely invalid, and the
>>>> test MUST
>>>>>>> be repeated with a higher search limit).
>>>>>> 
>>>>>> But IIUC it's the additional time that needs to scale up.
>>>>> [acm]
>>>>> 
>>>>> In the revised text where David and I reached agreement, we identified
>> 3
>>>> time components of the trial duration, making the duration variable: no
>>>> longer static and at "at least 2 seconds".
>>>>> 
>>>>> 1. the time to send the burst of frames (at the back-to-back rate),
>>>> determined by the search algorithm
>>>>> 2. the time to receive the transferred burst of frames (at the RFC2544
>>>> Throughput rate), possibly truncated by buffer overflow, but certainly
>>>> including the latency of the DUT with or without buffer-bloat
>>>>> 3. at least 2 seconds in addition to the time to receive the burst
>> (2.),
>>>> to ensure that DUT buffers have depleted.
>>>>> 
>>>>> So, both components 1. and 2. are variables, and the burst receive
>> time
>>>> component (2.) compensates for large buffers, non-back-to-back burst
>>>> egress, and anything else that contributes to DUT latency. The final
>> "at
>>>> least 2 seconds" is simply about making sure the trial is really over
>>>> before moving on in an automated test - we won't make an error if
>> frames
>>>> trickle-out very late for some unfortunate reason.
>>>>> 
>>>>>> A layman's reading of
>>>>>> the document, IMO, suggests that the burst length has a binary search
>>>> but the 2
>>>>>> seconds of waiting can be fixed.
>>>>> [acm]
>>>>> Yes, that's right, plus all the other factors above.
>>>>> 
>>>>> So, let's try this, but I'm trying not to extend or complicate the
>>>> buffer time << 2 seconds testing for the sake of the buffer-bloat case:
>>>>> 
>>>>> -=-=-=-=-=-=-
>>>>> 
>>>>> The duration of the trial includes three REQUIRED components:
>>>>> 
>>>>> 1. the time to send the burst of frames (at the back-to-back rate),
>>>> determined by the search algorithm
>>>>> 2. the time to receive the transferred burst of frames (at the RFC2544
>>>> Throughput rate), possibly truncated by buffer overflow, and certainly
>>>> including the latency of the DUT
>>>>> 3. at least 2 seconds not overlapping the time to receive the burst
>>>> (2.), to ensure that DUT buffers have depleted.
>>>>> 
>>>>> The upper search limit for the time to send each burst MUST be
>>>> configurable as high as 30 seconds (buffer time results reported at or
>>>> near the configured upper limit are likely invalid, and the test MUST
>> be
>>>> repeated with a higher search limit).
>>>>> 
>>>>> -=-=-=-=-=-=-=-=-
>>>>> 
>>>>> Does that wording do it?
>>>>> 
>>>>>> 
>>>>>> ---------------------------------------------------------------------
>> -
>>>>>> COMMENT:
>>>>>> ---------------------------------------------------------------------
>> -
>>>>>> 
>>>>>> Other than that, this a well-written document. Thanks!
>>>>> [acm]
>>>>> Thank you!
>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> bmwg mailing list
>>>>> bmwg@ietf.org
>>>>> 
>>>> 
>> https://urldefense.com/v3/__https://www.ietf.org/mailman/listinfo/bmwg__;!
>>>> !BhdT!1uRJDJBUadSunB4ZCkgOTzg3ZssPtiufcyrsTcxEc1F67df5q4YNUa9IYHacnsA$
>>> 
>