Re: [bmwg] Tsvart last call review of draft-ietf-bmwg-b2b-frame-03

"MORTON, ALFRED C (AL)" <acm@research.att.com> Wed, 25 November 2020 18:20 UTC

Return-Path: <acm@research.att.com>
X-Original-To: bmwg@ietfa.amsl.com
Delivered-To: bmwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A0F7C3A151A; Wed, 25 Nov 2020 10:20:22 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.919
X-Spam-Level:
X-Spam-Status: No, score=-1.919 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_MSPIKE_H4=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id bxYxv-ViSsOt; Wed, 25 Nov 2020 10:20:20 -0800 (PST)
Received: from mx0a-00191d01.pphosted.com (mx0a-00191d01.pphosted.com [67.231.149.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 218193A1519; Wed, 25 Nov 2020 10:20:20 -0800 (PST)
Received: from pps.filterd (m0048589.ppops.net [127.0.0.1]) by m0048589.ppops.net-00191d01. (8.16.0.43/8.16.0.43) with SMTP id 0APIE8oM032710; Wed, 25 Nov 2020 13:20:18 -0500
Received: from tlpd255.enaf.dadc.sbc.com (sbcsmtp3.sbc.com [144.160.112.28]) by m0048589.ppops.net-00191d01. with ESMTP id 35153d8wcd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 25 Nov 2020 13:20:18 -0500
Received: from enaf.dadc.sbc.com (localhost [127.0.0.1]) by tlpd255.enaf.dadc.sbc.com (8.14.5/8.14.5) with ESMTP id 0APIKG9w090779; Wed, 25 Nov 2020 12:20:17 -0600
Received: from zlp30493.vci.att.com (zlp30493.vci.att.com [135.46.181.176]) by tlpd255.enaf.dadc.sbc.com (8.14.5/8.14.5) with ESMTP id 0APIKCvm090605 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Wed, 25 Nov 2020 12:20:12 -0600
Received: from zlp30493.vci.att.com (zlp30493.vci.att.com [127.0.0.1]) by zlp30493.vci.att.com (Service) with ESMTP id 42777400A0A7; Wed, 25 Nov 2020 18:20:12 +0000 (GMT)
Received: from clph811.sldc.sbc.com (unknown [135.41.107.12]) by zlp30493.vci.att.com (Service) with ESMTP id 0EBEF400A0A2; Wed, 25 Nov 2020 18:20:12 +0000 (GMT)
Received: from sldc.sbc.com (localhost [127.0.0.1]) by clph811.sldc.sbc.com (8.14.5/8.14.5) with ESMTP id 0APIKAdX116660; Wed, 25 Nov 2020 12:20:11 -0600
Received: from mail-green.research.att.com (mail-green.research.att.com [135.207.255.15]) by clph811.sldc.sbc.com (8.14.5/8.14.5) with ESMTP id 0APIK1ue115755; Wed, 25 Nov 2020 12:20:01 -0600
Received: from exchange.research.att.com (njbdcas1.research.att.com [135.197.255.61]) by mail-green.research.att.com (Postfix) with ESMTP id 1443D10A18DF; Wed, 25 Nov 2020 13:20:00 -0500 (EST)
Received: from njmtexg5.research.att.com ([fe80::b09c:ff13:4487:78b6]) by njbdcas1.research.att.com ([fe80::8c6b:4b77:618f:9a01%11]) with mapi id 14.03.0468.000; Wed, 25 Nov 2020 13:19:50 -0500
From: "MORTON, ALFRED C (AL)" <acm@research.att.com>
To: "Black, David" <David.Black@dell.com>, "tsv-art@ietf.org" <tsv-art@ietf.org>
CC: "bmwg@ietf.org" <bmwg@ietf.org>, "draft-ietf-bmwg-b2b-frame.all@ietf.org" <draft-ietf-bmwg-b2b-frame.all@ietf.org>, "last-call@ietf.org" <last-call@ietf.org>
Thread-Topic: Tsvart last call review of draft-ietf-bmwg-b2b-frame-03
Thread-Index: AQHWwnY1WN9Sgd3Tb0eRqGiakAjX6qnXbW2ggACasYD///ACEIABRvIA//++lwA=
Date: Wed, 25 Nov 2020 18:19:50 +0000
Message-ID: <4D7F4AD313D3FC43A053B309F97543CF014764F08F@njmtexg5.research.att.com>
References: <160623159725.20249.9390987464844223889@ietfa.amsl.com> <4D7F4AD313D3FC43A053B309F97543CF014764C4F2@njmtexg5.research.att.com> <MN2PR19MB4045E52E2E321662721891C683FB0@MN2PR19MB4045.namprd19.prod.outlook.com> <4D7F4AD313D3FC43A053B309F97543CF014764EAA2@njmtexg5.research.att.com> <MN2PR19MB4045F0FF2FC651C799B61C4C83FA0@MN2PR19MB4045.namprd19.prod.outlook.com>
In-Reply-To: <MN2PR19MB4045F0FF2FC651C799B61C4C83FA0@MN2PR19MB4045.namprd19.prod.outlook.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [24.148.42.167]
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.312, 18.0.737 definitions=2020-11-25_11:2020-11-25, 2020-11-25 signatures=0
X-Proofpoint-Spam-Details: rule=outbound_policy_notspam policy=outbound_policy score=0 bulkscore=0 impostorscore=0 priorityscore=1501 adultscore=0 phishscore=0 mlxscore=0 malwarescore=0 lowpriorityscore=0 suspectscore=0 mlxlogscore=999 spamscore=0 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2011250114
Archived-At: <https://mailarchive.ietf.org/arch/msg/bmwg/ks_sziNxsnLpcrGOqN973rVL1d0>
Subject: Re: [bmwg] Tsvart last call review of draft-ietf-bmwg-b2b-frame-03
X-BeenThere: bmwg@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Benchmarking Methodology Working Group <bmwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bmwg>, <mailto:bmwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/bmwg/>
List-Post: <mailto:bmwg@ietf.org>
List-Help: <mailto:bmwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bmwg>, <mailto:bmwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 25 Nov 2020 18:20:23 -0000

David,

Allow me to  points more clearly: 

- I'm relying on the Binary Search aspect of the procedure to measure unexpected large buffer times; searches are an integral part of the procedure.

- These are benchmarking measurements in an isolated test environment, where we will control every source of variability possible to foster repeatability.

- The trial duration has two components: time to conduct sending + receiving, plus additional waiting time for buffer depletion. The test equipment orchestrates this easily, because it is a single device.

I clarified:
The duration of the trial MUST include least 2 seconds in addition to
the time required to send and receive each burst of frames, to ensure that DUT buffers to deplete.

and I'll add:
The upper search limit for the time to send each burst MUST be configurable as high as 30 seconds (buffer time results reported at the configured upper limit are likely invalid, and the test MUST be repeated with a higher search limit).

Al


> -----Original Message-----
> From: Black, David [mailto:David.Black@dell.com]
> Sent: Wednesday, November 25, 2020 9:35 AM
> To: MORTON, ALFRED C (AL) <acm@research.att.com>; tsv-art@ietf.org
> Cc: bmwg@ietf.org; draft-ietf-bmwg-b2b-frame.all@ietf.org; last-
> call@ietf.org; Black, David <David.Black@dell.com>
> Subject: RE: Tsvart last call review of draft-ietf-bmwg-b2b-frame-03
> 
> Al,
> 
> We appear to not be communicating on a key aspect of the concern:
> 
> > A burst of 1.7 seconds falls well within the coverage of a 2 second
> trial to count the
> > frames and determine that loss has occurred. We don't need a big margin
> time-
> > wise. Also, the trial duration increases with the search parameters:
> IOW, if the
> > tester is still sending frames after 2 seconds, then obviously we can't
> allow the trial
> > to terminate prematurely.
> 
> The 1.6 seconds value is *not* an upper bound - it was measured in
> practice on an operational conference network as part of what appears to
> have been a just-in-time slide preparation exercise.  Hence that 1.6
> seconds of buffering ought to be viewed as being somewhere in the midst of
> the expected distribution of buffer times.  For that reason, the upper
> bound on test duration needs to be set with a safety factor to ensure that
> even if a bizarre DUT has 10 seconds of buffering, the B2B test depletes
> that buffering so that the B2B test is testing the forwarding path as
> intended, not the buffering capacity.
> 
> > I want to avoid over-reacting to the buffer bloat case. We all agree
> that bloated
> > buffers are bad, and knowing a DUT has buffering >=1 second may be
> enough. I
> > certainly do not think that trial durations for the B2B Frame benchmark
> need to be
> > as long as you originally suggested, where the overwhelming number of
> test
> > conditions can use a <<1 second burst and 2 second buffer depletion
> time:
> 
> I agree with this approach provided that there is a test procedure (in the
> B2B test and/or its prerequisites) that will discover the behavior of the
> bizarre DUT with 10 seconds of buffering and adjust the B2B test time to
> ensure that this excessive buffering is depleted.
> 
> > I hope the approach of the single sentence above works for you.
> 
> Based on this discussion, in addition to that sentence (which I think has
> to contain a "MUST" for situations in which excessive buffering is found),
> I would like to see a brief explanation of how the procedures in the
> draft, along with additional test procedures pulled in from RFC 2544 or
> elsewhere, avoid a 2 second test duration on a DUT with 10 seconds of
> buffering.
> 
> Thanks, --David
> 
> > -----Original Message-----
> > From: MORTON, ALFRED C (AL) <acm@research.att.com>
> > Sent: Wednesday, November 25, 2020 12:00 AM
> > To: Black, David; tsv-art@ietf.org
> > Cc: bmwg@ietf.org; draft-ietf-bmwg-b2b-frame.all@ietf.org; last-
> call@ietf.org
> > Subject: RE: Tsvart last call review of draft-ietf-bmwg-b2b-frame-03
> >
> >
> > [EXTERNAL EMAIL]
> >
> > Hi David, some more clarity below.
> > Al
> >
> >
> > > -----Original Message-----
> > > From: Black, David [mailto:David.Black@dell.com]
> > > Sent: Tuesday, November 24, 2020 3:02 PM
> > > To: MORTON, ALFRED C (AL) <acm@research.att.com>; tsv-art@ietf.org
> > > Cc: bmwg@ietf.org; draft-ietf-bmwg-b2b-frame.all@ietf.org; last-
> > > call@ietf.org; Black, David <David.Black@dell.com>
> > > Subject: RE: Tsvart last call review of draft-ietf-bmwg-b2b-frame-03
> > >
> > > Al,
> > >
> > > > [acm] I agree with your observation that there are cases where trial
> > > duration
> > > > should be increased to accommodate the encountered in the DUT, but
> > > > not
> > > as a
> > > > mandate for all testing. I have four factors in mind:
> > > > 1. Some of the virtual network DUTs we are testing now have very
> > > > small
> > > buffers,
> > > > and the B2B stream of frames is quite short -- less than 2000
> > > frames@10GE in
> > > > some cases -- so 2 seconds fully sufficient.
> > > > 2. The trial duration is a factor in total test duration, where each
> > > trial is one step in
> > > > the Binary Search. We need to manage the tension between the time
> > > > needed
> > > to
> > > > reach a search result and confidence that we have depleted the
> queues.
> > > > 3. The RFC 2544 Latency benchmark will tell us if bufferbloat is
> > > present.
> > > > 4. The current text says "at least 2 seconds".
> > > >
> > > > So I suggest adding the following text:
> > > >
> > > >     The duration of the trial MUST be at least 2 seconds, to allow
> DUT
> > > >     buffers to deplete. When RFC2544 Latency measurements indicate
> that
> > > >     large buffers are present in the DUT, the trial duration SHOULD
> be
> > > >     increased to ensure that buffer depletion takes place, without
> unduly
> > > >     extending the total test time.
> > >
> > > The overall approach of collecting evidence that there is a problem
> > > before increasing the 2 second minimum duration is fine, but the
> > > details appear to need more attention:
> > >
> > > 1. Obtaining the RFC 2544 Latency measurements would need to be added
> > > to Section 4 (Prerequisites) of this draft to ensure that the buffer
> > > size information is available.
> > [acm]
> > Section 26.2 Latency is a very primitive procedure by today's standards.
> It is
> > common for testers to measure delay on every packet, and report a true
> average.
> > This is enough to alert the operator to the presence of bufferbloat, but
> there's no
> > assurance that the Latency benchmark measures the extent of the buffer
> time.
> >
> > >
> > > 2. I did not see any requirements in the RFC 2544 Latency test
> > > (Section
> > > 26.2) to deplete buffers.  Did I miss something?
> > [acm]
> > It's an overall requirement in Section 23, after running the trial:
> >
> >    d) Wait for two seconds for any residual frames to be received.
> >
> > >
> > > 3. I would think that the trial duration MUST be increased, not just
> > > SHOULD be increased if there is evidence of large buffer size, as
> > > buffer depletion appears to be a necessary characteristic of this B2B
> > > measurement.
> > [acm]
> >
> > We need to keep in mind that the sentence(s) in this discussion pertain
> to the trials
> > of the B2B Frame Benchmark testing, in the section that begins:
> >
> >   Each trial in the test requires the tester to send a burst of frames
> (after idle time)
> > with the minimum inter-frame gap, and to count the corresponding frames
> > forwarded by the DUT.
> >
> > Let's say we have a DUT that offers 1.6 seconds of buffering. The Binary
> Search will
> > increase the length of the burst until the buffer drops frames, and then
> try to find
> > the longest burst where frame loss is zero. This simple procedure always
> waits the
> > minimum trial duration for frames to exit the DUT.
> >
> > A burst of 1.7 seconds falls well within the coverage of a 2 second
> trial to count the
> > frames and determine that loss has occurred. We don't need a big margin
> time-
> > wise. Also, the trial duration increases with the search parameters:
> IOW, if the
> > tester is still sending frames after 2 seconds, then obviously we can't
> allow the trial
> > to terminate prematurely.
> >
> > So, I suggest we use this wording instead:
> >
> >        The duration of the trial MUST be at least 2 seconds in addition
> to
> >        the time to send each burst of frames, to allow DUT buffers to
> deplete.
> >
> > and we establish an adaptive waiting time for extreme cases, consistent
> with
> > RFC2544 section 23 item d). For a DUT with 1.6 second buffers, the Trial
> duration
> > would be 3.6 seconds.
> >
> > BTW, one tester issue we are trying to fix here is the case when the
> frame header
> > processing rate is sufficiently high that no buffer limit is measureable
> at the large
> > frame sizes, and we implore people not to waste time with the B2B Frame
> > Benchmark at those frame sizes. One commercial tester sent bursts up to
> 30
> > seconds in length, and then happily reported the impossible result of 30
> second
> > buffers.
> >
> > I want to avoid over-reacting to the buffer bloat case. We all agree
> that bloated
> > buffers are bad, and knowing a DUT has buffering >=1 second may be
> enough. I
> > certainly do not think that trial durations for the B2B Frame benchmark
> need to be
> > as long as you originally suggested, where the overwhelming number of
> test
> > conditions can use a <<1 second burst and 2 second buffer depletion
> time:
> >
> > > > > Hence, the 2 second minimum duration ought to be increased by at
> > > > > least a factor of 10.  I'd suggest changing it to 30 seconds or 60
> seconds...
> >
> > I hope the approach of the single sentence above works for you.
> >
> >
> >
> > >
> > > It also looks like the link to the entire slide deck didn't make it
> > > into my original review correctly - that slide deck is at:
> > > https://urldefense.com/v3/__http://www.taht.net/*d/lca_tcp3.odp__;fg!!
> > > BhdT !0oWabwvUrtVlGQY7ZKnFnMa-
> > un_e95tfCJcRKIWG7wECiJkVISb4sjd9Salwehw$
> > > .  In addition to slide 6 from this slide deck (Figure 1 in the APNIC
> > > blog), slide 14 is also relevant to this discussion.
> > >
> > > Thanks, --David
> > >
> > > > -----Original Message-----
> > > > From: MORTON, ALFRED C (AL) <acm@research.att.com>
> > > > Sent: Tuesday, November 24, 2020 11:11 AM
> > > > To: Black, David; tsv-art@ietf.org
> > > > Cc: bmwg@ietf.org; draft-ietf-bmwg-b2b-frame.all@ietf.org; last-
> > > call@ietf.org
> > > > Subject: RE: Tsvart last call review of draft-ietf-bmwg-b2b-frame-03
> > > >
> > > >
> > > > [EXTERNAL EMAIL]
> > > >
> > > > Hi David, Thanks for your review and comment!
> > > >
> > > > Please see a proposed resolution below, [acm] Al
> > > >
> > > > > -----Original Message-----
> > > > > From: David Black via Datatracker [mailto:noreply@ietf.org]
> > > > > Sent: Tuesday, November 24, 2020 10:27 AM
> > > > > To: tsv-art@ietf.org
> > > > > Cc: bmwg@ietf.org; draft-ietf-bmwg-b2b-frame.all@ietf.org; last-
> > > > > call@ietf.org
> > > > > Subject: Tsvart last call review of draft-ietf-bmwg-b2b-frame-03
> > > > >
> > > > > Reviewer: David Black
> > > > > Review result: Ready with Issues
> > > > >
> > > > ...
> > > > >
> > > > > This draft updates the back-to-back frame testing procedure in RFC
> > > > > 2544 to take account of experience.
> > > > >
> > > > > The draft is in good shape, with one notable exception in Section
> 5.2:
> > > > >
> > > > >    The duration of the trial MUST be at least 2 seconds, to allow
> DUT
> > > > >    buffers to deplete.
> > > > >
> > > > > That duration of 2 seconds has been carried forward from RFC 2544
> > > > > without change.  A 2 second duration may have been sufficient to
> > > > > deplete buffers in 1999, but that is no longer reliably the case.
> > > > > For example, on-site measurement of the network for the 2020 Linux
> > > > > Conference in Australia indicated a at least 1.6 seconds of
> > > > > buffering, as indicated by Figure 1 at
> > > > >
> > > https://urldefense.com/v3/__https://blog.apnic.net/2020/01/22/bufferbl
> > > oat-
> > > may-be-solved-but-its-not-over-
> > > yet/__;!!BhdT!wOuQE0NajXs4dT7tdIMVQU5FFpb0JiU0-
> > > yK2DOVVn0ecoYjf7mFEABLmlwDk$ ,
> > > > > which is slide 6 in from the complete slide deck at:
> > > > >
> > > https://urldefense.com/v3/__https://blog.apnic.net/2020/01/22/bufferbl
> > > oat-
> > > may-be-solved-but-its-not-over-
> > > yet/__;!!BhdT!wOuQE0NajXs4dT7tdIMVQU5FFpb0JiU0-
> > > yK2DOVVn0ecoYjf7mFEABLmlwDk$
> > > > > .  Experience with bufferbloat suggests that one network device
> > > > > was primarily responsible.  Also, see slide 14 in that slide deck.
> > > > >
> > > > > That 1.6 seconds measured on an actual network is entirely too
> > > > > close to 2 seconds for confidence that buffers will be depleted in
> > > > > any
> > > tested device.
> > > > > Hence, the 2 second minimum duration ought to be increased by at
> > > > > least a factor of 10.  I'd suggest changing it to 30 seconds or 60
> > > > > seconds as convenient round numbers, and providing the rationale
> > > > > that increased buffering in WiFi devices, e.g., home "routers," as
> > > > > indicated by experience with bufferbloat measurements, is the
> > > > > reason
> > > for the
> > > > increased duration.
> > > > >
> > > > [acm]
> > > > [acm] I agree with your observation that there are cases where trial
> > > duration
> > > > should be increased to accommodate the encountered in the DUT, but
> > > > not
> > > as a
> > > > mandate for all testing. I have four factors in mind:
> > > > 1. Some of the virtual network DUTs we are testing now have very
> > > > small
> > > buffers,
> > > > and the B2B stream of frames is quite short -- less than 2000
> > > frames@10GE in
> > > > some cases -- so 2 seconds fully sufficient.
> > > > 2. The trial duration is a factor in total test duration, where each
> > > trial is one step in
> > > > the Binary Search. We need to manage the tension between the time
> > > > needed
> > > to
> > > > reach a search result and confidence that we have depleted the
> queues.
> > > > 3. The RFC 2544 Latency benchmark will tell us if bufferbloat is
> > > present.
> > > > 4. The current text says "at least 2 seconds".
> > > >
> > > > So I suggest adding the following text:
> > > >
> > > >     The duration of the trial MUST be at least 2 seconds, to allow
> DUT
> > > >     buffers to deplete. When RFC2544 Latency measurements indicate
> that
> > > >     large buffers are present in the DUT, the trial duration SHOULD
> be
> > > >     increased to ensure that buffer depletion takes place, without
> > > unduly
> > > >     extending the total test time.
> > > >
> > > > I hope this suggestion resolves your issue; thanks for highlighting
> it!
> > > > Al