[bmwg] My comments on draft-ietf-bmwg-sip-bench-meth-09

William Cerveny <bmwg@wjcerveny.com> Mon, 07 April 2014 15:06 UTC

Message-Id: <1396883172.2235.103678713.65DD6B82@webmail.messagingengine.com>
From: William Cerveny <bmwg@wjcerveny.com>
To: bmwg@ietf.org
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Type: text/plain
Date: Mon, 07 Apr 2014 11:06:12 -0400
Archived-At: http://mailarchive.ietf.org/arch/msg/bmwg/U1QFrgbBhzTRWfmPIfex7gkEboI
Subject: [bmwg] My comments on draft-ietf-bmwg-sip-bench-meth-09
Precedence: list

Below are my high level comments on draft-ietf-bmwg-sip-bench-meth-09
denoted by /* comment */ ; I'm sending more detailed (mostly grammatical
comments) directly to the authors. I've identified the section and
general line number in which the comment is made.

Of note, I had trouble following the variables in the pseudocode, but I
may not have been following close enough attention to the code.

Bill Cerveny

<begin comments on draft-ietf-bmwg-sip-bench-meth-09>

desktop-10-36:scratch wcerveny$ cat grep-output-meth-2014-04-17.txt
1. Abstract:

30-   objective comparison of the capacity of SIP devices.  Test setup
31-   parameters and a methodology are necessary because SIP allows a
wide
32-   range of configuration and operational conditions that can
influence
33:/* In my opinion, sentence beginning with "A standard terminology
..."
34:   is an assumed and I'm not sure it should be in the abstract */
35-   performance benchmark measurements.  A standard terminology and
36-   methodology will ensure that benchmarks have consistent definition
37-   and were obtained following the same procedures.
--
2. Introduction:
--
200-   only.
201-
202-   The device-under-test (DUT) is a SIP server, which may be any SIP
203:/* Capitalization in "Benchmarks can be ..." is inconsistant */
204-   conforming [RFC3261] device.  Benchmarks can be obtained and
compared
205-   for different types of devices such as SIP Proxy Server, Session
206-   Border Controllers (SBC), SIP registrars and SIP proxy server
paired
--
--
208-
209-   The test cases provide metrics for benchmarking the maximum 'SIP
210-   Registration Rate' and maximum 'SIP Session Establishment Rate'
that
211:/* Is "extended period" defined? */
212-   the DUT can sustain over an extended period of time without
failures.
213-   Some cases are included to cover Encrypted SIP.  The test
topologies
214-   that can be used are described in the Test Setup section. 
Topologies
--
--
219-
220-   SIP permits a wide range of configuration options that are
explained
221-   in Section 4 and Section 2 of [I-D.sip-bench-term].  Benchmark
values
222:/* Is associated media defined */
223-   could possibly be impacted by Associated Media.  The selected
values
224-   for Session Duration and Media Streams per Session enable
benchmark
225-
--
3. Benchmarking Topologies
--
259-
260-   There are two test topologies; one in which the DUT does not
process
261-   the media (Figure 1) and the other in which it does process media
262:/* EA defined? */
263-   (Figure 2).  In both cases, the tester or EA sends traffic into
the
264-   DUT and absorbs traffic from the DUT.  The diagrams in Figure 1
and
265-   Figure 2 represent the logical flow of information and do not
dictate
--
4.3 Associated Media
--
346-4.3.  Associated Media
347-
348-   Some tests require Associated Media to be present for each SIP
349:/* Is this redundant? */
350-   session.  The test topologies to be used when benchmarking DUT
351-   performance for Associated Media are shown in Figure 1 and Figure
2.
352-
--
4.6 Session Duration
--
370-
371-   The value of the DUT's performance benchmarks may vary with the
372-   duration of SIP sessions.  Session Duration MUST be reported with
373:/* I'm not sure if this sentence ("A Session Duration ...") is
properly
374-formed (it might be), but I had difficulty following the logic of
the
375:sentence */
376-   benchmarking results.  A Session Duration of zero seconds
indicates
377-   transmission of a BYE immediately following successful SIP
378-   establishment indicate by receipt of a 200 OK.  An infinite
Session
--
4.8 Benchmarking algorithm
--
409-
410-   During the Candidate Identification phase, the test runs until n
411-   sessions have been attempted, at session attempt rates, r, which
vary
412:/* Upper case N and lower case n are different variables?? Same with
"R" */
413-   according to the algorithm below, where n is also a parameter of
test
414-   and is a relatively large number, but an order of magnitude
smaller
415-   than N. If no errors occur during the time it takes to attempt n
--
--
415-   than N. If no errors occur during the time it takes to attempt n
416-   sessions, we increment r according to the algorithm.  If errors
are
417-   encountered during the test, we decrement r according to the
418:/* sentence "The algorithm provides ..." needs clarification
419:Is the word "how" unnecessary? */
420-   algorithm.  The algorithm provides a variable, G, that allows us
to
421-   control how the accuracy, in sessions per second, that we require
of
422-   the test.
--
--
422-   the test.
423-
424-   After this candidate rate has been discovered, the test enters
the
425:/* Is N consistent with N in pseudocode? */
426-   Steady State phase.  In the Steady State phase, N session
Attempts
427-   are made at the candidate rate.  The goal is to find a rate at
which
428-   the DUT can process calls "forever" with no errors and the test
--
--
432-   the steady-state phase is entered again until a final (new)
steady-
433-   state rate is achieved.
434-
435:/* Would this process be clearer if presented as a list? */
436-   The iterative process itself is defined as follows: A starting
rate
437-   of r = 100 sessions per second is used and we place calls at that
438-   rate until n = 5000 calls have been placed.  If all n calls are
--
--
436-   The iterative process itself is defined as follows: A starting
rate
437-   of r = 100 sessions per second is used and we place calls at that
438-   rate until n = 5000 calls have been placed.  If all n calls are
439:/* sps defined? This said, it's easy to figure out */
440-   successful, the rate is increased to 150 sps and again we place
calls
441-   at that rate until n = 5000 calls have been placed.  The attempt
rate
442-   is continuously ramped up until a failure is encountered before n
=
--
--
449-   between the rate at which failures occurred and the last
successful
450-   rate.  Continuing in this way, an attempt rate without errors is
451-   found.  The tester can specify a margin of error using the
parameter G,
452:/* units? */
453-   measured in units of sessions per second.
454-
455-   The pseudo-code corresponding to the description above follows.
--
--
478-      G  := 5      ; granularity of results - the margin of error in
479-                   ; sps
480-      C  := 0.05   ; calibration amount: How much to back down if we
481:/* using "s" before definition, in my opinion; consider "found
candidate rate
482:s but cannot send at s" (Still not right, though ... */
483-                   ; have found candidate s but cannot send at rate
s
484-                   ; for time T without failures
485-
--
--
487-      ; ---- Initialization of flags, candidate values and upper
bounds
488-
489-      f  := false  ; indicates a success after the upper limit
490:/* Capital F never used in pseudocode */
491-      F  := false  ; indicates that test is done
492-      c  := 0      ; indicates that we have found an upper limit
493-
--
--
499-                                   ; characteristics until n
500-                                   ; requests have been sent
501-             if (all requests succeeded) {
502:/* undefined variable r'?  does this matter? */
503-                r' := r ; save candidate value of metric
504-                if ( c == 0 ) {
--
6.3 Session Establishment Rate with Media not on DUT
--
649-          be recorded using any pertinent parameters as shown in the
650-          reporting format of Section 5.1.
651-
652:/* Long sentence in general, but minimally last part of sentence
doesn't
653:conclude */
654-   Expected Results:  Session Establishment Rate results obtained
with
655-      Associated Media with any number of media streams per SIP
session
656-      are expected to be identical to the Session Establishment Rate

<end comments on draft-ietf-bmwg-sip-bench-meth-09>

[bmwg] My comments on draft-ietf-bmwg-sip-bench-m… William Cerveny
Re: [bmwg] My comments on draft-ietf-bmwg-sip-ben… Carol Davids