Re: [tsvwg] Slides to support discussion of draft-ietf-tsvwg-rfc6040shim-update

Jonathan Morton <> Wed, 08 April 2020 16:30 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 864C13A0F1A for <>; Wed, 8 Apr 2020 09:30:46 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.848
X-Spam-Status: No, score=-1.848 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id SVyAVejzQ2ge for <>; Wed, 8 Apr 2020 09:30:44 -0700 (PDT)
Received: from ( [IPv6:2a00:1450:4864:20::12b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 562BA3A0B64 for <>; Wed, 8 Apr 2020 09:30:44 -0700 (PDT)
Received: by with SMTP id s13so5603401lfb.9 for <>; Wed, 08 Apr 2020 09:30:44 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=yjCaVuLWPMjdcCKrydMOEriLekbFYkzM291c+POYBsM=; b=OZexa1Xgr/aaaeRL8SvaMxtpvKMxnYhN7vWMWDLDeooNEW/YUCpLhWSKa32Vkgi5xp WIWlDzGDdUy6OrkH7Wlo7jNvkcJ80dkwRJISoVBnzm+SKwWVrSjhrg0jfj2juamJNKIm /nVxb05dDqHMcOSGcVKs/GSIZ6sgYnXmNxprOoxHHIvroggRKE9j/wczZrpLXaGAS4AK ygNYymJAFiJpVTh3jR1xD1Qll0nlFexEOoNQOnXr6LM1cPVKmFFMGwBxx0ghDIsW2+Xq ULEOoBc4B/IBJ3LeDUX1lSjYTAlaftw3Nk49Si1zo7NMMU4etaMYCwyveDImSC9tzIkc HvYA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:mime-version:subject:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to; bh=yjCaVuLWPMjdcCKrydMOEriLekbFYkzM291c+POYBsM=; b=YxD0nj3baygVb0uvKFGcNRzmEoRB6v2aecaqALWOPb/DiEE6Sr7Mdd2JwPKfGj3OIB 6R7sAzVgR6xYtOdK2cyw0QxATd7CtK+wclwenSeSw8oG0yjoYW7rxaVR+72FT39LsVOP biEAled43WY9nw/8Os6Y7h+SGgkKSG63dkkDW8Vk4+m8IKxuC7YaHhvsrYsmwwTM0OOo 5SigRlrTFMyyjXNoUhuXnjY5TVa9+6Mlz/GplbAHMPDFQPwNqgGEK2PKc8PMeifTve36 u3ea7+KJnC3SgARPb1uHSDBNuhpENadZMDV2cDj7GEqhDEA0iEVIo41d2TmF9gyQ7Cow 2WSQ==
X-Gm-Message-State: AGi0PuYPQ/PJnwSotIFaUVgb2XO6tUm78lGdTctjfpkbogd9PEPoOk/+ CJylaOJ4K9Zcc8fV2SNyNVSjdDWt
X-Google-Smtp-Source: APiQypJgcMzEwO7q2IfG8m5FBAlPMs/AtFRY8IWRZdUc4iBOBNCXeyA/bja9/54+8a8LGBsrsPTXug==
X-Received: by 2002:a05:6512:304e:: with SMTP id b14mr357594lfb.119.1586363442601; Wed, 08 Apr 2020 09:30:42 -0700 (PDT)
Received: from jonathartonsmbp.lan ( []) by with ESMTPSA id x80sm8218120lff.23.2020. (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 08 Apr 2020 09:30:41 -0700 (PDT)
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 11.5 \(3445.9.1\))
From: Jonathan Morton <>
In-Reply-To: <>
Date: Wed, 8 Apr 2020 19:30:39 +0300
Cc: tsvwg IETF list <>
Content-Transfer-Encoding: quoted-printable
Message-Id: <>
References: <>
To: Bob Briscoe <>
X-Mailer: Apple Mail (2.3445.9.1)
Archived-At: <>
Subject: Re: [tsvwg] Slides to support discussion of draft-ietf-tsvwg-rfc6040shim-update
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Transport Area Working Group <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 08 Apr 2020 16:30:47 -0000

> On 8 Apr, 2020, at 4:54 pm, Bob Briscoe <> wrote:
> I've produced 4 slides that might be useful to support the fragment reassembly discussion on draft-ietf-tsvwg-rfc6040shim-update

I'd like to get out ahead of the main discussion over the fragment reassembly semantics, because these slides contained some analysis of that problem which I believe to be erroneous.  I was going to mention this during the meeting, but David Black rightly pushed this topic off to mailing list discussion.

It's also worth remembering that this discussion is technically off-topic for rfc6040shim-update, since the latter is supposed to be dropping fragment reassembly semantics as out of scope.  But I think I should present an alternative analysis now, so that we can converge on the truth more quickly.

The analysis as presented appears to proceed along the following lines:

1: CE marking is modelled as a uniform, steady-state probability over all packets, regardless of packet size or AQM type.

2: Traffic sources utilising different packets sizes are contrasted as to fragmentation behaviour when encountering a tunnel, and their packet rate for constant throughput, both on the tunnel path and after reassembly.  Specifically, 1500 byte packets are fragmented into full and runt packets, 1480 byte packets are not fragmented and have the same origin packet rate, and 750 byte packets are not fragmented but have twice the packet rate.

3: There is a well-known equation relating Reno average cwnd to marking probability, and another relating segment size, cwnd (in segments) and RTT to flow throughput.  This is applied to find the relative behaviour of the above traffic sources when experiencing the same, uniform marking probability, somewhere on the tunnel path.

4: Another scenario involving an FQ-AQM is presented, in which the throughput of each flow is equalised and the marking rate and probability calculated from that.  It is pointed out, unsurprisingly, that these end up being different for each flow.

5: The conclusion is explicitly drawn that RFC-3168's existing rule for preserving CE marks on fragment reassembly "is broken".  Implied is an assertion that it needs to be materially changed.

Steps 2, 3, and 4 appear (at first glance) to be sound.  However, the modelling assumption in step 1 is flawed, and I think this torpedoes the conclusion.

The central assumption is that all AQMs mark packets with uniform probability, including those which calculate a timebase marking schedule (eg. Codel) instead of a marking probability (eg. RED).  The analysis as presented is valid only for the latter case.

When timebase marking is in use, the relative sizes of the packets being considered for marking becomes relevant.  A runt fragment of 40 bytes occupies the head position in the queue (where Codel does its marking) for a different length of time than the full 1500-byte fragments either side of it, and this strongly influences the relative probability of it being marked.  Likewise a stream of 750-byte packets will each spend only half the time at the queue head as an otherwise similar stream of 1500-byte packets, so their probability of marking is halved in the same timebase schedule.

Therefore, it is incorrect to convert a timebase marking rate to a uniform marking probability, when packets of significantly different sizes are involved.  A different analysis must therefore be run to establish the effect of a shared timebase AQM on the tunnel path.

Another implicit assumption is that AQMs apply a static marking rate (in whichever paradigm they choose) over a timescale of many seconds.  In fact, they typically react to changes in queue depth over timescales of milliseconds.  This has the effect of concentrating marking events on the flows with highest throughput at the moment congestion occurs, at the peak of each flow's sawtooth.  This is a further factor which may invalidate the assumption of uniform marking probability, even for probabilistic AQMs.

I hope the above will help to inform the main discussion about RFC-3168 semantics, when that is taken up.

 - Jonathan Morton