[aqm] Codel's count variable and re-entering dropping state at small time intervals

Roland Bless <roland.bless@kit.edu> Mon, 20 July 2015 16:49 UTC

Return-Path: <roland.bless@kit.edu>
X-Original-To: aqm@ietfa.amsl.com
Delivered-To: aqm@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 05EEA1ACE21 for <aqm@ietfa.amsl.com>; Mon, 20 Jul 2015 09:49:31 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.15
X-Spam-Level:
X-Spam-Status: No, score=-1.15 tagged_above=-999 required=5 tests=[BAYES_50=0.8, HELO_EQ_DE=0.35, RCVD_IN_DNSWL_MED=-2.3] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id FzOl82L9Y0DT for <aqm@ietfa.amsl.com>; Mon, 20 Jul 2015 09:49:27 -0700 (PDT)
Received: from iramx2.ira.uni-karlsruhe.de (iramx2.ira.uni-karlsruhe.de [141.3.10.81]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A34DD1ACE46 for <aqm@ietf.org>; Mon, 20 Jul 2015 09:49:27 -0700 (PDT)
Received: from i72vorta.tm.uni-karlsruhe.de ([141.3.71.26] helo=i72vorta.tm.kit.edu) by iramx2.ira.uni-karlsruhe.de with esmtp port 25 iface 141.3.10.81 id 1ZHEFq-0006dX-95 for <aqm@ietf.org>; Mon, 20 Jul 2015 18:49:26 +0200
Received: from [IPv6:::1] (localhost [127.0.0.1]) by i72vorta.tm.kit.edu (Postfix) with ESMTPS id 2CD6DB00611 for <aqm@ietf.org>; Mon, 20 Jul 2015 18:49:26 +0200 (CEST)
Message-ID: <55AD2695.8050605@kit.edu>
Date: Mon, 20 Jul 2015 18:49:25 +0200
From: Roland Bless <roland.bless@kit.edu>
Organization: Institute of Telematics, Karlsruhe Institute of Technology (KIT)
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.1) Gecko/20060111 Thunderbird/1.5 Mnenhy/0.7.3.0
MIME-Version: 1.0
To: aqm@ietf.org
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
X-ATIS-AV: ClamAV (iramx2.ira.uni-karlsruhe.de)
X-ATIS-Timestamp: iramx2.ira.uni-karlsruhe.de 1437410966.
Archived-At: <http://mailarchive.ietf.org/arch/msg/aqm/G8mlmUcSNAjQGncYB_3IKLsBrls>
Subject: [aqm] Codel's count variable and re-entering dropping state at small time intervals
X-BeenThere: aqm@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "Discussion list for active queue management and flow isolation." <aqm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/aqm>, <mailto:aqm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/aqm/>
List-Post: <mailto:aqm@ietf.org>
List-Help: <mailto:aqm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/aqm>, <mailto:aqm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 20 Jul 2015 16:49:31 -0000

Dear All,

we (Polina and I) have two questions concerning the behavior of the
`count` variable in Codel which can be summarized as:

1. after exiting dropping state, count is usually reset, unless "the
next drop state is entered too close to the previous one". This special
case is not explained in the text of the codel draft, but is present in
all implementations, and, currently, there are at least three different
versions of it (see below). We feel that implementers need more guidance
here...

2. is 'count' supposed to be reset or saturated on overflow and what
should be its maximum value (it makes a difference whether you are using
16-, 32-, or 64-bit counter)?

Regarding the first question:

Here are references to code that we looked to:
1) reference implementation in the draft:
https://tools.ietf.org/html/draft-ietf-aqm-codel-01#page-20
2) ns-2 code:
https://github.com/hbatmit/ns2.35/blob/master/queue/codel.cc#L144
3) linux code:
https://github.com/torvalds/linux/blob/110bc76729d448fdbcb5cdb63b83d9fd65ce5e26/include/net/codel.h
4) cake code: https://github.com/dtaht/sch_cake/blob/master/codel5.h

When codel encounters congestion it enters drop state, counts the number
the of packets that were dropped since congestion was encountered, and
drops packets in intervals of "interval" / sqrt (count). When there is
no more congestion, the drop state is exited and this counter is reset
unless "the next drop state is entered too close to the previous one".

This "too close to the previous one" is a part that is not well
documented and differs in implementations:
1) In the ns-2 code it is in lines 187-200. The most important part is
the comment that says that this control law is bad, but there is no
better one available.
2) In the reference code these lines are the last lines on page 20. It
has the same code (except // kmn decay tests line), but doesn't say that
this solution is temporary.
3) In Linux code the lines in question are 334-348 and 352; first it
uses 16 instead of 8, but this is consistent with comments, second it
sets the count to something else that wasn't comprehensible (It doesn't
look like the difference between previous count and count before that as
the names would suggest, but rather a some improvement of count - 2,
where 2 can be 3 or 1 or something ...)
4) For Cake the lines in question are 401-411, and in particular 404-405.

What is the "official" recommended version or is there any guidance on
how to select one?

Regarding the second question:

In the reference implementation of the draft, count is 32-bit integer,
but one cannot find any comments about the overflow behavior.
In Linux there is a comment to ignore overflow since there is no
division, and due to the sqrt^-1 approximation it won't break at 0.
According to these two emails on the Cake archive:
https://lists.bufferbloat.net/pipermail/cake/2015-June/000301.html,
https://lists.bufferbloat.net/pipermail/cake/2015-June/000302.html , it
seems that the original intention of count was to be reset by overflow
at 2^16, but according to experiments it is better to saturate it.

So, what is the desired overflow behavior for count: should it overflow
or saturate, and if the latter applies, at which maximal value?

Best Regards,
 Roland and Polina (currently implementing stuff)