Re: [aqm] Questioning each PIE heuristic

Bob Briscoe <in@bobbriscoe.net> Tue, 28 March 2017 13:15 UTC

Return-Path: <in@bobbriscoe.net>
X-Original-To: aqm@ietfa.amsl.com
Delivered-To: aqm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C95CD129983 for <aqm@ietfa.amsl.com>; Tue, 28 Mar 2017 06:15:58 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.001
X-Spam-Level:
X-Spam-Status: No, score=-2.001 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=bobbriscoe.net
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id JdFts2lrUWif for <aqm@ietfa.amsl.com>; Tue, 28 Mar 2017 06:15:56 -0700 (PDT)
Received: from server.dnsblock1.com (server.dnsblock1.com [85.13.236.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 38E841298B7 for <aqm@ietf.org>; Tue, 28 Mar 2017 06:15:56 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=bobbriscoe.net; s=default; h=Content-Transfer-Encoding:Content-Type: In-Reply-To:MIME-Version:Date:Message-ID:From:References:To:Subject:Sender: Reply-To:Cc:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=854/T1eW40Gjvuhat9fD7guLxL4lM2e9/iCTZfnAGac=; b=7kV/97FIuYdbVf1GR/gmUGz0RJ RJsw2OiX0csMv+eJcNpbimlAvDpAxVgsLMtFFHEuhZ5rbhnmD/zcn9SQYNExZPZrZmXJ+WXNFsGUB 4MN8VG0wrGJoBXh0mG+xBsUFSZ4r8Mbsv5JWh84cS2UGKOuclUWKgz+w/38+w78a0C942OpMwZCzX AZRyfNTx7RrKDLhMOBhETYdikOm3r1DnGUJUy7krNfH7AlKoAIVRpnsi4gnYNeEG4NIDACQT9wt2B 73bmTdt6onIFgIaNtLhK/PDYto9n6DFUKh4J/ulr1tJPA8cwdDyu3EbsYa0G20R59IhHxZc+c/4/7 CodHAZ5g==;
Received: from [77.88.71.158] (port=35658 helo=[172.16.5.179]) by server.dnsblock1.com with esmtpsa (TLSv1.2:ECDHE-RSA-AES128-GCM-SHA256:128) (Exim 4.88) (envelope-from <in@bobbriscoe.net>) id 1csqyX-0003sl-Fe for aqm@ietf.org; Tue, 28 Mar 2017 14:15:54 +0100
To: aqm@ietf.org
References: <9ddba389-e368-9050-3b14-aa235c99fcb8@bobbriscoe.net> <D4FDD717.2636D%ropan@cisco.com>
From: Bob Briscoe <in@bobbriscoe.net>
Message-ID: <372077b2-d194-6fbe-ca1e-fb4e3e5e3d3d@bobbriscoe.net>
Date: Tue, 28 Mar 2017 11:41:12 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.7.0
MIME-Version: 1.0
In-Reply-To: <D4FDD717.2636D%ropan@cisco.com>
Content-Type: text/plain; charset="windows-1252"; format="flowed"
Content-Transfer-Encoding: 8bit
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname - server.dnsblock1.com
X-AntiAbuse: Original Domain - ietf.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - bobbriscoe.net
X-Get-Message-Sender-Via: server.dnsblock1.com: authenticated_id: in@bobbriscoe.net
X-Authenticated-Sender: server.dnsblock1.com: in@bobbriscoe.net
Archived-At: <https://mailarchive.ietf.org/arch/msg/aqm/uvWrbwCChsMJ5fy1U-wrgqIGddY>
Subject: Re: [aqm] Questioning each PIE heuristic
X-BeenThere: aqm@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Discussion list for active queue management and flow isolation." <aqm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/aqm>, <mailto:aqm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/aqm/>
List-Post: <mailto:aqm@ietf.org>
List-Help: <mailto:aqm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/aqm>, <mailto:aqm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 28 Mar 2017 13:15:59 -0000

Rong,

Some comments inline. And one remaining question at the end...

On 28/03/17 02:04, Rong Pan (ropan) wrote:
> Bob,
>
> Sorry for the late reply. I have been traveling. Please see inlineŠ
>
> Rong
>
> On 3/23/17, 5:01 PM, "aqm on behalf of Bob Briscoe" <aqm-bounces@ietf.org
> on behalf of ietf@bobbriscoe.net> wrote:
>
>> Rong, Preethi, Greg, Fred, and others involved in PIE,
>>
>> You may recall that when we wrote PI2 we didn't include any of PIE's
>> heuristics. Mostly because PI2 solved the issues they addressed
>> intrinsically. But we left some until we had checked their benefit,
>> which is what I'm doing now...
>>
>> My first question is about this heuristic in PIE:
>>
>>           //Safeguard PIE to be work conserving
>>           if ( (PIE->qdelay_old_ < QDELAY_REF/2 && PIE->drop_prob_ < 0.2)
>>                 || (queue_.byte_length() <= 2 * MEAN_PKTSIZE) ) {
>>                return ENQUE;
>>           }
>>
>> If it tests true, this block doesn't stop the calculation of drop_prob_
>> evolving, but it disables it being able to lead to any random packet drop.
>>
>> I can understand why you want to disable packet drop when the queue is
>> no more than 2 packets.
>> My question is about the first half of the logical OR. The drop_prob_ <
>> 20% test will be true under normal non-overloaded conditions. So I have
>> just realized that the qdelay_old_ < QDELAY_REF/2 test will turn off
>> random drop very often. I would expect this to radically impact the
>> behaviour of PIE. It seems to be overriding the PI controller as if you
>> are thinking "actually we don't really trust the PI controller to leave
>> it to do its thing, so we've overridden it a lot of the time." For
>> instance, whenever a single long-running TCP flow with RTT about the
>> same as the target delay is saw-toothing, this test will disable random
>> drop completely during the lower half of every saw-tooth in the queue.
>> Maybe that's OK, but...
>>
>> Without this test, the PI controller should reduce drop probability as
>> the queue sawtooths down anyway. If another flow causes the queue to
>> rise rapidly while it is under half the target, the PI controller is
>> designed to detect such an increase and translate it into drop. But this
>> heuristic suppresses any drop until the queue has exceeded half the
>> target.
>>
>> So my questions are:
>>
>> Q1. What were the reasons for introducing such a frequent suppression of
>> the PI algorithm (the RFC just says what this code does, not why)?
>
> To be work conserving and avoid any unnecessary drops are the main reasons
> behind it.
> Cisco had a not so successful algorithm before that is not work
> conserving. So we are
> extra cautious about being work conserving...
[BB] There is only a work-conservation problem if drop_early() is 
applied at enqueue. That's because, at enqueue, you don't yet know 
whether another packet will arrive to take the place of the packet you 
are deciding to drop.

We're shifting drop_early() to dequeue {Note 1}. So to be 
work-conserving we can rely solely on the test on the other side of the 
logical OR above that suppresses any drop if "backlog < 2 MTU".  That's 
the only heuristic that we are keeping so far, although I'm undecided 
about the  "< QDELAY_REF/2" test, which (as you say) might be beneficial 
for other reasons than work conservation. But we have no tests that show 
that yet.

{Note 1}: Because we're using sojourn time to measure the queue, so if 
we were still dropping on enqueue, each congestion signal would be 
delayed twice by the queue.

>
>
>> Q2. Why use qdelay_old_ in the test? This seems to drive suppression of
>> drop using stale state.
> qdelay_old_ is the latency state currently stored. This is for
> implementation
> Considerations as we don¹t want to calculate qdelay_ on per packet basis.
[BB] Understood.
We're using sojourn time per packet for the shifted FIFO scheduler 
anyway, so no extra cost.
>
>> Q3. Having said that it looks like this heuristic will significantly
>> alter PIE's behaviour, in tests under a very wide range of traffic
>> conditions, link rates, mixed RTTs, traffic models etc, we have found
>> that removing the heuristics makes no measurable difference to PIE's
>> performance. So if you added this heuristic for a specific scenario,
>> please describe it, so we can test for it.
> Again, to be work conserving and avoid drops are our goal. I don¹t
> think it would be hurtful to add those safeguards.

[BB]: So that begs just one remaining question:
Q: Do you have tests showing any benefit, specifically comparing with 
and without this  "< QDELAY_REF/2" heuristic?

Given the point of a (non-ECN) AQM is to introduce the right level of 
random drops, it seems strange to suppress some of them with an 
additional arbitrary rule.


Thanks for your replies so far tho. They have helped me realize more 
reasons why PIE needs these heuristics, but PI2 might not.


Bob

>
>
>>
>> Cheers
>>
>>
>> Bob
>>
>>
>>
>>
>> -- 
>> ________________________________________________________________
>> Bob Briscoehttp://bobbriscoe.net/
>>
>> _______________________________________________
>> aqm mailing list
>> aqm@ietf.org
>> https://www.ietf.org/mailman/listinfo/aqm
> _______________________________________________
> aqm mailing list
> aqm@ietf.org
> https://www.ietf.org/mailman/listinfo/aqm

-- 
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/