Re: [aqm] Questioning each PIE heuristic

"Rong Pan (ropan)" <ropan@cisco.com> Tue, 28 March 2017 14:25 UTC

Return-Path: <ropan@cisco.com>
X-Original-To: aqm@ietfa.amsl.com
Delivered-To: aqm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A255B1299ED for <aqm@ietfa.amsl.com>; Tue, 28 Mar 2017 07:25:03 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -14.523
X-Spam-Level:
X-Spam-Status: No, score=-14.523 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cisco.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 5x2unczSoTOU for <aqm@ietfa.amsl.com>; Tue, 28 Mar 2017 07:25:01 -0700 (PDT)
Received: from alln-iport-1.cisco.com (alln-iport-1.cisco.com [173.37.142.88]) (using TLSv1.2 with cipher DHE-RSA-SEED-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1C4061299EE for <aqm@ietf.org>; Tue, 28 Mar 2017 07:24:55 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=8856; q=dns/txt; s=iport; t=1490711095; x=1491920695; h=from:to:subject:date:message-id:references:in-reply-to: content-id:content-transfer-encoding:mime-version; bh=18JX2rq2ccqAwdv+gvtKPFXBN7iVf93V/WU0Xz4Dk3s=; b=K/O7soTq439vZvxqQ6PZzOwDfmVO8RjLmfoe7uVOd8dGi+YWAzggBIKe 0EFsh8C2VB7wQHzo6zJ0qO14A3Lka9SY3o+XuV5e3y3WcT07bWkatLF6d 9XuwvL0Qzoilautu2spgoG7pjy/OtFbJSH02JrDYI5Hay6fzpf+EHRz58 E=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: =?us-ascii?q?A0C1AQDhcNpY/4UNJK1dGQEBAQEBAQEBA?= =?us-ascii?q?QEBBwEBAQEBg1RhgQsHg1uKD5FRlUyCDh8LhXgCGoMHPxgBAgEBAQEBAQFrKIU?= =?us-ascii?q?WAQEBAwEBIQQNOhsCAQgOCgICJgICAiULFRACBAESigcOrFuBbDqKSwEBAQEBA?= =?us-ascii?q?QEBAQEBAQEBAQEBASCBC4VDhG+EJy0Xgm+CXwWcYAGGe4tTgXxUhFaKDZNpAR8?= =?us-ascii?q?4gQRZFUGEWB2BY3WHMYEvgQ0BAQE?=
X-IronPort-AV: E=Sophos;i="5.36,236,1486425600"; d="scan'208";a="401218903"
Received: from alln-core-11.cisco.com ([173.36.13.133]) by alln-iport-1.cisco.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 28 Mar 2017 14:24:54 +0000
Received: from XCH-ALN-016.cisco.com (xch-aln-016.cisco.com [173.36.7.26]) by alln-core-11.cisco.com (8.14.5/8.14.5) with ESMTP id v2SEOsPi010534 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=FAIL); Tue, 28 Mar 2017 14:24:54 GMT
Received: from xch-aln-017.cisco.com (173.36.7.27) by XCH-ALN-016.cisco.com (173.36.7.26) with Microsoft SMTP Server (TLS) id 15.0.1210.3; Tue, 28 Mar 2017 09:24:53 -0500
Received: from xch-aln-017.cisco.com ([173.36.7.27]) by XCH-ALN-017.cisco.com ([173.36.7.27]) with mapi id 15.00.1210.000; Tue, 28 Mar 2017 09:24:53 -0500
From: "Rong Pan (ropan)" <ropan@cisco.com>
To: Bob Briscoe <in@bobbriscoe.net>, "aqm@ietf.org" <aqm@ietf.org>
Thread-Topic: [aqm] Questioning each PIE heuristic
Thread-Index: AQHSpDHJy+lx1UjTYECfkxBsGJVX96GphvqAgADkFgD///twAA==
Date: Tue, 28 Mar 2017 14:24:53 +0000
Message-ID: <D4FFE952.26567%ropan@cisco.com>
References: <9ddba389-e368-9050-3b14-aa235c99fcb8@bobbriscoe.net> <D4FDD717.2636D%ropan@cisco.com> <372077b2-d194-6fbe-ca1e-fb4e3e5e3d3d@bobbriscoe.net>
In-Reply-To: <372077b2-d194-6fbe-ca1e-fb4e3e5e3d3d@bobbriscoe.net>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
user-agent: Microsoft-MacOutlook/14.7.0.161029
x-ms-exchange-messagesentrepresentingtype: 1
x-ms-exchange-transport-fromentityheader: Hosted
x-originating-ip: [10.24.35.73]
Content-Type: text/plain; charset="utf-8"
Content-ID: <C0D63FAC24C6E14480EB3B248A48F418@emea.cisco.com>
Content-Transfer-Encoding: base64
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/aqm/oBjYlYGjtDzMxmzcMN1VZI2u--w>
Subject: Re: [aqm] Questioning each PIE heuristic
X-BeenThere: aqm@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Discussion list for active queue management and flow isolation." <aqm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/aqm>, <mailto:aqm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/aqm/>
List-Post: <mailto:aqm@ietf.org>
List-Help: <mailto:aqm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/aqm>, <mailto:aqm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 28 Mar 2017 14:25:04 -0000

[BB]: So that begs just one remaining question:
Q: Do you have tests showing any benefit, specifically comparing with
and without this  "< QDELAY_REF/2" heuristic?


If we set the QDELAY_REF too low, we have seen losing throughput. This is
related
to your question regarding qdelay_old_. You are using sojourn time while
we kept the state in qdelay_old_ which can
become stale if update_interval is long.

Rong  

On 3/28/17, 6:41 AM, "aqm on behalf of Bob Briscoe" <aqm-bounces@ietf.org
on behalf of in@bobbriscoe.net> wrote:

>Rong,
>
>Some comments inline. And one remaining question at the end...
>
>On 28/03/17 02:04, Rong Pan (ropan) wrote:
>> Bob,
>>
>> Sorry for the late reply. I have been traveling. Please see inlineŠ
>>
>> Rong
>>
>> On 3/23/17, 5:01 PM, "aqm on behalf of Bob Briscoe"
>><aqm-bounces@ietf.org
>> on behalf of ietf@bobbriscoe.net> wrote:
>>
>>> Rong, Preethi, Greg, Fred, and others involved in PIE,
>>>
>>> You may recall that when we wrote PI2 we didn't include any of PIE's
>>> heuristics. Mostly because PI2 solved the issues they addressed
>>> intrinsically. But we left some until we had checked their benefit,
>>> which is what I'm doing now...
>>>
>>> My first question is about this heuristic in PIE:
>>>
>>>           //Safeguard PIE to be work conserving
>>>           if ( (PIE->qdelay_old_ < QDELAY_REF/2 && PIE->drop_prob_ <
>>>0.2)
>>>                 || (queue_.byte_length() <= 2 * MEAN_PKTSIZE) ) {
>>>                return ENQUE;
>>>           }
>>>
>>> If it tests true, this block doesn't stop the calculation of drop_prob_
>>> evolving, but it disables it being able to lead to any random packet
>>>drop.
>>>
>>> I can understand why you want to disable packet drop when the queue is
>>> no more than 2 packets.
>>> My question is about the first half of the logical OR. The drop_prob_ <
>>> 20% test will be true under normal non-overloaded conditions. So I have
>>> just realized that the qdelay_old_ < QDELAY_REF/2 test will turn off
>>> random drop very often. I would expect this to radically impact the
>>> behaviour of PIE. It seems to be overriding the PI controller as if you
>>> are thinking "actually we don't really trust the PI controller to leave
>>> it to do its thing, so we've overridden it a lot of the time." For
>>> instance, whenever a single long-running TCP flow with RTT about the
>>> same as the target delay is saw-toothing, this test will disable random
>>> drop completely during the lower half of every saw-tooth in the queue.
>>> Maybe that's OK, but...
>>>
>>> Without this test, the PI controller should reduce drop probability as
>>> the queue sawtooths down anyway. If another flow causes the queue to
>>> rise rapidly while it is under half the target, the PI controller is
>>> designed to detect such an increase and translate it into drop. But
>>>this
>>> heuristic suppresses any drop until the queue has exceeded half the
>>> target.
>>>
>>> So my questions are:
>>>
>>> Q1. What were the reasons for introducing such a frequent suppression
>>>of
>>> the PI algorithm (the RFC just says what this code does, not why)?
>>
>> To be work conserving and avoid any unnecessary drops are the main
>>reasons
>> behind it.
>> Cisco had a not so successful algorithm before that is not work
>> conserving. So we are
>> extra cautious about being work conserving...
>[BB] There is only a work-conservation problem if drop_early() is
>applied at enqueue. That's because, at enqueue, you don't yet know
>whether another packet will arrive to take the place of the packet you
>are deciding to drop.
>
>We're shifting drop_early() to dequeue {Note 1}. So to be
>work-conserving we can rely solely on the test on the other side of the
>logical OR above that suppresses any drop if "backlog < 2 MTU".  That's
>the only heuristic that we are keeping so far, although I'm undecided
>about the  "< QDELAY_REF/2" test, which (as you say) might be beneficial
>for other reasons than work conservation. But we have no tests that show
>that yet.
>
>{Note 1}: Because we're using sojourn time to measure the queue, so if
>we were still dropping on enqueue, each congestion signal would be
>delayed twice by the queue.
>
>>
>>
>>> Q2. Why use qdelay_old_ in the test? This seems to drive suppression of
>>> drop using stale state.
>> qdelay_old_ is the latency state currently stored. This is for
>> implementation
>> Considerations as we don¹t want to calculate qdelay_ on per packet
>>basis.
>[BB] Understood.
>We're using sojourn time per packet for the shifted FIFO scheduler
>anyway, so no extra cost.
>>
>>> Q3. Having said that it looks like this heuristic will significantly
>>> alter PIE's behaviour, in tests under a very wide range of traffic
>>> conditions, link rates, mixed RTTs, traffic models etc, we have found
>>> that removing the heuristics makes no measurable difference to PIE's
>>> performance. So if you added this heuristic for a specific scenario,
>>> please describe it, so we can test for it.
>> Again, to be work conserving and avoid drops are our goal. I don¹t
>> think it would be hurtful to add those safeguards.
>
>[BB]: So that begs just one remaining question:
>Q: Do you have tests showing any benefit, specifically comparing with
>and without this  "< QDELAY_REF/2" heuristic?
>
>Given the point of a (non-ECN) AQM is to introduce the right level of
>random drops, it seems strange to suppress some of them with an
>additional arbitrary rule.
>
>
>Thanks for your replies so far tho. They have helped me realize more
>reasons why PIE needs these heuristics, but PI2 might not.
>
>
>Bob
>
>>
>>
>>>
>>> Cheers
>>>
>>>
>>> Bob
>>>
>>>
>>>
>>>
>>> -- 
>>> ________________________________________________________________
>>> Bob Briscoehttp://bobbriscoe.net/
>>>
>>> _______________________________________________
>>> aqm mailing list
>>> aqm@ietf.org
>>> https://www.ietf.org/mailman/listinfo/aqm
>> _______________________________________________
>> aqm mailing list
>> aqm@ietf.org
>> https://www.ietf.org/mailman/listinfo/aqm
>
>-- 
>________________________________________________________________
>Bob Briscoe                               http://bobbriscoe.net/
>
>_______________________________________________
>aqm mailing list
>aqm@ietf.org
>https://www.ietf.org/mailman/listinfo/aqm