Re: [aqm] FQ-PIE kernel module implementation

"Francini, Andrea (Andrea)" <andrea.francini@alcatel-lucent.com> Thu, 09 July 2015 13:27 UTC

Return-Path: <andrea.francini@alcatel-lucent.com>
X-Original-To: aqm@ietfa.amsl.com
Delivered-To: aqm@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BA3EC1A9080 for <aqm@ietfa.amsl.com>; Thu, 9 Jul 2015 06:27:07 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.608
X-Spam-Level:
X-Spam-Status: No, score=-6.608 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DC_PNG_UNO_LARGO=0.001, HTML_MESSAGE=0.001, MIME_8BIT_HEADER=0.3, RCVD_IN_DNSWL_HI=-5, T_RP_MATCHES_RCVD=-0.01] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9-JXwU_Kz-tL for <aqm@ietfa.amsl.com>; Thu, 9 Jul 2015 06:26:57 -0700 (PDT)
Received: from smtp-fr.alcatel-lucent.com (fr-hpida-esg-01.alcatel-lucent.com [135.245.210.20]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6BAC31A907D for <aqm@ietf.org>; Thu, 9 Jul 2015 06:26:56 -0700 (PDT)
Received: from us70tusmtp2.zam.alcatel-lucent.com (unknown [135.5.2.64]) by Websense Email Security Gateway with ESMTPS id 3ACB064A99708; Thu, 9 Jul 2015 13:26:50 +0000 (GMT)
Received: from US70TWXCHHUB04.zam.alcatel-lucent.com (us70twxchhub04.zam.alcatel-lucent.com [135.5.2.36]) by us70tusmtp2.zam.alcatel-lucent.com (GMO) with ESMTP id t69DQn3p020123 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=FAIL); Thu, 9 Jul 2015 13:26:49 GMT
Received: from US70TWXCHMBA12.zam.alcatel-lucent.com ([169.254.6.242]) by US70TWXCHHUB04.zam.alcatel-lucent.com ([135.5.2.36]) with mapi id 14.03.0195.001; Thu, 9 Jul 2015 09:26:48 -0400
From: "Francini, Andrea (Andrea)" <andrea.francini@alcatel-lucent.com>
To: "Agarwal, Anil" <Anil.Agarwal@viasat.com>, Polina Goltsman <polina.goltsman@student.kit.edu>, "Bless, Roland (TM)" <roland.bless@kit.edu>, "Fred Baker (fred)" <fred@cisco.com>, Toke Høiland-Jørgensen <toke@toke.dk>
Thread-Topic: [aqm] FQ-PIE kernel module implementation
Thread-Index: AQHQuIvh7zcVUT1kXUKS4PSVeb85XZ3P+wwAgACMaYD//9BJEIAAtH0AgAC9LVCAAVyWgP//848w
Date: Thu, 09 Jul 2015 13:26:48 +0000
Message-ID: <1BFAC0A1D7955144A2444E902CB628F865B0499E@US70TWXCHMBA12.zam.alcatel-lucent.com>
References: <D1961A16.1087%hokano@cisco.com> <5577FBD3.5000804@student.kit.edu> <97EDD2D8-CC0A-4AFA-9A74-3F2C282CF5C2@cisco.com> <87mvzem9i9.fsf@alrua-karlstad.karlstad.toke.dk> <7E6C797B-EE6F-4390-BC8F-606FDD8D5195@cisco.com> <559659A8.9030104@student.kit.edu> <87fv55mtpz.fsf@alrua-karlstad.karlstad.toke.dk> <559674B7.5050004@kit.edu> <7A2801D5E40DD64A85E38DF22117852C70AD0859@wdc1exchmbxp05.hq.corp.viasat.com> <559B889B.4060409@kit.edu> <559B9724.6090902@student.kit.edu> <7A2801D5E40DD64A85E38DF22117852C70AD366D@wdc1exchmbxp05.hq.corp.viasat.com> <1BFAC0A1D7955144A2444E902CB628F865B043CD@US70TWXCHMBA12.zam.alcatel-lucent.com> <7A2801D5E40DD64A85E38DF22117852C70AD3F48@wdc1exchmbxp05.hq.corp.viasat.com> <1BFAC0A1D7955144A2444E902CB628F865B04673@US70TWXCHMBA12.zam.alcatel-lucent.com> <7A2801D5E40DD64A85E38DF22117852C70AD5890@wdc1exchmbxp05.hq.corp.viasat.com>
In-Reply-To: <7A2801D5E40DD64A85E38DF22117852C70AD5890@wdc1exchmbxp05.hq.corp.viasat.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: yes
X-MS-TNEF-Correlator:
x-originating-ip: [135.5.27.17]
Content-Type: multipart/related; boundary="_004_1BFAC0A1D7955144A2444E902CB628F865B0499EUS70TWXCHMBA12z_"; type="multipart/alternative"
MIME-Version: 1.0
Archived-At: <http://mailarchive.ietf.org/arch/msg/aqm/hU77hXms1cxkLPeUtPpMktb5ae8>
Cc: "Hironori Okano -X (hokano - AAP3 INC at Cisco)" <hokano@cisco.com>, AQM IETF list <aqm@ietf.org>
Subject: Re: [aqm] FQ-PIE kernel module implementation
X-BeenThere: aqm@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "Discussion list for active queue management and flow isolation." <aqm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/aqm>, <mailto:aqm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/aqm/>
List-Post: <mailto:aqm@ietf.org>
List-Help: <mailto:aqm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/aqm>, <mailto:aqm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 09 Jul 2015 13:27:08 -0000

The packet drop rate is very different for the TCP and UDP queues: 0.017% and 12.7% are the values measured in the 100ms RTT case (with PIE drop probability at about 16%). The random generator for the drop probability would indeed drop at the 16% rate, but whenever a TCP packet arrives at the FQ-PIE queue (10.4% of the cases), the drop probability is scaled down quite drastically. In other words, the full 16% packet drop probability only applies to a fraction of the incoming packets, which yields a lower total packet drop rate (11.3%).This is an intrinsic property of FQ-PIE with drop probability derived from aggregate state: the effective drop rate is systematically lower than the drop probability set by the algorithm, because the scaling by queue length ratio produces a drop probability value that is never larger than the one produced by the control equation. I wonder if this discrepancy should be a reason for concern from a control-theory perspective.

I also wonder if there is an equation that accurately relates the long-term CUBIC drop rate to the long-term CUBIC throughput when the CUBIC drop probability oscillates with the CUBIC queue length.

Regards,

Andrea


From: Agarwal, Anil [mailto:Anil.Agarwal@viasat.com]
Sent: Thursday, July 09, 2015 5:31 AM
To: Francini, Andrea (Andrea); Polina Goltsman; Bless, Roland (TM); Fred Baker (fred); Toke Høiland-Jørgensen
Cc: Hironori Okano -X (hokano - AAP3 INC at Cisco); AQM IETF list
Subject: RE: [aqm] FQ-PIE kernel module implementation

Andrea,

This is good info.

Slightly surprising how CUBIC manages to keep cwnd large in the presence of such a large packet discard probability.

Might be useful to try and compare with the per-queue packet drop probability algorithm mentioned in the DOCSIS report.

Regards,
Anil


From: Francini, Andrea (Andrea) [mailto:andrea.francini@alcatel-lucent.com]
Sent: Wednesday, July 08, 2015 1:52 PM
To: Agarwal, Anil; Polina Goltsman; Bless, Roland (TM); Fred Baker (fred); Toke Høiland-Jørgensen
Cc: Hironori Okano -X (hokano - AAP3 INC at Cisco); AQM IETF list
Subject: RE: [aqm] FQ-PIE kernel module implementation

Hi Anil,

I have run a few experiments in my ns2 environment, using my own FQ version of a rather old PIE module. I used 16ms for both the tUpdate and qdelay_ref parameters.

DISCLAIMER: Please do not take the results as indicative of the actual FQ-PIE (and plain PIE) behavior, but rather of a multi-queue (and single-queue) AQM scheme that instantiates some of the PIE principles.

The link has 100Mbps capacity. The buffer size is rather large (at least 6x BDP) so that buffer overflow occurs only if the AQM drop decisions are not sufficient to keep the queue length from growing uncontrolled.

The UDP traffic comes in at the constant rate of 101 Mbps. TCP traffic is from a CUBIC source (also a very old ns2 model). I used 50ms and 100ms RTT for the TCP traffic.

The experiments run for 300s of simulated time. The collection of statistics starts at time 100s, so all averages are computed over a 200s period. The scenario may be far from realistic, but it helps highlight basic properties of the AQM scheme.

With 50ms RTT, the CUBIC flow gets 32.3% of its 50Mbps fair share (2.15% with plain PIE -  a 15x drop). The PIE drop probability (which applies directly to UDP traffic, while it is scaled down for the shorter TCP queue) settles around 16%.
With 100ms RTT, the CUBIC flow gets 23.6% of its 50Mbps fair share (1.33% with plain PIE). The PIE drop probability also settles around 16%, but with wider oscillations.

The plot in attachment shows a 25s window in the evolution of the aggregate queue length and congestion window size for the CUBIC source (100ms RTT case). The 100% dashed line is for the cwnd size that yields 100% of the fair share.

The drop probability is much larger than 1%. This is because the TCP traffic adds to the 101% input load of UDP. The queue length oscillates at every high frequency, loosely modulated by the CUBIC cwnd. As soon as the length of the CUBIC queue accumulates a few units, the TCP drop probability given by the FQ-PIE formula becomes sufficiently large to cause the loss of a TCP packet and push the TCP queue back to zero occupancy.

TCP traffic suffers because the TCP drop probability is tied to the overall buffer occupancy, which UDP keeps trying to increase. We agree that it would be much better if the drop probability was defined exclusively by the state of the TCP queue, but this is not easy to realize in practice, at least according to the explanation given in the May 2014 CableLabs document for the decision made there to use aggregate state for setting the drop probability  (the explanation overlaps in part with the issues you identify in your message). The aggregate approach could be improved with a better formula for the per-queue drop probability, but I doubt it will be easy to find one that fits well all use cases and types of TCP source.

Regards,

Andrea

From: Agarwal, Anil [mailto:Anil.Agarwal@viasat.com]
Sent: Tuesday, July 07, 2015 9:27 PM
To: Francini, Andrea (Andrea); Polina Goltsman; Bless, Roland (TM); Fred Baker (fred); Toke Høiland-Jørgensen
Cc: Hironori Okano -X (hokano - AAP3 INC at Cisco); AQM IETF list
Subject: RE: [aqm] FQ-PIE kernel module implementation

Hi Andrea,

I am quite sure FQ-PIE with aggregate queue AQM will have some advantages over PIE with a single queue.
Although, it is not much in this use case, described by Polina.
In this case, assume that the unresponsive traffic is at a rate just 1% over the link rate.
PIE will converge to a drop probability of around 1%.
The TCP connection will also experience ~1% packet drop rate.
At that drop rate, the TCP goodput will be quite small - ~160 kbps.

I suspect that the advantages will show up in cases with multiple responsive flows and
better fairness and delay properties across flows.

Also, we have not discussed any advantages of FQ-PIE with aggregate queue vs FQ-PIE with per-queue AQM.
I am sure there are some.
One thought is that the aggregation will result in less "noise" in the algorithm input variables and
more stability in the state variable values.
Imagine per-queue AQM having to deal with individual short-lived TCP connections and
slow starts for each connection and very few RTTs to adapt connection rates. How well will it
control delays and aggregate buffer usage? (Better than single queue with tail-drops, but
that is a very low bar). Perhaps, we will need help from techniques such as TCP Hybrid Slow Start.
Some analysis or simulations of FQ-PIE with aggregate queue vs FQ-PIE with per-queue AQM
would be useful.
Note that FQ-Codel with aggregate queue AQM is not a viable option.

Regards,
Anil

From: Francini, Andrea (Andrea) [mailto:andrea.francini@alcatel-lucent.com]
Sent: Tuesday, July 07, 2015 3:05 PM
To: Agarwal, Anil; Polina Goltsman; Bless, Roland (TM); Fred Baker (fred); Toke Høiland-Jørgensen
Cc: draft-ietf-aqm-pie@tools.ietf.org<mailto:draft-ietf-aqm-pie@tools.ietf.org>; Hironori Okano -X (hokano - AAP3 INC at Cisco); AQM IETF list
Subject: RE: [aqm] FQ-PIE kernel module implementation

Hi Anil,

One comment about the first point of your summary:

While FQ-PIE does drop TCP throughput compared to the fair share, a single-queue AQM will do even worse in the same scenario where the input rate of the UDP flow exceeds the output rate of the queue (no TCP throughput at all). I also suspect that, if the FQ-PIE experiment is repeated with a smaller RTT, closer to the PIE delay target, we may see some improvement for TCP (and more so with CUBIC vs. Reno).

FQ-AQM with per-queue state (including the case of a fixed tail-drop threshold per queue) does succeed in enforcing the fair share, but if the drop threshold is oversized compared to the flow RTT the price to pay is a large self-inflicted queuing delay.

It is true that any scheme that uses aggregate state (typically the overall buffer occupancy or queuing delay) to make drop decisions will lose flow isolation/protection to some extent. However, there are important quantitative differences that may emerge depending on the way the FQ-AQM uses the aggregate state.

Regards,

Andrea


From: aqm [mailto:aqm-bounces@ietf.org] On Behalf Of Agarwal, Anil
Sent: Tuesday, July 07, 2015 1:31 PM
To: Polina Goltsman; Bless, Roland (TM); Fred Baker (fred); Toke Høiland-Jørgensen
Cc: draft-ietf-aqm-pie@tools.ietf.org<mailto:draft-ietf-aqm-pie@tools.ietf.org>; Hironori Okano -X (hokano - AAP3 INC at Cisco); AQM IETF list
Subject: Re: [aqm] FQ-PIE kernel module implementation

Polina, Roland,

This is good info.
So, here is a short summary of our analysis -
For FQ-PIE with aggregate-queue AQM -


1.      In the presence of unresponsive flows, FQ-PIE has similar properties as single-queue AQMs - the responsive flows are squeezed down to use leftover bandwidth, if any. FQ-AQM with per-queue AQM performs better.

2.      In the presence of flows that do not use their fairshare (temporarily or permanently), FQ-PIE has similar properties as single-queue AQMs - the flows, that do not use their fairshare, experience non-zero packet drops. FQ-AQM with per-queue AQM performs better.

3.      In the presence of flows that do not use their fairshare (temporarily or permanently), the queue size and queuing delay of flows that use their fairshare can grow above the desired target value.

#2 and #3 are probably not major issues - especially in a network bottleneck with a large number of diverse flows.
But it is worth pointing out and documenting these properties (somewhere).

Regards,
Anil



From: Polina Goltsman [mailto:polina.goltsman@student.kit.edu]
Sent: Tuesday, July 07, 2015 5:09 AM
To: Bless, Roland (TM); Agarwal, Anil; Fred Baker (fred); Toke Høiland-Jørgensen
Cc: draft-ietf-aqm-pie@tools.ietf.org<mailto:draft-ietf-aqm-pie@tools.ietf.org>; Hironori Okano -X (hokano - AAP3 INC at Cisco); AQM IETF list
Subject: Re: [aqm] FQ-PIE kernel module implementation

Hello all,

Here are my thoughts about interaction of AQM and fair-queueing system.

I think I will start with a figure. I have started a tcp flow with netperf, and 15 seconds later unresponsive UDP flow with iperf with a send rate a little bit above bottleneck link capacity. Both flows run together for 50 seconds.
This figure plots the throughput of UDP flow that was reported by iperf server. (Apparently netperf doesn't produce any output if throughput is below some value, so I can't plot TCP flow.).  The bottleneck is 100Mb/s and RTT is 100ms. All AQMs were configured with their default values and noecn flag.
[cid:image001.png@01D0BA26.0A6BE0B0]

Here is my example in theory. A link with capacity is C is shared between two flows - a non-application-limited TCP flow and unresponsive UDP flow with send rate 105%C. Both flows send max-sized packets, so round robin can be used instead of fair-queueing scheduler.

Per definition of max-min fair share both flows are supposed to get 50% of link capacity.

(1) Taildrop queues:
UDP packets will be dropped when its queue is full, TCP packets will be dropped when its queue is full. As long as there are packets in TCP flow queue, TCP should receive its fair share. ( As far as I understand, this depends on the size of the queue)

(2) AQM with state per queue:
Drop probability of UDP flow will always be non-zero and should stabilize around approximately 0.5.
Drop probability of TCP flow will be non-zero only when it starts sending above 50%C. Thus, while TCP recovers from packet drops, it should not receive another drop.

(3) AQM with state per aggregate:
UDP flow always creates a standing queue, so drop probability of aggregate is always non-zero. Let's call it p_aqm.
The share of TCP packets in the aggregate p_tcp = TCP send rate / (TCP send rate + UDP send rate) and the probability of dropping a TCP packet is p_aqm * p_tcp. This probability is non-zero unless TCP doesn't send at all.

In (3) drop probability is at least different. I assume that it is larger than in (2), which will cause more packet drops for TCP flow, and as result the flow will reduce its sending rate below its fair share.

Regards,
Polina
On 07/07/2015 10:06 AM, Bless, Roland (TM) wrote:

Hi,



thanks for your analysis. Indeed, Polina came up with

a similar analysis for an unresponsive UDP flow and

a TCP flow. Flow queueing can achieve link share fairness

despite the presence of unresponsive flows, but is ineffective

if the AQM is applied to the aggregate and not to the individual

flow queue. Polina used the FQ-PIE implementation

to verify this behavior (post will follow).



Regards,

 Roland





Am 04.07.2015 um 22:12 schrieb Agarwal, Anil:

Roland, Fred,



Here is a simple example to illustrate the differences between FQ-AQM with AQM per queue vs AQM per aggregate queue.



Let's take 2 flows, each mapped to separate queues in a FQ-AQM system.

   Link rate = 100 Mbps

   Flow 1 rate = 50 Mbps, source rate does not go over 50 Mbps

   Flow 2 rate >= 50 Mbps, adapts based on AQM.



FQ-Codel, AQM per queue:

   Flow 1 delay is minimal

   Flow 1 packet drops = 0

   Flow 2 delay is close to target value



FQ-Codel, AQM for aggregate queue:

   Does not work at all

   Packets are dequeued alternatively from queue 1 and queue 2

   Packets from queue 1 experience very small queuing delay

   Hence, CoDel does not enter dropping state, queue 2 is not controlled :(



FQ-PIE, AQM per queue:

   Flow 1 delay is minimal

   Flow 1 packet drops = 0

   Flow 2 delay is close to target value



FQ-PIE, AQM for aggregate queue:

   Flow 1 delay and queue 1 length are close to zero.

   Flow 2 delay is close to 2 * target_del :(

           qlen2 = target_del * aggregate_depart_rate

   Flow 1 experiences almost the same number of drops or ECNs as flow 2 :(

           Same drop probability and almost same packet rate for both flows

   (If flow 1 drops its rate because of packet drops or ECNs, the analysis gets slightly more complicated).



See if this makes sense.



If the analysis is correct, then it illustrates that flow behaviors are quite different

between AQM per queue and AQM per aggregate queue schemes.

In FQ-PIE for aggregate queue,

   - The total number of queued bytes will slosh between

     queues depending on the nature and data rates of the flows.

   - Flows with data rates within their fair share value will experience

     non-zero packet drops (or ECN marks).

   - Flows that experience no queuing delay will increase queuing delay of other flows.

   - In general, the queuing delay for any given flow will not be close to target_delay and can be

     much higher