Re: [aqm] New Version Notification for draft-baker-aqm-sfq-implementation-00.txt

Daniel Havey <dhavey@yahoo.com> Tue, 24 June 2014 19:46 UTC

Return-Path: <dhavey@yahoo.com>
X-Original-To: aqm@ietfa.amsl.com
Delivered-To: aqm@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4DBEB1A037D for <aqm@ietfa.amsl.com>; Tue, 24 Jun 2014 12:46:01 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.049
X-Spam-Level:
X-Spam-Status: No, score=0.049 tagged_above=-999 required=5 tests=[BAYES_50=0.8, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RP_MATCHES_RCVD=-0.651, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2rqnjKQI1yc0 for <aqm@ietfa.amsl.com>; Tue, 24 Jun 2014 12:45:58 -0700 (PDT)
Received: from nm50-vm4.bullet.mail.bf1.yahoo.com (nm50-vm4.bullet.mail.bf1.yahoo.com [216.109.115.223]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 439531A0384 for <aqm@ietf.org>; Tue, 24 Jun 2014 12:45:58 -0700 (PDT)
Received: from [98.139.215.140] by nm50.bullet.mail.bf1.yahoo.com with NNFMP; 24 Jun 2014 19:45:57 -0000
Received: from [98.139.212.228] by tm11.bullet.mail.bf1.yahoo.com with NNFMP; 24 Jun 2014 19:45:57 -0000
Received: from [127.0.0.1] by omp1037.mail.bf1.yahoo.com with NNFMP; 24 Jun 2014 19:45:57 -0000
X-Yahoo-Newman-Property: ymail-3
X-Yahoo-Newman-Id: 311814.6989.bm@omp1037.mail.bf1.yahoo.com
Received: (qmail 31612 invoked by uid 60001); 24 Jun 2014 19:45:57 -0000
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1403639157; bh=0/DIDXj2rQHATL++cXWmMcOxY776ZgwqZ8yEX0555l4=; h=Message-ID:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=a1dVJlp/7MNK5szhEwQpaJc3YHW4OrqpdEHLlawPm7yf3NzmTAVjh3nn/9M1Px7aFbkCevVb2v2Cw0+3xY3MgLhqiuh7nxbjlv4LQKnee96dfVx5CA1fFUU5KOjah7RYkB/C2w0F87RmLN7e4RvxfpSzDeWW/CRuB1FbZjkGypE=
X-YMail-OSG: 587LZHwVM1m8IThjxhh176mcj5D6tX7BM4_RsIZoIwh4i_B d2cVOtidMZAM3VmAJ2hGWZOckjCyKUUo3aayjMPbv8KJ2fpG00R30COB1d.i UG312neM.oZsxBpeshJGFGJ0XooTK4yHl2Hqc8wqJI1h3UetgF3oUuX9VdsO RAqJg4STqtnbbBvcJ2x5HV3j_ECfoFPmZ4NgkiBCRocv2tnGlGB1SCaZfvbB WTjPX.TiHoG_kljh7A3A2uFXLsmmRnZOo0WlxC3jBFVOhl5AB4ZcGwlv99Ms .T6XhQnFu7kouyFqzP6lPultVFYIVWvyJOLQcHLHFIreO2PFpSZmSrBVhgza JaH9SiNMDik.ZWJiMdGGASzhGasOfIza_N0xqGPKrvPDfkYwLj.lVOdpf_mx zFDTnqIhhM1pU9.eDTR7k.2W2FHi2ucdpJSau4OR18Xe3BK2o7PkpsMK7bFR 825CSUc6xkqIZZNgIH2t3.og3yPKtKoAn2AtM4AfS.BbMjFMNSVV06MqTst8 V6A6JFbQdI2tEWlm35jvAW9M3b_yJRP7b7fF0XRy83PkM3vvfqw--
Received: from [184.187.166.95] by web141606.mail.bf1.yahoo.com via HTTP; Tue, 24 Jun 2014 12:45:57 PDT
X-Rocket-MIMEInfo: 002.001, WXVwLCB0aGlzIGlzIHdoYXQgbXkgZXhwZXJpbWVudHMgYXJlIHNob3dpbmcuICBJJ20gbm90IGRvaW5nIGhlYWQvdGFpbCBkcm9wLCBidXQsIEkgYW0gZHJvcHBpbmcvbWFya2luZyBhdCB0aGUgZGV2aWNlIGJlZm9yZSBhbmQgYWZ0ZXIgdGhlIGJsb2F0ZWQgZGV2aWNlLiAgVGhhdCB3b3VsZCBiZSBjb3VudGVyY2xvY2t3aXNlIG9yIGNsb2Nrd2lzZSBvZiB0aGUgYmxvYXRlZCAoc2xvdyBhbmQgb3ZlcnF1ZXVlZCkgZGV2aWNlLiAgVGhpcyBpcyBlc3NlbnRpYWxseSB0aGUgc2FtZSB0aGluZy4NCg0KV2hhdCABMAEBAQE-
X-Mailer: YahooMailClassic/635 YahooMailWebService/0.8.191.1
Message-ID: <1403639157.90844.YahooMailBasic@web141606.mail.bf1.yahoo.com>
Date: Tue, 24 Jun 2014 12:45:57 -0700
From: Daniel Havey <dhavey@yahoo.com>
To: RichardScheffenegger <rs@netapp.com>, "Fred Baker (fred)" <fred@cisco.com>
In-Reply-To: <4A20A0BD-1DF4-49AC-856A-B0FD73AADFC1@cisco.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: http://mailarchive.ietf.org/arch/msg/aqm/ft4Cf6tCYhtQV0FVhL9RKvZFL_c
Cc: "aqm@ietf.org" <aqm@ietf.org>, grenville armitage <garmitage@swin.edu.au>
Subject: Re: [aqm] New Version Notification for draft-baker-aqm-sfq-implementation-00.txt
X-BeenThere: aqm@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
Reply-To: dhavey@yahoo.com
List-Id: "Discussion list for active queue management and flow isolation." <aqm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/aqm>, <mailto:aqm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/aqm/>
List-Post: <mailto:aqm@ietf.org>
List-Help: <mailto:aqm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/aqm>, <mailto:aqm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 24 Jun 2014 19:46:01 -0000

Yup, this is what my experiments are showing.  I'm not doing head/tail drop, but, I am dropping/marking at the device before and after the bloated device.  That would be counterclockwise or clockwise of the bloated (slow and overqueued) device.  This is essentially the same thing.

What I am observing is that if I drop/mark after (clockwise) the system is snappy and manages the queue quickly.  Dropping counterclockwise (before) the bloated device is a slightly different story.  If I let the flow run for a few seconds the bloat builds up in the queue at the slow device.  Switching AQM on at this point causes "overshoot" and burns the connection.

I think what is happening is that it takes so long for the drop signal to propagate that by the time the retransmitted packet arrives it's time to drop again so we lose the retransmission and have to wait for an RTO.  Whatever.  Baaaad mojo.  Don't let this happen.

Anyways, as long as the AQM system is on before the flow starts then the queuing is managed and the system remains snappy.

So IMHO it really doesn't matter except in the weird corner case where a a running flow has already bloated the queue and then we switch on the AQM.


BTW...On a related note: I tried to address that problem by using ECN.  If I don't drop a packet then there will be no RTX so...Well anyways, I switched on the ECN on my Linux boxes and the thing doesn't work!  Is ECN in Linux broken?  Here are the details:

I switched ECN on using the /proc filesystem setting it to request ECN and to accept ECN requests.  However examining the wireshark trace shows that the flow is marked Not ECN capable.  I'm running 3.12 kernels.

I find it difficult to believe that ECN is broken in Linux, but, it's definitely not working. Uhhhh, seriously?  ECN is broken in Linux?  You gotta be kidding me?!?  Did I do something wrong?  Anybody else notice this?

*** I don't care that much about it because my experiments don't actually require that I use ECN, but, how on earth can we expect people to activate ECN when it doesn't even work?  Errrrggggghhh!





--------------------------------------------
On Tue, 6/24/14, Fred Baker (fred) <fred@cisco.com> wrote:

 Subject: Re: [aqm] New Version Notification for draft-baker-aqm-sfq-implementation-00.txt
 To: "Scheffenegger, Richard" <rs@netapp.com>
 Cc: "aqm@ietf.org" <aqm@ietf.org>, "grenville armitage" <garmitage@swin.edu.au>
 Date: Tuesday, June 24, 2014, 12:41 AM
 
 
 On
 Jun 23, 2014, at 6:32 AM, Scheffenegger, Richard <rs@netapp.com>
 wrote:
 
 > <as
 individual>
 > 
 > Hi
 Fred,
 > 
 > thank you
 for writing this down; one aspect that gets referred to, but
 not made completely explicit in sections 3.2 and 3.3 is the
 interaction of the AQM / Queue signals with the transport
 control loop.
 > 
 >
 IMHO, it should be made very clear, when the AQM action is
 done before the queueing, that the AQM signal is delayed for
 the outer control loop; obviously in a defensive loss
 situation, this will always be the case. In comparison, when
 the Queue prepends the AQM action, the AQM signal is delayed
 less to the outer control loop.
 > 
 > Depending on the depth of the queue /
 departure rate, that timing difference can be
 significant...
 > 
 > I
 don't know how to put that into better words that would
 fit into your draft though.
 > 
 > Best regards,
 > 
 > Richard Scheffenegger
 
 I can go into that if you
 want.
 
 My logic here goes
 something like this.
 
 You
 can think of a TCP session as a control loop stuck into the
 middle of a larger stream. Imagine I’m moving a gigabyte
 file from here to there, the MSS is 1440 bytes (an IPv6
 packet containing it is 1500 bytes), the bottleneck link
 between “here” and “there” is some specific rate,
 and the propagation delay between “here” and “there”
 is some non-trivial value. The least effective window that
 would maximize throughput is the number of segments that
 could fully use the bottleneck capacity; any additional
 quantum that is there sits in a queue and increases the RTT.
 So at any given point in time, we can think of the transfer
 as having several components:
 
      K segments that are actually
 “in flight” somewhere
  
    An additional K segments that hack been
 received and whose acks are in flight.
  
    cwnd-2*K segments sitting in a queue, probably
 at the bottleneck.
      Some
 number of bytes that haven’t been transmitted yet
           of those, the next cwnd segments
 will be transmitted as acks arrive.
  
    Some number of bytes that have already been
 received at the far end but not delivered yet to the
 application
      Zero or more
 segments that have been received out of order and are being
 held pending retransmission
 
 Now, let’s mark a specific segment; I’ll
 call it segment N. I don’t really care what the value of N
 is. But it is the segment that the AQM algorithm will select
 and drop, by whatever algorithm it decides. In a tail-drop
 case, for the sake of argument, we can assert that the
 entire cwnd segments are between segment N and the receiver
 or are represented in acknowledgments on their way back. In
 a head-drop case, there are cwnd-2K segments sitting in the
 queue after segment N, K segments between it and the
 receiver, and K acks in flight.
 
   +------+      +--------+           
           +--------+
   |      | 
     | Queue  |Data    K segments    |        |
   |      +----->+        +----> -
 - - - ------->+        |
  
 |Sender|      |Router 1|                     
 |Receiver|
   |      +<-----+     
   +<---- - - - - <-------+        |
   |      |      |        |Acks   
 K acks        |        |
  
 +------+      +--------+                     
 +--------+
 
 So now, the
 whole thing rotates clockwise. 
 1) K
 segments already in flight arrive and are acknowledged while
 K acks arrive at the sender and trigger new transmissions.
 The queue still has cwnd-2*K segments in it.
 2) depending on whether it was head drop or
 tail drop, somewhere between zero and cwnd-2*K segments
 arrive and are acknowledged while as many acks are received
 at the sender and trigger new transmissions. The queue still
 has cwnd-2*K segments in it.
 3) the missed
 packet is detected by the receiver, who starts responding
 with duplicate acks. However, there are still K acks in
 flight, so the sender is going to send another K new packets
 before he even sees the first dupack. The queue still has
 cwnd-2*K segments in it.
 4) we now get a
 long stream of dupacks, and the sender presumably
 retransmits the dropped packet. AT THIS POINT, SENDER
 REDUCES CWND. If more than one packet got dropped, let’s
 hope the SACK logic retransmits it as well.
 5) At long last, the retransmission arrives at
 the receiver, who sends a giant ack and starts sending a
 stream of acks, triggering new transmissions.
 
 To determine the difference
 between head-drop and tail-drop, we have to ask ourselves
 how big cwnd-2*K is. If we are using traditional
 too-full-drop-something without AQM, it might be a largish
 number (it could be the size of the memory allocated to the
 queue if there is no competing traffic), and it is probably
 fair to say that head-drop would get the event back to the
 sender more rapidly than tail-drop.
 
 If we are using any AQM technology - RED, WRED,
 ARED, PIE, CoDel, Blue, AVQ, or whatever else, the
 fundamental purpose of the logic is to keep the queue
 relatively shallow, even if it ha a large memory system
 behind it. In RED terms, we would estimate that the mean
 queue depth will approximate min-threshold or less; each of
 the other technologies has its counterpart to that and will
 similarly keep the latency and/or queue depth down.
 
 Hence, cwnd-2*K is a function
 of the mean queue depth at the bottleneck, and is a
 relatively small number. Hence, if we’re using an AQM
 algorithm - any AQM algorithm as long as it works - the real
 differences between had-drop and tail-drop is a relatively
 small number.
 
 Which makes
 me ask - what are we really talking about? Does it actually
 matter?
 -----Inline Attachment Follows-----
 
 _______________________________________________
 aqm mailing list
 aqm@ietf.org
 https://www.ietf.org/mailman/listinfo/aqm