[tcpm] Re: draft-iyengar-burst-mitigation-01.txt

Janardhan Iyengar <iyengar@mail.eecis.udel.edu> Mon, 20 February 2006 21:05 UTC

Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1FBIDU-0002qM-84; Mon, 20 Feb 2006 16:05:08 -0500
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1FBIDS-0002pi-Nn for tcpm@ietf.org; Mon, 20 Feb 2006 16:05:06 -0500
Received: from louie.udel.edu ([128.4.40.12] helo=mail.eecis.udel.edu) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1FBIDQ-0005Oi-8f for tcpm@ietf.org; Mon, 20 Feb 2006 16:05:06 -0500
Received: by mail.eecis.udel.edu (Postfix, from userid 62) id E4C491EA; Mon, 20 Feb 2006 15:47:54 -0500 (EST)
Received: from stimpy.eecis.udel.edu (stimpy.eecis.udel.edu [128.4.40.17]) by mail.eecis.udel.edu (Postfix) with ESMTP id 48C75E2; Mon, 20 Feb 2006 15:47:52 -0500 (EST)
Date: Mon, 20 Feb 2006 15:47:52 -0500
From: Janardhan Iyengar <iyengar@mail.eecis.udel.edu>
X-X-Sender: iyengar@stimpy.eecis.udel.edu
To: Sally Floyd <floyd@icir.org>
In-Reply-To: <200601300622.k0U6Mvvq065454@cougar.icir.org>
Message-ID: <Pine.GSO.4.62.0602021114540.589@stimpy.eecis.udel.edu>
References: <200601300622.k0U6Mvvq065454@cougar.icir.org>
X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on louie.udel.edu
X-Spam-Level:
X-Spam-Status: No, score=-3.8 required=4.1 tests=ALL_TRUSTED,BAYES_00 autolearn=ham version=3.0.4
X-Sanitizer: This message has been sanitized!
X-Sanitizer-URL: http://mailtools.anomy.net/
X-Sanitizer-Rev: UDEL-ECECIS: Sanitizer.pm, v 1.64 2002/10/22 MIME-Version: 1.0
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset="US-ASCII"; format="flowed"
X-Spam-Score: 0.0 (/)
X-Scan-Signature: a8041eca2a724d631b098c15e9048ce9
Cc: Ethan Blanton <eblanton@cs.purdue.edu>, tcpm@ietf.org, Mark Allman <mallman@icir.org>
Subject: [tcpm] Re: draft-iyengar-burst-mitigation-01.txt
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
Errors-To: tcpm-bounces@ietf.org

Sally,

(Resonse to: 
http://www1.ietf.org/mail-archive/web/tcpm/current/msg01671.html)

Thank you very much for spending time on this draft, and thank you for 
your comments! I am sorry that I am quite late in replying. Hope this 
email addresses your concerns, if not, then generates further discussion.


> * The network's tolerance of burstiness: [...]

I have added some text (based on your suggestions) that now seems more 
appropriate in the Discussion section (Section 4). I have also changed the 
tone somewhat... I do not stress the fact that increased deployment of AQM 
is happening (which I think was one of your points, correct me if I read 
wrong). The reason I changed the tone was to maintain focus on micro-burst 
mitigation, while mentioning the parallel effects of AQM. Let me know if 
you think the following captures it, or if you'd suggest mods.

     We emphasize that the question of whether or not micro-bursts need
     mitigation remains open. While use of micro-burst mitigation can
     reduce stress on network queues, congested routers may, in parallel,
     use some form of Active Queue Management (AQM) to better handle
     stress on network queues. Use of AQM in a router allows for better
     congestion control through better congestion signals (i.e., signals
     other than packet drops) and enables queues to accomodate transient
     bursts in the aggregate incoming traffic. Micro-burst mitigation may
     still provide some respite by reducing the amount of transient
     burstiness in the traffic aggregate at a queue.


> * Rate-based pacing:
> [...] I think it would be worth it to say that there are many cases 
> where rate-based pacing should be a big win over CWL, or over MaxBurst, 
> in terms of traffic dynamics. [...]

I have modified the para in the Related Work Section (Section 3) to 
reflect your suggestions. Also see the following argument.

       * Rate-Based Pacing [VH97] imposes a limitation on the rate of
         sending, and prevent bursts by pacing data into the network
         until the ACK clock is established.  Although this solution can
         be very effective in burst mitigation in some cases, it requires
         a new timer and parameters for pacing out the data segments.
         Further, as shown in [AB05], there are cases where there is no
         natural "lull" in the connection into which segments can be
         nicely paced, and as shown in [AST00], the tradeoffs involved in
         using pacing are not clear.  While the exact application of
         pacing clearly requires more research, there are several cases
         where pacing can be expected to mitigate bursts better than CWL
         (or MaxBurst or UI/LI). For instance, where CWL (also MaxBurst and
         UI/LI) will not be able to handle bursts caused by slow start,
         pacing could work well and prevent these transient
         bursts. Pacing can potentially also have a bigger impact on
         traffic aggregates, possibly significantly reducing burstiness in
         the aggregate at several timescales.

> E.g., in a short connection with only BLimit + 1 more packets to send, 
> but when only one ack will be arriving, due to ACK losses.

I do not use this example, because I am not sure how RBP will be in short 
connections, i.e., I am not clear on the tradeoffs here---for instance, 
what happens if the RTT value is seeded incorrectly in such a small 
connection? But I believe the larger point you wanted to convey has been 
captured in the para above. Let me know if not.

> But my suggestion would be to be more explicit about the potential 
> advantages of rate-based pacing, and to say something about it a little 
> earlier in the document.

I hope that para was able to capture it, but I'm not sure about saying 
something earlier in the document. This section is the first place that 
the other algorithms are mentioned and the pros and cons discussed. Do you 
think the entire section should appear earlier?


> * Security:
> I think that the security issues raised in Section 5 are pretty
> important, and could use more discussion.  This is another case
> where rate-based pacing would be more robust than CWL, I think.

I'll try to add more discussion on that point. Do you have anything 
specific in mind that you might want to suggest?

While I suspect that rate-based pacing may be more robust than CWL under 
ack loss, it is difficult for me to make a strong comparative statement 
without knowing what exactly would be used for rate-based pacing. 
Depending on the exact details, I suspect that rate-based pacing *could 
also possibly* be (differently) susceptible to issues such as ack loss. 
So, I am not sure that we should say anything about how the mentioned 
security issue applies to rate-based pacing.

> * Causes of micro-bursts: ACK-compression can also be a cause of 
> micro-bursts.  (That is, ACK-compression can cause micro-bursts as well 
> as macro-bursts.)

How so? I am thinking about it, but cannot figure out how that could be. 
A micro-burst is defined as a burst sent in response to one ACK. In time, 
I suppose ACK-compression could cause bursts at the same timescale. In 
other words, some macro-burst phenomena could occur at the timescales that 
the bursts resemble micro-bursts. Maybe we should state that explicitly in 
the discussion about macro-bursts. Something like

"Note that some macro-bursts could occur at the timescales of 
micro-bursts---for instance, back-to-back ACKs due to extreme 
ACK-compression could cause the consequent data burst to resemble a 
micro-burst from the network's point of view."

What do you think?


> * Section 2, step (1):
> "the only case where a micro-burst will not occur" ->
> "the only case where a micro-burst will not occur, if steps (2)
>  and (3) aren't used".

I am thinking of taking that entire sentence out---it does not add much 
value, but the cost of parsing it is high (and is getting higher). Unless 
someone sees enough value in it, that is.


> * "BLimit SHOULD be chosen such that bursts are no larger than those
> allowed by [RFC3390]."
>
> Why is this?  It is not obvious to me.

I was thinking about a fixed burst limit that one could apply for 
line-rate bursts, and it seemed reasonable to use 3390 as a basis. My 
(simplistic) reasoning was that since bursts from starting TCP connections 
should be absorbable by the network, 3390-defined bursts should be 
acceptable in general.

> For example, four-packet bursts, for 1500-byte packets, might be 
> perfectly acceptable, for a connection that has a sufficiently large 
> congestion window to send that many packets. And there is a cost to 
> BLimit being at most three 1500-byte packets.

I'm not sure I'm following. A large cwnd does not mean more segments can 
be sent (at line-rate) in response to _each ack_. BLimit is meant to 
impose that limit, not to limit the overall cwnd size. IMHO, having a 
large cwnd does not mean that a large burst is acceptable.

Yes, I agree that there is a cost to having BLimit at what it is. But is 
there a situation that demands larger line-rate bursts?

What do you think is a reasonable limit (static/dynamic)? I may not be 
following what you are saying; if so, let me know.


> * Maxburst:
> "An additional drawback of MaxBurst ...".
> What was the first drawback?  Introducing a separate control?  But
> the earlier sentence says that "using two different controls may
> make sense".

New text:
         However, a drawback with
         MaxBurst is that adding a second control brings with it the
         possibility of the two transmission controllers interacting
         poorly to cause undesirable side effects.


> "the two transmission controllers may interact poorly, causing
> undesirable side effects."
> Do you have any examples of this?  Or citations?  Or anything
> else to back this up?

Not really. The idea was to suggest that there is the possibility of poor 
interaction due to the introduction of a new control. Maybe the new text 
(see above) captures this more effectively?


> "When BLimit == MaxBurst, CWL and MaxBurst perform similarly [AB05]."
> I don't see how CWL and MaxBurst can perform similarly, and at
> the same time you can prefer one over the other.  I think it would
> be useful to look a little more closely for performance differences,
> or to say that there aren't any.

(I'm checking with my co-authors on this one. Will get back to you asap.)

> Depending on how it is implemented, MaxBurst could be made to work
> quite well with re-ordering, where each duplicate ack packet (possibly
> from reordering of data packets) could be responded-to with up to
> MaxBurst data packets (if allowed by the congestion window).
> (E.g., Aggressive MaxBurst from [AB05].)
> Can something like this be done with CWL as well?

Hmm. Offhand, I do not see how this can be done with CWL, but I will think 
about it more. This is a good point though. Maybe we should mention this 
in the related work section where we discuss pros and cons of MaxBurst.

Thanks again!
- jana

--------------------
Janardhan R. Iyengar
http://www.cis.udel.edu/~iyengar
Protocol Engineering Lab, University of Delaware


_______________________________________________
tcpm mailing list
tcpm@ietf.org
https://www1.ietf.org/mailman/listinfo/tcpm