Re: [tcpm] Ordering of SACK blocks, flushing of reassembly queue after inactivity

Joshua Blanton <> Tue, 22 January 2008 15:54 UTC

Return-path: <>
Received: from [] ( by with esmtp (Exim 4.43) id 1JHLSe-0001b1-JR; Tue, 22 Jan 2008 10:54:52 -0500
Received: from tcpm by with local (Exim 4.43) id 1JHLGK-0001W4-Td for; Tue, 22 Jan 2008 10:42:08 -0500
Received: from [] ( by with esmtp (Exim 4.43) id 1JHLGK-0001Vl-JN for; Tue, 22 Jan 2008 10:42:08 -0500
Received: from ([]) by with esmtp (Exim 4.43) id 1JHLGJ-0006SQ-WD for; Tue, 22 Jan 2008 10:42:08 -0500
Received: from ([]) by with ESMTP id <>; Tue, 22 Jan 2008 15:42:07 +0000
Received: by (Postfix, from userid 500) id 1AC56181C8; Tue, 22 Jan 2008 10:42:07 -0500 (EST)
Date: Tue, 22 Jan 2008 10:42:07 -0500
From: Joshua Blanton <>
To: Andre Oppermann <>
Subject: Re: [tcpm] Ordering of SACK blocks, flushing of reassembly queue after inactivity
Message-ID: <>
Mail-Followup-To: Andre Oppermann <>,
References: <>
MIME-Version: 1.0
In-Reply-To: <>
X-Operating-System: Linux
User-Agent: Mutt/1.5.13 (2006-08-11)
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 3002fc2e661cd7f114cb6bae92fe88f1
X-Mailman-Approved-At: Tue, 22 Jan 2008 10:54:50 -0500
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: Joshua Blanton <>
List-Id: TCP Maintenance and Minor Extensions Working Group <>
List-Unsubscribe: <>, <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
Content-Type: multipart/mixed; boundary="===============0563638091=="

I don't have a good answer to your first question (other than to
mention that, if you always send the most-recently-modified SACK
regions, you ensure that they're sent multiple times - which is the
only quasi-reliability you can create on the ACK path), but I would
like to address the second.

Andre Oppermann wrote:
> The second is how long to hold onto data in the reassembly queue.
> The general theme here is resource exhaustion be it through malicious
> activity or just end points that drop off the net.  I think we can
> all agree that holding onto reassembly queue data until the session
> times out (if ever) is not really useful considering the overall
> resource constrains.  The question now is after what time to flush
> the reassembly queue (and to send an appropriate ACK)?  A range of
> options are available.  On the wide side we have a flush timeout
> of something like 2 times MSL.  On the small side we can go down to
> the current calculated retransmit timeout value as seen from our side.
> Also of importance is from where the timeout is calculated.  From the
> time the first segment arrived in the reassembly queue (resetting when
> rcv_nxt is advanced), or from the arrival time of the most recent
> segment.  For the moment and testing I've chosen the former at four
> times retransmit timeout as something that probably marks the boundary
> between spurious network losses or partitioning and longer-term
> disconnect or malicious activity.  Is any empirical data available on
> abandoned sessions with data in the reassembly queue?  What is your
> opinion and rationale on this?

Well, I actually disagree that holding onto reassembly queue data is
a lost cause, even after long periods of inactivity - so perhaps we
don't all agree :-).  Certainly you could tell *after* the fact that
holding such data was a fool's errand, if the connection is
terminated; until that point, there's no reason to necessarily
assume that the lack of progress in the connection is permanent.  In
general, I would expect an operating system to hold on to reassembly
data for forever, assuming that there's no memory resource concern
that makes the buffers valuable...  To flush data simply because
wall-clock time has elapsed doesn't make sense to me, since I've
seen many traces where "long" time periods have elapsed and then
connections suddenly resume.  If there's no global "we're running
out of memory" trigger available for a given OS, a stack could set a
timer to fire at some arbitrary time (4*RTO, for instance) and check
for memory pressure - *if* it exists, go ahead and flush the data.

I don't have any data showing how much reassembly data is left
hanging around when a session is abandoned, but I have looked at
quite a few traces trying to find SACK renegs (which would be the
result of your data flushing).  In general, I believe that a scheme
such as you're proposing is not used; other than some traces that
I've found that p0f identifies as being FreeBSD receivers, there
doesn't appear to be a solid link between connection progress
timeouts and reneging.  I don't know the FreeBSD stack well enough
to say that, in its current implementations, it definitely flushes
reassembly queue data based on a timer - but I suspected that it
did, and your question reinforces my suspicion.  If I am correct,
and FreeBSD currently (5.x and 6.x) times out reassembly data as
you're proposing, I've seen traces where this actually impedes a
connection's recovery - so I vote against such a scheme.

Again, I have no problem with stacks flushing out-of-order data in
the face of a low-memory condition.  Beyond that, I'd have to see
some pretty convincing data that "long pauses == connection that
will terminate without finishing," which is what I feel you're

tcpm mailing list