Re: TCP and/or sockets vs. the last message in full-duplex applications

Henrik Frystyk Nielsen <frystyk@w3.org> Wed, 03 March 1999 15:13 UTC

Message-Id: <3.0.5.32.19990303101333.02ed2100@localhost>
X-Sender: frystyk@localhost
X-Mailer: QUALCOMM Windows Eudora Pro Version 3.0.5 (32)
Date: Wed, 03 Mar 1999 10:13:33 -0500
To: spreitze@parc.xerox.com, tcp-impl@lerc.nasa.gov
From: Henrik Frystyk Nielsen <frystyk@w3.org>
Subject: Re: TCP and/or sockets vs. the last message in full-duplex applications
In-Reply-To: <99Mar1.114532pst."105927"@augustus.parc.xerox.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Sender: owner-tcp-impl@lerc.nasa.gov
Precedence: bulk
Status: RO
Content-Length: 4236
Lines: 84

At 11:45 3/1/99 PST, spreitze@parc.xerox.com wrote:

>The proposed fix is to make the server: send the clean shutdown,
>close the sending half of the TCP connection, consume data until
>EOF or timeout, then close the receiving half.  This seems like an
>inappropriate amount of bother just to reliably send a message and
>close.

This is old news in HTTP world - in fact (although I can't find any mails
in the HTTP archives to show this), I believe to have first found this
problem in early 1995 when experimenting with cross-atlantic HTTP/1.0 PUT
requests between MIT and CERN. The problem that I saw for HTTP/1.0 PUT is
the following:

* Client A sends a PUT request with a large body to the server
* The server B sends back a short 401 Access Denied response
  and closes the connection in both directions.
* A receives the response but has already sent a large part of
  the body because of the RTT across the Atlantic
* A ACK's the response but the ACK is pending behind the data
  already sent to B
* B sees that data is still coming and sends a RST to A
* A gets the RST and passes it up immediately to the application
  dropping the HTTP response because the seq counter hasn't
  been updated.
* A gets a RST but no HTTP response - it doesn't know what happened.

At the time I talked with Dave Clark about it and he pointed out that it
was not a bug but normal behavior.

The problem can also occur in HTTP/1.1 pipelining which we found when
implementing this in the libwww HTTP code. This is described in an IETF
draft by Jim Gettys and Alan Freier that was never finished [1], section 8:
	
   In simple request/response protocols (e.g. HTTP/1.0), a server can go
   ahead and close both receive and transmit sides of its connection
   simultaneously whenever it needs to. A pipelined or streaming
   protocol (e.g. HTTP/1.1) connection, is more complex [Frystyk et.
   al.], and an implementation which does so can create major problems.

   The scenario is as follows: an HTTP/1.1 client talking to a HTTP/1.1
   server starts pipelining a batch of requests, for example 15 on an
   open TCP connection.  The server decides that it will not serve more
   than 5 requests per connection and closes the TCP connection in both
   directions after it successfully has served the first five requests.
   The remaining 10 requests that are already sent from the client will
   along with client generated TCP ACK packets arrive on a closed port
   on the server. This "extra" data causes the server's TCP to issue a
   reset which makes the client TCP stack pass the last ACK'ed packet to
   the client application and discard all other packets. This means that
   HTTP responses that are either being received or already have been
   received successfully but haven't been ACK'ed will be dropped by the
   client TCP. In this situation the client does not have any means of
   finding out which HTTP messages were successful or even why the
   server closed the connection. The server may have generated a
   "Connection: Close" header in the 5th response but the header may
   have been lost due to the TCP reset. Servers must therefore close
   each half of the connection independently.

This has in HTTP/1.1 rev 6 [2] been moved to section 10.4

	If the client is sending data, a server implementation
	using TCP SHOULD be careful to ensure that the client
	acknowledges receipt of the packet(s) containing the
	response, before the server closes the input connection.
	If the client continues sending data to the server after
	the close, the server's TCP stack will send a reset packet
	to the client, which may erase the client's unacknowledged
	input buffers before they can be read and interpreted by
	the HTTP application.

I believe that all modern HTTP servers in fact do the half-close. However,
this is not without problems as the server also has to be able to protect
itself. It therefore has to have some feeling of a "reasonable lingering
time" for the connection.

Henrik

[1] http://www.ics.uci.edu/pub/ietf/http/draft-ietf-http-connection-00.txt
[2] http://www.ics.uci.edu/pub/ietf/http/draft-ietf-http-v11-spec-rev-06.txt
--
Henrik Frystyk Nielsen,
World Wide Web Consortium
http://www.w3.org/People/Frystyk