Re: [TLS] Sending fatal alerts over TCP

Peter Gutmann <> Wed, 21 December 2011 02:29 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 6471711E808A for <>; Tue, 20 Dec 2011 18:29:34 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.599
X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[AWL=0.000, BAYES_00=-2.599]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id L0rGQX5ftrQP for <>; Tue, 20 Dec 2011 18:29:33 -0800 (PST)
Received: from ( []) by (Postfix) with ESMTP id 3C7F811E8073 for <>; Tue, 20 Dec 2011 18:29:31 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple;;; q=dns/txt; s=uoa; t=1324434573; x=1355970573; h=from:to:subject:in-reply-to:message-id:date; bh=sXr3c7FckKrec/e6pn0CKb/RMKcmc6SisC9FaivPGRY=; b=SoLJB7XVhs+dSvQ1GvSjoWRMEEtxKKs8xkNcefYFh+d8ctwPRNmdDujf nTPx4C9YIQ1/8HyAP4KOQBdTXrMFty6SXJ11Bvu/xGNonYRYGsUkbZ47y QTHp/+xyDCjBw6PaA1+vmzRmqxzHIlyoEj3LweratuwiMOZhkvfcCPTkI w=;
X-IronPort-AV: E=Sophos;i="4.71,385,1320577200"; d="scan'208";a="95590598"
X-Ironport-Source: - Outgoing - Outgoing
Received: from ([]) by with ESMTP/TLS/AES256-SHA; 21 Dec 2011 15:29:24 +1300
Received: from pgut001 by with local (Exim 4.69) (envelope-from <>) id 1RdBvn-0004or-PG; Wed, 21 Dec 2011 15:29:23 +1300
From: Peter Gutmann <>
In-Reply-To: <>
Message-Id: <>
Date: Wed, 21 Dec 2011 15:29:23 +1300
Subject: Re: [TLS] Sending fatal alerts over TCP
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "This is the mailing list for the Transport Layer Security working group of the IETF." <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 21 Dec 2011 02:29:34 -0000

Florian Weimer <> writes:

>What is the best way to deal with this? 

If it's the other side that does it then there's not much you can do, if
you're doing it then you need to wander into a twisty little maze of
shutdown()/close() passages, all different, buggy, and incompatible:

/* Close a connection.  Safely handling closes is extremely difficult due to 
   a combination of the way TCP/IP (and TCP stacks) work and various bugs 
   and quirks in implementations.  After a close (and particularly if short-
   timeout non-blocking writes are used) there can still be data left in 
   TCP send buffers, and also as unacknowledged segments on the network.  At 
   this point there's no easy way for the TCP stack to know how long it 
   should hang around trying to get the data out and waiting for acks to 
   come back.  If it doesn't wait long enough, it'll end up discarding 
   unsent data.  If it waits too long, it could potentially wait forever in 
   the presence of network outages or crashed peers.  What's worse, since 
   the socket is now closed, there's no way to report any problems that may 
   occur at this point back to the caller.

   We try and handle this with a combination of shutdown() and close(), but 
   due to implementation bugs/quirks and the TCP stack issues mentioned 
   above this doesn't work all of the time.  The details get very 
   implementation-specific, for example with glibc the manpage says that 
   setting SO_LINGER causes shutdown() not to return until queued messages 
   are sent (which is wrong, and non-glibc implementations like PHUX and 
   Solaris specifically point out that only close() is affected), but that 
   shutdown() discards unsent data.  glibc in turn is dependent on the 
   kernel it's running on top of, under Linux shutdown() returns immediately 
   but data is still sent regardless of the SO_LINGER setting.

   BSD Net/2 and later (which many stacks are derived from, including non-
   Unix systems like OS/2) returned immediately from a close() but still 
   sent queued data on a best-effort basis.  With SO_LINGER set and a zero 
   timeout the close was abortive (which Linux also implemented starting 
   with the 2.4 kernel), and with a non-zero timeout it would wait until all 
   the data was sent, which meant that it could block almost indefinitely 
   (minutes or even hours, this is the worst-case behaviour mentioned 
   above).  This was finally fixed in 4.4BSD (although a lot of 4.3BSD-
   derived stacks ended up with the indefinite-wait behaviour), but even 
   then there was some confusion as to whether the wait time was in machine-
   specific ticks or seconds (Posix finally declared it to be seconds).  
   Under Winsock, close() simply discards queued data while shutdown() has 
   the same effect as under Linux, sending enqueued data asynchronously 
   regardless of the SO_LINGER setting.

   This is a real mess to sort out safely, the best that we can do is to 
   perform a shutdown() followed later by a close().  Messing with SO_LINGER 
   is too risky and something like performing an ioWait() doesn't work 
   either because it just results in whoever initiated the shutdown being 
   blocked for the I/O wait time, and waiting for a recv() of 0 bytes isn't 
   safe because the higher-level code may need to read back a shutdown ack 
   from the other side which a recv() performed at this point would 
   interfere with.  Under Windows we could handle it by waiting for an 
   FD_CLOSE to be posted but this requires the use of a window handle which 
   we don't have access to, and which may not even exist for some classes of
   applications */