Return-Path: <kobby.Carmona@qlogic.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1])
 by ietfa.amsl.com (Postfix) with ESMTP id 7EC9212D692
 for <tcpm@ietfa.amsl.com>; Thu,  4 Aug 2016 02:09:09 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.123
X-Spam-Level: 
X-Spam-Status: No, score=-1.123 tagged_above=-999 required=5
 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1,
 RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001,
 SPF_HELO_PASS=-0.001, SPF_NEUTRAL=0.779]
 autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key)
 header.d=qlgc.onmicrosoft.com
Received: from mail.ietf.org ([4.31.198.44])
 by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024)
 with ESMTP id yZgz0by5mJIj for <tcpm@ietfa.amsl.com>;
 Thu,  4 Aug 2016 02:09:07 -0700 (PDT)
Received: from NAM03-BY2-obe.outbound.protection.outlook.com
 (mail-by2nam03on0094.outbound.protection.outlook.com [104.47.42.94])
 (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by ietfa.amsl.com (Postfix) with ESMTPS id 6226A12D1A2
 for <tcpm@ietf.org>; Thu,  4 Aug 2016 02:09:01 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=qlgc.onmicrosoft.com; 
 s=selector1-qlogic-com;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version;
 bh=nmYzhdfteszXSUaN7AUR+Hqn9xNAigNny+YxeuJ8idM=;
 b=D18GH44S3Tonf5UZymHynW/unR1VxtBv8X9zmNUzSsc9avmdRTACOwNt6mFCByAe+1XshZImdhqaUtOPA0Ln1z5ik7tWEJmAJcRWPHIj63Sp7qCYAU899dCT9tJvZTskcfq7FQ7KF6+sjBfJScLCWXGyXP+5uZwepIby/vEmfHQ=
Received: from MWHPR11MB1374.namprd11.prod.outlook.com (10.169.234.8) by
 MWHPR11MB1376.namprd11.prod.outlook.com (10.169.234.10) with Microsoft SMTP
 Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id
 15.1.549.15; Thu, 4 Aug 2016 09:08:58 +0000
Received: from MWHPR11MB1374.namprd11.prod.outlook.com ([10.169.234.8]) by
 MWHPR11MB1374.namprd11.prod.outlook.com ([10.169.234.8]) with mapi id
 15.01.0549.023; Thu, 4 Aug 2016 09:08:58 +0000
From: Kobby Carmona <kobby.Carmona@qlogic.com>
To: "tcpm@ietf.org" <tcpm@ietf.org>
Thread-Topic: Possible deadlock scenario with retransmission on both sides at
 the same time
Thread-Index: AdHth2zqZmjYio0mRrKVRRoD8V9Fcg==
Date: Thu, 4 Aug 2016 09:08:58 +0000
Message-ID: <MWHPR11MB1374A50BC599B093EA09668984070@MWHPR11MB1374.namprd11.prod.outlook.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
authentication-results: spf=none (sender IP is )
 smtp.mailfrom=kobby.Carmona@qlogic.com; 
x-originating-ip: [31.168.140.228]
x-ms-office365-filtering-correlation-id: 714f5f17-cba9-4340-a443-08d3bc46f834
x-microsoft-exchange-diagnostics: 1; MWHPR11MB1376;
 20:+NmX+rzsOBrm8iRsk5gvsMPqohRqH7MaNljwdfQToI0L0+u/gF0p7DeM3J8SM/KxFZljVecgfWZvbuAKQOcATav8acgFKGv2Udl4O7J4sOiM+LoETl0lygg5t/mSThEt5q7ZZSU5Y3jYws7Amnmva9wsMKArjzMOORij6t1QgaE=
x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:MWHPR11MB1376;
x-microsoft-antispam-prvs: <MWHPR11MB1376640C8C95F53569D8515584070@MWHPR11MB1376.namprd11.prod.outlook.com>
x-exchange-antispam-report-test: UriScan:;
x-exchange-antispam-report-cfa-test: BCL:0; PCL:0;
 RULEID:(601004)(2401047)(8121501046)(5005006)(3002001)(10201501046);
 SRVR:MWHPR11MB1376; BCL:0; PCL:0; RULEID:; SRVR:MWHPR11MB1376; 
x-forefront-prvs: 00246AB517
x-forefront-antispam-report: SFV:NSPM;
 SFS:(10019020)(6009001)(7916002)(53754006)(189002)(199003)(2501003)(74316002)(5640700001)(3280700002)(3660700001)(86362001)(68736007)(6116002)(7846002)(66066001)(9686002)(92566002)(77096005)(81166006)(107886002)(81156014)(110136002)(2900100001)(97736004)(87936001)(122556002)(101416001)(8676002)(105586002)(54356999)(8936002)(50986999)(2906002)(305945005)(586003)(33656002)(106356001)(450100001)(99286002)(11100500001)(1730700003)(10400500002)(229853001)(5002640100001)(3846002)(102836003)(19580395003)(2351001)(76576001)(189998001)(7696003)(7736002);
 DIR:OUT; SFP:1102; SCL:1; SRVR:MWHPR11MB1376;
 H:MWHPR11MB1374.namprd11.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; 
 MX:1; A:1; LANG:en; 
received-spf: None (protection.outlook.com: qlogic.com does not designate
 permitted sender hosts)
spamdiagnosticoutput: 1:99
spamdiagnosticmetadata: NSPM
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-OriginatorOrg: qlogic.com
X-MS-Exchange-CrossTenant-originalarrivaltime: 04 Aug 2016 09:08:58.4051 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 0d68a1f9-1490-4d0e-8767-a87dab3ef2ba
X-MS-Exchange-Transport-CrossTenantHeadersStamped: MWHPR11MB1376
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/Y-64mSrPhdWYusLKMZLnoZFiUWM>
X-Mailman-Approved-At: Thu, 04 Aug 2016 10:43:38 -0700
Subject: [tcpm] Possible deadlock scenario with retransmission on both sides
 at the same time
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>,
 <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>,
 <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 04 Aug 2016 09:10:16 -0000

Hi all,
While running a bidirectional scenario with random drops in a network simul=
ator of our (QLogic's NIC) TCP stack we found a case where it seems there i=
s deadlock in the TCP protocol (the connection will keep sending pure acks =
from both sides until RTO will expire multiple times and a RST will sent to=
 close the connection).
The scenario is as follows (there is an example with numbers for each stage=
 assuming the MSS and each packet is 1000B):
1. Both sides are transmitting data and a single packet is dropped on eithe=
r side and the next two packets are received properly
	Side A - SND.MAX=3D3000, SND.NXT=3D3000, SND.UNA=3D1000, RCV.NXT=3D11000, =
out-of-order block 12000-13000
	Side B - SND.MAX =3D13000, SND.NXT =3D13000, SND.UNA=3D11000, RCV.NXT=3D10=
00, out-of-order block 2000-3000
2. RTO timer expires on both sides
	Side A - SND.MAX=3D3000, SND.NXT=3D1000, SND.UNA=3D1000, RCV.NXT=3D11000, =
out-of-order block 12000-13000
	Side B - SND.MAX =3D13000, SND.NXT=3D11000, SND.UNA=3D11000, RCV.NXT=3D100=
0, out-of-order block 2000-3000
3. Both sides transmit a single packet to the peer:
	A->B - pkt.seq=3D1000, pkt.ack=3D11000, len=3D1000
	B->A - pkt.seq=3D11000, pkt.ack=3D1000, len=3D1000
3. Both sides receive the packets and update the receive context:
	Side A - SND.MAX=3D3000, SND.NXT=3D2000, SND.UNA=3D1000, RCV.NXT=3D13000
	Side B - SND.MAX=3D13000, SND.NXT=3D12000, SND.UNA=3D11000, RCV.NXT=3D3000
4. Both sides send another segment:
	A->B - pkt.seq=3D2000, pkt.ack=3D13000, len=3D1000
	B->A - pkt.seq=3D12000, pkt.ack=3D3000, len=3D1000
5. Both sides don't accept the packet (and don't update SND.UNA) since the =
sequence on the packet is less than RCV.NXT (sequence number check in page =
69 of RFC793) and send a pure ACK instead
	A->B - pkt.seq=3D2000, pkt.ack=3D13000, len=3D0 (pure ACK)
	B->A - pkt.seq=3D12000, pkt.ack=3D3000, len=3D0 (pure ACK)
6. This will continue forever (until the connection will be terminated by R=
ST) since every packet that ends before RCV.NXT (even a retransmit from SND=
.UNA) will be dropped.

Did anyone encountered this issue before? Is the anything we missed on this=
 sequence?
If this is indeed a real deadlock, there might be several solutions to this=
 which will require a modification in receive processing of RFC793. But I w=
ould like to know if you think this is a real issue before dealing with sol=
utions.
Thanks,

	Kobby


