[tcpm] Possible deadlock scenario with retransmission on both sides at the same time
Kobby Carmona <kobby.Carmona@qlogic.com> Thu, 04 August 2016 09:09 UTC
Return-Path: <kobby.Carmona@qlogic.com>
X-Original-To: tcpm@ietfa.amsl.com
Delivered-To: tcpm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7EC9212D692 for <tcpm@ietfa.amsl.com>; Thu, 4 Aug 2016 02:09:09 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.123
X-Spam-Level:
X-Spam-Status: No, score=-1.123 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_PASS=-0.001, SPF_NEUTRAL=0.779] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=qlgc.onmicrosoft.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id yZgz0by5mJIj for <tcpm@ietfa.amsl.com>; Thu, 4 Aug 2016 02:09:07 -0700 (PDT)
Received: from NAM03-BY2-obe.outbound.protection.outlook.com (mail-by2nam03on0094.outbound.protection.outlook.com [104.47.42.94]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6226A12D1A2 for <tcpm@ietf.org>; Thu, 4 Aug 2016 02:09:01 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=qlgc.onmicrosoft.com; s=selector1-qlogic-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=nmYzhdfteszXSUaN7AUR+Hqn9xNAigNny+YxeuJ8idM=; b=D18GH44S3Tonf5UZymHynW/unR1VxtBv8X9zmNUzSsc9avmdRTACOwNt6mFCByAe+1XshZImdhqaUtOPA0Ln1z5ik7tWEJmAJcRWPHIj63Sp7qCYAU899dCT9tJvZTskcfq7FQ7KF6+sjBfJScLCWXGyXP+5uZwepIby/vEmfHQ=
Received: from MWHPR11MB1374.namprd11.prod.outlook.com (10.169.234.8) by MWHPR11MB1376.namprd11.prod.outlook.com (10.169.234.10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.549.15; Thu, 4 Aug 2016 09:08:58 +0000
Received: from MWHPR11MB1374.namprd11.prod.outlook.com ([10.169.234.8]) by MWHPR11MB1374.namprd11.prod.outlook.com ([10.169.234.8]) with mapi id 15.01.0549.023; Thu, 4 Aug 2016 09:08:58 +0000
From: Kobby Carmona <kobby.Carmona@qlogic.com>
To: "tcpm@ietf.org" <tcpm@ietf.org>
Thread-Topic: Possible deadlock scenario with retransmission on both sides at the same time
Thread-Index: AdHth2zqZmjYio0mRrKVRRoD8V9Fcg==
Date: Thu, 04 Aug 2016 09:08:58 +0000
Message-ID: <MWHPR11MB1374A50BC599B093EA09668984070@MWHPR11MB1374.namprd11.prod.outlook.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
authentication-results: spf=none (sender IP is ) smtp.mailfrom=kobby.Carmona@qlogic.com;
x-originating-ip: [31.168.140.228]
x-ms-office365-filtering-correlation-id: 714f5f17-cba9-4340-a443-08d3bc46f834
x-microsoft-exchange-diagnostics: 1; MWHPR11MB1376; 20:+NmX+rzsOBrm8iRsk5gvsMPqohRqH7MaNljwdfQToI0L0+u/gF0p7DeM3J8SM/KxFZljVecgfWZvbuAKQOcATav8acgFKGv2Udl4O7J4sOiM+LoETl0lygg5t/mSThEt5q7ZZSU5Y3jYws7Amnmva9wsMKArjzMOORij6t1QgaE=
x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:MWHPR11MB1376;
x-microsoft-antispam-prvs: <MWHPR11MB1376640C8C95F53569D8515584070@MWHPR11MB1376.namprd11.prod.outlook.com>
x-exchange-antispam-report-test: UriScan:;
x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(601004)(2401047)(8121501046)(5005006)(3002001)(10201501046); SRVR:MWHPR11MB1376; BCL:0; PCL:0; RULEID:; SRVR:MWHPR11MB1376;
x-forefront-prvs: 00246AB517
x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(6009001)(7916002)(53754006)(189002)(199003)(2501003)(74316002)(5640700001)(3280700002)(3660700001)(86362001)(68736007)(6116002)(7846002)(66066001)(9686002)(92566002)(77096005)(81166006)(107886002)(81156014)(110136002)(2900100001)(97736004)(87936001)(122556002)(101416001)(8676002)(105586002)(54356999)(8936002)(50986999)(2906002)(305945005)(586003)(33656002)(106356001)(450100001)(99286002)(11100500001)(1730700003)(10400500002)(229853001)(5002640100001)(3846002)(102836003)(19580395003)(2351001)(76576001)(189998001)(7696003)(7736002); DIR:OUT; SFP:1102; SCL:1; SRVR:MWHPR11MB1376; H:MWHPR11MB1374.namprd11.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en;
received-spf: None (protection.outlook.com: qlogic.com does not designate permitted sender hosts)
spamdiagnosticoutput: 1:99
spamdiagnosticmetadata: NSPM
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-OriginatorOrg: qlogic.com
X-MS-Exchange-CrossTenant-originalarrivaltime: 04 Aug 2016 09:08:58.4051 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 0d68a1f9-1490-4d0e-8767-a87dab3ef2ba
X-MS-Exchange-Transport-CrossTenantHeadersStamped: MWHPR11MB1376
Archived-At: <https://mailarchive.ietf.org/arch/msg/tcpm/Y-64mSrPhdWYusLKMZLnoZFiUWM>
X-Mailman-Approved-At: Thu, 04 Aug 2016 10:43:38 -0700
Subject: [tcpm] Possible deadlock scenario with retransmission on both sides at the same time
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tcpm/>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 04 Aug 2016 09:10:16 -0000
Hi all, While running a bidirectional scenario with random drops in a network simulator of our (QLogic's NIC) TCP stack we found a case where it seems there is deadlock in the TCP protocol (the connection will keep sending pure acks from both sides until RTO will expire multiple times and a RST will sent to close the connection). The scenario is as follows (there is an example with numbers for each stage assuming the MSS and each packet is 1000B): 1. Both sides are transmitting data and a single packet is dropped on either side and the next two packets are received properly Side A - SND.MAX=3000, SND.NXT=3000, SND.UNA=1000, RCV.NXT=11000, out-of-order block 12000-13000 Side B - SND.MAX =13000, SND.NXT =13000, SND.UNA=11000, RCV.NXT=1000, out-of-order block 2000-3000 2. RTO timer expires on both sides Side A - SND.MAX=3000, SND.NXT=1000, SND.UNA=1000, RCV.NXT=11000, out-of-order block 12000-13000 Side B - SND.MAX =13000, SND.NXT=11000, SND.UNA=11000, RCV.NXT=1000, out-of-order block 2000-3000 3. Both sides transmit a single packet to the peer: A->B - pkt.seq=1000, pkt.ack=11000, len=1000 B->A - pkt.seq=11000, pkt.ack=1000, len=1000 3. Both sides receive the packets and update the receive context: Side A - SND.MAX=3000, SND.NXT=2000, SND.UNA=1000, RCV.NXT=13000 Side B - SND.MAX=13000, SND.NXT=12000, SND.UNA=11000, RCV.NXT=3000 4. Both sides send another segment: A->B - pkt.seq=2000, pkt.ack=13000, len=1000 B->A - pkt.seq=12000, pkt.ack=3000, len=1000 5. Both sides don't accept the packet (and don't update SND.UNA) since the sequence on the packet is less than RCV.NXT (sequence number check in page 69 of RFC793) and send a pure ACK instead A->B - pkt.seq=2000, pkt.ack=13000, len=0 (pure ACK) B->A - pkt.seq=12000, pkt.ack=3000, len=0 (pure ACK) 6. This will continue forever (until the connection will be terminated by RST) since every packet that ends before RCV.NXT (even a retransmit from SND.UNA) will be dropped. Did anyone encountered this issue before? Is the anything we missed on this sequence? If this is indeed a real deadlock, there might be several solutions to this which will require a modification in receive processing of RFC793. But I would like to know if you think this is a real issue before dealing with solutions. Thanks, Kobby
- Re: [tcpm] Possible deadlock scenario with retran… David Borman
- Re: [tcpm] Possible deadlock scenario with retran… Neal Cardwell
- Re: [tcpm] Possible deadlock scenario with retran… Yoshifumi Nishida
- Re: [tcpm] Possible deadlock scenario with retran… Kobby Carmona
- Re: [tcpm] Possible deadlock scenario with retran… David Borman
- Re: [tcpm] Possible deadlock scenario with retran… Kobby Carmona
- Re: [tcpm] Possible deadlock scenario with retran… David Borman
- [tcpm] Possible deadlock scenario with retransmis… Kobby Carmona
- Re: [tcpm] Possible deadlock scenario with retran… Yoshifumi Nishida
- Re: [tcpm] Possible deadlock scenario with retran… Neal Cardwell
- Re: [tcpm] Possible deadlock scenario with retran… Yoshifumi Nishida
- Re: [tcpm] Possible deadlock scenario with retran… Yoshifumi Nishida