RE: [Ipoverib] Please read - proposed WG termination
Bernard King-Smith <wombat2@us.ibm.com> Thu, 01 September 2005 00:11 UTC
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EAcgJ-0005PR-CC; Wed, 31 Aug 2005 20:11:51 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EAcgE-0005PG-OY; Wed, 31 Aug 2005 20:11:50 -0400
Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id UAA28235; Wed, 31 Aug 2005 20:11:40 -0400 (EDT)
Received: from e4.ny.us.ibm.com ([32.97.182.144]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1EAci0-0008JU-3L; Wed, 31 Aug 2005 20:13:37 -0400
Received: from d01relay04.pok.ibm.com (d01relay04.pok.ibm.com [9.56.227.236]) by e4.ny.us.ibm.com (8.12.11/8.12.11) with ESMTP id j810BGkQ017453; Wed, 31 Aug 2005 20:11:16 -0400
Received: from d01av03.pok.ibm.com (d01av03.pok.ibm.com [9.56.224.217]) by d01relay04.pok.ibm.com (8.12.10/NCO/VERS6.7) with ESMTP id j810BFCo097152; Wed, 31 Aug 2005 20:11:15 -0400
Received: from d01av03.pok.ibm.com (loopback [127.0.0.1]) by d01av03.pok.ibm.com (8.12.11/8.13.3) with ESMTP id j810B5cG012635; Wed, 31 Aug 2005 20:11:05 -0400
Received: from [9.56.228.210] (d01mlc96.pok.ibm.com [9.56.228.210]) by d01av03.pok.ibm.com (8.12.11/8.12.11) with ESMTP id j810B5Qx012339; Wed, 31 Aug 2005 20:11:05 -0400
In-Reply-To: <506C3D7B14CDD411A52C00025558DED60893AD3C@mtlex01.yok.mtl.com>
Subject: RE: [Ipoverib] Please read - proposed WG termination
To: Dror Goldenberg <gdror@mellanox.co.il>
X-Mailer: Lotus Notes Release 6.0.2CF1 June 9, 2003
Message-ID: <OFEAD798BC.5BD12489-ON8525706E.00447E83-8525706F.0000FE88@us.ibm.com>
From: Bernard King-Smith <wombat2@us.ibm.com>
Date: Wed, 31 Aug 2005 20:10:51 -0400
X-MIMETrack: Serialize by Router on D01MLC96/01/M/IBM(Build V70_M6_06302005 Beta 4 HF4|August 24, 2005) at 08/31/2005 20:11:04
MIME-Version: 1.0
Content-type: text/plain; charset="US-ASCII"
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 5d7a7e767f20255fce80fa0b77fb2433
Cc: margaret@thingmagic.com, kashyapv@us.ibm.com, "H.K. Jerry Chu" <Jerry.Chu@eng.sun.com>, ipoverib@ietf.org, Bill_Strahm@McAfee.com, ipoverib-bounces@ietf.org
X-BeenThere: ipoverib@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: IP over InfiniBand WG Discussion List <ipoverib.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ipoverib>, <mailto:ipoverib-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:ipoverib@ietf.org>
List-Help: <mailto:ipoverib-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ipoverib>, <mailto:ipoverib-request@ietf.org?subject=subscribe>
Sender: ipoverib-bounces@ietf.org
Errors-To: ipoverib-bounces@ietf.org
Having IPoIB-CM is a very important feature to make IB a viable interconnect in clustered systems. Without IPoIB-CM for HPC clusters, and commercial clusters using IB to SAN, you need two networks for good total cluster performance, one for IB ( non-IP traffic ) and an IP performance network like GigE. This means that IB is not cost effective as the GigE ( or 10 GigE) network which handles both types of traffic reasonably. In the HPC world most clusters use the cluster fabric ( and IB is the future direction ) for both MPI and IP traffic. The IP traffic is usually for parallel file systems and system management and control. This high bandwidth IP network is required in most production HPC clusters. With the current IPoIB only using UD, the performance is dismal. Our simulations using the small packet MTU of IB says that the parallel file systems ( GPFS, PVFS, Lustre etc ) can only get 25% of a 4X IB link today and at 12X it will be about 10%. The problem is that the IP drivers are single threaded per adapter. Also the CPU utilization of TCP/IP at a MTU of the IB link very high because of the per packet stack processing. Going to IPoIB-CM means we can cut down the number of TCP/IP stack traversals from 32 to 1 for a 60K IP packet. This means that you have 30 times as much data transmitted per device driver call. This will enable IP to show similar bandwidth with multiple sockets as other protocols that can use RC or fragment within the device driver. For commercial clusters, if IB is used for storage, then you save a network by having fast IP performance and can use the IB network for both. Why use IB and another network for the commercial cluster, when the other network supports similar bandwidth for storage and IP. Implementing IPoIB-CM makes IB viable in the HPC cluster and some commercial clusters. Otherwise I don't think it competes economically with other network technologies. Regards. Bernie King-Smith IBM Corporation Server Group Cluster System Performance wombat2@us.ibm.com (845)433-8483 Tie. 293-8483 or wombat2 on NOTES "We are not responsible for the world we are born into, only for the world we leave when we die. So we have to accept what has gone before us and work to change the only thing we can, -- The Future." William Shatner Dror Goldenberg <gdror@mellanox.c o.il> To Sent by: kashyapv@us.ltcfwd.linux.ibm.com, ipoverib-bounces@ "H.K. Jerry Chu" ietf.org <Jerry.Chu@eng.sun.com> cc margaret@thingmagic.com, 08/30/2005 09:32 ipoverib@ietf.org, AM Bill_Strahm@McAfee.com Subject RE: [Ipoverib] Please read - proposed WG termination > From: Vivek Kashyap [mailto:kashyapv@us.ibm.com] > Sent: Tuesday, August 30, 2005 8:39 AM > > On Mon, 29 Aug 2005, H.K. Jerry Chu wrote: > <snip> > > 1. IPoIB connected mode draft-ietf-ipoib-connected-mode-00.txt > > updated recently > > Well, in recent days there has been a discussion going on > based on Dror's input. I also made some updates after some > discussion on OpenIB (not on > IETF though). This draft itself became a working group draft > this february > after some lively discussion just before that. It appears to > me that we > should be possible to finalise this draft soon enough. > > 20th sept. might be long enough to know one way or the other... > > vivek > We would like to see IPoIB-CM being finalized in IETF. We see great value in having a standard for connected mode which effectively increases the MTU. We are willing to contribute to the standardization effort. We're also looking at the implementation of IPoIB-CM in Linux. -Dror _______________________________________________ IPoverIB mailing list IPoverIB@ietf.org https://www1.ietf.org/mailman/listinfo/ipoverib _______________________________________________ IPoverIB mailing list IPoverIB@ietf.org https://www1.ietf.org/mailman/listinfo/ipoverib
- [Ipoverib] Please read - proposed WG termination H.K. Jerry Chu
- Re: [Ipoverib] Please read - proposed WG terminat… Michael Krause
- RE: [Ipoverib] Please read - proposed WG terminat… Yaron Haviv
- RE: [Ipoverib] Please read - proposed WG terminat… H.K. Jerry Chu
- RE: [Ipoverib] Please read - proposed WG terminat… Michael Krause
- RE: [Ipoverib] Please read - proposed WG terminat… Yaron Haviv
- RE: [Ipoverib] Please read - proposed WG terminat… Yaron Haviv
- RE: [Ipoverib] Please read - proposed WG terminat… Carl Hensler
- Re: [Ipoverib] Please read - proposed WG terminat… Vivek Kashyap
- RE: [Ipoverib] Please read - proposed WG terminat… Harald Tveit Alvestrand
- RE: [Ipoverib] Please read - proposed WG terminat… Vivek Kashyap
- RE: [Ipoverib] Please read - proposed WG terminat… Yaron Haviv
- RE: [Ipoverib] Please read - proposed WG terminat… Dror Goldenberg
- RE: [Ipoverib] Please read - proposed WG terminat… Michael Krause
- RE: [Ipoverib] Please read - proposed WG terminat… Michael Krause
- RE: [Ipoverib] Please read - proposed WG terminat… Yaron Haviv
- RE: [Ipoverib] Please read - proposed WG terminat… Bernard King-Smith
- RE: [Ipoverib] Please read - proposed WG terminat… Harald Tveit Alvestrand
- RE: [Ipoverib] Please read - proposed WG terminat… Michael Krause
- Re: [Ipoverib] Please read - proposed WG terminat… Vivek Kashyap
- RE: [Ipoverib] Please read - proposed WG terminat… H.K. Jerry Chu
- Re: FW: [Ipoverib] Please read - proposed WG term… Eitan Zahavi
- Re: [Ipoverib] Please read - proposed WG terminat… Roland Dreier
- Re: FW: [Ipoverib] Please read - proposed WG term… H.K. Jerry Chu
- RE: FW: [Ipoverib] Please read - proposed WG term… Sean Harnedy
- RE: FW: [Ipoverib] Please read - proposed WG term… H.K. Jerry Chu
- RE: FW: [Ipoverib] Please read - proposed WG term… bill
- RE: [Ipoverib] Please read - proposed WG terminat… Michael Krause
- Re: [Ipoverib] Please read - proposed WG terminat… Michael Krause
- RE: [Ipoverib] Please read - proposed WG terminat… Bernard King-Smith
- Re: [Ipoverib] Please read - proposed WG terminat… Vivek Kashyap
- RE: [Ipoverib] Please read - proposed WG terminat… Vivek Kashyap
- Why is MTU an issue? (RE: [Ipoverib] Please read … Harald Tveit Alvestrand
- Ecosystems cost of additional specs (RE: [Ipoveri… Harald Tveit Alvestrand
- Re: Why is MTU an issue? (RE: [Ipoverib] Please r… Mark Townsley
- Re: FW: [Ipoverib] Please read - proposed WG term… Eitan Zahavi
- RE: [Ipoverib] Please read - proposed WG terminat… Dror Goldenberg
- RE: [Ipoverib] Please read - proposed WG terminat… Dror Goldenberg
- RE: [Ipoverib] Please read - proposed WG terminat… Michael Krause
- RE: [Ipoverib] Please read - proposed WG terminat… Michael Krause
- RE: [Ipoverib] Please read - proposed WG terminat… Dror Goldenberg
- RE: [Ipoverib] Please read - proposed WG terminat… Michael Krause