RE: [Ipoverib] Please read - proposed WG termination
"H.K. Jerry Chu" <Jerry.Chu@eng.sun.com> Thu, 01 September 2005 18:19 UTC
Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EAteX-0000Ec-Nc; Thu, 01 Sep 2005 14:19:09 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1EAteT-0000ER-KX; Thu, 01 Sep 2005 14:19:08 -0400
Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id OAA18442; Thu, 1 Sep 2005 14:19:02 -0400 (EDT)
Received: from brmea-mail-4.sun.com ([192.18.98.36]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1EAtgR-0007K7-HH; Thu, 01 Sep 2005 14:21:10 -0400
Received: from jurassic.eng.sun.com ([129.146.17.57]) by brmea-mail-4.sun.com (8.12.10/8.12.9) with ESMTP id j81IITTW007786; Thu, 1 Sep 2005 12:18:29 -0600 (MDT)
Received: from sweethome (punchin-hkchu.SFBay.Sun.COM [192.9.61.13]) by jurassic.eng.sun.com (8.13.5.Beta0+Sun/8.13.5.Beta0) with SMTP id j81IIPJV251312; Thu, 1 Sep 2005 11:18:28 -0700 (PDT)
Message-Id: <200509011818.j81IIPJV251312@jurassic.eng.sun.com>
Date: Thu, 01 Sep 2005 11:19:23 -0700
From: "H.K. Jerry Chu" <Jerry.Chu@eng.sun.com>
Subject: RE: [Ipoverib] Please read - proposed WG termination
To: wombat2@us.ibm.com, gdror@mellanox.co.il, krause@cup.hp.com
MIME-Version: 1.0
Content-Type: TEXT/plain; charset="us-ascii"
Content-MD5: PZixTRKD39S6gG5FvHC76g==
X-Mailer: dtmail 1.3.0 @(#)CDE Version 1.6 SunOS 5.10 sun4u sparc
X-Spam-Score: 0.0 (/)
X-Scan-Signature: d890c9ddd0b0a61e8c597ad30c1c2176
Cc: margaret@thingmagic.com, kashyapv@us.ibm.com, Bill_Strahm@McAfee.com, ipoverib-bounces@ietf.org, ipoverib@ietf.org
X-BeenThere: ipoverib@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: "H.K. Jerry Chu" <Jerry.Chu@eng.sun.com>
List-Id: IP over InfiniBand WG Discussion List <ipoverib.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ipoverib>, <mailto:ipoverib-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:ipoverib@ietf.org>
List-Help: <mailto:ipoverib-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ipoverib>, <mailto:ipoverib-request@ietf.org?subject=subscribe>
Sender: ipoverib-bounces@ietf.org
Errors-To: ipoverib-bounces@ietf.org
[co-chair hat off] ... <snip> >These performance problems are primarily implementation-specific and have >little to do with IB technology itself. In addition, nearly all IB >solutions use a 2KB not the smallest MTU to transfer data - no different >than Ethernet. Ethernet is adopting jumboframe to get more firing power. Where is IB's equivalent of jumboframe? >As I and others have raised over the years, the enablement >of IP over IB to perform well is a local HCA issue not a standards >issue. Addition of checksum off-load support to the HCA is rather trivial >and does not require standardization (this is what is done for Ethernet >today and is non-standard). Addition of large send off-load support is a >local HCA issue not a standards issue and effectively provides the same >benefit as connected mode. Yes LSO (or TSO as some call it) is relatively easy. But LRO (large receive offload) is a heck more difficult. IB connected transports already have all silicons to do it. Why not just use it? >The use of multiple QP to spread work across >CPU for both send / receive ala the multi-queue support I've worked with >various Ethernet IHV to get in place is again a local HCA issue (does not >have to be visible as part of the layer 2 address resolution). One can >construct a very nice performing IP over IB solution but there hasn't been >much public progress to implement these de facto capabilities found in >Ethernet solutions on IB. Getting these into a HCA implementation is a >heck of a lot easier and faster to do than to develop a standard and >getting all of the OS changes made (the HCA implementation issues can all >be done underneath the IP stack just like with Ethernet so no real OS impacts). I don't understand the large MTU issue to the OS (requiring continguous physical addresses). Aren't all decent hardware capable of scatter/gather these days? What's more hairy to the OS stack is the per-destination MTU and different MTU for multicast than for unicast inherited in IPoIB CM. Jerry > > >>For commercial clusters, if IB is used for storage, then you save a network >>by having fast IP performance and can use the IB network for both. Why use >>IB and another network for the commercial cluster, when the other network >>supports similar bandwidth for storage and IP. > >There will always be Ethernet in any cluster so the fabric is there. The >question is whether it is just for low-bandwidth / management services or >for applications. For storage, need to separate the discussion into >whether it is block or file. For block, IB gateways to Fibre Channel, etc. >can and are being used today quite nicely. Performance is reasonable and >the ecosystem costs, target availability, customer "pain", etc. are much >lower than attempting to move to native IB storage. The same applies to >file based where IB gateways to Ethernet which then attaches to file >servers works quite nicely. In fact, the original vision of IB was that of >an I/O fabric to create modular server solutions. The addition of IPC came >later in the process when it was found to be relatively low cost to >define. So, IB is successful in the HPC world and slowly entering some >commercial solutions. To state that its future relies on getting an IP >over IB RC solution is perhaps blowing it a bit out of proportion. The >easier path for all is to simply use the techniques I and others have >advocated for years now and solve the problems within the HCA >implementation. Much lower costs and will result in delivering a good >performance solution. > >BTW, RNIC / Ethernet solutions implement these techniques today. With the >arrival of 10 GbE and the lower prices of RNIC and 10 GbE switch ports, >lower latency switches (competitive enough with IB for commercial and many >HPC clusters), etc. the success of IB must lie elsewhere and not on an IETF >spec. This was noted at the recent IEEE Hot Interconnects conference as >well so isn't just my opinion. > >Mike > >>Implementing IPoIB-CM makes IB viable in the HPC cluster and some >>commercial clusters. Otherwise I don't think it competes economically with >>other network technologies. >> >>Regards. >> >>Bernie King-Smith >>IBM Corporation >>Server Group >>Cluster System Performance >>wombat2@us.ibm.com (845)433-8483 >>Tie. 293-8483 or wombat2 on NOTES >> >>"We are not responsible for the world we are born into, only for the world >>we leave when we die. >>So we have to accept what has gone before us and work to change the only >>thing we can, >>-- The Future." William Shatner >> >> >> >> Dror Goldenberg >> <gdror@mellanox.c >> o.il> To >> Sent by: kashyapv@us.ltcfwd.linux.ibm.com, >> ipoverib-bounces@ "H.K. Jerry Chu" >> ietf.org <Jerry.Chu@eng.sun.com> >> cc >> margaret@thingmagic.com, >> 08/30/2005 09:32 ipoverib@ietf.org, >> AM Bill_Strahm@McAfee.com >> Subject >> RE: [Ipoverib] Please read - >> proposed WG termination >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> > From: Vivek Kashyap [mailto:kashyapv@us.ibm.com] >> > Sent: Tuesday, August 30, 2005 8:39 AM >> > >> > On Mon, 29 Aug 2005, H.K. Jerry Chu wrote: >> > >> >> >><snip> >> >> >> > > 1. IPoIB connected mode draft-ietf-ipoib-connected-mode-00.txt >> > > updated recently >> > >> > Well, in recent days there has been a discussion going on >> > based on Dror's input. I also made some updates after some >> > discussion on OpenIB (not on >> > IETF though). This draft itself became a working group draft >> > this february >> > after some lively discussion just before that. It appears to >> > me that we >> > should be possible to finalise this draft soon enough. >> > >> > 20th sept. might be long enough to know one way or the other... >> > >> > vivek >> > >> >> >>We would like to see IPoIB-CM being finalized in IETF. We see >>great value in having a standard for connected mode which effectively >>increases the MTU. We are willing to contribute to the standardization >>effort. We're also looking at the implementation of IPoIB-CM in Linux. >> >> >>-Dror _______________________________________________ >>IPoverIB mailing list >>IPoverIB@ietf.org >>https://www1.ietf.org/mailman/listinfo/ipoverib >> >> >> >> >> >>_______________________________________________ >>IPoverIB mailing list >>IPoverIB@ietf.org >>https://www1.ietf.org/mailman/listinfo/ipoverib _______________________________________________ IPoverIB mailing list IPoverIB@ietf.org https://www1.ietf.org/mailman/listinfo/ipoverib
- [Ipoverib] Please read - proposed WG termination H.K. Jerry Chu
- Re: [Ipoverib] Please read - proposed WG terminat… Michael Krause
- RE: [Ipoverib] Please read - proposed WG terminat… Yaron Haviv
- RE: [Ipoverib] Please read - proposed WG terminat… H.K. Jerry Chu
- RE: [Ipoverib] Please read - proposed WG terminat… Michael Krause
- RE: [Ipoverib] Please read - proposed WG terminat… Yaron Haviv
- RE: [Ipoverib] Please read - proposed WG terminat… Yaron Haviv
- RE: [Ipoverib] Please read - proposed WG terminat… Carl Hensler
- Re: [Ipoverib] Please read - proposed WG terminat… Vivek Kashyap
- RE: [Ipoverib] Please read - proposed WG terminat… Harald Tveit Alvestrand
- RE: [Ipoverib] Please read - proposed WG terminat… Vivek Kashyap
- RE: [Ipoverib] Please read - proposed WG terminat… Yaron Haviv
- RE: [Ipoverib] Please read - proposed WG terminat… Dror Goldenberg
- RE: [Ipoverib] Please read - proposed WG terminat… Michael Krause
- RE: [Ipoverib] Please read - proposed WG terminat… Michael Krause
- RE: [Ipoverib] Please read - proposed WG terminat… Yaron Haviv
- RE: [Ipoverib] Please read - proposed WG terminat… Bernard King-Smith
- RE: [Ipoverib] Please read - proposed WG terminat… Harald Tveit Alvestrand
- RE: [Ipoverib] Please read - proposed WG terminat… Michael Krause
- Re: [Ipoverib] Please read - proposed WG terminat… Vivek Kashyap
- RE: [Ipoverib] Please read - proposed WG terminat… H.K. Jerry Chu
- Re: FW: [Ipoverib] Please read - proposed WG term… Eitan Zahavi
- Re: [Ipoverib] Please read - proposed WG terminat… Roland Dreier
- Re: FW: [Ipoverib] Please read - proposed WG term… H.K. Jerry Chu
- RE: FW: [Ipoverib] Please read - proposed WG term… Sean Harnedy
- RE: FW: [Ipoverib] Please read - proposed WG term… H.K. Jerry Chu
- RE: FW: [Ipoverib] Please read - proposed WG term… bill
- RE: [Ipoverib] Please read - proposed WG terminat… Michael Krause
- Re: [Ipoverib] Please read - proposed WG terminat… Michael Krause
- RE: [Ipoverib] Please read - proposed WG terminat… Bernard King-Smith
- Re: [Ipoverib] Please read - proposed WG terminat… Vivek Kashyap
- RE: [Ipoverib] Please read - proposed WG terminat… Vivek Kashyap
- Why is MTU an issue? (RE: [Ipoverib] Please read … Harald Tveit Alvestrand
- Ecosystems cost of additional specs (RE: [Ipoveri… Harald Tveit Alvestrand
- Re: Why is MTU an issue? (RE: [Ipoverib] Please r… Mark Townsley
- Re: FW: [Ipoverib] Please read - proposed WG term… Eitan Zahavi
- RE: [Ipoverib] Please read - proposed WG terminat… Dror Goldenberg
- RE: [Ipoverib] Please read - proposed WG terminat… Dror Goldenberg
- RE: [Ipoverib] Please read - proposed WG terminat… Michael Krause
- RE: [Ipoverib] Please read - proposed WG terminat… Michael Krause
- RE: [Ipoverib] Please read - proposed WG terminat… Dror Goldenberg
- RE: [Ipoverib] Please read - proposed WG terminat… Michael Krause