[Tsvwg] Re: [rddp] RE:I-D ACTION:draft-williams-iwarp-ift-00.txt
"Vadim Makhervaks" <VADIK@il.ibm.com> Wed, 06 November 2002 22:54 UTC
Received: from www1.ietf.org (ietf.org [132.151.1.19] (may be forged)) by ietf.org (8.9.1a/8.9.1a) with ESMTP id RAA16070 for <tsvwg-archive@odin.ietf.org>; Wed, 6 Nov 2002 17:54:06 -0500 (EST)
Received: (from mailnull@localhost) by www1.ietf.org (8.11.6/8.11.6) id gA6MuAH18970 for tsvwg-archive@odin.ietf.org; Wed, 6 Nov 2002 17:56:10 -0500
Received: from www1.ietf.org (localhost.localdomain [127.0.0.1]) by www1.ietf.org (8.11.6/8.11.6) with ESMTP id gA6Mtkv18916; Wed, 6 Nov 2002 17:55:46 -0500
Received: from ietf.org (odin.ietf.org [132.151.1.176]) by www1.ietf.org (8.11.6/8.11.6) with ESMTP id gA6Mr4v18800 for <tsvwg@optimus.ietf.org>; Wed, 6 Nov 2002 17:53:05 -0500
Received: from d12lmsgate-5.de.ibm.com (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id RAA15951; Wed, 6 Nov 2002 17:50:27 -0500 (EST)
Received: from d12relay02.de.ibm.com (d12relay02.de.ibm.com [9.165.215.23]) by d12lmsgate-5.de.ibm.com (8.12.3/8.12.3) with ESMTP id gA6MqFvx010488; Wed, 6 Nov 2002 23:52:15 +0100
Received: from d10hubm1.telaviv.ibm.com (d10ml001.telaviv.ibm.com [9.148.216.55]) by d12relay02.de.ibm.com (8.12.3/NCO/VER6.4) with ESMTP id gA6MqEo9075588; Wed, 6 Nov 2002 23:52:14 +0100
To: "Williams, Jim" <Jim.Williams@emulex.com>
Cc: Giora Biran <GBIRAN@il.ibm.com>, Julian Satran <Julian_Satran@il.ibm.com>, "'Culley, Paul'" <Paul.Culley@hp.com>, rddp@ietf.org, rddp-admin@ietf.org, tsvwg@ietf.org
X-Mailer: Lotus Notes Release 5.0.8 June 18, 2001
Message-ID: <OF5E28E10F.DDAEE059-ONC2256C69.007C01E1@telaviv.ibm.com>
From: Vadim Makhervaks <VADIK@il.ibm.com>
Date: Thu, 07 Nov 2002 00:52:10 +0200
X-MIMETrack: Serialize by Router on D10ML001/10/M/IBM(Release 5.0.9a |January 7, 2002) at 07/11/2002 00:52:14
MIME-Version: 1.0
Content-type: text/plain; charset="us-ascii"
Subject: [Tsvwg] Re: [rddp] RE:I-D ACTION:draft-williams-iwarp-ift-00.txt
Sender: tsvwg-admin@ietf.org
Errors-To: tsvwg-admin@ietf.org
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.0.12
Precedence: bulk
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>
I'll try to put some pros for markers. First of all I need to admit, that I'm not fun of markers, it is not an easy thing to implement, but right now I do not see any other solution for the problem that I'm going to present below. First few words about host interface, and it's ability to handle bursts from the wire. I believe that DDP concept is 10gb oriented, and thus a host interface that should be considered is not PCI-X, but QDR-PCIX, RapidIO, direct interface to memory controller, etc. It's not clear that such interface would have any problem to handle inbound traffic on wire-speed. It should be mostly write operations on the bus, and assuming efficient bridge/MC implementation, we should not have big problems with bursts of writes. Even assuming that RNIC vendor would decide to pass the received stream via on-board/on-chip memory before placing to the host memory, markers still would be very helpful. I'll give just one example to emphasize this: Without markers the location of the next DDP segment is identified by the Length field of the header of previous DDP segment. If DDP header was lost, then without use of markers, RNIC vendor would need to wait for receiving this header, in order to place the data to the destination buffers on the host. However, if markers are supplied by transmitter, receiver may several choices: a) use a fast-path (cut-thru) and place received segments directly to the destination buffers; b) pass received stream thru the memory, but still be able to place the payload to the destination buffers out-of-order; c) use a classic reassembly buffers, and place the data to the destination buffers in-order. So why should we force RNIC vendor to choose option c), and not leave this as another feature open for competition? Vadim |---------+----------------------------> | | "Williams, Jim" | | | <Jim.Williams@emu| | | lex.com> | | | Sent by: | | | rddp-admin@ietf.o| | | rg | | | | | | | | | 06/11/02 11:07 PM| | | | |---------+----------------------------> >-------------------------------------------------------------------------------------------------------------------------------| | | | To: "'Culley, Paul'" <Paul.Culley@hp.com>, Vadim Makhervaks/Haifa/IBM@IBMIL | | cc: rddp@ietf.org, tsvwg@ietf.org, Giora Biran/Haifa/IBM@IBMIL, Julian Satran/Haifa/IBM@IBMIL | | Subject: [rddp] RE:I-D ACTION:draft-williams-iwarp-ift-00.txt | | | | | >-------------------------------------------------------------------------------------------------------------------------------| > -----Original Message----- > From: Culley, Paul [mailto:Paul.Culley@hp.com] > > You have done an excellent analysis of the differences and > issues at hand. I would echo that sentiment. > -----Original Message----- > From: Vadim Makhervaks [mailto:VADIK@il.ibm.com] > I probably late on this discussion, but I just recently went thru the > differences between both drafts. > It seems (to me) that the changes proposed by Jim are not > tightly coupled, > and I would propose to discuss them separately: Generally true. There is some coupling, though, which I note. > ddp header CRC: Sounds as a reasonable approach to me. I can > imagine why > such CRC would help to RNIC implementation. On the other > side, I don't see > a damage of adding such CRC to the DDP segment, beside > another word of > overhead per DDP segment. I also think that such CRC should > protect the DDP > header only, and RDMA extensions (for Read Request and > Terminate) should be > considered as a payload. Anyway this disclaimer is > out-of-the-scope of MPA > definition, and probably is not relevant too much for this discussion. > > <prc> As discussed in prior emails, the MPA authors (after > much discussion) felt that this is not necessary. > You need a separate header CRC if designing a "Flow through NIC": > * But Need for elasticity buffer anyway makes "flow through" an > un-needed extra complexity. > * But NIC designers want to check L2 CRC and IP checksum first > anyway, requires whole packet. > You need a separate header CRC if Placing partial FPDUs > (can place 1st > part without whole FPDU): > * As long as alignment is correct, this is not needed. > o Alignment is Expected to be the usual case in the > data center, > at least, and common elsewhere. > o If lost alignment is an uncommon case, extra wire > overhead and > extra complexity of saving headers, while placing > data doesn't > seem worth the small buffer saving > <prc> I agree that a single CRC makes sense if alignment is always guaranteed. The IFT proposal does not guarantee alignment, therefore two CRCs make sense. SCTP does guarantee alignment, so its single CRC makes sense. Standard TCP does not provide alignment. TSVWG is considering an experimental variant of TCP that does provide alignment. But the RDDP work group charter is clear that RDDP must be standardized on standard TCP. Hence the motivation for the IFT proposal. Also, the benefit of the second CRC for the unaligned case seems much greater than the cost of an extra CRC in the aligned case. > padding: It's definitely easier to implement CRC logic when > everything is > word aligned. This is not an issue for which I have a whole lot of energy. But my experience and those of the other hardware designers I have worked with directly contradict the above. Unaligned CRCs are trivial to implement. Logic necessary to insert padding is not trivial. Either way works. My vote is no padding, but I'm not terribly concerned about conceding this point if a majority feel otherwise. > It's not too much overhead - upto 3 bytes per > DDP segment. I > also agree that this is not a MUST requirement, and RNIC > vendor should be > able to handle not-aligned CRC generation/validation. > <prc> Padding also eliminates the problem of how to deal with > markers that fall in the middle of the CRC. If everything is > aligned on 4 byte breaks, then markers also end up on the > same breaks (tied into the 512 byte marker interval). > <prc> True. IFT does not propose markers, so this point is moot for IFT. > <prc> [...] I was personally of the opinion that we should > allow or even require packing, but was in a minority opinion ;-(. > <prc> I suspect that packing may ultimately be required by the IETF. Either that, or some detailed explaining as to why network traffic patterns are not negatively affected. > markers: We may argue whether this feature is helpful or not, > and both side > could give plenty examples showing cons and pros of this > feature. I volunteer for the cons. :) There are some basic architectural issues needing to be resolved as to whether it is permissible for DDP to do out of order placement when running over an ordered transport. I personally am not religious about this either way. But unless this can be resolved, markers make no sense whatsoever. I do believe the benefits of out of order placement are sufficiently questionable that the simplicity of the in order model should not be compromised if at all avoidable. I am a bit religious in feeling that markers are an ugly, complicated, and technically embarrassing way to solve the problem of alignment. It has been clearly stated that primary problem the RDDP group needs to solve is layering RDDP on standard TCP, and layering on an experimental aligned version of TCP is secondary. Markers appears to be an effort to optimize for the secondary case at the expense of the primary case. I believe that alignment can be done without markers and without complicating or burdening the primary objective of standard TCP. I will attempt to explore this in future drafts. draft-ietf-tsvwg-tcp-ulp-frame-01.txt (expired), is an example of such an approach. This type of approach is far more consistent with the RDDP group charter of treating standard TCP as the primary objective, and an experimental aligning version of TCP as a secondary objective. _______________________________________________ rddp mailing list rddp@ietf.org https://www1.ietf.org/mailman/listinfo/rddp _______________________________________________ tsvwg mailing list tsvwg@ietf.org https://www1.ietf.org/mailman/listinfo/tsvwg
- [Tsvwg] RE:I-D ACTION:draft-williams-iwarp-ift-00… Williams, Jim
- [Tsvwg] Re: [rddp] RE:I-D ACTION:draft-williams-i… Caitlin Bestler
- [Tsvwg] RE:I-D ACTION:draft-williams-iwarp-ift-00… Williams, Jim
- RE: [Tsvwg] RE:I-D ACTION:draft-williams-iwarp-if… Culley, Paul
- Re: [Tsvwg] RE:I-D ACTION:draft-williams-iwarp-if… Caitlin Bestler
- [Tsvwg] RE:I-D ACTION:draft-williams-iwarp-ift-00… Williams, Jim
- RE: [Tsvwg] RE:I-D ACTION:draft-williams-iwarp-if… pat_thaler
- Re: [rddp] RE: [Tsvwg] RE:I-D ACTION:draft-willia… Caitlin Bestler
- [Tsvwg] Re: [rddp] RE:I-D ACTION:draft-williams-i… Vadim Makhervaks
- [Tsvwg] RE: [rddp] RE:I-D ACTION:draft-williams-i… Culley, Paul
- [Tsvwg] RE:I-D ACTION:draft-williams-iwarp-ift-00… Williams, Jim
- [Tsvwg] Re: [rddp] RE:I-D ACTION:draft-williams-i… Vadim Makhervaks
- [Tsvwg] Re: [rddp] RE:I-D ACTION:draft-williams-i… Giora Biran
- RE: [Tsvwg] RE:I-D ACTION:draft-williams-iwarp-if… Uri Elzur
- [Tsvwg] RE: [rddp] RE:I-D ACTION:draft-williams-i… Uri Elzur
- RE: [Tsvwg] RE:I-D ACTION:draft-williams-iwarp-if… Williams, Jim