Re: [storm] iSER draft

Tom Talpey <ttalpey@microsoft.com> Fri, 15 July 2011 18:16 UTC

Return-Path: <ttalpey@microsoft.com>
X-Original-To: storm@ietfa.amsl.com
Delivered-To: storm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EFF8221F8B61 for <storm@ietfa.amsl.com>; Fri, 15 Jul 2011 11:16:45 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -110.298
X-Spam-Level:
X-Spam-Status: No, score=-110.298 tagged_above=-999 required=5 tests=[AWL=-0.300, BAYES_00=-2.599, HTML_MESSAGE=0.001, J_CHICKENPOX_46=0.6, RCVD_IN_DNSWL_HI=-8, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id nG-sOhmy4rRr for <storm@ietfa.amsl.com>; Fri, 15 Jul 2011 11:16:39 -0700 (PDT)
Received: from smtp.microsoft.com (smtp.microsoft.com [131.107.115.215]) by ietfa.amsl.com (Postfix) with ESMTP id 5366021F8B44 for <storm@ietf.org>; Fri, 15 Jul 2011 11:16:39 -0700 (PDT)
Received: from TK5EX14HUBC102.redmond.corp.microsoft.com (157.54.7.154) by TK5-EXGWY-E802.partners.extranet.microsoft.com (10.251.56.168) with Microsoft SMTP Server (TLS) id 8.2.176.0; Fri, 15 Jul 2011 11:16:38 -0700
Received: from TK5EX14MBXC111.redmond.corp.microsoft.com ([169.254.2.64]) by TK5EX14HUBC102.redmond.corp.microsoft.com ([157.54.7.154]) with mapi id 14.01.0323.002; Fri, 15 Jul 2011 11:16:38 -0700
From: Tom Talpey <ttalpey@microsoft.com>
To: Michael Ko <Michael@huaweisymantec.com>, Mallikarjun Chadalapaka <cbm@chadalapaka.com>, 'Caitlin Bestler' <cait@asomi.com>
Thread-Topic: [storm] iSER draft
Thread-Index: AQHMNlPpov6tD7xTq0WTfzkPhGmP2JTUcDZAgABph/mAAdAuAIATxsXggAFa2ACAAHJheIAAD/iAgAAmeC+AALC4gP//kWlMgAEM23A=
Date: Fri, 15 Jul 2011 18:16:38 +0000
Message-ID: <F83812DF4B59B9499C1BC978336D91745EE03EAD@TK5EX14MBXC111.redmond.corp.microsoft.com>
References: <BANLkTi=GVHUBdsZiwsRXxETd_rU=F8f85w@mail.gmail.com> <F83812DF4B59B9499C1BC978336D91745EDE207F@TK5EX14MBXC118.redmond.corp.microsoft.com> <971779FD6B314A40B1A3658E521C0D4A@china.huawei.com> <BANLkTik0rG2=orD0S0L3s62gBSJsLapA+w@mail.gmail.com> <F83812DF4B59B9499C1BC978336D91745EE0279D@TK5EX14MBXC111.redmond.corp.microsoft.com> <B443B9D1-754C-4D74-B43A-858A6031850A@asomi.com> <1680F0051D794095A86AF16343CAD400@china.huawei.com> <F83812DF4B59B9499C1BC978336D91745EE039F9@TK5EX14MBXC111.redmond.corp.microsoft.com> <70CC5C8CCDD74EA0A9CEFF6DDC8145ED@china.huawei.com> <SNT131-ds105B54D3D6E874C31C27A9A0490@phx.gbl> <D74343B13CC640229AB1AFCDEE9F9883@china.huawei.com>
In-Reply-To: <D74343B13CC640229AB1AFCDEE9F9883@china.huawei.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [157.54.51.36]
Content-Type: multipart/alternative; boundary="_000_F83812DF4B59B9499C1BC978336D91745EE03EADTK5EX14MBXC111r_"
MIME-Version: 1.0
Cc: "storm@ietf.org" <storm@ietf.org>
Subject: Re: [storm] iSER draft
X-BeenThere: storm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Storage Maintenance WG <storm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/storm>, <mailto:storm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/storm>
List-Post: <mailto:storm@ietf.org>
List-Help: <mailto:storm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/storm>, <mailto:storm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 15 Jul 2011 18:16:46 -0000

This discussion is starting to fork, but I'll do my best to reply to the top of the thread.

I'm glad we're making progress, but I don't think we're on the same page yet. In particular, I disagree with your statement that "for iWARP, the Steering Address is set to zero." The choice of addressing mode is not made on a per-protocol basis, it is a local issue on the initiator. If ZBVA mode is used, then the addressing mode is zero-based. However local verbs support several other modes, and the "steering address" is not at all constrained to be an "address". Depending on several factors, it can be a virtual address, a physical address, or a completely arbitrary value.

To my reading, the new draft is proposing that an additional value be sent in the Control PDU header, to be added to the BufferOffset of the individual SCSI control PDU payload when the target performs RDMA Read or RDMA Write. To me, this value is not an address, and has no semantics beyond being an initiator-provided value needed to direct the expected RDMA, when it is performed by the target. Personally, I think it would be better defined as a "Base Offset", or "Tagged Base Offset", to which the BufferOffset is added to form the resulting TaggedOffset. There's no further requirement than to have the two values to add together.

I think that any proposed addition of a connection attribute for "iWARP" or "offset" to the normative discussion is incorrect, for the reasons above. While it's true that iWARP may support addressing modes that another RDMA protocol may not, it's neither necessary or desirable to require iSER or iSCSI to change behavior depending on the lower RDMA layer. iSER is appropriately a transport adaptation layer, and should be defined in such a way as to be agnostic to the needs of the lower layer.

So again, I would like to see:

-          Virtual Address renamed to agnostic term (perhaps) Base Offset

-          Clearly stated processing at the target to compute the necessary TaggedOffset

-          Avoidance of discussion of addressing mode requirements or local verb semantics.

From: Michael Ko [mailto:Michael@huaweisymantec.com]
Sent: Thursday, July 14, 2011 9:50 PM
To: Mallikarjun Chadalapaka; Tom Talpey; 'Caitlin Bestler'
Cc: storm@ietf.org
Subject: Re: [storm] iSER draft

Mallikarjun,

Nice to hear from you!

I have embedded my comments in your response.

Mike
----- Original Message -----
From: Mallikarjun Chadalapaka<mailto:cbm@chadalapaka.com>
To: 'Michael Ko'<mailto:Michael@huaweisymantec.com> ; Tom Talpey<mailto:ttalpey@microsoft.com> ; 'Caitlin Bestler'<mailto:cait@asomi.com>
Cc: storm@ietf.org<mailto:storm@ietf.org>
Sent: Thursday, July 14, 2011 6:26 PM
Subject: RE: [storm] iSER draft

I suggest we have to be explicit about the responsibilities of each protocol
layer because the phrase "initiator side needs to translate the Tagged
Offset" sounds vague to me.

[mk] That statement is only meant to convey what would be done by the initiator if it uses address for referencing the buffer rather than using buffer offset to reference the buffer, and the Steering Address is not sent to the target as currently defined in RFC 5046.  It does not appear in the spec.

Also, it may be worth formalizing the notion of
"if Virtual Address was used for this command" via a negotiated connection
attribute.

[mk] The use of the Steering Address is up to the RCaP.  For iWARP, the Steering Address is set to zero.  By adding the explanation (as I indicated in my previous note) on how the Tagged Offset is computed from the Steering Address and the Buffer Offset, it is up to the RCaP to determine what they want to do.

If I followed this thread right, it seems like we are talking roughly about
the following:

1) A shared iSCSI/iSER-level connection attribute (connection-scoped text
key?) that indicates whether Steering Address ("Address") or Tagged Offset
("Offset") is required by the RCaP in the RDMA requests on the wire for that
connection.  How the RCaP API exposes this capability to its local iSER
layer is implementation-dependent, and we do not need spec it.

[mk] Both the Steering Address and the Tagged Offset will be required.  When the Steering Address is set to zero, we have the iWARP mode of behavior as defined in RFC 5046.

2) If the connection attribute = "Offset:,

[mk] I don't think a new connection attribute is needed.  The RCaP just needs to set the Steering Address to the appropriate value.  So for iWARP, set the Steering Address to zero.

a) the semantic requirement on the RCaP layer on the initiator is
that it treats the incoming RDMA memory references as offsets, and does the
appropriate base+offset local translation for RDMA Reads and RDMA Writes.
b) The initiator iSER layer must only use ZB-VA, and must set the
Steering Address to 0 in appropriate control-type PDUs.
c) The target iSER layer must assume that zero TO for the advertised
STag points to the beginning of the initiator I/O Buffer in all the RDMA
Writes and RDMA Reads that it issues (same as now).
3) If the connection attribute = "Address":
a)  the semantic requirement on the RCaP layer on the initiator is
that it treats incoming RDMA memory references as complete Steering
Addresses in RDMA Reads and RDMA Writes.
b) The initiator iSER layer must use non-ZB-VA, and must set the
Steering Address to the RCaP-API-returned Virtual Address in appropriate
control-type PDUs.
c) The target iSER layer must assume that the Steering Address for
the advertised STag points to the beginning of the initiator I/O Buffer, and
compute the Tagged Offset in all the RDMA Writes and RDMA Reads that it
issues by adding the Steering Address to the Buffer Offset received from the
Data_Descriptor of Datamover API.


Mallikarjun








From: storm-bounces@ietf.org<mailto:storm-bounces@ietf.org> [mailto:storm-bounces@ietf.org]<mailto:[mailto:storm-bounces@ietf.org]> On Behalf Of
Michael Ko
Sent: Thursday, July 14, 2011 1:53 PM
To: Tom Talpey; Caitlin Bestler
Cc: storm@ietf.org<mailto:storm@ietf.org>
Subject: Re: [storm] iSER draft

For clarity, I need to update in the next revision of the spec everywhere
where "Tagged Offset", "Steering Tag", etc. are mentioned to specify how the
Tagged Offset is computed from the "Steering Address" and the buffer
offset.

So for example, in the example you cited in section 7.3.5, the target
computes the Tagged Offset using the Steering Address and the Buffer
Offset. For iWARP, the Steering Address is zero, and so the Tagged Offset
is the Buffer Offset in the SCSI Data-in PDU as defined in RFC 5046.

Mike
----- Original Message -----
From: Tom Talpey
To: Michael Ko ; Caitlin Bestler
Cc: Alexander Nezhinsky ; storm@ietf.org<mailto:storm@ietf.org>
Sent: Thursday, July 14, 2011 1:16 PM
Subject: RE: [storm] iSER draft

Changing the name doesn't address the core issues here. There are no
processing rules defined in the draft to support the arithmetic you describe
below, or even to use the Steering/Virtual Address.

For example, section 7.3.5 states the target MUST use the Buffer Offset of
the Data-In PDU as the Tagged Offset of an RDMA Write. Where does the new
Steering Address in the control PDU header get folded into the Tagged
Offset? And what protocol state would trigger the target to do so, including
perform the arithmetic? Same question for several other sections.



From: Michael Ko [mailto:Michael@huaweisymantec.com]<mailto:[mailto:Michael@huaweisymantec.com]>
Sent: Thursday, July 14, 2011 2:39 PM
To: Caitlin Bestler; Tom Talpey
Cc: Alexander Nezhinsky; storm@ietf.org<mailto:storm@ietf.org>
Subject: Re: [storm] iSER draft

When "Virtual Address" is not used, as in iWARP, the initiator side needs to
translate the "Tagged Offset" which is just the offset into the buffer, into
a usable address by adding it to the starting base address of the buffer.
If the starting base address is communicated to the target side, as is done
in some RCaPs, then the "Tagged Offset" is the usable address itself at the
initiator where data is to be fetched or stored. Alex suggested that
perhaps we can change the name "Virtual Address" to "Steering Address".
Then it can be defined in the glossary without tripping over the common term
"virtual address".

Mike
----- Original Message -----
From: Caitlin Bestler
To: Tom Talpey
Cc: Alexander Nezhinsky ; Michael Ko ; storm@ietf.org<mailto:storm@ietf.org>
Sent: Wednesday, July 13, 2011 9:49 PM
Subject: Re: [storm] iSER draft


On Jul 13, 2011, at 8:12 AM, Tom Talpey wrote:

> Yes, understood on the ZBVA/Infiniband issue. Regarding the VA term's
presence in an earlier draft, I did not see it in RFC5046 and I did not
review the expired draft, so consider these comments to be on the overall
change, not the last revision.
>
> Let me try a different approach to perhaps make myself clearer.
>
> First, it seems to me that a "virtual address" has no real meaning to the
network protocol. It begins, perhaps, as some value in an address space on
the initiator, but it's not meaningful on the target in this way, it's only
used to request that the remote RNIC perform a transfer to that originally
registered remote address. So calling it a Virtual Address as an iSER
protocol object seems, to me, an artificial and somewhat leading convention.
>
> Second, the Virtual Address goes out from the initiator to the target in a
Control PDU, but where does it come back? The RDMA Read or Write as depicted
in (xx) shows only a Tagged Offset. So, it's not clear what its protocol
meaning is.
>
> Third, I don't ever see a Tagged Buffer described by a fully qualified
four-tuple. I see it appearing as either { Stag, TO, length } or { Stag, VA,
length }, depending on the addressing mode.
>
> I think the non-ZBVA mode is really just a special case of the existing
one, but where the meaning of TO has changed from a small offset to some
other token, generated and managed only at the initiator. So, it seems
artificial to define it as distinct, and document it as possessing some new
properties. Isn't it just a Target Offset, still?
>
> Tom.
>
>

I agree, so far we have not seen a protocol justification for the need to
add "Virtual Address" to the glossary as something distinct from Target
Offset
for the purpose of defining an IETF protocol.

I am sympathetic to the interoperability issues raised, but I don't think
those can justify something that has *no* justification in the protocol.

If someone could site a class of implementation where there is a real need
for this distinction in an iSER adapter, but as far as I can see the
adapters have
to be able to translate to TO *anyway* in order to use an RDMA Write or RDMA
Read.

Local Interface compatibility with IB can make a *lot* of sense, but why
does it have to make its way into the *wire* protocol?

--
Caitlin Bestler
cait@asomi.com<mailto:cait@asomi.com>
http://www.asomi.com/CaitlinBestlerResume.html