RE: [Ipoverib] Comments on IPoIB Connected Mode Connection Establ ishment

Vivek Kashyap <kashyapv@us.ibm.com> Thu, 18 August 2005 06:43 UTC

Received: from localhost.localdomain ([127.0.0.1] helo=megatron.ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1E5e7w-0001eT-EL; Thu, 18 Aug 2005 02:43:48 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by megatron.ietf.org with esmtp (Exim 4.32) id 1E5e7u-0001eL-M8 for ipoverib@megatron.ietf.org; Thu, 18 Aug 2005 02:43:46 -0400
Received: from ietf-mx.ietf.org (ietf-mx [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id CAA14541 for <ipoverib@ietf.org>; Thu, 18 Aug 2005 02:43:43 -0400 (EDT)
Received: from e35.co.us.ibm.com ([32.97.110.133]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1E5ehZ-0005Qn-Bu for ipoverib@ietf.org; Thu, 18 Aug 2005 03:20:38 -0400
Received: from westrelay02.boulder.ibm.com (westrelay02.boulder.ibm.com [9.17.195.11]) by e35.co.us.ibm.com (8.12.10/8.12.9) with ESMTP id j7I6hTQH649468 for <ipoverib@ietf.org>; Thu, 18 Aug 2005 02:43:30 -0400
Received: from d03av04.boulder.ibm.com (d03av04.boulder.ibm.com [9.17.195.170]) by westrelay02.boulder.ibm.com (8.12.10/NCO/VERS6.7) with ESMTP id j7I6h6KM444426 for <ipoverib@ietf.org>; Thu, 18 Aug 2005 00:43:06 -0600
Received: from d03av04.boulder.ibm.com (loopback [127.0.0.1]) by d03av04.boulder.ibm.com (8.12.11/8.13.3) with ESMTP id j7I6hTeV005842 for <ipoverib@ietf.org>; Thu, 18 Aug 2005 00:43:29 -0600
Received: from sig-9-65-59-213.mts.ibm.com (sig-9-65-59-213.mts.ibm.com [9.65.59.213]) by d03av04.boulder.ibm.com (8.12.11/8.12.11) with ESMTP id j7I6hSaV005827; Thu, 18 Aug 2005 00:43:28 -0600
Date: Wed, 17 Aug 2005 23:35:38 -0700
From: Vivek Kashyap <kashyapv@us.ibm.com>
X-X-Sender: kashyapv@localhost.localdomain
To: Dror Goldenberg <gdror@mellanox.co.il>
Subject: RE: [Ipoverib] Comments on IPoIB Connected Mode Connection Establ ishment
In-Reply-To: <506C3D7B14CDD411A52C00025558DED608749581@mtlex01.yok.mtl.com>
Message-ID: <Pine.LNX.4.62.0508172327090.4324@localhost.localdomain>
References: <506C3D7B14CDD411A52C00025558DED608749581@mtlex01.yok.mtl.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset="US-ASCII"; format="flowed"
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 848ed35f2a4fc0638fa89629cb640f48
Cc: IPoverIB <ipoverib@ietf.org>
X-BeenThere: ipoverib@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: IP over InfiniBand WG Discussion List <ipoverib.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ipoverib>, <mailto:ipoverib-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:ipoverib@ietf.org>
List-Help: <mailto:ipoverib-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ipoverib>, <mailto:ipoverib-request@ietf.org?subject=subscribe>
Sender: ipoverib-bounces@ietf.org
Errors-To: ipoverib-bounces@ietf.org

On Wed, 17 Aug 2005, Dror Goldenberg wrote:

>
>
>> From: Vivek Kashyap [mailto:kashyapv@us.ibm.com]
>> Sent: Monday, August 15, 2005 9:41 AM
>>
>> On Wed, 10 Aug 2005, Dror Goldenberg wrote:
>>
>>> Reviewing the ipoib-cm draft
>>>
>> (http://www1.ietf.org/mail-archive/web/ipoverib/current/msg01372.html
>>>
>> <http://www1.ietf.org/mail-archive/web/ipoverib/current/msg013
> 72.html>
>>> ) I have three comments about the connection establishment of
>>> IPoIB-CM.
>>>
>>> (1) Service ID Reserved Bits
>>>
>>> Section 3.3. defines the Service-ID as having reserved bits which must
>>> be transmitted as zeroes and ignored on receive.
>>> It doesn't sound right to ignore those bits on the receive  side.
> Ignoring
>>> those bits on
>>> receive essentially means that the IPoIB-CM has to listen on a range of
>>> service IDs.
>>> I believe that it's better to define a service ID without  reserved bits
> such
>>> that the
>>> passive side listens on a single service ID.
>>> For example, service IDs defined for SDP, don't have any reserved bits.
>>
>> Is it not possible to mask the bits marked as reserved (or
>> unused)? The draft requires that the reserved bits must be
>> zeroes on send. If it is a limitation
>> of the connection manager that it cannot mask the bits and
>> will have to listen to the 'range' then I agree that the
>> service-ID will have to be specified
>> received as zeroes instead of ignoring them.
>>
>
> The CM interface is not defined in the IB spec, it is therefore
> implementation
> dependent. My understanding is that in OpenIB, the Linux CM is capable of
> masking the Service ID, but the Windows CM is not. I personally think that
> we should define those reserved as zero in send and receive, unless we
> can come up with an example of how future extension can leverage
> this feature.

ok, we might have to therefore check for zeros. Does anyone have
experience with  Solaris, HP/UX, other CM's?

[Future...well, one could think of a service-ID that is tied to say a 
protocol/port pair...but then there might be other ideas. ]
>
>>>
>>>
>>> (2) Active/Active Model Support
>>>
>>> 2.3 implies that it is desirable to have one connection between peers,
>>> i.e. one QP on each side. I believe that active-active connection
>>> establishment model is more
>>> appropriate here. Otherwise, there may be cases of simultaneous access
> that
>>> will
>>> end up having two connections between the same two nodes  (one for each
>>> direction).
>>> The current Service ID assignment, makes it impossible to use the
>>> active-active
>>> model. The main reason is that the service ID has to be  identical at
> both
>>> ends. In
>>> other words, active-active requires that both REQ messages  be used with
> the
>>> same
>>> Service ID. The current Service ID is different (it has the  QP number
> in
>>> it).
>>>
>>> A model that enables active-active, for example, is using a  Service ID
>>> which is a derivative of the PKey (no QP# in the Service  ID). However,
>>> it will not support the model of more than
>>> one UD QP on the same HCA on the same PKey. This model is important.
>>>
>>> Another model that enables active-active is to have both local and
>>> remote QP numbers in the service ID. However, it will require both
>>> ends to listen on many Service IDs.
>>>
>>> Another alternative is to stay with active-passive connection
>>> establishment, but add an interface through which one of the
>>> connection can be quiesced and closed. In the
>>> case where you happened to open two connections between the  hosts, then
> you
>>> can close one of them afterwards.
>>>
>>>
>>> (3) Identification of Remote Peer in Passive Side
>>>
>>> It is unclear to me how can the passive side tell which is  the remote
>>> peer. It must be able to tell that, so that it can use the  other half
>>> of the connection to send messages
>>> back to the active side (instead of creating a new connection).
>>> The identification of the peer is based on Link-Layer  Address, which is
>>> {GID,QPN}. The
>>> CM REQ message only includes GID. Maybe the QPN (the UD QP)  should be
> passed
>>> through the REQ private data ?
>>
>> ok..thinking about the above points, how about:
>>
>> 1. Include the local QPN in the request
>> 2. The receiver of CM REQ will have:
>>  	- local QPN (by way of ServiceID requested)
>>  	- remote QPN (in private data as above)
>>  	- remote GID (in REQ)
>
> Sounds good. It will fully solve item (3) in my original mail.
>
>> 3. In case of the active-active connection the request from
>> the numerically
>> smaller MAC Address (QPN+GID or probably just the GID) is
>> terminated. The error code 'state connection' is a possibility here.
>>
>> The active-active situation is known when it is determined
>> that there is an outstanding request to the remote peer (as
>> determined by
>> GID+QPN) while a request from the same is received on local ServiceID.
>
> In IB, active-active is only determined by equal Service IDs. See IB
> spec 1.2 vol1 12.9.7.1, the Active CM state machine goes to
> Peer-Compare state based solely on the Service ID.
> Therefore, if you want to have active-active, then you must ensure that
> both service IDs are identical. In the current draft, the Service IDs are
> different, as they include a remote QP number.
>

Yes, if we are determining it based on the CM. However, I was suggesting
the above as what the ipoib component will do when it is informed by 
CM. It can return a suitable error to CM when it determines that a particular
connection shouldn't be allowed. Is that not possible?

>>
>> 4. However, there might be a need to establish multiple
>> connections between the same two peers. Therefore, if a
>> connection already exits between the remote GID+QPN on the
>> local ServiceID, it is up to the
>> receiver to accept this connection. The implementation
>> accepting multiple connections and the one requesting
>> multiple connections on the same ServiceID must manage the
>> load; it is beyond the scope of this spec.
>
> I agree.
>
>>
>> If two connection requests cross then case 3. above applies.
>
>
> They must have same Service ID in order to cross, from the IB
> perspective.
>
>>
>> thoughts?
>>
>> Vivek
>>
>>>
>>> -Dror
>>>
>>
>

_______________________________________________
IPoverIB mailing list
IPoverIB@ietf.org
https://www1.ietf.org/mailman/listinfo/ipoverib