Re: [storm] iSER - what to do

<david.black@emc.com> Tue, 17 July 2012 01:20 UTC

Return-Path: <david.black@emc.com>
X-Original-To: storm@ietfa.amsl.com
Delivered-To: storm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 365F521F87FF for <storm@ietfa.amsl.com>; Mon, 16 Jul 2012 18:20:24 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.194
X-Spam-Level:
X-Spam-Status: No, score=-102.194 tagged_above=-999 required=5 tests=[AWL=-0.195, BAYES_00=-2.599, J_CHICKENPOX_72=0.6, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id bPdbFjku5lCn for <storm@ietfa.amsl.com>; Mon, 16 Jul 2012 18:20:23 -0700 (PDT)
Received: from mexforward.lss.emc.com (hop-nat-141.emc.com [168.159.213.141]) by ietfa.amsl.com (Postfix) with ESMTP id 0FC6C21F87FE for <storm@ietf.org>; Mon, 16 Jul 2012 18:20:22 -0700 (PDT)
Received: from hop04-l1d11-si01.isus.emc.com (HOP04-L1D11-SI01.isus.emc.com [10.254.111.54]) by mexforward.lss.emc.com (Switch-3.4.3/Switch-3.4.3) with ESMTP id q6H1L6PB014481 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 16 Jul 2012 21:21:07 -0400
Received: from mailhub.lss.emc.com (mailhub.lss.emc.com [10.254.222.130]) by hop04-l1d11-si01.isus.emc.com (RSA Interceptor); Mon, 16 Jul 2012 21:20:46 -0400
Received: from mxhub03.corp.emc.com (mxhub03.corp.emc.com [10.254.141.105]) by mailhub.lss.emc.com (Switch-3.4.3/Switch-3.4.3) with ESMTP id q6H1KklI012966; Mon, 16 Jul 2012 21:20:46 -0400
Received: from mx15a.corp.emc.com ([169.254.1.189]) by mxhub03.corp.emc.com ([10.254.141.105]) with mapi; Mon, 16 Jul 2012 21:20:45 -0400
From: david.black@emc.com
To: nezhinsky@gmail.com, storm@ietf.org
Date: Mon, 16 Jul 2012 21:20:44 -0400
Thread-Topic: [storm] iSER - what to do
Thread-Index: Ac1iW1TGtSebRbE7RA28PPzcQVgFZwBU/mAQ
Message-ID: <8D3D17ACE214DC429325B2B98F3AE71208DD882E@MX15A.corp.emc.com>
References: <8D3D17ACE214DC429325B2B98F3AE71208C14966@MX15A.corp.emc.com> <CAP_=6d+-VfyBOOP4pudwqZxy6dtRD=OPzeZ=W3br=KkPJGfnoQ@mail.gmail.com> <8D3D17ACE214DC429325B2B98F3AE71208C14A8F@MX15A.corp.emc.com> <8D3D17ACE214DC429325B2B98F3AE71208D3B0CA@MX15A.corp.emc.com> <CAEkHY=esUJWgBoosDVLaoBnQLy0BH-v0gb0+z60AuoimXtCKag@mail.gmail.com>
In-Reply-To: <CAEkHY=esUJWgBoosDVLaoBnQLy0BH-v0gb0+z60AuoimXtCKag@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-EMM-MHVC: 1
Subject: Re: [storm] iSER - what to do
X-BeenThere: storm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Storage Maintenance WG <storm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/storm>, <mailto:storm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/storm>
List-Post: <mailto:storm@ietf.org>
List-Help: <mailto:storm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/storm>, <mailto:storm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 17 Jul 2012 01:20:24 -0000

Alexander,

Thank you for the response.

> All in all, I suggest that we bite the bullet, complete the spec and head
> towards fully spec-compliant implementations of both initiator and target
> as soon as possible.
>
> On practical grounds we can address the distro maintainers to employ all
> possible means to distribute compliant updates sooner than later,
> as those will represent a special, critical change.

We can update the spec now - what do you think a reasonable timeframe
is to push out that code?

Running through your suggested approach:

> 1. iSERHelloRequired remains defined as is, with default=No.

Ok.

> 2. It becomes *mandatory* for a fully-spec compliant initiator
>    implementation to communicate iSERHelloRequired=Yes.
>    * If this key is not sent then the "new" target knows that it has
>      encountered an "old" initiator.
>    * If the initiator sends  iSERHelloRequired=No, it means it chooses
>      (for some bizarre reasons) to behave as an "old" one - while
>      such behavior is strongly discouraged.
>      I guess the requirement that:
>      "the initiator SHOULD send iSERHelloRequired=Yes"
>      reflects the situation, correct me if I'm wrong.

Ok, that seems correct ... rewriting using my own words ...

Specifically, a spec-compliant initiator SHOULD begin the negotiation
with a "Yes" value.  In addition, a spec-compliant target SHOULD
respond to iSERHelloRequired=Yes with iSERHelloRequired=Yes.

A spec-compliant target MUST treat non-receipt of
iSERHelloRequired and receipt of iSERHelloRequired=No in the same
fashion - they both indicate an old initiator. [old = does not use
iSER Hello].

Further, a spec-compliant initiator SHOULD be prepared to receive
both iSERHelloRequired=NotUnderstood and iSERHelloRequired=No responses,
and MUST treat both responses in the same fashion - they both
indicate an old target [old = does not use iSER Hello].


> 3. "New" initiator will recognize an "old" target by receiving
>    "NotUnderstood" in response to iSERHelloRequired=Yes.

Or by receiving iSERHelloRequired=No in response.

>    Then it can either refuse to deal with it, or to employ a range of
>    tricky means used until now.
>    We can describe those means as the guidelines, e.g. :
>    * posting one or better MaxOutstandingUnexpectedPDUs buffers
>    * to be really on the safe side, having those buffers at least 8KB long.

I would say "SHOULD post" at least one or more buffers that are sized
for negotiation (and not say 8KB explicitly), explain this reduces the
possibility that early unsolicited PDUs will cause the RCaP connection
to close, and state that this technique is known to work with existing
InfiniBand iSER targets.
 
>    As we are trying to neutralize the shortcomings of the existing
>    targets, the initiator can bet that the target won't send split
>    login responses, as it regularly does not do so today.

Ok, this does need to be stated as something that existing targets
do not do.

> 4. "New" target will recognize an "old" initiator by having received
>    iSERHelloRequired=No either implicitly or explicitly.

"implicitly or explicitly" means either iSERHelloRequired was not
received or iSERHelloRequired=No was received.  Both cases MUST be
treated in the same fashion.

>    Then it must
>    ignore the iSERHello absence 

Those aren't the right words.  The target knows that iSERHello will
not be used - it can choose to terminate the negotiation without setting
up iSER.  If it does not ...

and may also take some precautions,
>    like:

... then the target SHOULD do one of the following:

>    * delaying sending any "unexpected" PDUs until the first PDU is
>      received from the initiator after the final login response
>      has been sent
>    * taking a reasonable timeout, say a second (the exact value
>      does not matter as the initiator can't count on it anyway and
>      no value will solve the problem in full, theoretically).
>    * doing both, that is waiting for the first incoming PDU and
>      taking a timer to start sending NOP-INs in case no PDUs arrived
>      during the timeout period, to be able to detect silent connection
>      failures.

I believe the second and third bullets ought to be combined, in that
receipt of a PDU is sufficient reason to end the timeout period before
it would otherwise expire.

> 5. "New" target and "new" initiator will count on iSERHello as the
>    guarantee of proper buffer posting
> 
> 6. "Old" target and "old" initiator will work as they do now, in their
>    double bliss of ignorance.

We also need to issue a warning about the latter combination risking
RCaP session termination if unsolicited PDUs show up from the target
before the initiator is ready.

If the above looks close to correct, we can start working on text for
the draft ...

Thanks,
--David

> -----Original Message-----
> From: Alexander Nezhinsky [mailto:nezhinsky@gmail.com]
> Sent: Sunday, July 15, 2012 3:27 AM
> To: storm@ietf.org; Black, David; Mike Ko; Paul Koning; Mallikarjun
> Chadalapaka; Or Gerlitz; Mike Christie
> Subject: Re: [storm] iSER - what to do
> 
> Hi, all
> 
> Sorry for a late answer (again).
> 
> I have been thinking over this issue hesitantly for a long time being
> close to just agree with the latest set of suggestions.
> But then I realized there is a simple counter-argument which
> complicates things even more.
> 
> When the initiator sends its final Login Request it is not guaranteed
> that the next Login Response it receives is the "final" one, too.
> If the target has more text data to send than the hardcoded 8KB, it
> will split it into two (or more) PDUs by raising Continue bit in all
> its responses except the last.
> 
> This is a rare event but it means that to be fully compliant and
> full-proof the initiator can't just post another N buffers to anticipate
> all "unexpected" PDUs from target.
> 
> It posts one 8KB buffer for the next Login Response, but it should be
> ready for the case where the response contains C=1. In such case
> it would post another 8KB buffer and answer ok to continue.
> 
> Regular initiator rx-buffers are much smaller than 8KB.
> Implementation-wise they are usually allocated from a separate pool or
> some other kind of discrimination is made between the login and
> full-featured-phase buffers.
> 
> As there is no acceptable way to reclaim the buffers after they have
> been posted, the only way out is to post a few 8KB buffers, but it will
> make the implementation even more complicated and cumbersome.
> 
> All in all, I suggest that we bite the bullet, complete the spec and head
> towards fully spec-compliant implementations of both initiator and target
> as soon as possible.
> On pratical grounds we can address the distro maintainers to employ all
> possible means to distribute compliant updates sooner than later,
> as those will represent a special, critical change.
> 
> To minimize the damages i suggest taking the following path:
> 
> 1. iSERHelloRequired remains defined as is, with default=No.
> 
> 2. It becomes *mandatory* for a fully-spec compliant initiator
>    implementation to communicate iSERHelloRequired=Yes.
>    * If this key is not sent then the "new" target knows that it has
>      encountered an "old" initiator.
>    * If the initiator sends  iSERHelloRequired=No, it means it choses
>      (for some bizarre reasons) to behave as an "old" one - while
>      such behavior is strongly discouraged.
>      I guess the requirement that:
>      "the initiator SHOULD send iSERHelloRequired=Yes"
>      reflects the situation, correct me if i'm wrong.
> 
> 3. "New" initiator will recognize an "old" target by receiving
>    "NotUnderstood" in response to iSERHelloRequired=Yes.
>    Then it can either refuse to deal with it, or to employ a range of
>    tricky means used until now.
>    We can describe those means as the guidelines, e.g. :
>    * posting one or better MaxOutstandingUnexpectedPDUs buffers
>    * to be really on the safe side, having those buffers at least 8KB long.
> 
>    As we are trying to neutralize the shortcomings of the existing
>    targets, the initiator can bet that the target won't send split
>    login responses, as it regularly does not do so today.
> 
> 4. "New" target will recognize an "old" initiator by having received
>    iSERHelloRequired=No either implicitly or explicitly. Then it must
>    ignore the iSERHello absense and may also take some precautions,
>    like:
>    * delaying sending any "unexpected" PDUs until the first PDU is
>      received from the initiator after the final login response
>      has been sent
>    * taking a reasonable timeout, say a second (the exact value
>      does not matter as the initiator can't count on it anyway and
>      no value will solve the problem in full, theoretically).
>    * doing both, that is waiting for the first incoming PDU and
>      taking a timer to start sending NOP-INs in case no PDUs arrived
>      during the timeout period, to be able to detect silent connection
>      failures.
> 
> 5. "New" target and "new" initiator will count on ISERHello as the
>    guarantee of proper buffer posting
> 
> 6. "Old" target and "old" initiator will work as they do now, in their
>    double bliss of ignorance.
> 
> By the way, the initiator patch alleviating the problem by posting one
> additional login buffer was submitted relatively recently and all previous
> deployed implementations of the initiator are exposed.
> Eventually, the new better code is making its way to the users of all distros.
> This is a common situation encountered by the linux kernel community
> quite often. Let's take this as a working example, make the spec fool-proof
> and advise the implementors how to minimize the damages with the old
> software, while keeping everything as simple as possible under these
> already over-complicated circumstances.
> 
> Alexander