Re: [secdir] Use of StringPrep/Unicode (was Re: SecDir and AppsDir review of draft-ietf-storm-iscsi-cons-06)

"Black, David" <david.black@emc.com> Thu, 11 October 2012 13:35 UTC

Return-Path: <david.black@emc.com>
X-Original-To: secdir@ietfa.amsl.com
Delivered-To: secdir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 06D9821F8734; Thu, 11 Oct 2012 06:35:29 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -103.599
X-Spam-Level:
X-Spam-Status: No, score=-103.599 tagged_above=-999 required=5 tests=[AWL=1.000, BAYES_00=-2.599, GB_I_LETTER=-2, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id F4kQLrHaOSor; Thu, 11 Oct 2012 06:35:28 -0700 (PDT)
Received: from mexforward.lss.emc.com (hop-nat-141.emc.com [168.159.213.141]) by ietfa.amsl.com (Postfix) with ESMTP id EDE8621F871A; Thu, 11 Oct 2012 06:35:27 -0700 (PDT)
Received: from hop04-l1d11-si01.isus.emc.com (HOP04-L1D11-SI01.isus.emc.com [10.254.111.54]) by mexforward.lss.emc.com (Switch-3.4.3/Switch-3.4.3) with ESMTP id q9BDZKDC013777 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 11 Oct 2012 09:35:24 -0400
Received: from mailhub.lss.emc.com (mailhubhoprd02.lss.emc.com [10.254.221.253]) by hop04-l1d11-si01.isus.emc.com (RSA Interceptor); Thu, 11 Oct 2012 09:35:07 -0400
Received: from mxhub07.corp.emc.com (mxhub07.corp.emc.com [128.222.70.204]) by mailhub.lss.emc.com (Switch-3.4.3/Switch-3.4.3) with ESMTP id q9BDZ5IS015806; Thu, 11 Oct 2012 09:35:06 -0400
Received: from mxhub37.corp.emc.com (128.222.70.104) by mxhub07.corp.emc.com (128.222.70.204) with Microsoft SMTP Server (TLS) id 8.3.213.0; Thu, 11 Oct 2012 09:35:05 -0400
Received: from mx15a.corp.emc.com ([169.254.1.83]) by mxhub37.corp.emc.com ([128.222.70.104]) with mapi; Thu, 11 Oct 2012 09:35:04 -0400
From: "Black, David" <david.black@emc.com>
To: Alexey Melnikov <alexey.melnikov@isode.com>, Mallikarjun Chadalapaka <cbm@chadalapaka.com>
Date: Thu, 11 Oct 2012 09:35:03 -0400
Thread-Topic: Use of StringPrep/Unicode (was Re: SecDir and AppsDir review of draft-ietf-storm-iscsi-cons-06)
Thread-Index: Ac2npIiVzRwd1qTfT62HXRexTlNkFgACqOkQ
Message-ID: <8D3D17ACE214DC429325B2B98F3AE7120DFAE9D4@MX15A.corp.emc.com>
References: <4FBFAE5F.8010305@gmail.com> <506C43AA.9010206@isode.com> <E160851FCED17643AE5F53B5D4D0783A4C410C95@BL2PRD0610MB361.namprd06.prod.outlook.com> <5076AEDB.7030900@isode.com>
In-Reply-To: <5076AEDB.7030900@isode.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-EMM-MHVC: 1
X-Mailman-Approved-At: Thu, 11 Oct 2012 07:49:56 -0700
Cc: "Black, David" <david.black@emc.com>, "iesg@ietf.org" <iesg@ietf.org>, "draft-ietf-storm-iscsi-cons.all@tools.ietf.org" <draft-ietf-storm-iscsi-cons.all@tools.ietf.org>, "secdir@ietf.org" <secdir@ietf.org>
Subject: Re: [secdir] Use of StringPrep/Unicode (was Re: SecDir and AppsDir review of draft-ietf-storm-iscsi-cons-06)
X-BeenThere: secdir@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Security Area Directorate <secdir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/secdir>, <mailto:secdir-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/secdir>
List-Post: <mailto:secdir@ietf.org>
List-Help: <mailto:secdir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/secdir>, <mailto:secdir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 11 Oct 2012 13:35:29 -0000

Alexey,

Thanks for splitting this out into a separate thread.  More comments inline.

> Sorry for the delay with this reply.
> 
> On 09/10/2012 04:42, Mallikarjun Chadalapaka wrote:
> >     all iSCSI implementations complying with this document. Protocol
> >     behavior defined in this Section MUST be exhibited by iSCSI
> >     implementations on an iSCSI session when they negotiate the
> >     TaskReporting (Section 13.23) key to "FastAbort" on that session.
> >     The execution of ABORT TASK SET, CLEAR TASK SET, LOGICAL UNIT
> >     RESET, TARGET WARM RESET, and TARGET COLD RESET TMF Requests
> >     consists of the following sequence of actions in the specified
> >     order on the specified party.
> >
> > In Section 4.2.7.1
> >
> >        iSCSI names are composed only of displayable characters.
> >
> > What does "displayable" means here? "ASCII printable"? "Unicode
> > printable"?
> >
> > [Mallikarjun:] The latter. Note that all normative statements make it clear
> that we refer to Unicode-encoding - especially look at Section 4.2.7.2 where
> it discusses the stringprep-normalization. This section (4.2.7.1) is just a
> non-normative discussion of the user-visible characteristics.
>
> At this point I haven't read 4.2.7.2 and it is not clear that this
> section is non normative. I wish you would use more standard terminology
> and/or forward references.

"Unicode printable" is intended - the details are in 4.2.7.2 and the
stringprep profile in RFC 3722.  I would change "displayable characters"
to "printable ASCII and Unicode characters".  Yes, I know that ASCII is a
subset of Unicode, but not every reader of this RFC will have that level
of Unicode familiarity.

> >>           iSCSI
> >>           names allow the use of international character sets but are
> >>           not case sensitive.
> >>
> >> What does this mean exactly? Do you mean ASCII case sensitivity? Unicode
> >> case sensitivity?

There's been a separate email exchange with Pete Resnick about this text.
The conclusion is that the words "case sensitive" above are misleading,
so the above sentence needs to be rewritten and expanded as follows:

	iSCSI names allow the use of international characters
	but uppercase characters are prohibited.  The iSCSI stringprep
      profile [RFC3722] maps uppercase characters to lowercase and
      SHOULD be used to prepare iSCSI names from input that may
      include uppercase characters.

> >>           No whitespace characters are used in
> >>           iSCSI names.
> >>
> >> What is "whitespace". U+0020? (ASCII space) Something else?
> Any comments on this?

Something else - see RFC 3722, which specifies the explicit prohibitions
beyond U+0020, specifically it prohibits the space characters in tables
C.1.1 and C.1.2 from RFC 3454.  If it helps, we could add ", see [RFC3722]
for details" at the end of that sentence.

> >> 4.2.7.2. iSCSI Name Encoding
> >>
> >>     The stringprep process is described in [RFC3454]; iSCSI's use of
> >>     the stringprep process is described in [RFC3722]. Stringprep is a
> >>     method designed by the Internationalized Domain Name (IDN) working
> >>     group to translate human-typed strings into a format that can be
> >>     compared as opaque strings. Strings MUST NOT include punctuation,
> >>     spacing, diacritical marks, or other characters that could get in
> >>     the way of readability.
> >>
> >> This MUST NOT is not well defined. You need to define specific Unicode
> >> codepoints or character classes.
> >>
> > [Mallikarjun:] Given the number of scripts that Unicode supports, this does
> not strike me a trivial exercise. I suspect that's why the original ips WG
> left it as such, and this level of specificity has worked well for over a
> decade. However, I am not a Unicode expert at all, so I may be overlooking
> something obvious you have in mind?
> Unicode Consortium has defined classes of characters (e.g. "letters",
> "digits", "punctuation", etc.). You can just reference such classes.
> Happy to try to suggest some text or find a person who can help you
> reword this.
> 
> Without specific details your MUST NOT is not implementable and not
> enforceable. Worse, I suspect nobody has implemented it anyway. I think
> your choices are: (a) delete the sentence as it is not useful or (b) fix
> it so that it actually ensures interoperability.

Alexey is correct about this one - that "MUST NOT" is too strong and not
implementable, even though it's a "good idea" in principle.

This text was originally intended as usage guidance on avoidance of
characters that result in different strings that look the same or similar
to humans.  This is already a concern in ASCII (e.g., courtesy of l, 1, 0
and O), but is much more of a concern for Unicode.

The sentence needs a rewrite as guidance that avoids using "MUST NOT", e.g.:

	iSCSI names are expected to be used by administrators for purposes
	such as system configuration - for this reason, characters that may
	lead to human confusion among different iSCSI names (e.g., punctuation,
	spacing, diacritical marks) should be avoided, even when such
	characters are allowed as stringprep processing output by [RFC3722].

The mandatory requirements on allowed vs. prohibited characters are in
RFC 3722, and (IMHO) it's best to not expand on them here.

> >>     The stringprep process also converts
> >>     strings into equivalent strings of lower-case characters.
> >>
> >>
> >>     The stringprep process does not need to be implemented if the
> >>     names are only generated using numeric and lower-case (any
> >>     character set) alphabetic characters.
> >>
> >> Lower-case is not well defined, unless you say you mean ASCII or
> >> something else.

Actually it is well-defined in RFC 3722 via its reference to table B.2
in RFC 3454.

> >> Also, "any character set"? This really doesn't look correct.

OTOH, that's definitely not correct - "(any character set)" should just
be deleted.  This text was intended to refer to what Unicode calls a
"script" and it's probably better to just not mention that concept here.

> >   [Mallikarjun:] The entire iSCSI Name discussion is in the context of
> Unicode, and it is called out in normative sentences in appropriate places.
> This is non-normative descriptive text. OTOH, if you have specific
> suggestions, they are welcome.
> I think you meant to write:
> 
>     The stringprep process does not need to be implemented if the
>     names are only generated using ASCII numeric and Unicode characters with
>	"Lowercase" property.

Unfortunately, that's not quite right, because the actual specification
is what RFC 3722 allows via the tables in RFC 3454.  Those RFC 3454 are
definitive, as opposed to the Unicode specifications of character
properties (which were used to generate those tables).

The following text should get this right:

     The stringprep process does not need to be implemented if the
     names are generated using only characters allowed as output by
     the stringprep processing specified in [RFC3722].  Those allowed
     characters include all ASCII lowercase and numeric characters,
     as well as lowercase Unicode characters as specified in [RFC3722].

The PRECIS work should allow a future update that uses the Unicode
"Lowercase" property (and other Unicode) properties.

Thanks,
--David
----------------------------------------------------
David L. Black, Distinguished Engineer
EMC Corporation, 176 South St., Hopkinton, MA  01748
+1 (508) 293-7953             FAX: +1 (508) 293-7786
david.black@emc.com        Mobile: +1 (978) 394-7754
----------------------------------------------------