Re: [secdir] Use of StringPrep/Unicode (was Re: SecDir and AppsDir review of draft-ietf-storm-iscsi-cons-06)

Alexey Melnikov <alexey.melnikov@isode.com> Thu, 11 October 2012 13:57 UTC

Return-Path: <alexey.melnikov@isode.com>
X-Original-To: secdir@ietfa.amsl.com
Delivered-To: secdir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2E8CC21F876F; Thu, 11 Oct 2012 06:57:12 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -103.363
X-Spam-Level:
X-Spam-Status: No, score=-103.363 tagged_above=-999 required=5 tests=[AWL=1.236, BAYES_00=-2.599, GB_I_LETTER=-2, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id aLUQc0wV2ALv; Thu, 11 Oct 2012 06:57:11 -0700 (PDT)
Received: from waldorf.isode.com (cl-125.lon-03.gb.sixxs.net [IPv6:2a00:14f0:e000:7c::2]) by ietfa.amsl.com (Postfix) with ESMTP id F177721F8760; Thu, 11 Oct 2012 06:57:08 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; t=1349963826; d=isode.com; s=selector; i=@isode.com; bh=gN/TNr2AR0DPxSUTrzcPV5USoq4z5zrZQvqmnGZPhFY=; h=From:Sender:Reply-To:Subject:Date:Message-ID:To:Cc:MIME-Version: In-Reply-To:References:Content-Type:Content-Transfer-Encoding: Content-ID:Content-Description; b=SAA5jmG/3GNEhrMNm1rMpDkBt5obtjc0txN4GT08snu1DX0K3vKceUNSOURMJ3VDiv66OL mUH7Ds7PW/uWQAGki6l/Zr3wXxbsUQoO53BxTwD3M3qm7x2tA6MDGxRl95x0bymO97nM2N /HfaL1dGsxRO+gjC4ejb4x9leLfuXJw=;
Received: from [172.16.1.29] (shiny.isode.com [62.3.217.250]) by waldorf.isode.com (submission channel) via TCP with ESMTPA id <UHbQMAB4nglb@waldorf.isode.com>; Thu, 11 Oct 2012 14:57:06 +0100
Message-ID: <5076D036.3060602@isode.com>
Date: Thu, 11 Oct 2012 14:57:10 +0100
From: Alexey Melnikov <alexey.melnikov@isode.com>
User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:13.0) Gecko/20120614 Thunderbird/13.0.1
To: "Black, David" <david.black@emc.com>
References: <4FBFAE5F.8010305@gmail.com> <506C43AA.9010206@isode.com> <E160851FCED17643AE5F53B5D4D0783A4C410C95@BL2PRD0610MB361.namprd06.prod.outlook.com> <5076AEDB.7030900@isode.com> <8D3D17ACE214DC429325B2B98F3AE7120DFAE9D4@MX15A.corp.emc.com>
In-Reply-To: <8D3D17ACE214DC429325B2B98F3AE7120DFAE9D4@MX15A.corp.emc.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: Mallikarjun Chadalapaka <cbm@chadalapaka.com>, "iesg@ietf.org" <iesg@ietf.org>, "draft-ietf-storm-iscsi-cons.all@tools.ietf.org" <draft-ietf-storm-iscsi-cons.all@tools.ietf.org>, "secdir@ietf.org" <secdir@ietf.org>
Subject: Re: [secdir] Use of StringPrep/Unicode (was Re: SecDir and AppsDir review of draft-ietf-storm-iscsi-cons-06)
X-BeenThere: secdir@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Security Area Directorate <secdir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/secdir>, <mailto:secdir-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/secdir>
List-Post: <mailto:secdir@ietf.org>
List-Help: <mailto:secdir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/secdir>, <mailto:secdir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 11 Oct 2012 13:57:12 -0000

On 11/10/2012 14:35, Black, David wrote:
> Alexey,
Hi David,
> Thanks for splitting this out into a separate thread.  More comments inline.
>
>> Sorry for the delay with this reply.
>>
>> On 09/10/2012 04:42, Mallikarjun Chadalapaka wrote:
>>>      all iSCSI implementations complying with this document. Protocol
>>>      behavior defined in this Section MUST be exhibited by iSCSI
>>>      implementations on an iSCSI session when they negotiate the
>>>      TaskReporting (Section 13.23) key to "FastAbort" on that session.
>>>      The execution of ABORT TASK SET, CLEAR TASK SET, LOGICAL UNIT
>>>      RESET, TARGET WARM RESET, and TARGET COLD RESET TMF Requests
>>>      consists of the following sequence of actions in the specified
>>>      order on the specified party.
>>>
>>> In Section 4.2.7.1
>>>
>>>         iSCSI names are composed only of displayable characters.
>>>
>>> What does "displayable" means here? "ASCII printable"? "Unicode
>>> printable"?
>>>
>>> [Mallikarjun:] The latter. Note that all normative statements make it clear
>> that we refer to Unicode-encoding - especially look at Section 4.2.7.2 where
>> it discusses the stringprep-normalization. This section (4.2.7.1) is just a
>> non-normative discussion of the user-visible characteristics.
>>
>> At this point I haven't read 4.2.7.2 and it is not clear that this
>> section is non normative. I wish you would use more standard terminology
>> and/or forward references.
> "Unicode printable" is intended - the details are in 4.2.7.2 and the
> stringprep profile in RFC 3722.  I would change "displayable characters"
> to "printable ASCII and Unicode characters".  Yes, I know that ASCII is a
> subset of Unicode, but not every reader of this RFC will have that level
> of Unicode familiarity.
Ok.
>>>>            iSCSI
>>>>            names allow the use of international character sets but are
>>>>            not case sensitive.
>>>>
>>>> What does this mean exactly? Do you mean ASCII case sensitivity? Unicode
>>>> case sensitivity?
> There's been a separate email exchange with Pete Resnick about this text.
> The conclusion is that the words "case sensitive" above are misleading,
> so the above sentence needs to be rewritten and expanded as follows:
>
> 	iSCSI names allow the use of international characters
> 	but uppercase characters are prohibited.  The iSCSI stringprep
>        profile [RFC3722] maps uppercase characters to lowercase and
>        SHOULD be used to prepare iSCSI names from input that may
>        include uppercase characters.
This would be Ok, if you update the above to say ASCII or Unicode.
>>>>            No whitespace characters are used in
>>>>            iSCSI names.
>>>>
>>>> What is "whitespace". U+0020? (ASCII space) Something else?
>> Any comments on this?
> Something else - see RFC 3722, which specifies the explicit prohibitions
> beyond U+0020, specifically it prohibits the space characters in tables
> C.1.1 and C.1.2 from RFC 3454.  If it helps, we could add ", see [RFC3722]
> for details" at the end of that sentence.

Sounds good.

>>>> 4.2.7.2. iSCSI Name Encoding
>>>>
>>>>      The stringprep process is described in [RFC3454]; iSCSI's use of
>>>>      the stringprep process is described in [RFC3722]. Stringprep is a
>>>>      method designed by the Internationalized Domain Name (IDN) working
>>>>      group to translate human-typed strings into a format that can be
>>>>      compared as opaque strings. Strings MUST NOT include punctuation,
>>>>      spacing, diacritical marks, or other characters that could get in
>>>>      the way of readability.
>>>>
>>>> This MUST NOT is not well defined. You need to define specific Unicode
>>>> codepoints or character classes.
>>>>
>>> [Mallikarjun:] Given the number of scripts that Unicode supports, this does
>> not strike me a trivial exercise. I suspect that's why the original ips WG
>> left it as such, and this level of specificity has worked well for over a
>> decade. However, I am not a Unicode expert at all, so I may be overlooking
>> something obvious you have in mind?
>> Unicode Consortium has defined classes of characters (e.g. "letters",
>> "digits", "punctuation", etc.). You can just reference such classes.
>> Happy to try to suggest some text or find a person who can help you
>> reword this.
>>
>> Without specific details your MUST NOT is not implementable and not
>> enforceable. Worse, I suspect nobody has implemented it anyway. I think
>> your choices are: (a) delete the sentence as it is not useful or (b) fix
>> it so that it actually ensures interoperability.
> Alexey is correct about this one - that "MUST NOT" is too strong and not
> implementable, even though it's a "good idea" in principle.
>
> This text was originally intended as usage guidance on avoidance of
> characters that result in different strings that look the same or similar
> to humans.  This is already a concern in ASCII (e.g., courtesy of l, 1, 0
> and O), but is much more of a concern for Unicode.
>
> The sentence needs a rewrite as guidance that avoids using "MUST NOT", e.g.:
>
> 	iSCSI names are expected to be used by administrators for purposes
> 	such as system configuration - for this reason, characters that may
> 	lead to human confusion among different iSCSI names (e.g., punctuation,
> 	spacing, diacritical marks) should be avoided, even when such
> 	characters are allowed as stringprep processing output by [RFC3722].

Ok. This is not what I expected the replacement text to say, but this is 
better.

> The mandatory requirements on allowed vs. prohibited characters are in
> RFC 3722, and (IMHO) it's best to not expand on them here.
>
>>>>      The stringprep process also converts
>>>>      strings into equivalent strings of lower-case characters.
>>>>
>>>>
>>>>      The stringprep process does not need to be implemented if the
>>>>      names are only generated using numeric and lower-case (any
>>>>      character set) alphabetic characters.
>>>>
>>>> Lower-case is not well defined, unless you say you mean ASCII or
>>>> something else.
> Actually it is well-defined in RFC 3722 via its reference to table B.2
> in RFC 3454.
Well, it is only well defined if you say what it means ;-). Adding a 
reference to B.2 from RFC 3454 would address this.
>>>> Also, "any character set"? This really doesn't look correct.
> OTOH, that's definitely not correct - "(any character set)" should just
> be deleted.  This text was intended to refer to what Unicode calls a
> "script" and it's probably better to just not mention that concept here.
>
>>>    [Mallikarjun:] The entire iSCSI Name discussion is in the context of
>> Unicode, and it is called out in normative sentences in appropriate places.
>> This is non-normative descriptive text. OTOH, if you have specific
>> suggestions, they are welcome.
>> I think you meant to write:
>>
>>      The stringprep process does not need to be implemented if the
>>      names are only generated using ASCII numeric and Unicode characters with
>> 	"Lowercase" property.
> Unfortunately, that's not quite right, because the actual specification
> is what RFC 3722 allows via the tables in RFC 3454.  Those RFC 3454 are
> definitive, as opposed to the Unicode specifications of character
> properties (which were used to generate those tables).
>
> The following text should get this right:
>
>       The stringprep process does not need to be implemented if the
>       names are generated using only characters allowed as output by
>       the stringprep processing specified in [RFC3722].  Those allowed
>       characters include all ASCII lowercase and numeric characters,
>       as well as lowercase Unicode characters as specified in [RFC3722].
This new text is better and addresses my concern. (The whole paragraph 
is sort of "if you are too lazy to implement StringPrep, this is what 
you can safely use", but as long as this is correct, I don't mind.)
> The PRECIS work should allow a future update that uses the Unicode
> "Lowercase" property (and other Unicode) properties.
Right.