Re: [secdir] secdir review of draft-ietf-precis-framework-15

Peter Saint-Andre <stpeter@stpeter.im> Mon, 21 April 2014 14:40 UTC

Message-ID: <53552DD4.4080801@stpeter.im>
Date: Mon, 21 Apr 2014 08:40:20 -0600
From: Peter Saint-Andre <stpeter@stpeter.im>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:24.0) Gecko/20100101 Thunderbird/24.4.0
MIME-Version: 1.0
To: Charlie Kaufman <charliek@microsoft.com>, "secdir@ietf.org" <secdir@ietf.org>, "iesg@ietf.org" <iesg@ietf.org>, "draft-ietf-precis-framework.all@tools.ietf.org" <draft-ietf-precis-framework.all@tools.ietf.org>
References: <23eaf4c7da5a4bbe94c9f00496716596@BL2PR03MB498.namprd03.prod.outlook.com>
In-Reply-To: <23eaf4c7da5a4bbe94c9f00496716596@BL2PR03MB498.namprd03.prod.outlook.com>
Content-Type: text/plain; charset="windows-1252"; format="flowed"
Content-Transfer-Encoding: 8bit
Archived-At: http://mailarchive.ietf.org/arch/msg/secdir/iMU65C5MbYct9j1OBDn3g_hteic
Subject: Re: [secdir] secdir review of draft-ietf-precis-framework-15
Precedence: list

Thanks for the review. Comments and proposed text inline.

On 4/19/14, 11:37 PM, Charlie Kaufman wrote:
> I have reviewed this document as part of the security directorate's
> ongoing effort to review all IETF documents being processed by the
> IESG.  These comments were written primarily for the benefit of the
> security area directors.  Document editors and WG chairs should treat
> these comments just like any other last call comments.
>
> This document concerns international character sets. You might
> intuitively think that international character sets would have few if
> any security considerations, but you would be wrong. Many security
> mechanisms depend on the ability to recognize that two identifiers refer
> to the same entity and inconsistent handling of international character
> sets can result in two different pieces of code disagreeing as to
> whether two identifiers match and this has led to a number of serious
> security problems.
>
> This document defines 18 categories of characters within the UNICODE
> character set, with the intention that systems that want to accept
> subsets of UNICODE characters in their identifiers specify profiles
> referencing this document, and it defines two initial classes
> (IdentifierClass and FreeformClass) that could be used directly by lots
> of protocol specifications.
>
> While I see no problems with this document, it does seem like a missed
> opportunity to specify some things that are very important in the secure
> use of international character sets. The most important of these is a
> rule for determining whether two strings should be considered to be
> equivalent.

What we have here is a failure to communicate. :-) The purpose of the 
PRECIS work is twofold:

1. determine if a given string is allowed

2. determine if two given strings are equivalent

Since that it not stated clearly enough in the document, we need to add 
some text. I propose the following change to the Introduction:

###

OLD
    ... Profiles are responsible for defining the
    handling of right-to-left characters as well as various mapping
    operations of the kind also discussed for IDNs in [RFC5895], such as
    case preservation or lowercasing, Unicode normalization, mapping of
    certain characters to other characters or to nothing, and mapping of
    full-width and half-width characters.

    It is expected that this framework will yield the following benefits:

NEW
    ... Profiles are responsible for defining the
    handling of right-to-left characters as well as various mapping
    operations of the kind also discussed for IDNs in [RFC5895], such as
    case preservation or lowercasing, Unicode normalization, mapping of
    certain characters to other characters or to nothing, and mapping of
    full-width and half-width characters.

    When an application applies a profile of a PRECIS string class, it
    can achieve the following objectives:

    a. Determine if a given string conforms to the profile (e.g. to
       determine if it is allowed for use in the relevant "slot"
       specified by an application protocol).

    b. Determine if any two given strings are equivalent (e.g., to
       make an access decision for purposes of authentication or
       authorization as further described in [RFC6943]).

    It is expected that this framework will yield the following benefits:

###

The PRECIS framework document contains no examples. However, all of the 
existing profile documents contain examples:

http://tools.ietf.org/html/draft-ietf-precis-saslprepbis-07#section-4.3

http://tools.ietf.org/html/draft-ietf-precis-saslprepbis-07#section-5.3

http://tools.ietf.org/html/draft-ietf-precis-nickname-09#section-3

http://tools.ietf.org/html/draft-ietf-xmpp-6122bis-12#section-3.5

I'm a big believer in examples, so I suggest that we add some to the 
framework document, or at least point specifically to the relevant 
sections of those profile documents.

> It is very common in both IETF protocols and in operating
> system object naming to adopt a preserve case / ignore case model. That
> means that if an identifier is entered in mixed case, the mixed case is
> preserved as the identifier but if someone tries to find an object using
> an identifier that is identical except for the case of characters, it
> will find the object. Further, in instances where uniqueness of
> identifiers is enforced (e.g. user names or file names), a request to
> create a second identifier that differs only in the case of the
> characters from an existing one will fail.
>
> These scenarios require that if be well defined whether two characters
> differ only in case,

This is handled in PRECIS profile definitions by specifying case 
mapping. As we have defined the framework, that is a matter for 
profiles, not the base string classes.

> and while that is an easy check to make in ASCII
> with 26 letters that have upper and lower case versions, the story is
> much more complex for some international character sets. Worse, case
> mapping of even ASCII characters can change based on the “culture”. The
> most famous example is the Turkish undotted lower case ‘i’ and uppercase
> dotted ‘I’ which caused security bugs because mapping “FILE” to
> lowercase in the Turkish Locale did not result in the string “file”.
> There are also cases where two different lowercase characters are both
> mapped to the same uppercase character. It is a scary world out there.
>
> To be used safely from a security standpoint, there must be a
> standardized way to compare two strings for equivalence that all
> programs will agree on. Programs will still have bugs, but when two
> programs interpret equivalence differently it is important that it be
> possible to determine objectively which one is wrong. The ideal way to
> do this is to have a canonical form of any string such that two strings
> are equivalent if their canonical forms are identical.
>
> Section “10.4 Local Character Set Issues” acknowledges this problem, but
> offers no solution.

Actually, Section 10.4 discusses the problem of character encodings 
other than ASCII and Unicode (say, "Shift JIS" in Japan).

Localization issues (such as Turkish dotless "i") are discussed further 
in draft-ietf-precis-mappings. They are not addressed in the PRECIS 
framework because we could solve only so many problems in the framework 
and the working group decided to limit itself to internationalization, 
not also localization.

> In section “10.6 Security of Passwords”, this document recommends that
> password comparisons not ignore case (and I agree). But for passwords in
> particular, it is vital that they be translated to a canonical form
> because they are frequently hashed and the hashes must test as
> identical. One rarely has the luxury of comparing passwords character by
> character and deciding whether the characters are “close enough”.

Indeed. This is discussed in draft-ietf-precis-saslprepbis (which does 
define a PRECIS profile for passwords). Would it help to point to that 
document here?

> Section “10.5 Visually Similar Characters” discusses another hard
> problem: characters that are entirely distinct but are visually similar
> enough to mislead users. This problem occurs even without leaving ASCII
> in the form of the digit ‘0’ vs the uppercase letter ‘O’ and triple of
> the digit ‘1’, the lowercase letter ‘l’, and the uppercase letter ‘I’.
> In some fonts, various of these are indistinguishable. International
> character sets introduce even more such collisions. To the extent that
> we expect users to look at URLs like https://www.fideIity.com
> <http://www.fideIity.com> and recognize that something is out of place,
> we have a problem. It is probably best addressed by having tables of
> “looks similar” characters and disallowing the issuance of identifiers
> that look visually similar to existing ones in places like DNS
> registries and other places where this problem arises. Having a document
> that lists the doppelganger character equivalents would be a useful
> first step towards deploying such restrictions.

The Unicode Consortium maintains such a document:

http://www.unicode.org/Public/security/latest/confusables.txt

> I suppose it is too much to expect this document to address either of
> these issues, but I couldn’t resist suggesting it.

What we are talking about here is really a matter of scope. There is 
only so much that we could accomplish in the PRECIS framework itself. In 
particular, issues of localization and confusability are extremely messy 
and trying to tackle them would have significantly delayed our work.

Peter

[secdir] secdir review of draft-ietf-precis-frame… Charlie Kaufman
Re: [secdir] secdir review of draft-ietf-precis-f… Peter Saint-Andre