Re: [precis] rationale of rfc7613 decisions

Peter Saint-Andre <stpeter@stpeter.im> Mon, 17 April 2017 03:40 UTC

To: Nikos Mavrogiannopoulos <nmav@redhat.com>, precis@ietf.org
References: <1490885635.10364.10.camel@redhat.com> <5d02a0bc-5f53-a9fe-33fe-be0c66de24ee@stpeter.im> <1490948974.24162.5.camel@redhat.com>
From: Peter Saint-Andre <stpeter@stpeter.im>
Message-ID: <b1762d6f-ae65-eefe-cf8e-0a8c7a5c5a47@stpeter.im>
Date: Sun, 16 Apr 2017 21:40:54 -0600
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:45.0) Gecko/20100101 Thunderbird/45.8.0
MIME-Version: 1.0
In-Reply-To: <1490948974.24162.5.camel@redhat.com>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/precis/GnT0CVLo-6oOURDxLvPRXomxRJk>
Subject: Re: [precis] rationale of rfc7613 decisions
Precedence: list

Hi Nikos,

I haven't forgotten about your message and will reply at greater length
this week.

Peter

On 3/31/17 2:29 AM, Nikos Mavrogiannopoulos wrote:
> On Thu, 2017-03-30 at 19:45 -0600, Peter Saint-Andre wrote:
> 
>>> I'm checking both rfc7564 and rfc7613, and I cannot find the
>>> rationale
>>> of the restrictions being done. In particular:
>>>  1. why rfc7613 restricts all spaces for passwords to U+0020?
>>
>>
> [...]
>>>  2. what is the purpose of "Contextual Rule Required" in section
>>> 4.3.2
>>> of rfc7564?
>>
>> It's complicated, but in essence PRECIS is consistent with IDNA2008
>> here
>> (see RFC 5891, RFC 5892, and RFC 5894). In particular, the code
>> points
>> ZERO WIDTH JOINER (U+200D) and ZERO WIDTH NON-JOINER (U+200C) are
>> necessary to produce certain combinatiosn of characters in certain
>> scripts (e.g., Arabic, Persian, and Indic scripts) but if used in
>> other
>> contexts can have consequences that violate the principle of least
>> user astonishment.
> 
> I think that such issues should warrant extensive discussions in an RFC
> like 7613. It is not apparent for me for example why that principle
> should apply for passwords (which are not visible). I guess there are
> arguments for that, but should be presented in order to understand and
> be able to convince people that RFC7613 is the way to go.
> 
>>  3. why freeform class doesn't allow "Old Hangul Jamo characters"?
>> As explained in §2.9 of RFC 5892:
>>
>>    Elimination of conjoining Hangul Jamo from the set of PVALID
>>    characters results in restricting the set of Korean PVALID
>> characters
>>    just to preformed, modern Hangul syllable characters.
>> Here again PRECIS is consistent with IDNA2008.
> 
> As I am mostly restricted in the context of passwords, my question is
> mostly on why is this done for the passwords. E.g., Is it because the
> Hangual Jamo set is a deprecated set which may not be in use years from
> now or another reason?
> 
>>>  4. why freeform class doesn't allow ignorable charaters?
>>
>> These are things like soft hyphen, certain joiners, specialized code
>> points for use within Unicode itself (e.g., language tags and
>> variation
>> selectors), and so on. They were disallowed in RFC 4013 and are
>> disallowed in IDNA2008, too.
>>
>> By saying "PRECIS is consistent with IDNA2008" I'm not appealing to
>> authority or saying that a consistency is necessarily a good thing.
>> Instead, defining as few string handling methods as possible helps
>> users
>> because strings aren't handled differently in different protocols and
>> contexts (see §5.1 of RFC 7564). This has security implications, too,
>> because the more such methods exist the easier it will be for
>> attackers to trick users.
> 
> In the context of 'passwords', I see very little applicability of such
> attacks, though I may be wrong. The main concern I see for passwords
> used for storage is compatibility, e.g., even with legacy software
> which did not follow these rules, and simplicity, so that software can
> follow the rules under reasonable for the task effort (I find the
> effort RFC7613 requires for processing UTF-8 passwords unproportionaly
> complex to the effort needed for US-ASCII passwords).
> 
>>> The context of that, is that I am trying to understand what would
>>> be
>>> the drawbacks from recommending a fixed normalization form (e.g.,
>>> NFC),
>>> for passwords, in contrast to recommending rfc7613.
>>
>> Nikos, instead of asking us why the foregoing restrictions were made,
>> ask yourself why you would want to ignore them and whether you
>> understand internationalization well enough to independently craft
>> appropriate rules and guidelines for the RFC you're updating. Because
>> you actively work on security technologies, think of it this way:
>> would
>> you want someone who doesn't understand all the issues to "just use
>> TLS"
>> without specifying appropriate cipher suites (ignoring RFC 7525) or
>> certificate checking procedures (ignoring RFC 5280 and RFC 6125)? The
>> issues involved with internationalization are just as complex (albeit
>> in different ways) and the whole reason we developed IDNA2008 and
>> PRECIS is so that well-meaning folks like you don't shoot yourselves
>> in the foot.
> 
> I cannot disagree with that, however, providing rationale for the
> decisions is important, especially in documents which are developed in
> disconnect with many existing protocols/practices. The current state in
> PKCS#12, PKCS#8 encrypted files, is pass there whatever you have as
> long as it is UTF-8. Convincing developers to deploy thousands lines of
> code for pre-processing such passwords, would require to underline the
> problems of the previous practice. RFC7613 unfortunately ignores that
> part completely, and I have no arguments when trying to convince people
> that this should be preferred.
> 
>> I strongly encourage you to use the PRECIS profile for passwords in
>> RFC7613, and we'd be happy to help you do so in the safest ways
>> possible.
> 
> I'm trying to make a list of items which make apparent why RFC7613 is
> needed. What I have now is:
> 
> "UTF-8 however, does not imply that strings conforming to it, are
> unambiguously unique, since there are can be various forms of the same
> string which may look identical to an observer, although being
> represented by a different byte string. Some issues are the following."
> 
> [The NFC argument is the easier to explain]
>  * Various normalization forms, which result to different data for the
> same input.
> 
> [why spaces need to be merged to 0x20 is harder to sell]
>  * The unicode standard includes a number of space characters which
> cannot be distinguished from each other, or have no width resulting to
> different results when switching to a different input method
> 
> [Hangual Jamo even harder]
>  * There are deprecated alphabet sets, which are no longer in use(?)
> and may not be available as input methods in the future.
> 
> [contextual rule]
>  * Certain combinations of code points between certain scripts produce
> unexpected visible results. (the question here is why would one care
> for visible results on passwords which are not printed)
> 
> regards,
> Nikos
>

Re: [precis] rationale of rfc7613 decisions Peter Saint-Andre
[precis] rationale of rfc7613 decisions Nikos Mavrogiannopoulos
Re: [precis] rationale of rfc7613 decisions Peter Saint-Andre
Re: [precis] rationale of rfc7613 decisions Nikos Mavrogiannopoulos
Re: [precis] rationale of rfc7613 decisions Peter Saint-Andre