Re: [precis] rationale of rfc7613 decisions

Peter Saint-Andre <stpeter@stpeter.im> Mon, 17 April 2017 03:40 UTC

Return-Path: <stpeter@stpeter.im>
X-Original-To: precis@ietfa.amsl.com
Delivered-To: precis@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 766F1127A97 for <precis@ietfa.amsl.com>; Sun, 16 Apr 2017 20:40:58 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.721
X-Spam-Level:
X-Spam-Status: No, score=-2.721 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=stpeter.im header.b=ejyjZNgf; dkim=pass (2048-bit key) header.d=messagingengine.com header.b=JlWsMMsi
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Ft5-DIJRZiDp for <precis@ietfa.amsl.com>; Sun, 16 Apr 2017 20:40:57 -0700 (PDT)
Received: from out1-smtp.messagingengine.com (out1-smtp.messagingengine.com [66.111.4.25]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id E6BAD1273E2 for <precis@ietf.org>; Sun, 16 Apr 2017 20:40:56 -0700 (PDT)
Received: from compute2.internal (compute2.nyi.internal [10.202.2.42]) by mailout.nyi.internal (Postfix) with ESMTP id C215F20702; Sun, 16 Apr 2017 23:40:55 -0400 (EDT)
Received: from frontend2 ([10.202.2.161]) by compute2.internal (MEProxy); Sun, 16 Apr 2017 23:40:55 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=stpeter.im; h= content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-me-sender :x-me-sender:x-sasl-enc:x-sasl-enc; s=fm1; bh=7ScIwMV/nD6S0fBOpk FRxXmVWQ4dImzu4XeNGi6z+w0=; b=ejyjZNgfozYvaWKDMXPIloLS5owyUpB21E roXAO/x1ndYF4f5FBf5DuDai/qxVUobsV0syK/Rzd+mAKBHeW4HXmiZEgoSI6TIS wthFCSbiaegXSKKk1teynWTjJniX6H8nvq5jUPiy8Ed4/NYJqMZw3pjeP0z5UBlg /jXC05xbYasTgwmGRM88NVnMN4fuFqxmAsgj8qPDpKB3HE1322NTdVhRuVG5Eoir MUxiGxmGiM7mdxCW5JYHG66bLlkhbl89ephv4SkwpbXotP5bNSl8Qg+82rshM+kl lKQJ3o8Bns7vdb3P/dh8c38edRwf+v/Ln7iX0slC2kCpkY96tl+w==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-sender:x-me-sender:x-sasl-enc:x-sasl-enc; s= fm1; bh=7ScIwMV/nD6S0fBOpkFRxXmVWQ4dImzu4XeNGi6z+w0=; b=JlWsMMsi oFcPIJYbpSk/OOym+2t4lBE5b0yZlH2j1+xnkLYfWG8v+oI0piOqr9H5ZJIL8ECd vFKF6bpp0NP3eF9BqgYJ0fhT18CxSi8c1Zb2sAqcazCA1V0UKciKr8uPqbOsiitO wLYjdc2V1ztoSsEdr2HIi3X0VmgQCPsvKV1FXLpQsFKQUJ5OzD2JXHrOxgMftuGN rFlSxUNsW0D2VpGxPZwnGeFBKwxm9FZ8APDlnRyCYlXr5CdFHnnpGhZ6MsdkW4fA Nydy8KN95G5XixcE/YeO2THeu8v6+x4nPv2SCz+ILFarjNpUhd3lMrDhdGBkep5x kqB38XzKDopbgg==
X-ME-Sender: <xms:Rzn0WB6LSucI6VavbCNul7MHKd_PGkV96LziDFWxwWv1WwGfB6rBeQ>
X-Sasl-enc: tB2gvsRhgusmBoI76mNK6uGy9u6cktmEnPB2lqA6n29G 1492400455
Received: from aither.local (unknown [76.25.4.24]) by mail.messagingengine.com (Postfix) with ESMTPA id 207972469D; Sun, 16 Apr 2017 23:40:55 -0400 (EDT)
To: Nikos Mavrogiannopoulos <nmav@redhat.com>, precis@ietf.org
References: <1490885635.10364.10.camel@redhat.com> <5d02a0bc-5f53-a9fe-33fe-be0c66de24ee@stpeter.im> <1490948974.24162.5.camel@redhat.com>
From: Peter Saint-Andre <stpeter@stpeter.im>
Message-ID: <b1762d6f-ae65-eefe-cf8e-0a8c7a5c5a47@stpeter.im>
Date: Sun, 16 Apr 2017 21:40:54 -0600
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:45.0) Gecko/20100101 Thunderbird/45.8.0
MIME-Version: 1.0
In-Reply-To: <1490948974.24162.5.camel@redhat.com>
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/precis/GnT0CVLo-6oOURDxLvPRXomxRJk>
Subject: Re: [precis] rationale of rfc7613 decisions
X-BeenThere: precis@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Preparation and Comparison of Internationalized Strings <precis.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/precis>, <mailto:precis-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/precis/>
List-Post: <mailto:precis@ietf.org>
List-Help: <mailto:precis-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/precis>, <mailto:precis-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 17 Apr 2017 03:40:58 -0000

Hi Nikos,

I haven't forgotten about your message and will reply at greater length
this week.

Peter

On 3/31/17 2:29 AM, Nikos Mavrogiannopoulos wrote:
> On Thu, 2017-03-30 at 19:45 -0600, Peter Saint-Andre wrote:
> 
>>> I'm checking both rfc7564 and rfc7613, and I cannot find the
>>> rationale
>>> of the restrictions being done. In particular:
>>>  1. why rfc7613 restricts all spaces for passwords to U+0020?
>>
>>
> [...]
>>>  2. what is the purpose of "Contextual Rule Required" in section
>>> 4.3.2
>>> of rfc7564?
>>
>> It's complicated, but in essence PRECIS is consistent with IDNA2008
>> here
>> (see RFC 5891, RFC 5892, and RFC 5894). In particular, the code
>> points
>> ZERO WIDTH JOINER (U+200D) and ZERO WIDTH NON-JOINER (U+200C) are
>> necessary to produce certain combinatiosn of characters in certain
>> scripts (e.g., Arabic, Persian, and Indic scripts) but if used in
>> other
>> contexts can have consequences that violate the principle of least
>> user astonishment.
> 
> I think that such issues should warrant extensive discussions in an RFC
> like 7613. It is not apparent for me for example why that principle
> should apply for passwords (which are not visible). I guess there are
> arguments for that, but should be presented in order to understand and
> be able to convince people that RFC7613 is the way to go.
> 
>>  3. why freeform class doesn't allow "Old Hangul Jamo characters"?
>> As explained in §2.9 of RFC 5892:
>>
>>    Elimination of conjoining Hangul Jamo from the set of PVALID
>>    characters results in restricting the set of Korean PVALID
>> characters
>>    just to preformed, modern Hangul syllable characters.
>> Here again PRECIS is consistent with IDNA2008.
> 
> As I am mostly restricted in the context of passwords, my question is
> mostly on why is this done for the passwords. E.g., Is it because the
> Hangual Jamo set is a deprecated set which may not be in use years from
> now or another reason?
> 
>>>  4. why freeform class doesn't allow ignorable charaters?
>>
>> These are things like soft hyphen, certain joiners, specialized code
>> points for use within Unicode itself (e.g., language tags and
>> variation
>> selectors), and so on. They were disallowed in RFC 4013 and are
>> disallowed in IDNA2008, too.
>>
>> By saying "PRECIS is consistent with IDNA2008" I'm not appealing to
>> authority or saying that a consistency is necessarily a good thing.
>> Instead, defining as few string handling methods as possible helps
>> users
>> because strings aren't handled differently in different protocols and
>> contexts (see §5.1 of RFC 7564). This has security implications, too,
>> because the more such methods exist the easier it will be for
>> attackers to trick users.
> 
> In the context of 'passwords', I see very little applicability of such
> attacks, though I may be wrong. The main concern I see for passwords
> used for storage is compatibility, e.g., even with legacy software
> which did not follow these rules, and simplicity, so that software can
> follow the rules under reasonable for the task effort (I find the
> effort RFC7613 requires for processing UTF-8 passwords unproportionaly
> complex to the effort needed for US-ASCII passwords).
> 
>>> The context of that, is that I am trying to understand what would
>>> be
>>> the drawbacks from recommending a fixed normalization form (e.g.,
>>> NFC),
>>> for passwords, in contrast to recommending rfc7613.
>>
>> Nikos, instead of asking us why the foregoing restrictions were made,
>> ask yourself why you would want to ignore them and whether you
>> understand internationalization well enough to independently craft
>> appropriate rules and guidelines for the RFC you're updating. Because
>> you actively work on security technologies, think of it this way:
>> would
>> you want someone who doesn't understand all the issues to "just use
>> TLS"
>> without specifying appropriate cipher suites (ignoring RFC 7525) or
>> certificate checking procedures (ignoring RFC 5280 and RFC 6125)? The
>> issues involved with internationalization are just as complex (albeit
>> in different ways) and the whole reason we developed IDNA2008 and
>> PRECIS is so that well-meaning folks like you don't shoot yourselves
>> in the foot.
> 
> I cannot disagree with that, however, providing rationale for the
> decisions is important, especially in documents which are developed in
> disconnect with many existing protocols/practices. The current state in
> PKCS#12, PKCS#8 encrypted files, is pass there whatever you have as
> long as it is UTF-8. Convincing developers to deploy thousands lines of
> code for pre-processing such passwords, would require to underline the
> problems of the previous practice. RFC7613 unfortunately ignores that
> part completely, and I have no arguments when trying to convince people
> that this should be preferred.
> 
>> I strongly encourage you to use the PRECIS profile for passwords in
>> RFC7613, and we'd be happy to help you do so in the safest ways
>> possible.
> 
> I'm trying to make a list of items which make apparent why RFC7613 is
> needed. What I have now is:
> 
> "UTF-8 however, does not imply that strings conforming to it, are
> unambiguously unique, since there are can be various forms of the same
> string which may look identical to an observer, although being
> represented by a different byte string. Some issues are the following."
> 
> [The NFC argument is the easier to explain]
>  * Various normalization forms, which result to different data for the
> same input.
> 
> [why spaces need to be merged to 0x20 is harder to sell]
>  * The unicode standard includes a number of space characters which
> cannot be distinguished from each other, or have no width resulting to
> different results when switching to a different input method
> 
> [Hangual Jamo even harder]
>  * There are deprecated alphabet sets, which are no longer in use(?)
> and may not be available as input methods in the future.
> 
> [contextual rule]
>  * Certain combinations of code points between certain scripts produce
> unexpected visible results. (the question here is why would one care
> for visible results on passwords which are not printed)
> 
> regards,
> Nikos
>