[xmpp] review of draft-ietf-xmpp-6122bis-12

"Joe Hildebrand (jhildebr)" <jhildebr@cisco.com> Wed, 30 July 2014 21:25 UTC

Return-Path: <jhildebr@cisco.com>
X-Original-To: xmpp@ietfa.amsl.com
Delivered-To: xmpp@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id AD59F1A0552; Wed, 30 Jul 2014 14:25:37 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -14.502
X-Spam-Level:
X-Spam-Status: No, score=-14.502 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_HI=-5, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 57DEzMR_Xkm1; Wed, 30 Jul 2014 14:25:35 -0700 (PDT)
Received: from rcdn-iport-4.cisco.com (rcdn-iport-4.cisco.com [173.37.86.75]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 314D51A049F; Wed, 30 Jul 2014 14:25:35 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=13010; q=dns/txt; s=iport; t=1406755535; x=1407965135; h=from:to:subject:date:message-id:content-id: content-transfer-encoding:mime-version; bh=vRui1XmTkMgHx4kEt5sE2Sh/5Ue88sYaVBj507B48pg=; b=PddRk0NTiuPhEkZeFfXzyG/dcukaBHmYg+Fy7lW+PJVMsNkX3IpMtiZQ qMlUluDbftL9/y4P1Q4pSuzSGtdATC/kYbW8PuWyL4J7pjUCXQUzwsrz2 buDlRkGjNGLOoaUztxftMq+aGjgY3vCilrNPwJweRK2V8r55UDtFwGwjh A=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AhEJAEFi2VOtJV2R/2dsb2JhbABZgw5SWAOCdMgah2V3FneEBQQBIxFXASICERUCBDAVEgQBiEwIDahNl2oXgSyOMYJvgVEFlz6EJoFSkwGDSYIx
X-IronPort-AV: E=Sophos;i="5.01,767,1400025600"; d="scan'208";a="344053548"
Received: from rcdn-core-9.cisco.com ([173.37.93.145]) by rcdn-iport-4.cisco.com with ESMTP; 30 Jul 2014 21:25:34 +0000
Received: from xhc-rcd-x11.cisco.com (xhc-rcd-x11.cisco.com [173.37.183.85]) by rcdn-core-9.cisco.com (8.14.5/8.14.5) with ESMTP id s6ULPYR3021047 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=FAIL); Wed, 30 Jul 2014 21:25:34 GMT
Received: from xmb-rcd-x10.cisco.com ([169.254.15.102]) by xhc-rcd-x11.cisco.com ([173.37.183.85]) with mapi id 14.03.0123.003; Wed, 30 Jul 2014 16:25:34 -0500
From: "Joe Hildebrand (jhildebr)" <jhildebr@cisco.com>
To: "xmpp@ietf.org" <xmpp@ietf.org>, "precis@ietf.org" <precis@ietf.org>
Thread-Topic: review of draft-ietf-xmpp-6122bis-12
Thread-Index: AQHPrDzL9R10zr+WqkC/gobE1sN1DQ==
Date: Wed, 30 Jul 2014 21:25:33 +0000
Message-ID: <CFFEBEEE.575AE%jhildebr@cisco.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
user-agent: Microsoft-MacOutlook/14.4.3.140616
x-originating-ip: [10.89.11.87]
Content-Type: text/plain; charset="utf-8"
Content-ID: <5112396BBE9CED41988615D1FFF2C2A3@emea.cisco.com>
Content-Transfer-Encoding: base64
MIME-Version: 1.0
Archived-At: http://mailarchive.ietf.org/arch/msg/xmpp/dWGsWasI7b5A2SLes4I8fye6XN8
Subject: [xmpp] review of draft-ietf-xmpp-6122bis-12
X-BeenThere: xmpp@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: XMPP Working Group <xmpp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/xmpp>, <mailto:xmpp-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/xmpp/>
List-Post: <mailto:xmpp@ietf.org>
List-Help: <mailto:xmpp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/xmpp>, <mailto:xmpp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 30 Jul 2014 21:25:37 -0000

The reasons the precis group got a spate of questions from me today was I
was prepping to do this review.  There are a couple of issues that the
precis folk should pay more attention to.

 > 1.  Introduction
... 

 >    Instead, this document builds upon the
 >    internationalization framework defined by the IETF's PRECIS Working
 >    Group [I-D.ietf-precis-framework], while attempting to ensure that
 >    the characters allowed in Jabber IDs under stringprep are still
 >    allowed and handled in the same way under PRECIS.

"the same way" means more backward-compatibility to me than I think we
intend here.

 > 3.1.  Fundamentals
 > 
 >       jid           = [ localpart "@" ] domainpart [ "/" resourcepart ]
 >       localpart     = 1*1023(localpoint)
 >                       ;
 >                       ; a "localpoint" is a UTF-8 encoded
 >                       ; Unicode code point that conforms to
 >                       ; the "JIDlocalIdentifierClass" profile
 >                       ; of the PRECIS IdentifierClass
 >                       ;

This implies 1023 codepoints, not 1023 bytes to me. Same issue for ifqdn
and resourcepart.  6122 just had 1*; I think going back to that would be
fine since we have a rule below that captures the max size.

 > 3.2.  Domainpart
 > 
 >    The domainpart of a JID is that portion after the '@' character (if
 >    any) and before the '/' character (if any); it is the primary

I think it's often surprising to people that foo/@bar is a valid JID with
"foo" as the domainpart and "@bar" as the resourcepart.  The text above,
although pulled from 6122, might be better as:

The domainpart of a JID is that portion after the first '@' character (if
any) and before the first '/' character (if any);

and possibly adding the example.

 >    In general, the content of a domainpart is an Internationalized
 >    Domain Name ("IDN") as described in the specifications for
 >    Internationalized Domain Names in Applications (commonly called
 >    "IDNA2008"), and a domainpart is an "IDNA-aware domain name slot" as
 >    defined in [RFC5890].  The following rules apply to a domainpart that
 >    consists of a fully-qualified domain name and MUST be applied in the
 >    following order:

When do these rules need to be applied? Only before comparison or routing?

 >    1.  The domainpart MUST contain only NR-LDH labels and U-labels as
 >        defined in [RFC5890] and MUST consist only of Unicode code points
 >        that conform to the rules specified in [RFC5892] (which includes
 >        Unicode normalization).  This implies that the domainpart MUST
 >        NOT include A-labels as defined in [RFC5890]; each A-label MUST
 >        be converted to a U-label during preparation of a domainpart, and
 >        comparison MUST be performed using U-labels, not A-labels.

This seems like an always rule, including for dumb clients.

 >    2.  All uppercase and titlecase code points within the domainpart
 >        MUST be mapped to their lowercase equivalents, preferably using
 >        Unicode Default Case Folding as defined in Chapter 3 of the
 >        Unicode Standard [UNICODE].

Dumb clients might get away with this and the system would still work.

 >    3.  Fullwidth and halfwidth characters within the domainpart MUST be
 >        mapped to their decomposition mappings.

Dumb clients have no shot at this one.

 >       Implementation Note: The foregoing order is different from the
 >       order for localparts and resourceparts as described below, to
 >       maintain consistency with the IDNA methods in both [RFC5892] and
 >       [RFC5895].
 > 
 >    After any and all normalization, conversion, and mapping of code
 >    points, 

as well as conversion to UTF-8.

 >    a domainpart MUST NOT be zero octets in length and MUST NOT
 >    be more than 1023 octets in length.  (Naturally, the length limits of
 >    [RFC1034] apply, and nothing in this document is to be interpreted as
 >    overriding those more fundamental limits.)
 > 
 > 3.3.  Localpart
 > 
 >    The localpart of a JID is an optional identifier placed before the
 >    domainpart and separated from the latter by the '@' character.
 >    Typically a localpart uniquely identifies the entity requesting and
 >    using network access provided by a server (i.e., a local account),
 >    although it can also represent other kinds of entities (e.g., a chat
 >    room associated with a multi-user chat service [XEP-0045]).  The
 >    entity represented by an XMPP localpart is addressed within the
 >    context of a specific domain (i.e., <localpart@domainpart>).
 > 
 >    A localpart MUST NOT be zero octets in length and MUST NOT be more
 >    than 1023 octets in length.  This rule is to be enforced after any
 >    normalization and mapping of code points.

and conversion to UTF-8.

 >    A localpart MUST consist only of Unicode code points that conform to
 >    the "JIDlocalIdentifierClass" profile of the "IdentifierClass" base
 >    string class defined in [I-D.ietf-precis-framework].  The
 >    JIDlocalIdentifierClass profile includes all code points allowed by
 >    the IdentifierClass base class, with the exception of the following
 >    characters that are explicitly disallowed in XMPP localparts:

(special precis focus)
I would have expected this to be phrased more similarly to step 2 of
http://tools.ietf.org/html/draft-ietf-precis-framework-17#section-5, or
for section 5 to just have a step about codepoints forbidden in a given
usage of the selected precis class.

 >    The normalization and mapping rules for the JIDlocalIdentifierClass
 >    are as follows, where the operations specified MUST be completed in
 >    the order shown:

Again, I think we need language about when these rules are applied.  The
rest of the section is about what is allowed, not about how to compare.

 >    1.  Fullwidth and halfwidth characters MUST be mapped to their
 >        decomposition mappings.
 > 
 >    2.  Uppercase and titlecase characters MUST be mapped to their
 >        lowercase equivalents, preferably using Unicode Default Case
 >        Folding as defined in Chapter 3 of the Unicode Standard
 >        [UNICODE].

Nothing about SpecialCasing?

 >    A resourcepart MUST NOT be zero octets in length and MUST NOT be more
 >    than 1023 octets in length.  This rule is to be enforced after any
 >    normalization and mapping of code points.
 > 
 >    A resourcepart MUST consist only of Unicode code points that conform
 >    to the "JIDresourceFreeformClass" profile of the "FreeformClass" base
 >    string class defined in [I-D.ietf-precis-framework].
 > 
 >    The normalization and mapping rules for the resourcepart of a JID are
 >    as follows, where the operations specified MUST be completed in the
 >    order shown:

Again, when are the rules applied?

 >    1.  Fullwidth and halfwidth characters MAY be mapped to their
 >        decomposition mappings.

(precis)
I need a hint as to when do this.  "MAY" isn't nearly enough.

 >    2.  Map any instances of non-ASCII space to ASCII space (U+0020).

(precis)
I was hoping either the framework doc or the mappings doc would tell me
more about which characters to map here.  RFC 3454 had table C.1.2, but I
don't see any hints about what I'm supposed to do now.  Is the rule "has a
compatibility mapping to U+0020"?  That doesn't hit U+200B which is in
C.1.2, nor does "has category Zs".  draft-ietf-precis-mappings says
"Therefore, the special mapping table should be based on a well-
   defined mapping table for each protocol", which although I don't
particularly like, I can live with - but we need the table here.

 >    3.  So-called additional mappings MAY be applied, such as mapping of
 >        characters that are similar to common delimiters (such as '@',
 >        ':', '/', '+', '-', and '.', e.g., mapping of IDEOGRAPHIC FULL
 >        STOP (U+3002) to FULL STOP (U+002E)) and special handling of
 >        certain characters or classes of characters (e.g., mapping of
 >        non-ASCII spaces to ASCII space); the PRECIS mappings document
 >        [I-D.ietf-precis-mappings] describes such mappings in more
 >        detail.
 > 
 >    4.  Uppercase and titlecase characters MAY be mapped to their
 >        lowercase equivalents, preferably using Unicode Default Case
 >        Folding as defined in Chapter 3 of the Unicode Standard
 >        [UNICODE].

Again, I need more about the MAY here.

 > 6.  IANA Considerations
 > 
 >    The following completed templates provide the information necessary
 >    for the IANA to add 'JIDlocalIdentifierClass' and
 >    'JIDresourceFreeformClass' to the PRECIS Profiles Registry.

Should we also ask them to mark the status of nodeprep and resourceprep to
deprecated in the stringprep profiles registry?

 > Appendix A.  Differences from RFC 6122
 > 
 >    Based on consensus derived from working group discussion,
 >    implementation and deployment experience, and formal interoperability
 >    testing, the following substantive modifications were made from RFC
 >    6122.

I think it might be nice to point out that this may have made
previously-valid JIDs no longer valid (or vice-versa), and that we suggest
careful testing before migrating user data.


-- 
Joe Hildebrand