Re: [xmpp] draft-ietf-xmpp-address

Peter Saint-Andre <stpeter@stpeter.im> Wed, 07 July 2010 23:27 UTC

Return-Path: <stpeter@stpeter.im>
X-Original-To: xmpp@core3.amsl.com
Delivered-To: xmpp@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 1123A3A69B7 for <xmpp@core3.amsl.com>; Wed, 7 Jul 2010 16:27:44 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.532
X-Spam-Level:
X-Spam-Status: No, score=-3.532 tagged_above=-999 required=5 tests=[AWL=0.467, BAYES_00=-2.599, GB_I_LETTER=-2, J_CHICKENPOX_37=0.6]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id X5ByOTW1ACCc for <xmpp@core3.amsl.com>; Wed, 7 Jul 2010 16:27:42 -0700 (PDT)
Received: from stpeter.im (stpeter.im [207.210.219.233]) by core3.amsl.com (Postfix) with ESMTP id B75233A6918 for <xmpp@ietf.org>; Wed, 7 Jul 2010 16:27:42 -0700 (PDT)
Received: from leavealone.cisco.com (72-163-0-129.cisco.com [72.163.0.129]) (Authenticated sender: stpeter) by stpeter.im (Postfix) with ESMTPSA id 7FC1540E3B for <xmpp@ietf.org>; Wed, 7 Jul 2010 17:27:45 -0600 (MDT)
Message-ID: <4C350D6F.1040503@stpeter.im>
Date: Wed, 07 Jul 2010 17:27:43 -0600
From: Peter Saint-Andre <stpeter@stpeter.im>
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1.9) Gecko/20100317 Thunderbird/3.0.4
MIME-Version: 1.0
To: xmpp@ietf.org
References: <B27ACCD0-2352-4068-9358-4FDA38E273E5@nostrum.com> <alpine.CYG.2.00.1006281708470.4220@mhgneonet> <4C33CB96.8080209@stpeter.im> <4C33D36D.7000707@stpeter.im> <alpine.CYG.2.00.1007071238080.5224@mhgneonet> <4C34F2A1.6070604@stpeter.im>
In-Reply-To: <4C34F2A1.6070604@stpeter.im>
X-Enigmail-Version: 1.0.1
OpenPGP: url=http://www.saint-andre.com/me/stpeter.asc
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
Subject: Re: [xmpp] draft-ietf-xmpp-address
X-BeenThere: xmpp@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: XMPP Working Group <xmpp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/xmpp>, <mailto:xmpp-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/xmpp>
List-Post: <mailto:xmpp@ietf.org>
List-Help: <mailto:xmpp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/xmpp>, <mailto:xmpp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 07 Jul 2010 23:27:44 -0000

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 7/7/10 3:33 PM, Peter Saint-Andre wrote:
> On 7/7/10 3:00 PM, Bruce Campbell wrote:
> 
>> On Tue, 6 Jul 2010, Peter Saint-Andre wrote:
> 
>>> On 7/6/10 6:34 PM, Peter Saint-Andre wrote:
>>>> On 6/29/10 11:47 AM, Bruce Campbell wrote:
>>>>
>>>>> Alternatively, the ABNF for the JID address draft could be brought into
>>>>> alignment with 3986 by specifying:
>>>>
>>>>>  2.1.
>>>>>   domain = fqdn / IPv4address / IP-literal
>>>>>            ; the "IPv4address" and "IP-literal" rules are
>>>>>            ; defined in RFC3986.
>>>>
>>>>> which would result in domainpart allowing a bare IPv4 address (
>>>>> foo@1.2.3.4/bar ), and IPv6 and other literal addresses needing to be
>>>>> enclosed in '[]' as per RFC3986 .
>>>>
>>>> Although I have never seen a domainpart consisting of an IPv6 address
>>>> (and have rarely seen one consisting of an IPv4 address), I think it
>>>> would be best to be consistent with RFC 3986 on this point.
>>>
>>> Looking at this more closely, I wonder if we could re-use the ihost
>>> definition from RFC 3987. That would yield:
>>>
>>>   jid            = [ localpart "@" ] ihost [ "/" resource ]
>>>                    ; the "ihost" rule is defined in RFC 3987
>>>
>>> where:
>>>
>>>   ihost          = IP-literal / IPv4address / ireg-name
>>>   ireg-name      = *( iunreserved / pct-encoded / sub-delims )
>>>   iunreserved    = ALPHA / DIGIT / "-" / "." / "_" / "~" / ucschar
>>>   ucschar        = %xA0-D7FF / %xF900-FDCF / %xFDF0-FFEF
>>>                  / %x10000-1FFFD / %x20000-2FFFD / %x30000-3FFFD
>>>                  / %x40000-4FFFD / %x50000-5FFFD / %x60000-6FFFD
>>>                  / %x70000-7FFFD / %x80000-8FFFD / %x90000-9FFFD
>>>                  / %xA0000-AFFFD / %xB0000-BFFFD / %xC0000-CFFFD
>>>                  / %xD0000-DFFFD / %xE1000-EFFFD
>>>
>>> and where the "IP-literal" and "IPv4address" rules are as in RFC 3896.
>>>
>>> This is a step in the right direction, because it appears (!) that the
>>> ABNF in RFC 3920 doesn't allow anything but letter-digit-hyphen (i.e.,
>>> nothing outside the US-ASCII range).
>>>
>>> However, ireg-name isn't quite right either because some its aspects
>>> (pct-encoded and sub-delims) are included for compabitility with URI
>>> syntax, and we don't need those in native JIDs.
>>>
>>> Thus we could do this:
>>>
>>>   jid            = [ localpart "@" ] domainpart [ "/" resourcepart ]
> 
>> Do we need a comment about implementations identifying the localpart,
>> domainpart and resourcepart portions before applying any
>> transformations? A quick glance shows %x2215 (division slash) just
>> lurking there for the implementer who applies ToASCII to the domainpart
>> before picking out the resourcepart.
> 
> Good catch.

In fact U+2215 DIVISION SLASH does not decompose to U+002F SOLIDUS,
although U+FE6B SMALL COMMERCIAL AT does decompose to U+0040 COMMERCIAL AT.

I've added this text:

###

      Implementation Note: When dividing a JID into its component parts,
      an implementation needs to match the separator characters '@' and
      '/' before applying any transformation algorithms, which might
      decompose certain Unicode code points to the separator characters
      (e.g., U+FE6B SMALL COMMERCIAL AT might decompose into U+0040
      COMMERCIAL AT).

###

>>>   domainpart     = IP-literal / IPv4address / ifqdn
>>>   ifqdn          = 1*( iunreserved )
>>>                    ; the "iunreserved" rule is defined in RFC 3987
>>>
>>> I'm not an ABNF guru, so feedback is welcome.
> 
>> Apart from personally disliking '~' and '_' in the domainpart
> 
> Yeah I don't like "_" either, and I've never seen "~"...
> 
>> , WFM.
> 
> Upon further reflection I'm still not sure that ifqdn works, because
> iunreserved excludes characters that are disallowed in URIs but not
> necessarily in internationalized domain names. I need to double-check.

Given that the Nameprep spec doesn't include ABNF, I'm inclined to punt
on this just as we have for nodeprep and resourceprep, by doing this:

      jid           = [ localpart "@" ] domainpart [ "/" resourcepart ]
      localpart     = 1*(nodepoint)
                      ; a "nodepoint" is a UTF-8 encoded Unicode code
                      ; point that satisfies the Nodeprep profile of
                      ; stringprep
      domainpart    = IP-literal / IPv4address / ifqdn
                      ; the "IPv4address" and "IP-literal" rules are
                      ; defined in RFC 3986, and the first-match-wins
                      ; (a.k.a. "greedy") algorithm described in RFC
                      ; 3986 applies to the matching process
      ifqdn         = 1*(namepoint)
                      ; a "namepoint" is a UTF-8 encoded Unicode
                      ; code point that satisfies the Nameprep
                      ; profile of stringprep
      resourcepart  = 1*(resourcepoint)
                      ; a "resourcepoint" is a UTF-8 encoded Unicode
                      ; code point that satisfies the Resourceprep
                      ; profile of stringprep

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkw1DW8ACgkQNL8k5A2w/vy9ewCfbMvZmpmmTb5mDT1fGcDYkMRC
v90An03lmofMfrYN2hP7ZpUh3DXSsRGK
=bQY4
-----END PGP SIGNATURE-----