Re: [xmpp] Unicode Version Interop Concerns in JIDs
Waqas Hussain <waqas20@gmail.com> Mon, 23 September 2019 20:11 UTC
Return-Path: <waqas20@gmail.com>
X-Original-To: xmpp@ietfa.amsl.com
Delivered-To: xmpp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 80B7312000F for <xmpp@ietfa.amsl.com>; Mon, 23 Sep 2019 13:11:39 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.748
X-Spam-Level:
X-Spam-Status: No, score=-1.748 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id kZCwMifEuhdW for <xmpp@ietfa.amsl.com>; Mon, 23 Sep 2019 13:11:37 -0700 (PDT)
Received: from mail-oi1-x22b.google.com (mail-oi1-x22b.google.com [IPv6:2607:f8b0:4864:20::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 21DE0120020 for <xmpp@ietf.org>; Mon, 23 Sep 2019 13:11:37 -0700 (PDT)
Received: by mail-oi1-x22b.google.com with SMTP id w144so8774669oia.6 for <xmpp@ietf.org>; Mon, 23 Sep 2019 13:11:37 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=+znZvvqali4cc0o5+7d5vjfPAjtsGC1F6n9aMxxt0P0=; b=hbqkp2gVPkidBKJ5AFvektcagp8S54LUyDdOFWeadrKHxVtkwHqIcgvAYnPxqtlZwA VDtsE6G9Ds7NZN/BbUs9Fe9bzHHt1b9KGPbGtQdR2YKBI/bsetgfYRk3pfkZeOd8jTUs gcAKZbJP95nCU12MRqmPF9zsq93jVttAzBPi+6/e8J8tch92dY4euiJfqq0wCEh1nzap lBsE7ntAr+Xa2p6iMyvDHIzjoFkwjRaN4TkhC30fxaxa5/Jrelo8XSWVZK5aJwq2hnqi gLFttZO7NNx45YH1PFvZVQmKNas3jho2eJI7U2V7nxUOuzvxR+qsQKV7m/3nd2RIC7B/ pwqQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=+znZvvqali4cc0o5+7d5vjfPAjtsGC1F6n9aMxxt0P0=; b=ii0OTE6lAcotp5ezWoCPb93RXwBA+b/8sFFuqmmTBcyu15Ga9u/Q2Xdmt1RQW06wJH snxtxX2Khy7ukYQUrhco33n54unBAAVV/1Tgz9H69cq5BUsuMonQGLKWV+5q5MWMbaVj W0APyRMtuzWnRoEw1TG/IZSTAv4HTnA4RI8u9Za/rf4BGS8AdeIFuqGrSs3CdQUvFN/N VlKBshyRMWNxhi36Tr0u0/QCbd0DMoUufW4y2GX2NKy+CNEfSmZnBzniJEe1MRAmTrvI 31f9SICSqnV/ciyApZMLaSuadlfvfiftB0ae7kDiAeVXVKwvYw+35qFMzICiXqa89h7K 4NPA==
X-Gm-Message-State: APjAAAVuljgpWKAO58xe2M+/qMVt8mHvFBQond93qgmM/2jS4lj+XvjQ 77NYmK3fyvo3C6+nyH4uRqvDv/TIUwAg6GDqIWI=
X-Google-Smtp-Source: APXvYqwBB1iUQeMKU/uMou42cLdJRj7MLmKAuyAXkz5WeZb3M4ZPLIHPurpG+Y8RqLzVjdzKyxRgVSOhwr7mV2yWOpU=
X-Received: by 2002:aca:7212:: with SMTP id p18mr1558327oic.165.1569269496267; Mon, 23 Sep 2019 13:11:36 -0700 (PDT)
MIME-Version: 1.0
References: <dbbb91ba-9116-50f7-fefa-09ef2bd5991d@ik.nu>
In-Reply-To: <dbbb91ba-9116-50f7-fefa-09ef2bd5991d@ik.nu>
From: Waqas Hussain <waqas20@gmail.com>
Date: Mon, 23 Sep 2019 16:11:25 -0400
Message-ID: <CALm9TZ8zba_ubSX=WvOSqFib_MMYR5P4jp+_4wc5DeRCU09Q7w@mail.gmail.com>
To: Ralph Meijer <ralphm@ik.nu>
Cc: XMPP Working Group <xmpp@ietf.org>, Alexey Melnikov <aamelnikov@fastmail.fm>, Barry Leiba <barryleiba@computer.org>
Content-Type: multipart/alternative; boundary="000000000000e1c9b705933e0667"
Archived-At: <https://mailarchive.ietf.org/arch/msg/xmpp/HAGTHbisrRnySzdqaOH-tEQFxgM>
Subject: Re: [xmpp] Unicode Version Interop Concerns in JIDs
X-BeenThere: xmpp@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: XMPP Working Group <xmpp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/xmpp>, <mailto:xmpp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/xmpp/>
List-Post: <mailto:xmpp@ietf.org>
List-Help: <mailto:xmpp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/xmpp>, <mailto:xmpp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 23 Sep 2019 20:11:40 -0000
On Tue, Sep 10, 2019 at 10:39 AM Ralph Meijer <ralphm@ik.nu> wrote: > Hi, > > Recently, there's been a discussion in the XSF Discussion room [1] about > interop issues in the face of different Unicode versions used for > processing XMPP Addresses, or JIDs. That particular discussion was > mostly focused on nicknames in Multi-User Chat (MUC) rooms, which are > encoded in the resourcepart of a JID, but is a concern for other address > handling. As I suggested giving this topic a wider audience, I write on > behalf of those involved in the initial discussion. > > Ever since RFC 6122 was obsoleted by RFC 7622 [2], both titled “XMPP: > Address Format”, resourceprep (which was fixed to Unicode 3.2) was > replaced by PRÉCIS processing as discussed in section 3.4. This in turn > the the resourcepart is a OpaqueString profile of the PRECIS > FreeformClass as defined in RFC 7613 [3], section 4.2 and RFC 7564 [4], > section 4.3 respectively. The idea is that in the face of newer Unicode > versions, application can make use of the new codepoints therein. > > RFC 7622 has extensive texts on JID handling, but there is uncertaintly > over when servers, services like MUC, and clients, should be liberal or > strict when checking JIDs. Different implementations perform their > processing based on differing versions of Unicode, implementations have > install bases still depending on older versions of the software and thus > the Unicode version they check against, and finally, there are > implementations and deployments performing the obsoleted stringprep. > > A particular example is the following. Say a MUC service (including its > server-to-server (s2s) handling) checks against Unicode version 12. One > user, with a client and their server checking against Unicode >=9, > chooses to use the nickname 'I♥🥓' (I love bacon). The MUC service > assumes everything is fine, and the occupant JID becomes > room@muc.server.example/I♥🥓. Both the BLACK HEART SUIT (U+2665) and > BACON (U+1F953) are in the Symbols, Other (So) category, and thus valid > for FreeformClass. > > Now another user comes along, using a server that supports Unicode 6.3. > Since BACON wasn't defined before Unicode 9, its code point is > unassigned. When receiving presence from the other user, what should the > receiving server do? > > a) It is liberal in what it accepts from other servers, it passes > incoming remote stanzas on to the client. > > b) It is strict, and sends back a <jid-malformed/>, which likely boots > the recipient from the room. > > c) In case a), if it wants to use private messaging towards the > occupant JID, their own server might reject this with a similar > <jid-malformed/> error. > > The above is just an example. MIX [5] refers to RFC 7700 [6], obsoleted > by RFC 8266, for preparing nicknames, which in turn also depends on > FreeformClass, and thus exhibiting similar concerns, but not on the > routing level. > > Basically the question comes down to: how do we robustly handle > different Unicode Versions in clients, services, and servers? > > [1] <xmpp:xsf@muc.xmpp.org> > [2] RFC 7622: XMPP: Address Format > <https://tools.ietf.org/html/rfc7622> > [3] RFC 7613: PRECIS Representing Usernames and Passwords > <https://tools.ietf.org/html/rfc7613> > [4] RFC 7564: PRECIS in Application Protocols > <https://tools.ietf.org/html/rfc7564> > [5] XEP-0369: Mediated Information eXchange (MIX) > <https://xmpp.org/extensions/xep-0369.html> > [6] <https://tools.ietf.org/html/rfc7700> > [7] RFC 8266: PRECIS Representing Nicknames > <https://tools.ietf.org/html/rfc8266> > > -- > ralphm > > _______________________________________________ > xmpp mailing list > xmpp@ietf.org > https://www.ietf.org/mailman/listinfo/xmpp There's an old thread on this from 2011 on the IETF list. I don't believe the core compatibility issue ever got resolved. See this message and the connected thread: https://mailarchive.ietf.org/arch/msg/xmpp/A2PT_EpDpR1swKbHC_te0NWwRgs We also lack any form of advertisement of supported unicode version by a remote entity, which is unfortunate. A stream feature and a caps hash may be useful. -- Waqas
- [xmpp] Unicode Version Interop Concerns in JIDs Ralph Meijer
- Re: [xmpp] Unicode Version Interop Concerns in JI… Florian Schmaus
- Re: [xmpp] Unicode Version Interop Concerns in JI… Sam Whited
- Re: [xmpp] Unicode Version Interop Concerns in JI… Alexey Melnikov
- Re: [xmpp] Unicode Version Interop Concerns in JI… Waqas Hussain