Re: [EAI] Unicode's PRI #185 & EAI: should the UBA be revised to handle bidi email addresses better too?

"Martin J. Dürst" <duerst@it.aoyama.ac.jp> Fri, 29 July 2011 10:36 UTC

Message-ID: <4E328CFC.2010502@it.aoyama.ac.jp>
Date: Fri, 29 Jul 2011 19:35:40 +0900
From: "\"Martin J. Dürst\"" <duerst@it.aoyama.ac.jp>
Organization: Aoyama Gakuin University
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.9) Gecko/20100722 Eudora/3.0.4
MIME-Version: 1.0
To: "Aharon (Vladimir) Lanin" <aharon@google.com>
References: <CA+FsOYasAd7w21vpX3gv0ZiHsmkiVCKfSa9hz+98THQGshmnsA@mail.gmail.com> <4E1EEFFA.1080007@gulbrandsen.priv.no> <CA+FsOYaFhXhD4LW4cmfZ3nz4AhEiBJ+E_TEHcE_rBetWpa_N1A@mail.gmail.com> <C480FC7B47C5BC56FF265009@PST.JCK.COM> <CA+FsOYY0ghirGe3ahrqM6DuXw51morONF6By0OfZ48f16KgvFA@mail.gmail.com>
In-Reply-To: <CA+FsOYY0ghirGe3ahrqM6DuXw51morONF6By0OfZ48f16KgvFA@mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 7bit
Cc: Arnt Gulbrandsen <arnt@gulbrandsen.priv.no>, ima@ietf.org
Subject: Re: [EAI] Unicode's PRI #185 & EAI: should the UBA be revised to handle bidi email addresses better too?
Precedence: list

[This is a copy of a comment that I submitted via the Unicode Review 
Form and posted on the member-only Unicode mailing list. I'm sending it 
here to have a public record, because this's the mailing list where most 
of the discussion about this draft in the IETF happened, as far as I'm 
aware of.]


Context
=======

I'm an individual Unicode member, but I'll paste this in to the 
reporting form because that's easier. Please make a 'document' out of it 
(or more than one, if that helps to better address the issues raised 
here). I apologize for being late with my comments.


Substantive Comments
====================

On substance, I don't agree with every detail of what Jonathan Rosenne, 
Behdad Esfahbod, Aharon Lanin and others have said, I agree with them in 
general. If their documents/messages are not properly submitted, I 
include them herewith by reference.

The proposal is an enormous change in the Bidi algorithm, changing its 
nature in huge ways. Whatever the details eventually may look like, it 
won't be possible to get everything right in one step, and probably 
countless tweaks will follow (not that they necessarily will make things 
better, though). Also, dealing with IRIs will increase the 
appetite/pressure for dealing with various other syntactical constructs 
in texts.

The introduction of the new algorithm will create numerous compatibility 
issues (and attack surfaces for phishing, the main thing the proposal 
tries to address) for a long period of time. Given that the Unicode 
Consortium has been working hard to address (compared to this issue) 
even extremely minor compatibility issues re. IDNs in TR46, it's 
difficult for me to see how this fits together.


Taking One Step Back
====================

As one of the first people involved with what's now called IDNs and 
IRIs, I know that the problem of such Bidi identifiers is extremely 
hard. The IETF, as the standards organization responsible for 
(Internationalized) Domain Names and for URIs/IRIs, has taken some steps 
to address it (there's a Bidi section in RFC 3987 
(http://tools.ietf.org/html/rfc3987#section-4), and for IDNs, there is 
http://tools.ietf.org/html/rfc5893).

I don't think these are necessarily sufficient or anything. And I don't 
think that the proposal at hand is completely useless. However, the 
proposal touches many aspects (e.g. recognizing IRIs in plain text,...) 
that are vastly more adequate for definition in another standards 
organization or where a high-bandwidth coordination with such an 
organization is crucial (roughly speaking, first on feasibility of 
various approaches, then on how to split up the work between the 
relevant organizations, then on coordination in details.) Without such a 
step back and high-bandwidth coordination, there is a strong chance of 
producing something highly suboptimal.

(Side comment on  detail: It would be better for the document to use 
something like
http://tools.ietf.org/html/rfc3987#section-2.2 rather than the totally 
obscure and no longer maintained 
http://rfc-ref.org/RFC-TEXTS/3987/chapter2.html, in the same way the 
Unicode Consortium would probably prefer to have its own Web site 
referenced for its work rather than some third-party Web site.)


Taking Another Step Back
========================

I mention 'high-bandwidth' above. The Unicode "Public Review" process is 
definitely not suited for this. It has various problems:
- The announcements are often very short, formalistic, and cryptic
   (I can dig up examples if needed.)
- The announcements go to a single list; forwarding them to other
   relevant places is mostly a matter of chance. This should be improved
   by identifying the relevant parties and contacting them directly.
- To find the Web form, one has to traverse several links.
- The submission is via a Web form, without any confirmation that the
   comment has been received.
- The space for comments on the form is very small.
- There is no way to make a comment public (except for publishing it
   separately).
- There is no official response to a comment submitted to the Web form.
   One finds out about what happened by chance or not at all.
   (compare to W3C process, where WGs are required to address each
    comment formally, and most of them including the responses are
    public)
- The turnaround is slow. Decisions get made (or postponed) at UTCs
   only.
Overall, from an outsider's point of view, the review process and the 
review form feel like a needle's ear connected to a black hole.

[I very much understand that part of the reason the UTC works the way it 
works is because of its collaboration with ISO/IEC committees. And I 
don't think any other standards organization has a perfect process. But 
what's appropriate for one part of the UTCs work may not be appropriate 
for other parts of its work (such as the matter at hand).]



Conclusion
==========

I herewith very strongly recommend that the UTC, besides using the 
upcoming meeting to advance discussion on the technical issues that the 
proposal raises,
a) Postpone the decision to adopt any of the proposed changes, 
independent of details, until such time as point b) is implemented and 
executed.
b) Swiftly take the necessary steps for a much better, high-bandwith 
coordination of this topic and the various issues it encompasses, both 
using the existing liaison mechanism and using public discussion on an 
appropriate forum (e.g. one of the relevant IETF mailing lists 
(idna/eai/iri)).
c) Seriously work on improving the process for soliciting and addressing 
comments from the public and relevant stakeholders.


Regards,    Martin.

[EAI] Unicode's PRI #185 & EAI: should the UBA be… Aharon (Vladimir) Lanin
Re: [EAI] Unicode's PRI #185 & EAI: should the UB… Arnt Gulbrandsen
Re: [EAI] Unicode's PRI #185 & EAI: should the UB… Aharon (Vladimir) Lanin
Re: [EAI] Unicode's PRI #185 & EAI: should the UB… John C Klensin
Re: [EAI] Unicode's PRI #185 & EAI: should the UB… Aharon (Vladimir) Lanin
Re: [EAI] Unicode's PRI #185 & EAI: should the UB… Martin J. Dürst
Re: [EAI] Unicode's PRI #185 & EAI: should the UB… Martin J. Dürst
Re: [EAI] Unicode's PRI #185 & EAI: should the UB… Mark Davis ☕