Re: Possible BofF question -- I18n (was: Re: Possible OBF question -- I18n)

John C Klensin <john-ietf@jck.com> Sat, 02 June 2018 14:16 UTC

Return-Path: <john-ietf@jck.com>
X-Original-To: ietf@ietfa.amsl.com
Delivered-To: ietf@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0342112D875; Sat, 2 Jun 2018 07:16:02 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id fSQFNphN02ny; Sat, 2 Jun 2018 07:15:59 -0700 (PDT)
Received: from bsa2.jck.com (ns.jck.com [70.88.254.51]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 4D21B12D7F2; Sat, 2 Jun 2018 07:15:59 -0700 (PDT)
Received: from [198.252.137.10] (helo=PSB) by bsa2.jck.com with esmtp (Exim 4.82 (FreeBSD)) (envelope-from <john-ietf@jck.com>) id 1fP7K2-000LiK-0u; Sat, 02 Jun 2018 10:15:58 -0400
Date: Sat, 02 Jun 2018 10:15:51 -0400
From: John C Klensin <john-ietf@jck.com>
To: Nico Williams <nico@cryptonector.com>
cc: IETF general list <ietf@ietf.org>, art-ads@ietf.org
Subject: Re: Possible BofF question -- I18n (was: Re: Possible OBF question -- I18n)
Message-ID: <915C420BB5B4877B35C23EEA@PSB>
In-Reply-To: <20180602000921.GJ14446@localhost>
References: <862E5704FEE30E9EFA684961@PSB> <20180602000921.GJ14446@localhost>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
X-SA-Exim-Connect-IP: 198.252.137.10
X-SA-Exim-Mail-From: john-ietf@jck.com
X-SA-Exim-Scanned: No (on bsa2.jck.com); SAEximRunCond expanded to false
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf/ocGCzbdVpIlXL7O9w9HzlD0GxbA>
X-BeenThere: ietf@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: IETF-Discussion <ietf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf>, <mailto:ietf-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf/>
List-Post: <mailto:ietf@ietf.org>
List-Help: <mailto:ietf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 02 Jun 2018 14:16:02 -0000

--On Friday, June 1, 2018 19:09 -0500 Nico Williams
<nico@cryptonector.com> wrote:

>...
> Please be specific.  What failures have we had.
>...

>> * Mentions of UTR#46 in general I18N contexts may be close to
> 
> Finally.  900 words and this is the one specific thing, but
> not that specific either:
>...
>  Is this an elliptical way of alluding to confusables issues?

No.  "Confusables" are, IMO. at most a symptom.  See below for a
bit more.

>...
 
> Can you boil things down to a small set of bullet items?
> 
>  - IDNs and confusables?
> 
>  - I18N and human rights?
> 
>    (what's that got to do with Internet protocols?  sure we
> have to make    sure they internationalize, but we already
> knew that long before    anyone made the connection to human
> rights)

And that is consistent with, and complementary to, what I said
in my note to Neils.

>  - something else?  what?
> 
> Please also list any high-profile failures of the IETF in this
> space.
> 
> Keep it brief out of consideration for your readers' time.
>...

I'm sorry.  You are asking for details and asking that I/we be
brief.  If we are brief, you denounce us for being non-specific.
It is hard to satisfy contradictory requests.

   ----

More generally, I'm sorry but I had assumed, given your
extensive comments and claim of having developed expertise, that
you had been following these activities.  Without exception,
every item on the list below has been identified on at least the
most relevant one of the IAB i18 list (i18n-discuss@iab.org),
the former IDNABIS, EAI, or PRECIS list, or this (IETF discuss)
lists. Many, if not most, have had attention drawn to them on
multiple lists in the hope of alerting people who are not
following all of them.  There have also been extensive
discussions on several of these issues, including lack of
progress, in the ICANN SSAC, Universal Acceptance, ccTLD/CCNSO,
and other contexts, and at IGF and other more political forums,
but those are not IETF venues.   

If you, as an expert, have been unaware of those issues and
discussions and there are a significant number of other experts
like you, then the hypothesis that we lack sufficient expertise
may be false and should be replaced by a hypothesis that we have
not been successful in getting a sufficient number of the
available experts to engage.  That would probably not change the
need to do something, but should be a much easier problem to
solve.

In any event, and with the understanding that this is almost
certainly not a complete list, we'd had:

* Identification of an incorrect assumption, made in both the
IDNA2003 and IDNA2008 designs, about the scope and effectiveness
of normalization in Unicode, creating security risks as well as
heightened risk for user-perceived errors in comparisons.  That
issue was initially identified differently, with the original
identification turning out to be a subset or symptom of the
larger one.   The problem generated two IAB statements, one of
which essentially acknowledges that no progress had been made, a
BOF, and a few iterations of an I-D which we have not been able
to progress.

* The freeze Patrik identified, which is, in some respects, just
another symptom of our inability to engage on the above issue.

* A draft clarifying the responsibilities of zone administrators
("registries") under IDNA2008 which we have been unable to
process.

* A draft proposing dealing with a number of issues by creating
a "troublesome characters" list, which we have not been able to
process, even to the extent of a serious discussion of whether
such a list under IETF auspices is desirable and can be
maintained.

* Difficulty bringing the EAI/SMTPUTF8 and PRECIS work to
conclusion with work about which everyone has high confidence
due to the WGs running out of energy and active, informed, and
contributing, participation.

* Difficulty getting adequate input and review for work being
done in LAMPS on non-ASCII characters in X.509 certificates.

* The "many IDNA standards" problem Patrik and I mentioned
earlier in the context of UTR#46.   Given the variety of
implementations and interpretations, it seems clear that the
IAB/IETF should be considering how to get the message out, but
it has not been possible to even start that discussion.  For
this issue and the next one, it is possible to claim that the
IETF's responsibility stops when a specification is put out
there and dealing with variations, alternate interpretations,
and plain non-conformance is someone else's problem.   I see
that as an abdication of responsibility, especially when IETF
may not have done a good enough job of explaining the reasons
for its decisions.  YMMD.

* Similarly the EAI WG, after running experiments after its
first-round work, came to the conclusion that attempts to
convert local parts in transit (similar to IDNA), especially
using Punycode encoding, was risky, unlikely to work well in
many important cases, and just plain a bad idea.  Yet we've seen
heavily-promoted implementations that do just that, claiming
that it works well because their clients in users interact well
with their servers and mail stores.   We could be doing a better
job of explaining why that is a bad idea.

* As another element of the UTR#46-related problems, IDNA2008's
rule structure prohibits the use of symbols in domain name
labels.   We've seen some top-level registries violate that rule
and sell names containing some of those code points,
particularly emoji, and UTR#46 specifically authorized emoji.
Whether they do so or not, ICANN can put some pressure on
second-level registrations but,  by design and their own
decisions, cannot say very much about zones further down the
tree.  Should IETF or IAB be following up some of the
explanations that have been made in ICAAN or assorted mailing
list with a document explaining more broadly why the IDNA2008
rules are the way they are (including the rules for lookup-time
checking, which UTR#46 has been interpreted as discouraging) and
why people should be paying attention to them?  I don't know,
but believe that the discussion has not occurred and that, if
work were done on developing such a description, there would be
no way to discuss and progress it is a problem.

Again, not a complete list.   I suggest that each of those is a
significant problem and that the combination is more so, at
least unless one doesn't believe that addressing i18n usability,
security, and interoperability issues is important.  I also
suggest that many of those issues should be considered important
enough to justify the discussion proposed for this BOF even if
none of the others were out there, which I think is Patrik's
point.   YMMD.

Patrik has addressed your category list question.  I think he is
right although the more detailed list above should fill in some
of the blanks on the protocols side.   I've explained my
position on the human rights side.   And, btw, I think the
confusion issue is actually two separate sets of problems
depending on whether the confusion is deliberate or not.

I've read your comments about confusion.  In the interest of not
making this note even longer and because I don't want to stray
into trying to solve specific problems, I'm not going to respond
except to suggest that there is ample evidence that parts of
your analysis are just incorrect.

    best,
      john