Re: Possible BofF question -- I18n (was: Re: Possible OBF question -- I18n)

John C Klensin <john-ietf@jck.com> Fri, 01 June 2018 21:14 UTC

Return-Path: <john-ietf@jck.com>
X-Original-To: ietf@ietfa.amsl.com
Delivered-To: ietf@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 78B3612DA45; Fri, 1 Jun 2018 14:14:17 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id rlA4DaisE1Pm; Fri, 1 Jun 2018 14:14:15 -0700 (PDT)
Received: from bsa2.jck.com (ns.jck.com [70.88.254.51]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 5147412DA27; Fri, 1 Jun 2018 14:14:15 -0700 (PDT)
Received: from [198.252.137.10] (helo=PSB) by bsa2.jck.com with esmtp (Exim 4.82 (FreeBSD)) (envelope-from <john-ietf@jck.com>) id 1fOrNG-000JsH-CI; Fri, 01 Jun 2018 17:14:14 -0400
Date: Fri, 01 Jun 2018 17:14:08 -0400
From: John C Klensin <john-ietf@jck.com>
To: IETF general list <ietf@ietf.org>
cc: art-ads@ietf.org
Subject: Re: Possible BofF question -- I18n (was: Re: Possible OBF question -- I18n)
Message-ID: <862E5704FEE30E9EFA684961@PSB>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
X-SA-Exim-Connect-IP: 198.252.137.10
X-SA-Exim-Mail-From: john-ietf@jck.com
X-SA-Exim-Scanned: No (on bsa2.jck.com); SAEximRunCond expanded to false
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf/927rN09vWx7XWwCiYItrUYt_kjk>
X-BeenThere: ietf@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: IETF-Discussion <ietf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf>, <mailto:ietf-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf/>
List-Post: <mailto:ietf@ietf.org>
List-Help: <mailto:ietf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 01 Jun 2018 21:14:18 -0000

Hi.

First a status report.  

Rather than wait until the last minute, I went ahead and pasted
a template description into the BOF wiki page.  The entry there
is a slightly shortened and rewritten version of my initial note
in this thread.
https://trac.tools.ietf.org/bof/trac/wiki/WikiStart

I've listed Patrik (without asking him) and myself as
"proponents" for this, but would be happy to add names (or have
others do so).   Spencer, I assumed it would be not be
appropriate to add you there, but you'd certainly be welcome if
I got that wrong.

Because I had to choose and because most of the i18n-specific
WGs have been in Apps/ART, I've put it into that area, but I
believe that IESG needs to figure out how to handle this.  I
could make a case for General (on the theory that many areas are
covered or that procedural changes might be needed) or for
Security (because the most recent major i18n interactions that I
know about have been with a WG there), but, again, the IESG
should decide.

   --------

Now a few comments on recent postings (after I went to work on
the template).  

* I think it would be a just dandy idea to require an "I18N
Considerations" section in every I-D or RFC.  I see only two
problems with it.  One is that we've had such a requirement for
documents that do have i18n issues for 20+ years and it hasn't
helped.  We've also had suggestions for including a statement
about i18n issues in Shepherd templates.   If the IESG were to
decide to enforce that requirement by changing the I-D checker
and refusing to consider any document that doesn't have at least
a clear "no issues" statement, we might make some progress, but
no one has yet convinced them it would be appropriate (that has
been tried several times, but mostly in individual or
directorate discussions).  The other is that, while I rather
like the idea of some sort of formal process or review by
experts, we explored that some years ago too.  It requires
committed experts and the difficulty in finding those is part of
the problem, not an obvious solution.   Speaking as a likely
stuckee, if the idea doesn't have a clear structure and a
commitment from the IESG and the community will be taken
seriously rather than as miscellaneous review input (i.e., more
like MIB Doctors input in their area of expertise rather than
notes from whomever drew the short straw, regardless of
expertise, on the Foo Area Review Team), staffing that effort
and keeping it staffed is likely a non-starter.   So, even if we
are agreed that a required  I-D/ RFC section would solve the
problem, we would still benefit from this BOF to start shorting
out those details.

* Neils, having spend months, probably years, of my life trying
to explain to assorted well-intentioned people in social
justice, human rights, and getting more people on the Internet
in a way that is fair and appropriate to their cultures spaces
why these are hard problems and that superficial solutions make
things worse, no, making this a human rights issue wouldn't
help.  I'm sorry to single out the documents to which you
pointed for special consideration, but RFC 8280 is an example of
a document that should have gotten serious i18n review and
apparently didn't.  The second question in Section 6.2.5 should
have set off alarms because character coding is only a tiny part
of the issues and not even the most important part.   Yes,
support for multiple "charsets" is a bad idea, but, because the
mappings from other CCSs and back are usually (but not always)
completely deterministic and reversible, a far less serious one
than a long series of other issues.  For a web-oriented
illustration of another set of important problems, I suggest
that you have a look at
https://w3c.github.io/typography/gap-analysis/language-matrix.html
.  I'm also very concerned that the last paragraph of that
section seems to argue for relaxing the "protocol identifier"
constrain while showing no evidence of understanding it beyond
what might be described as "multlingualism good".   If the Human
Rights Review Team were to start complaining about the absence
of Internationalization Consideration sections, I think that
would be positive, but the expertise needed to adequate consider
Human Rights issues and that needed to consider i18n ones is
very different and RFC 8280 may actually be proof of that.

* Crypto expert mindsets -- I agree with John Levine on this.
FWIW, a few of us have made the observation that this i18n area
is easy to learn about for one class of people.  Not impossible
for others, but seemingly hard to grasp.   That class of people
consists of those who had strong exposure to at least three
languages, using at least two scripts (ideally ones that are not
closely related, i.e., Greek, Latin, and Cyrillic may only count
as one... or not), before the age of about seven (connected to
the Chomsky hypothesis about roughly the point that most people
can no longer acquire languages and language intuition as a
child).  As Peter points out, "competence" is easier, but one
still has to get fairly deep into the characteristics of several
unrelated (at least in the last three or four thousand years or
so) writing systems to reach that point.  And even that ignores
issues with spoken language, which much of the linguistics
community would claim is the only thing that is important.

* Mentions of UTR#46 in general I18N contexts may be close to
mentioning Hitler under Godwin's Law.  It is strictly about IDNs
and not the general problem; its current version makes
assumptions equivalent to domain names not needing to be treated
as having important properties of identifiers; etc.  As I said,
I don't think this BOF effort is about particular protocol
issues.  However, as a non-protocol issue (and maybe the point
John Levine was trying to make) due in no small measure to
UTR#46 and the homebrew updates to IDNA2003 it has encouraged,
we have a large number (perhaps dozens) of interpretations about
what IDNA is "really" about and what the requirements "really"
are.   That is an operational and user predictability problem, a
security problem, etc.    

"No urgency"?  Only if one wants to see the IDN situation
deteriorate to the point that systems that are concerned about
privacy, malware, etc., will simply filter out all of them as a
good tradeoff against the risks.

And, again, I see IDNs as one issue here among many, not the
only or even the key issue.

   john