Re: [I18nrp] Additional input needed for i18nRP BOF

Nico Williams <nico@cryptonector.com> Thu, 07 June 2018 03:18 UTC

Return-Path: <nico@cryptonector.com>
X-Original-To: i18nrp@ietfa.amsl.com
Delivered-To: i18nrp@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 15545130E1E for <i18nrp@ietfa.amsl.com>; Wed, 6 Jun 2018 20:18:14 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.795
X-Spam-Level:
X-Spam-Status: No, score=-2.795 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.795] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cryptonector.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id b2Svl8FSS-Bo for <i18nrp@ietfa.amsl.com>; Wed, 6 Jun 2018 20:17:57 -0700 (PDT)
Received: from homiemail-a27.g.dreamhost.com (homie-sub4.mail.dreamhost.com [69.163.253.135]) (using TLSv1.1 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D1ED1130DEF for <i18nrp@ietf.org>; Wed, 6 Jun 2018 20:17:57 -0700 (PDT)
Received: from homiemail-a27.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a27.g.dreamhost.com (Postfix) with ESMTP id 5D4DCA009A83; Wed, 6 Jun 2018 20:17:56 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=cryptonector.com; h=date :from:to:cc:subject:message-id:references:mime-version :content-type:in-reply-to; s=cryptonector.com; bh=spALxtewP80BFZ TJDVbpW4Mdp7w=; b=IO/5SV2n5adf/kbvk/NgAh5VNVI6erhBByJ1X2Oknno81o 6sfkwFntRHx8OCmZS3gchIOgO+L0yPmve+lMEAPfGml7Zf6RU6XRt+NxmYrDpt5f ac9Q7Kat73D9nhDsI2z7o7aRgE0FhnFK9to+LllqWTr/uFLVEbsvOtpzCvxp0=
Received: from localhost (unknown [8.2.105.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: nico@cryptonector.com) by homiemail-a27.g.dreamhost.com (Postfix) with ESMTPSA id 8A7E3A009A82; Wed, 6 Jun 2018 20:17:55 -0700 (PDT)
Date: Wed, 6 Jun 2018 22:17:53 -0500
From: Nico Williams <nico@cryptonector.com>
To: Peter Saint-Andre <stpeter@mozilla.com>
Cc: Adam Roach <adam@nostrum.com>, John C Klensin <john-ietf@jck.com>, i18nrp@ietf.org
Message-ID: <20180607031752.GS14446@localhost>
References: <f997170c-3062-0241-e58d-7a3415fba983@nostrum.com> <CE6F76BB323F1555D6B217A5@PSB> <9ecf8b7a-d086-1c56-03fb-6773aed332c6@nostrum.com> <4DA478C4C99396556E1B3EF1@PSB> <a31e91ff-c78c-6a7c-fe8c-70b9563312f7@nostrum.com> <8774afa2-4d3f-bc08-69af-f88e229f547a@mozilla.com> <07356789-b93f-b1a2-21d6-bef704b7c0b0@nostrum.com> <a6b7bf5c-3f37-e97b-7e44-c9e648bdbcef@mozilla.com> <ba6339f3-eb5f-4d14-51fb-256d6682f37e@nostrum.com> <c6d2a8d7-301b-c017-34ac-44da954c0b46@mozilla.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <c6d2a8d7-301b-c017-34ac-44da954c0b46@mozilla.com>
User-Agent: Mutt/1.5.24 (2015-08-30)
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18nrp/LGyqH_unHry64q242Lok1LV1E9k>
Subject: Re: [I18nrp] Additional input needed for i18nRP BOF
X-BeenThere: i18nrp@ietf.org
X-Mailman-Version: 2.1.26
Precedence: list
List-Id: Internationalization Review Procedures <i18nrp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18nrp>, <mailto:i18nrp-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18nrp/>
List-Post: <mailto:i18nrp@ietf.org>
List-Help: <mailto:i18nrp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18nrp>, <mailto:i18nrp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Jun 2018 03:18:14 -0000

On Wed, Jun 06, 2018 at 05:27:05PM -0600, Peter Saint-Andre wrote:
> As stated in BCP 18:
> 
>    Internationalization is for humans. This means that protocols are not
>    subject to internationalization; text strings are.
> 
> If there are no human-readable text strings in a protocol, then no i18n
> review is needed. In the IETF, that significantly narrows the field. My
> sense is that few INT or RTG area specs, a somewhat higher proportion of
> OPS, SEC, and TSV area specs, and a preponderance of ART area specs
> would at least need a spot check to determine if review is required.

Almost.  Consider SSHv2...

SSHv2 carries UTF-8 strings in some of its messages, and supports
language negotiation for localization of messages to users.  This is
very nice and maybe was fairly forward-thinking at the time.

SSHv2 easily could have not carried any such strings -- at least if we
ignore password and interactive authentication methods, which I will for
the sake of the argument I'm about to make.

   [TL;DR?  My point is this: even a protocol which carries no strings,
    but which does carry session text data, or documents, might need to
    support language or locale negotiation.
    
    Related point: it's not just I18N we should think about, but also
    L10N (localization).  In the IETF we tend to include L10N in I18N,
    but elsewhere I've seen G11N (globalization) used as encompassing
    I18N + L10N.]

But the whole point of SSHv2 is to carry "shell session" user data,
including user _interactive_ session data, which means: text -- text
that the protocol carries as-is no matter the choice of codeset, so
there are no internalization issues in SSHv2 regarding that text.  But
there are still _localization_ issues regarding that text: specifically,
negotiation of server-side locale for user sessions.

So even if SSHv2 had carried no strings in its various messages, it
would still have needed to support locale negotiation.

The protocol only supports language negotiation, but for the sessions it
establishes it kinda needed to support codeset negotiation as well, and
maybe other locale attributes.  Users make do by just sending LANG
and/or LC_* POSIX environment variables without any kind of negotiation,
hoping for the best.

Note also that the IETF has no standard notion of locale (unless that
has changed recently), which is partly why SSHv2 doesn't either.  In our
protocols we speak of language tags and codesets completely from each
other, and we try hard to only use Unicode and UTF-8.  But operating
systems tend to deal with {language, codeset} as units -- locales.
(There's a bit more to a locale than language + codeset, but let's
ignore that for a moment.)

It might be useful for the IETF to adopt (a subset of?) the POSIX notion
of locale.  Though admittedly we have very few protocols where we'd need
it, thus lacking a notion of locale has not been noticeably problematic.

Nico
--