Re: [I18ndir] Review volunteer needed (Fwd: [dispatch] WGLC of draft-ietf-dispatch-javascript-mjs-07)

Pete Resnick <resnick@episteme.net> Thu, 30 April 2020 23:24 UTC

Return-Path: <resnick@episteme.net>
X-Original-To: i18ndir@ietfa.amsl.com
Delivered-To: i18ndir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D2EAC3A091E for <i18ndir@ietfa.amsl.com>; Thu, 30 Apr 2020 16:24:46 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id iYgUWxBKOTxb for <i18ndir@ietfa.amsl.com>; Thu, 30 Apr 2020 16:24:34 -0700 (PDT)
Received: from episteme.net (episteme.net [216.169.5.102]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1A88A3A16FB for <i18ndir@ietf.org>; Thu, 30 Apr 2020 16:24:13 -0700 (PDT)
Received: from localhost (localhost [127.0.0.1]) by episteme.net (Postfix) with ESMTP id 11C85AB00441; Thu, 30 Apr 2020 18:24:10 -0500 (CDT)
Received: from episteme.net ([127.0.0.1]) by localhost (episteme.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id DLDSCZZxURhG; Thu, 30 Apr 2020 18:24:08 -0500 (CDT)
Received: from [172.16.1.15] (episteme.net [216.169.5.102]) by episteme.net (Postfix) with ESMTPSA id 9954EAB00437; Thu, 30 Apr 2020 18:24:06 -0500 (CDT)
From: "Pete Resnick" <resnick@episteme.net>
To: "John C Klensin" <john-ietf@jck.com>
Cc: i18ndir@ietf.org, "John R Levine" <johnl@taugh.com>
Date: Thu, 30 Apr 2020 18:24:06 -0500
X-Mailer: MailMate (1.13.1r5683)
Message-ID: <0C7783A5-831D-4704-96ED-21D3FD374743@episteme.net>
In-Reply-To: <477C5A18357719590D6336D9@PSB>
References: <20200430014516.01551188B50A@ary.qy> <33a39102-0385-e235-1cdc-57cf6dad4f4b@ix.netcom.com> <7AD06F46449F354499AC2E24@PSB> <ACB0D0AB-2271-409D-A9A1-DFFD5A1AEE93@episteme.net> <alpine.OSX.2.22.407.2004301241440.26342@ary.qy> <8CE808C7-DF4F-45A9-9C17-2D82A8B78A9E@episteme.net> <477C5A18357719590D6336D9@PSB>
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"; format=flowed
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18ndir/ZMC9dCFzz-NPnW4Hm_SEOK5Gh20>
Subject: Re: [I18ndir] Review volunteer needed (Fwd: [dispatch] WGLC of draft-ietf-dispatch-javascript-mjs-07)
X-BeenThere: i18ndir@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Internationalization Directorate <i18ndir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18ndir/>
List-Post: <mailto:i18ndir@ietf.org>
List-Help: <mailto:i18ndir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 30 Apr 2020 23:24:47 -0000

[Secretary/chair hat off. Technical commenter hat on.]

On 30 Apr 2020, at 14:47, John C Klensin wrote:

> (1) We all interpreted it as "check the 'charset' parameter
> first".  The question was what was to be done next.

No, 4329 says check the charset first. As Levine pointed out, the draft 
says:

1a First check if it's a Module (an out-of-band determination); if so, 
it's UTF-8
1b Second check for UTF-8 or UTF-16 BOM; if you find one of them, trust 
it
2  Third check the charset; if you understand it, use it.
3  If none of the above, assume UTF-8.

> (2) At least two of us expressed concern about the use of a file
> name suffix as a classifier.  Even if that is a carry-forward
> from 4329...

First, the draft registers .mjs, but the only thing it says about it is 
that environments that use file extensions will treat .mjs to mean it is 
a module. It has no normative text claiming that in a MIME contexts it 
should be used as a classifier. Second, .mjs is not a carryover from 
4329. It is true that .js is a carryover, but the only place it appears 
is in the MIME registration for the file extension associated with the 
MIME type. I don't see that as problematic.

> (3) Other than the statement "Source text is expected to be in
> Unicode Normalization Form C", there is apparently no
> requirement that the underlying CCS be Unicode.

See Levine's last message on this.

> (5) It is not an i18n issue specifically, but as a co-author of
> RFC 6838 / BCP 13, I find it deeply troubling that this document
> is put forward as a media type registration when what it appears
> to do is to (i) allow the charset parameter but make it optional
> and specifically provide that it is to be ignored if present for
> Module Goals (Section 4.1) and then provide that, even for
> Script Goals, it can be ignored if heuristics are applied the
> suggest the charset (and encoding form) in use is something
> else.

I suspect there is some HTTP behavior assumption in here. But I think 
your first sentence is correct; that should probably be labeled as "Not 
an i18n issue, but this is Not Good ™."

> (6) The text and section organization that I described as
> "convoluted" is very troublesome.  It was bad in 4329; the
> changes in the current I-D certainly does not fix the problem
> and may make it worse.  When four of us (Patrik, Asmus, John
> Levine, and myself) each with considerable experience in reading
> technical specifications, can read the same spec and come to
> four different conclusions as to what it says to do, that is a
> deep, fundamental, problem independent of any details.

Yep. Probably treated like (5).

> What is even more troublesome is that they could rather easily
> dig themselves out of most (sadly, not all) of this mess by, as
> Asmus more or less put it, joining the 21st century.    For
> example, well-placed requirements that state clearly:
>
> (i) For Module goal sources, the information MUST be in Unicode,
> using encoding form UTF-8.  A charset parameter SHOULD NOT be
> specified, if it is, its value MUST be "UTF-8".

Yes. It almost says that now, but that would be crystal clear.

> (ii) For Script goal sources, a charset parameter MUST be
> specified and MUST be one of "UTF-8", "UTF-16BE", or "UTF-16LE".
> If it is omitted, the receiving system MAY dig itself into as
> deep a whole as it prefers, possibly using BOM heuristics if
> there is an explicit "MUST use Unicode" requirements for Script
> goals.

As Asmus said at some point, checking for a BOM at the front of the data 
is probably a good idea to confirm that the charset is correct, but 
other than that, I agree.

> and getting all requirements on the spec itself moved  out of
> the Security Considerations section and stated as requirements,
> without relying on requirements or recommendations of documents
> like RFC 3629 that are somewhat outdated and/or don't say what
> the I-D claims or implies that they say and/or are not
> applicable to encoding forms (and non-Unicode CCSs) that the I-D
> allows.

You're just referring to the two paragraphs that mention 3629, correct? 
I didn't see anything else in section 5 that looked like a requirement.

[Tech reviewer hat off / secretary/chair hat back on]

> _Recommendations_
>
> (a) We have said in multiple places, most recently in what is
> now RFC 8753, that this i18n stuff requires a collaborative
> effort by people whose expertise comes from a variety of
> different perspectives.  The comments from Patrik, Asmus, John
> Levine, and myself illustrate the reasons for that.  So either
> no review should ever go out unless it either reflects multiple
> sets of eyes and consensus (at least among those who were
> willing to look) or it should bear a much stronger disclaimer
> than is typical for "area review team" review assignments.  The
> latter might say something like "while this review was assigned
> to me by the i18n directorate, it represents my opinion only and
> not consensus among the experts who make up that directorate,
> even consensus that my summary of their discussion is accurate".

While I agree with the concern, I think this pushes in the wrong 
"responsibility" direction. I would hope that if I had poorly triaged 
this as "just a boring little i18n review" and not mentioned it on the 
list but simply assigned it to Levine, he would have called me out and 
said, "Whoa, there's a lot going on here. We should have a list 
discussion about this." If both he and I screwed up and didn't think a 
conversation was worth having and you all had to yell out at IETF LC, 
that would be a (painful) learning experience for John and I, and a 
bummer for the WG in question, I think that is the correct way to grow. 
If we need to depend on all of the experts to weigh in on every 
document, that's simply not sustainable. We have to individually be able 
to make calls and know (or learn) when we have to bring out the troops. 
Yes, that means there will be errors that will lead to well-egged faces 
every so often. I know of no other way to make things better in the long 
run.

pr
-- 
Pete Resnick https://www.episteme.net/
All connections to the world are tenuous at best