Re: [I18ndir] Review volunteer needed (Fwd: [dispatch] WGLC of draft-ietf-dispatch-javascript-mjs-07)
John C Klensin <john-ietf@jck.com> Thu, 30 April 2020 19:47 UTC
Return-Path: <john-ietf@jck.com>
X-Original-To: i18ndir@ietfa.amsl.com
Delivered-To: i18ndir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 314463A11BF for <i18ndir@ietfa.amsl.com>; Thu, 30 Apr 2020 12:47:50 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Ss_YoVbuTLKf for <i18ndir@ietfa.amsl.com>; Thu, 30 Apr 2020 12:47:47 -0700 (PDT)
Received: from bsa2.jck.com (bsa2.jck.com [70.88.254.51]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 24E8F3A11DF for <i18ndir@ietf.org>; Thu, 30 Apr 2020 12:47:47 -0700 (PDT)
Received: from [198.252.137.10] (helo=PSB) by bsa2.jck.com with esmtp (Exim 4.82 (FreeBSD)) (envelope-from <john-ietf@jck.com>) id 1jUF9t-00016w-66; Thu, 30 Apr 2020 15:47:45 -0400
Date: Thu, 30 Apr 2020 15:47:40 -0400
From: John C Klensin <john-ietf@jck.com>
To: Pete Resnick <resnick@episteme.net>
cc: i18ndir@ietf.org, John R Levine <johnl@taugh.com>
Message-ID: <477C5A18357719590D6336D9@PSB>
In-Reply-To: <8CE808C7-DF4F-45A9-9C17-2D82A8B78A9E@episteme.net>
References: <20200430014516.01551188B50A@ary.qy> <33a39102-0385-e235-1cdc-57cf6dad4f4b@ix.netcom.com> <7AD06F46449F354499AC2E24@PSB> <ACB0D0AB-2271-409D-A9A1-DFFD5A1AEE93@episteme.net> <alpine.OSX.2.22.407.2004301241440.26342@ary.qy> <8CE808C7-DF4F-45A9-9C17-2D82A8B78A9E@episteme.net>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
X-SA-Exim-Connect-IP: 198.252.137.10
X-SA-Exim-Mail-From: john-ietf@jck.com
X-SA-Exim-Scanned: No (on bsa2.jck.com); SAEximRunCond expanded to false
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18ndir/A83qz-DCjnzPMFqyG5REDghpe5c>
Subject: Re: [I18ndir] Review volunteer needed (Fwd: [dispatch] WGLC of draft-ietf-dispatch-javascript-mjs-07)
X-BeenThere: i18ndir@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Internationalization Directorate <i18ndir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18ndir/>
List-Post: <mailto:i18ndir@ietf.org>
List-Help: <mailto:i18ndir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 30 Apr 2020 19:47:51 -0000
Pete, I do not have time (especially today or tomorrow) to engage with this further or try to explain again, but parts of John's summary are, I believe, inconsistent with how Patrik, Asmus, and myself interpreted the document. Specifically, (1) We all interpreted it as "check the 'charset' parameter first". The question was what was to be done next. (2) At least two of us expressed concern about the use of a file name suffix as a classifier. Even if that is a carry-forward from 4329, it is a major step back from the reason why both email and the web adopted media type labeling and, if we are counting deployment and running code, I think it is safe to suggest that those two applications (or, if you prefer, sets of applications) are somewhat more broadly deployed than these so-called "scripting media types". (3) Other than the statement "Source text is expected to be in Unicode Normalization Form C", there is apparently no requirement that the underlying CCS be Unicode. The statement "Implementations are required to support the UTF-8 character encoding scheme" does not impose that requirement either, it just makes UTF-8 support mandatory to implement... if it does that, because normative requirements of that type are normally not buried in Security Considerations sections. This I-D claims authority for insisting on NFC from the Security Considerations Section of RFC 3629, but that Section does not discuss normalization forms at all: instead, it discusses "the same thing" issue and then says that the problems are "amenable to solutions" based on normalization (not just NFC) and UAX15. Even if that was good advice based on our best understanding in 2003, it certainly is neither necessary nor sufficient now. More generally, the current conventional wisdom (or, if you will, "best practice") is that, except where special circumstances apply (as they do with IDNA) normalization should occur if needed at processing (especially comparison) time, not at storage or transmission time. Yet, this document specifies NFC and does so without explanation other than, as discussed about, blaming RFC 3629 which is not only not guilty but is not applicable to any charset other than UTF-8. And, more broadly and probably more important: (5) It is not an i18n issue specifically, but as a co-author of RFC 6838 / BCP 13, I find it deeply troubling that this document is put forward as a media type registration when what it appears to do is to (i) allow the charset parameter but make it optional and specifically provide that it is to be ignored if present for Module Goals (Section 4.1) and then provide that, even for Script Goals, it can be ignored if heuristics are applied the suggest the charset (and encoding form) in use is something else. In particular, one can conform to the SHOULD in that Section by specifying, for example, 'charset="IEO-8859-6"' and, since the document doesn't specify what to do with Script goal charsets one does not recognize, support for that drags in all of the bidi and troublesome sequence issues associated with Arabic without any of the support Unicode the documents surrounding it provide. (6) The text and section organization that I described as "convoluted" is very troublesome. It was bad in 4329; the changes in the current I-D certainly does not fix the problem and may make it worse. When four of us (Patrik, Asmus, John Levine, and myself) each with considerable experience in reading technical specifications, can read the same spec and come to four different conclusions as to what it says to do, that is a deep, fundamental, problem independent of any details. What is even more troublesome is that they could rather easily dig themselves out of most (sadly, not all) of this mess by, as Asmus more or less put it, joining the 21st century. For example, well-placed requirements that state clearly: (i) For Module goal sources, the information MUST be in Unicode, using encoding form UTF-8. A charset parameter SHOULD NOT be specified, if it is, its value MUST be "UTF-8". (ii) For Script goal sources, a charset parameter MUST be specified and MUST be one of "UTF-8", "UTF-16BE", or "UTF-16LE". If it is omitted, the receiving system MAY dig itself into as deep a whole as it prefers, possibly using BOM heuristics if there is an explicit "MUST use Unicode" requirements for Script goals. and getting all requirements on the spec itself moved out of the Security Considerations section and stated as requirements, without relying on requirements or recommendations of documents like RFC 3629 that are somewhat outdated and/or don't say what the I-D claims or implies that they say and/or are not applicable to encoding forms (and non-Unicode CCSs) that the I-D allows. _Recommendations_ (a) We have said in multiple places, most recently in what is now RFC 8753, that this i18n stuff requires a collaborative effort by people whose expertise comes from a variety of different perspectives. The comments from Patrik, Asmus, John Levine, and myself illustrate the reasons for that. So either no review should ever go out unless it either reflects multiple sets of eyes and consensus (at least among those who were willing to look) or it should bear a much stronger disclaimer than is typical for "area review team" review assignments. The latter might say something like "while this review was assigned to me by the i18n directorate, it represents my opinion only and not consensus among the experts who make up that directorate, even consensus that my summary of their discussion is accurate". Consider what might happen in this case without one or the other. A review goes off that talks about the concerns of the directorate and John's summary of those concerns ("We understand it to say..."). The WG addresses those issues and the document goes to IETF LC. Some of Patrik, Asmus, and me (and maybe others) respond to the IETF LC pointing out the issues raised in our earlier notes and above, strongly suggesting that the WG should have known about most of this, that they are depending on documents that don't say what the WG claims they say and that violate the letter and spirit of assorted RCPs. We point out to IANA that this document is not a proper Media Type registration and that 4329 wasn't either. The WG responds with dismay because all of this is new to them. And the ART ADs (whom I believe are on this list) end up with egg on their faces as does the whole directorate and its leadership. (b) Let's respond to the WG with the issue I think those of us who have looked at the document are all agreed about: it is _really_ hard to figure out just what the document specifies and hence to comment on it in an authoritative way. If they are assuming Unicode, they need to make that a requirement, not hope the reader figures it out. The notorious Section 4 may need to be split up into separate subsections for Module goals and Script goals or otherwise structured to be sure it is clear what one is to do in each case and with and without charset parameters. And probably (less important for this iteration since no one else mentioned it, but I predict an extra iteration if it is not done), normative requirements on the spec must not appear only in the Security Considerations section and they better check the applicability of their references. Only when they fix enough of those things that we can all agree about what the documents says are they going to get a review of substantive i18n issues. Disgustedly, john --On Thursday, April 30, 2020 12:30 -0500 Pete Resnick <resnick@episteme.net> wrote: > On 30 Apr 2020, at 12:22, John R Levine wrote: > >>> the WG to take some action? If I don't hear from anyone, >>> I'll start accosting people privately. >> >> Nooo, not the Private Accosting. > > Obviously you have never experienced my full-out private > accosting. :-) > >> Summary: >> >> The i18n directorate has some concerns about character set >> handling in draft-ietf-dispatch-javascript-mjs-07. >> >> We understand it to say that if a javascript MIME element >> does not have a name that ends with .mjs, a consumer ignores >> the declared charset and looks at the first few bytes of the >> content for a byte order mark (BOM.) If it finds one, it >> uses the charset implied by the BOM, which can be UTF-16BE, >> UTF-16LE, or UTF-8. If there's no BOM, it uses the declared >> charset unless there isn't one, in which case it defaults to >> UTF-8. >> >> We are unaware of any other MIME type that uses this sort of >> trick to work around mislabelled content, and are concerned >> that it leads to failures in general MIME code that doesn't >> handle this special case. We also don't know how important >> the workaround is in practice, e.g., how many MIME producers >> still mislabel UTF-16 as UTF-8 or vice versa. >> >> For better interoperation it could say something like >> producers MUST put the correct charset on any media (same as >> any other media type) and that consumers SHOULD use the >> declared charset but MAY do the BOM trick for backward >> compatibility in certain cases. >> >> It also says the BOM must be removed from the decoded text. >> That's confusing since ECMAscript treats a BOM as a space >> which would be harmless at the start of a block of code. > > Thanks for taking up the pen John. If folks think something > needs to be elaborated or added, or if you have some > wordsmithing, do speak up. > > I'll check with Barry whether he wants this on the official > review form. If so, I'll assign the review to you in the > datatracker. Otherwise, you can just email the dispatch list > and sign it "John, stuckee for the directorate" or some such. > > pr > -- > Pete Resnick https://www.episteme.net/ > All connections to the world are tenuous at best
- [I18ndir] Review volunteer needed (Fwd: [dispatch… Pete Resnick
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John C Klensin
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Pete Resnick
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Patrik Fältström
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John C Klensin
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Asmus Freytag
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John Levine
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John Levine
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John Levine
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Asmus Freytag
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Asmus Freytag
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John C Klensin
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John C Klensin
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Pete Resnick
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Patrik Fältström
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Patrik Fältström
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Asmus Freytag (c)
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Asmus Freytag
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Asmus Freytag
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John R Levine
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Pete Resnick
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John C Klensin
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John R Levine
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Asmus Freytag
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Asmus Freytag
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John C Klensin
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John Levine
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Pete Resnick
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John R Levine
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Adam Roach
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John R Levine
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John C Klensin
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Asmus Freytag