Re: [I18ndir] Review volunteer needed (Fwd: [dispatch] WGLC of draft-ietf-dispatch-javascript-mjs-07)
John C Klensin <john-ietf@jck.com> Wed, 29 April 2020 22:30 UTC
Return-Path: <john-ietf@jck.com>
X-Original-To: i18ndir@ietfa.amsl.com
Delivered-To: i18ndir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A16A53A08FD for <i18ndir@ietfa.amsl.com>; Wed, 29 Apr 2020 15:30:49 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id FgwaJUrhXoOi for <i18ndir@ietfa.amsl.com>; Wed, 29 Apr 2020 15:30:48 -0700 (PDT)
Received: from bsa2.jck.com (bsa2.jck.com [70.88.254.51]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 753543A08B2 for <i18ndir@ietf.org>; Wed, 29 Apr 2020 15:30:46 -0700 (PDT)
Received: from [198.252.137.10] (helo=PSB) by bsa2.jck.com with esmtp (Exim 4.82 (FreeBSD)) (envelope-from <john-ietf@jck.com>) id 1jTvE1-000BKm-To; Wed, 29 Apr 2020 18:30:41 -0400
Date: Wed, 29 Apr 2020 18:30:35 -0400
From: John C Klensin <john-ietf@jck.com>
To: Patrik Fältström <patrik@frobbit.se>, Pete Resnick <resnick@episteme.net>
cc: Internationalization Directorate <i18ndir@ietf.org>
Message-ID: <E67F0F68A403F5E4E5D8F476@PSB>
In-Reply-To: <A9854982-3696-46FF-AD5C-8088CFCDD8FC@frobbit.se>
References: <E552C138-7938-42BD-B2B2-26AD8AA43516@nostrum.com> <A93B38FC-7D55-4D06-80AE-F165F242F259@episteme.net> <31CF68D680D76D7F45FAB3E2@PSB> <A9854982-3696-46FF-AD5C-8088CFCDD8FC@frobbit.se>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline
X-SA-Exim-Connect-IP: 198.252.137.10
X-SA-Exim-Mail-From: john-ietf@jck.com
X-SA-Exim-Scanned: No (on bsa2.jck.com); SAEximRunCond expanded to false
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18ndir/R84k1CQ-c5y8hR8MQIW5fqvDWbo>
Subject: Re: [I18ndir] Review volunteer needed (Fwd: [dispatch] WGLC of draft-ietf-dispatch-javascript-mjs-07)
X-BeenThere: i18ndir@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Internationalization Directorate <i18ndir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18ndir/>
List-Post: <mailto:i18ndir@ietf.org>
List-Help: <mailto:i18ndir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 29 Apr 2020 22:30:50 -0000
--On Wednesday, April 29, 2020 20:58 +0200 Patrik Fältström <patrik@frobbit.se> wrote: > On 29 Apr 2020, at 3:41, John C Klensin wrote: > >> Since some hours have gone by without a response to your >> message and I was in need of an excuse to delay getting to an >> unpleasant task... > > Now I have read the draft as well. > >> Moreover, if I correctly understand what seems like >> unnecessary convoluted text (in both versions) a BOM is >> ignored in further processing if the character encoding >> scheme is determined to be UTF-8 in 4.2(2) or 4.2(3) but not >> ignored if charset="UTF-8" is present and the BOM occurs >> anyway (something clearly allowed by RFC 3629). That doesn't >> appear to make sense. > > I think I see the same thing as you, which is that even if the > charset parameter states the encoding is UTF-8, if the data > itself starts with a BOM, then the text is to be treated as > UTF-16. Actually, not what I noticed, and that reinforces my view that even if one ignores the specific i18n issues, the text is just too convoluted. I had read the text as suggesting that, if the charset was labeled (part 1), that the checks in part 2 and 3 just don't get made. Whether that is smart or not -- whether the protocol or application should make a sanity check on whether something labeled as charset="UTF-8" is actually conforming UTF-8 and/or whether there is a BOF present if it is consistent with UTF-8 -- I don't really know except that it should probably be clear. But, you are right: if the text is identified with charset-"UTF-8", then it really, really, better be UTF-8 and, if there is a BOF that suggests it is something else, then the spec should say something quite definite about that. > That is just so wrong. I was more concerned about something else (with the understanding that it isn't my only issue). As I read the spec, the plan is, approximately: (1) Apply step one, if there is no charset parameter present, go to step 2 (2) Apply step 2, i.e., go looking for a BOM fingerprint. If there isn't one present, go to stem 3. (3) Step 3: decide it is UTF-8. Now the (or at least one) problem with that is that, absent a rule that says "MUST use Unicode in some known encoding form", there is no practical and reliable way to distinguish UTF-8 from any part of ISO/IEC 8859 or, for that matter, any proprietary code page. UTF-16, with or without the BOM heuristics of your choice is better, but not much better. > I went to the ECMA spec and see they use UTF-16 all over the > place, and have to bend over backwards to get things right. It > feels like reading a BER encoding spec (again). :-) BER is at least precise about what it is talking about. This spec isn't. > This "problem" do already exist in RFC 4329... > > But, if they update RFC 4329 I think they should clean this > up, and my suggestion would be: > > The encoding must be what is actually labeled. If the encoding > is UTF-16 (which it seems it often is), then it should be > tagged as UTF-16, not UTF-8 with BOM. Absolutely. The easiest way out of both the problem you saw and the one I saw is to get rid of steps 2 and 3 and insist on labeling and conformance to what is labeled. If that is impractical for some reason, much more specificity is needed, starting with a firm Unicode requirement. And, unless they intend to confine themselves to the BMP, they probably need to talk about surrogates and their implications (or include such an explanation by reference). As to whether they "should" fix a problem left over from 4329, they have changed the text in that area and this is a Known Technical Omission or Defect. best, john
- [I18ndir] Review volunteer needed (Fwd: [dispatch… Pete Resnick
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John C Klensin
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Pete Resnick
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Patrik Fältström
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John C Klensin
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Asmus Freytag
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John Levine
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John Levine
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John Levine
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Asmus Freytag
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Asmus Freytag
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John C Klensin
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John C Klensin
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Pete Resnick
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Patrik Fältström
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Patrik Fältström
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Asmus Freytag (c)
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Asmus Freytag
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Asmus Freytag
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John R Levine
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Pete Resnick
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John C Klensin
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John R Levine
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Asmus Freytag
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Asmus Freytag
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John C Klensin
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John Levine
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Pete Resnick
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John R Levine
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Adam Roach
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John R Levine
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John C Klensin
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Asmus Freytag