Re: [I18ndir] Review volunteer needed (Fwd: [dispatch] WGLC of draft-ietf-dispatch-javascript-mjs-07)
Asmus Freytag <asmusf@ix.netcom.com> Wed, 29 April 2020 23:54 UTC
Return-Path: <asmusf@ix.netcom.com>
X-Original-To: i18ndir@ietfa.amsl.com
Delivered-To: i18ndir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0BC333A0A9C for <i18ndir@ietfa.amsl.com>; Wed, 29 Apr 2020 16:54:25 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.996
X-Spam-Level:
X-Spam-Status: No, score=-1.996 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=ix.netcom.com; domainkeys=pass (2048-bit key) header.from=asmusf@ix.netcom.com header.d=ix.netcom.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id RNUR0YyuSnBA for <i18ndir@ietfa.amsl.com>; Wed, 29 Apr 2020 16:54:23 -0700 (PDT)
Received: from elasmtp-masked.atl.sa.earthlink.net (elasmtp-masked.atl.sa.earthlink.net [209.86.89.68]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id EB9923A0A90 for <i18ndir@ietf.org>; Wed, 29 Apr 2020 16:54:22 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ix.netcom.com; s=dk12062016; t=1588204462; bh=9LAEFbahtVCVpaBgcBpTU0Pz3ajMsrcZLs0/ wun/GSs=; h=Received:Subject:To:References:From:Message-ID:Date: User-Agent:MIME-Version:In-Reply-To:Content-Type:Content-Language: X-ELNK-Trace:X-Originating-IP; b=Tk7agG2zHq1mQAuyDE+I7rRFgT6hBrnoN BNs8Bd4Q+iXUIxL8Z6gpqSXFgbz/qtRyLKxEa1tHVqbmneb+AQWb1hVCws/OgySDbU2 NejT8dvsZyljrsI3e+XyAgPUTjiqHRW8gIHOOO3R/NqqH2ccl7rL05C+xx9/wCPXDBv DooctagJXoIerM1PvAsNRe/31WD/gLpFbRslrBelMA4q3TJPLX/4LmOOYSWSFlxWB2c ak6FKGY3KzgQEY7k2qbd1NGXVV0Bpg1oiwM+T7g14jLNoortYdzFDJYd6jLUW/TKZPz wy0qzNQxOFn1ZqeWZIotggy8SSth9M543kREibDAQ==
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=dk12062016; d=ix.netcom.com; b=jWNjaqTthoBcbFwQk4l1oo6XNrw9bwBTc045du/pzoU8xIKWEEYQ3i6tMiyRLzlLmO+nr0CXOU8d1sGFBkthzc9zOekASxhhrYofDQ3OLMUT1n20cRHJxwDJX98XvzJvz2tEfijYoZBqhZZem7PCyAr8huY2TF+9GZkGonW9dUyAlvIwDdrvrVg6WcnXEg7LR+5WNXrvJasvhdLWSbj496sXdhe0wzaA70ifiQK9QYPrIKNbKiFSaWYpjzM6wRtZNuVLEpuvqC1C/Nh65gbeBR8TnZzC1TI4oVbbxfm4QGvr5FJdBHTjEQgUws/LqbCk6oNCmotwLhRfqGhIbFvSUQ==; h=Received:Subject:To:References:From:Message-ID:Date:User-Agent:MIME-Version:In-Reply-To:Content-Type:Content-Language:X-ELNK-Trace:X-Originating-IP;
Received: from [75.172.116.31] (helo=[192.168.0.5]) by elasmtp-masked.atl.sa.earthlink.net with esmtpa (Exim 4) (envelope-from <asmusf@ix.netcom.com>) id 1jTwWz-000EHR-0f for i18ndir@ietf.org; Wed, 29 Apr 2020 19:54:21 -0400
To: i18ndir@ietf.org
References: <E552C138-7938-42BD-B2B2-26AD8AA43516@nostrum.com> <A93B38FC-7D55-4D06-80AE-F165F242F259@episteme.net> <31CF68D680D76D7F45FAB3E2@PSB> <A9854982-3696-46FF-AD5C-8088CFCDD8FC@frobbit.se> <E67F0F68A403F5E4E5D8F476@PSB>
From: Asmus Freytag <asmusf@ix.netcom.com>
Message-ID: <0c3f5982-108d-81e0-29a4-ce67e7685f2e@ix.netcom.com>
Date: Wed, 29 Apr 2020 16:54:21 -0700
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0
MIME-Version: 1.0
In-Reply-To: <E67F0F68A403F5E4E5D8F476@PSB>
Content-Type: multipart/alternative; boundary="------------C111DEF6FF82E717511FBC87"
Content-Language: en-US
X-ELNK-Trace: 464f085de979d7246f36dc87813833b26976a2cdabd2db7a5a6e9fbbb6276b6de3e1ea2bffcb0734350badd9bab72f9c350badd9bab72f9c350badd9bab72f9c
X-Originating-IP: 75.172.116.31
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18ndir/fOJxBaOYKt_XOyxPhFLKeIvGQpY>
Subject: Re: [I18ndir] Review volunteer needed (Fwd: [dispatch] WGLC of draft-ietf-dispatch-javascript-mjs-07)
X-BeenThere: i18ndir@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Internationalization Directorate <i18ndir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18ndir/>
List-Post: <mailto:i18ndir@ietf.org>
List-Help: <mailto:i18ndir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 29 Apr 2020 23:54:25 -0000
Don't have time to delve into this deeply, but am reading along the discussion here. +1 on the text being too convoluted - I know my first reading was wrong. +2 on seeing whether we can't use the revised spec to move to the 21st century in terms of character encoding. (Their step 3 really attempts that, by making it the default / fallback when unlabeled / or without BOM signature, but it could be more explicit that you can only get legacy behavior if such is labeled). +3 on stronger language about detecting mislabeled / inconsistent cases. Some conditions are clear cut enough to warrant rejection. Ideally, you'd always reject and not convert anything to U+FFFD. But, again, existing legacy may make that not possible. +4 Aren't UTF-16 BOM versions are not legal UTF-8 byte sequences, or am I misremebering? If so, a UTF-16 BOM should override or flag as invalid any UTF-8 declaration. A./ On 4/29/2020 3:30 PM, John C Klensin wrote: > > --On Wednesday, April 29, 2020 20:58 +0200 Patrik Fältström > <patrik@frobbit.se> wrote: > >> On 29 Apr 2020, at 3:41, John C Klensin wrote: >> >>> Since some hours have gone by without a response to your >>> message and I was in need of an excuse to delay getting to an >>> unpleasant task... >> Now I have read the draft as well. >> >>> Moreover, if I correctly understand what seems like >>> unnecessary convoluted text (in both versions) a BOM is >>> ignored in further processing if the character encoding >>> scheme is determined to be UTF-8 in 4.2(2) or 4.2(3) but not >>> ignored if charset="UTF-8" is present and the BOM occurs >>> anyway (something clearly allowed by RFC 3629). That doesn't >>> appear to make sense. >> I think I see the same thing as you, which is that even if the >> charset parameter states the encoding is UTF-8, if the data >> itself starts with a BOM, then the text is to be treated as >> UTF-16. > Actually, not what I noticed, and that reinforces my view that > even if one ignores the specific i18n issues, the text is just > too convoluted. I had read the text as suggesting that, if the > charset was labeled (part 1), that the checks in part 2 and 3 > just don't get made. Whether that is smart or not -- whether > the protocol or application should make a sanity check on > whether something labeled as charset="UTF-8" is actually > conforming UTF-8 and/or whether there is a BOF present if it is > consistent with UTF-8 -- I don't really know except that it > should probably be clear. But, you are right: if the text is > identified with charset-"UTF-8", then it really, really, better > be UTF-8 and, if there is a BOF that suggests it is something > else, then the spec should say something quite definite about > that. > >> That is just so wrong. > I was more concerned about something else (with the > understanding that it isn't my only issue). As I read the spec, > the plan is, approximately: > (1) Apply step one, if there is no charset parameter present, > go to step 2 > (2) Apply step 2, i.e., go looking for a BOM fingerprint. If > there isn't one present, go to stem 3. > (3) Step 3: decide it is UTF-8. > > Now the (or at least one) problem with that is that, absent a > rule that says "MUST use Unicode in some known encoding form", > there is no practical and reliable way to distinguish UTF-8 from > any part of ISO/IEC 8859 or, for that matter, any proprietary > code page. UTF-16, with or without the BOM heuristics of your > choice is better, but not much better. > >> I went to the ECMA spec and see they use UTF-16 all over the >> place, and have to bend over backwards to get things right. It >> feels like reading a BER encoding spec (again). :-) > BER is at least precise about what it is talking about. This > spec isn't. > >> This "problem" do already exist in RFC 4329... >> >> But, if they update RFC 4329 I think they should clean this >> up, and my suggestion would be: >> >> The encoding must be what is actually labeled. If the encoding >> is UTF-16 (which it seems it often is), then it should be >> tagged as UTF-16, not UTF-8 with BOM. > Absolutely. The easiest way out of both the problem you saw and > the one I saw is to get rid of steps 2 and 3 and insist on > labeling and conformance to what is labeled. If that is > impractical for some reason, much more specificity is needed, > starting with a firm Unicode requirement. And, unless they > intend to confine themselves to the BMP, they probably need to > talk about surrogates and their implications (or include such an > explanation by reference). As to whether they "should" fix a > problem left over from 4329, they have changed the text in that > area and this is a Known Technical Omission or Defect. > > best, > john >
- [I18ndir] Review volunteer needed (Fwd: [dispatch… Pete Resnick
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John C Klensin
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Pete Resnick
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Patrik Fältström
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John C Klensin
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Asmus Freytag
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John Levine
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John Levine
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John Levine
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Asmus Freytag
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Asmus Freytag
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John C Klensin
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John C Klensin
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Pete Resnick
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Patrik Fältström
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Patrik Fältström
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Asmus Freytag (c)
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Asmus Freytag
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Asmus Freytag
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John R Levine
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Pete Resnick
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John C Klensin
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John R Levine
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Asmus Freytag
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Asmus Freytag
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John C Klensin
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John Levine
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Pete Resnick
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John R Levine
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Adam Roach
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John R Levine
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… John C Klensin
- Re: [I18ndir] Review volunteer needed (Fwd: [disp… Asmus Freytag