Re: [I18ndir] Review volunteer needed (Fwd: [dispatch] WGLC of draft-ietf-dispatch-javascript-mjs-07)

John C Klensin <> Thu, 30 April 2020 04:18 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 867373A0DC5 for <>; Wed, 29 Apr 2020 21:18:01 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id pw6MV0AdJLOZ for <>; Wed, 29 Apr 2020 21:17:58 -0700 (PDT)
Received: from ( []) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id C80AA3A0DC4 for <>; Wed, 29 Apr 2020 21:17:58 -0700 (PDT)
Received: from [] (helo=PSB) by with esmtp (Exim 4.82 (FreeBSD)) (envelope-from <>) id 1jU0e1-000CHW-Jb; Thu, 30 Apr 2020 00:17:53 -0400
Date: Thu, 30 Apr 2020 00:17:47 -0400
From: John C Klensin <>
To: Asmus Freytag <>, John Levine <>
Message-ID: <7AD06F46449F354499AC2E24@PSB>
In-Reply-To: <>
References: <20200430014516.01551188B50A@ary.qy> <>
X-Mailer: Mulberry/4.0.8 (Win32)
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline
X-SA-Exim-Scanned: No (on; SAEximRunCond expanded to false
Archived-At: <>
Subject: Re: [I18ndir] Review volunteer needed (Fwd: [dispatch] WGLC of draft-ietf-dispatch-javascript-mjs-07)
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Internationalization Directorate <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Thu, 30 Apr 2020 04:18:02 -0000

John and Asmus,

It seem to me that we are straying into doing the design (or
redesign) work that is the responsibility of the WG.  Explaining
the problem to the WG and suggesting fixes would be entirely
reasonable.   Trying to convince each other, IMO, a lot less so.


--On Wednesday, April 29, 2020 19:37 -0700 Asmus Freytag
<> wrote:

> On 4/29/2020 6:45 PM, John Levine wrote:
>> It looks like step 1 is saying that if the text starts with a
>> BOM, you ignore the declared charset and sniff the BOM
>> instead, which sounds to me like an ancient workaround that
>> is perhaps no longer needed.
> If data are preceded by ("start with")  a BOM, you do want to
> strip it; you never want to keep it (and the chance that a
> legitimate text starts with a BOM that has an actual function
> in the text is perfectly negligible).
> If you have data that carries a UTF-16 bom it cannot be UTF-8,
> no matter what the charset declaration says (FF can't be a
> byte in UTF-8).
> Therefore, you always want to look for all three.
> If  it's UTF-8 you confirm you have the good character set
> and remove it.
> If it's one of the UTF-16 ones, you switch the charset to
> UTF-16 (or proper endianness) and remove it. (Or reject the
> input?)
>> Given that they are deprecating all of the existing
>> javascript media types and reviving text/javascript which
>> 4329 declared obsolete, this might be a good time to say if
>> you're going to use our lovely new (old) media type, declare
>> the correct character set so consumers can believe it and
>> stop doing byte sniffing kludges.
> There are two issues here. One centers around the fact that
> BOMs are "invisible". You can work hard at avoiding them, but
> they may be added by some "helpful" tool.
> The other is that they serve as a useful consistency check
> when present.
> You may write the spec to reject data with initial BOM, but
> then you'd still need to check for them. You definitely don't
> want to admit data without checking. Since you already need to
> check for them, you are better off giving clear instructions
> of how to make use of the fact that you've detected one.
> A./