Re: [I18ndir] I18ndir early review of draft-ietf-dispatch-javascript-mjs-07

"Martin J. Dürst" <duerst@it.aoyama.ac.jp> Sat, 09 May 2020 07:36 UTC

Return-Path: <duerst@it.aoyama.ac.jp>
X-Original-To: i18ndir@ietfa.amsl.com
Delivered-To: i18ndir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 070F03A07FD for <i18ndir@ietfa.amsl.com>; Sat, 9 May 2020 00:36:41 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.9
X-Spam-Level:
X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, MSGID_FROM_MTA_HEADER=0.001, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=itaoyama.onmicrosoft.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HwB3kWejoDWr for <i18ndir@ietfa.amsl.com>; Sat, 9 May 2020 00:36:36 -0700 (PDT)
Received: from JPN01-OS2-obe.outbound.protection.outlook.com (mail-eopbgr1410115.outbound.protection.outlook.com [40.107.141.115]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 00FD03A08BA for <i18ndir@ietf.org>; Sat, 9 May 2020 00:36:35 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=LSa/wXi4H27AnYPmz38n/EAUuWkG0JGVFsErcdenQ9cjpAN1/Pkm9D1mEUS9YVZ2OixRe7bN6M2jYsZX2cjzSt1BqEm9DA6AaisVKb2e6soT23pSch859oG8lvcV1V+QZUT71NZW8chUarQlY1keSk/9+VDEXS3a5vdHisha/vV1IvaQxfjrNqtGNH4tKlnDqbJ8TRUpW/Uv3ohPxjOllZ7nml1oZC1ZUArW1HebFyLH3UXUO8Fgn2eZFgWJJJBqX8/dZlnwi/gfBXa4Yq/46Wignbdyv07UYyeTfP8Zjj+5zcd7CBeedVhiwEDUNQ908PAMRSiwaXHL+Oon04oU9w==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=0/IFiL5Tv/9qyQfQm1k/1MmyRNMVypQpKkPDU/D+U8g=; b=IPJMr/ScjJC1TdOamX0tIuhFH7QxTChixV07HjVWTnEEPh6CpQGYy5JqLVw0YO8myQngY88w3m1jZGhd2WcompiQPsUGWB/Kdb7hiPoY68LfSD6ENtz0ur15g0CuQnffkpnJ63vn+AirOiHPOzgPGCZlp1pEuJm5tuZhzkPddlvZJHBiu9jRwRBLq8ZylzCD6mRSjYcnfLN7IeCG5px7LwVcXWMC9pq6Fo8E5NcBpQWv1zKRTSdqm0jey+Vlc5GVvCudu3u/jtFCnucf/g+vQHSZzQPsw/CYRw0/cSYhHrs301tT7QKUct0i2vxpy8P4NEIhoh8V5YilYaWD55e/8Q==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=it.aoyama.ac.jp; dmarc=pass action=none header.from=it.aoyama.ac.jp; dkim=pass header.d=it.aoyama.ac.jp; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=itaoyama.onmicrosoft.com; s=selector2-itaoyama-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=0/IFiL5Tv/9qyQfQm1k/1MmyRNMVypQpKkPDU/D+U8g=; b=ssMfTAD+BOakKf3NFJAyUxifcuGUcTJYcvdgLtZlwTy9lNKQ3DXx1EVxAWlyl/KeT3wF3Im8akwF1z3auaxsiKfd8cxM3g+NJOWfXNVK//NbhZEL3pZVAvmBwleMaVsAJSPbHnOiQcP9w6QXS+CUOlurLx8gXE/OnAOd4NwDiyY=
Authentication-Results: ietf.org; dkim=none (message not signed) header.d=none;ietf.org; dmarc=none action=none header.from=it.aoyama.ac.jp;
Received: from OSBPR01MB2566.jpnprd01.prod.outlook.com (52.134.254.141) by OSBPR01MB2502.jpnprd01.prod.outlook.com (52.134.252.11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2979.33; Sat, 9 May 2020 07:36:33 +0000
Received: from OSBPR01MB2566.jpnprd01.prod.outlook.com ([fe80::80a8:73bf:9ca8:7aae]) by OSBPR01MB2566.jpnprd01.prod.outlook.com ([fe80::80a8:73bf:9ca8:7aae%5]) with mapi id 15.20.2979.028; Sat, 9 May 2020 07:36:33 +0000
To: John C Klensin <john-ietf@jck.com>, Barry Leiba <barryleiba@computer.org>, John Levine <johnl@taugh.com>
Cc: i18ndir@ietf.org
References: <158896904545.17044.5288882047334991439@ietfa.amsl.com> <CALaySJ+CRJumYtDCxvGsSwzanz4y=7icuqd+toc0wMivf-mJGg@mail.gmail.c om> <CAD7Fb3diej1-3fAgqZsS_E9wOs1KC=OwVWbvxV5mVjOdQEQm5g@mail.gmail.com> <791ca602-758e-cb0f-a1a4-8fb6b74a8b61@outer-planes.net> <6F916805FF734CB450A3C724@PSB>
From: =?UTF-8?Q?Martin_J=2e_D=c3=bcrst?= <duerst@it.aoyama.ac.jp>
Organization: Aoyama Gakuin University
Message-ID: <015842ec-9a76-f381-1f46-c932224ebaca@it.aoyama.ac.jp>
Date: Sat, 9 May 2020 16:36:31 +0900
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.7.0
In-Reply-To: <6F916805FF734CB450A3C724@PSB>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 7bit
X-ClientProxiedBy: TY1PR01CA0181.jpnprd01.prod.outlook.com (2603:1096:402::33) To OSBPR01MB2566.jpnprd01.prod.outlook.com (2603:1096:604:1c::13)
MIME-Version: 1.0
X-MS-Exchange-MessageSentRepresentingType: 1
Received: from [192.168.1.10] (220.108.163.24) by TY1PR01CA0181.jpnprd01.prod.outlook.com (2603:1096:402::33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2979.26 via Frontend Transport; Sat, 9 May 2020 07:36:33 +0000
X-Originating-IP: [220.108.163.24]
X-MS-PublicTrafficType: Email
X-MS-Office365-Filtering-Correlation-Id: f3de1551-f4d7-496d-fa29-08d7f3ebb26f
X-MS-TrafficTypeDiagnostic: OSBPR01MB2502:
X-Microsoft-Antispam-PRVS: <OSBPR01MB2502FCE5B567B0AE513B61AFCAA30@OSBPR01MB2502.jpnprd01.prod.outlook.com>
X-MS-Oob-TLC-OOBClassifiers: OLM:10000;
X-Forefront-PRVS: 03982FDC1D
X-MS-Exchange-SenderADCheck: 1
X-Microsoft-Antispam: BCL:0;
X-Microsoft-Antispam-Message-Info: caO283iXJWwqxUKIm/ttJl5BLQamirq7w/hYHYRZg+x+z0EMfCdJw4kJA5ZuZ44aI7L09bmyLGS79xc8Z/C6W/4q16DJMaO7qPHt6YD74i0GjZFQSSjsFH6GnjWm7/F9LBHH/+JMBqPDMfU/RaPI9zEfFtvsjsMYQwaA8cG6TaSghHgt+YHfNCIyWPyQ/YQGlWebpTbCTJCfKWDWQhgYTeQVWZhtYDVU1nP3LuvKAhwCgaTtep5jHQwpfLLhP1hjVTQ9+rcp77s1HslueeZL6is/ZctfYzjfedKhQP/Zk4mUOK7ZTWzS8vHfRYybbRbcncNm3aBhOUV7JT5A8MUI4WWhGpomJPg/CUWfDusjdmwD3AUqiox2Yj11G+EtG/dYmzLbW4TnFQoUoemL+5/7eBGmEgN4TUsb5PYItiFFF9yyRRCtYqbLm2VEKrOOJm7a/pOMOhj11/2wB+AnnGCY3Zl/5wVEwSxp+fBneQL7PNtNT/K3p8igIrDH9jxpV0t9N304irJUV1gioFvR1cMSqpX+u+ytJC5ZS07H/+/5fA5XtPR7T1zp06BVnPohJDQfQk/OUsFXFEMPlmqoCxEiq5VqbPa6RtZ4xHLfcwTZMo0stxR/1ueIWEZDw2XDrzYrtngiy9Y8Zzqpdb4NO7Ye6g==
X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:OSBPR01MB2566.jpnprd01.prod.outlook.com; PTR:; CAT:NONE; SFTY:; SFS:(366004)(396003)(346002)(136003)(39840400004)(376002)(33430700001)(956004)(2616005)(2906002)(66946007)(8676002)(86362001)(31696002)(66556008)(66476007)(110136005)(53546011)(83080400001)(4326008)(31686004)(508600001)(966005)(36916002)(786003)(33440700001)(8936002)(52116002)(5660300002)(66574014)(186003)(6486002)(26005)(316002)(16576012)(16526019)(43740500002); DIR:OUT; SFP:1102;
X-MS-Exchange-AntiSpam-MessageData: 3nIlmhfmAb6jlLG97UwJKtvomqXXFbM+NaN+oGrffXVlhTk7rYWRlBKP8EFVEfxHYgNY3AqI6hqXIacaursElMRR4mK4LKUNYyTeCas351fKnm5+gGhBrY1ta3J7xWEXLK6V0B4zp2d8yfC8pSev0U8mJ/Q74MYb8mCTunSePUvXcIhJ8iGamAHk9rgslT9axAT1iTvtFePj0EbWvkUf23KAN0r8jawFr5Yl5ABzJgvXY8DOpilur41Ei5u1V+E9ZO3d55ZaDr4uVa8YwwsKQLsW+XpzTcJuuIhzPofUw2oRTOcqIdzXtC1gzB4wnyGEBeNlPxlmsTYtIpuIqRiBcFSVtToEBF7YrixWReAaFHpflDav9RY9Mni0wG7o0FNPcnfeZqam8chKDNwJq7WYcqlhMWOfEdR7RyQuH+fCX9bUR4ev6gdV76oQ5lXkC7StihD0hTIzH0FhFCuAgBWUWQyBuK4LRF91nygL25QZY4ztqOqjlKuXaBR4uphk4Ftn
X-OriginatorOrg: it.aoyama.ac.jp
X-MS-Exchange-CrossTenant-Network-Message-Id: f3de1551-f4d7-496d-fa29-08d7f3ebb26f
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 May 2020 07:36:33.4830 (UTC)
X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted
X-MS-Exchange-CrossTenant-Id: e02030e7-4d45-463e-a968-0290e738c18e
X-MS-Exchange-CrossTenant-MailboxType: HOSTED
X-MS-Exchange-CrossTenant-UserPrincipalName: EpKS+p6mBQEru5pZvpR9h2e4HRY0N5daphJqb+YfeuhBPEfpgCJtvMNEFQvjH9ZgfoytWZWgWuwv8UiwcGnOSA==
X-MS-Exchange-Transport-CrossTenantHeadersStamped: OSBPR01MB2502
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18ndir/ZhS1iaf6zuJ5BzczobdjjsAwtAg>
Subject: Re: [I18ndir] I18ndir early review of draft-ietf-dispatch-javascript-mjs-07
X-BeenThere: i18ndir@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Internationalization Directorate <i18ndir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18ndir/>
List-Post: <mailto:i18ndir@ietf.org>
List-Help: <mailto:i18ndir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 09 May 2020 07:36:42 -0000


On 09/05/2020 13:01, John C Klensin wrote:

>> Section 4.2 still includes step #3 to deal with the (in
>> practice quite common) case of a missing BOM and the media
>> type missing a charset parameter.  There are also too many
>> servers that set this to "ISO-8859-1" without otherwise
>> examining the sources being served. We'll make it clearer this
>> is a default/fallback case.
> 
> Noting (again) that, absent a BOM (or even with one) UTF-8
> cannot be reliably distinguished from ISO-8859-X (for any
> registered or unregistered value of X),

Sorry, not true. Please see 
https://www.sw.it.aoyama.ac.jp/2012/pub/IUC11-UTF-8.pdf, page 20.


> I don't know what the
> default/fallback case actually is or, more generally, what  the
> above paragraph means.  Unless I have misunderstood something
> important, the reality is that, if there is anywhere on the
> Internet that a web browser or server (including decades-old
> embedded servers) treat ISO/IEC 8859-1 as either the default or
> legitimate, then there are two possibilities: accurate labeling
> of the charset in use or use of heuristics that, by their nature
> and the nature of possible CCSs (not just encoding schemes), may
> fail.

For people not familiar with the details, the following should be 
pointed out clearly: The heuristic of detecting UTF-8 as UTF-8 based on 
its very specific bit patterns is an extremely strong heuristic. There 
are cases of specific character combinations in specific encodings that 
can look like UTF-8 bit combinations, but that is extremely rare and 
gets rarer and rarer the longer the data gets. This is already true if 
one looks just at the available characters/bit patterns, but even more 
so if one looks at the languages that use the respective encodings and 
their letter patterns.

Regards,   Martin.

> That calls for at least a health warning in the
> document, not proceeding as if the heuristics are foolproof.
> 
> best,
>     john