Re: Consensus call to include Display Strings in draft-ietf-httpbis-sfbis

"Martin J. Dürst" <duerst@it.aoyama.ac.jp> Sun, 28 May 2023 07:29 UTC

Return-Path: <ietf-http-wg-request+bounce-httpbisa-archive-bis2juki=lists.ie@listhub.w3.org>
X-Original-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Delivered-To: ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0A21BC15109B for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Sun, 28 May 2023 00:29:19 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.649
X-Spam-Level:
X-Spam-Status: No, score=-7.649 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HEADER_FROM_DIFFERENT_DOMAINS=0.25, MAILING_LIST_MULTI=-1, NICE_REPLY_A=-0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_PASS=-0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=itaoyama.onmicrosoft.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id v2e0tsD22TRY for <ietfarch-httpbisa-archive-bis2Juki@ietfa.amsl.com>; Sun, 28 May 2023 00:29:15 -0700 (PDT)
Received: from lyra.w3.org (lyra.w3.org [128.30.52.18]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 47455C14CE46 for <httpbisa-archive-bis2Juki@lists.ietf.org>; Sun, 28 May 2023 00:29:14 -0700 (PDT)
Received: from lists by lyra.w3.org with local (Exim 4.94.2) (envelope-from <ietf-http-wg-request@listhub.w3.org>) id 1q3Aps-00CjyM-H9 for ietf-http-wg-dist@listhub.w3.org; Sun, 28 May 2023 07:29:04 +0000
Resent-Date: Sun, 28 May 2023 07:29:04 +0000
Resent-Message-Id: <E1q3Aps-00CjyM-H9@lyra.w3.org>
Received: from titan.w3.org ([128.30.52.76]) by lyra.w3.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from <duerst@it.aoyama.ac.jp>) id 1q3Apq-00CjwN-2D for ietf-http-wg@listhub.w3.org; Sun, 28 May 2023 07:29:02 +0000
Received: from mail-os0jpn01on2116.outbound.protection.outlook.com ([40.107.113.116] helo=JPN01-OS0-obe.outbound.protection.outlook.com) by titan.w3.org with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from <duerst@it.aoyama.ac.jp>) id 1q3Apo-009oJ5-Bm for ietf-http-wg@w3.org; Sun, 28 May 2023 07:29:01 +0000
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=et4nqb+jiTBpOKPW3wtB9POcQBss28LjpN2MawYFx+WL81/Q3tM4WRNWKcE50REiIgWyJXvPrn41CI9uEKtu4vCK7GwG8K6M0UxEN4UBCMkmEQHOqEnOPd67l/jBkRmpp3p8BiTYCjLSNQwJQ3PEeUDiSr6pSMD4jSlMnud8AhRKI/KfINruAxqG434K/wXTR7vYmKQU3QPyJPECK0BCByLoJ/7lmiPefupKNLhk/HbmyemkMjiL6/pI+nsNzCqDUj4LleBYdFj/ovR1XXpSpDFVqvM7ZQb/xGc6wK66Y7IGKmDRtVwYrxWt+HBnlxJJoCTD3bnfs08Bu8yGDldxrw==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=6iHyKX+xerF/VWTnRZJ9vvfa4f/UGYzR8v37wjFlgxw=; b=AxLRt588rUJU/YqKtQjUlM51hr2RtCnePIQfIjXFCArlSFVuWIErRtMl09b8NYKBEMu0T2eY7hm7eZ6F3pbeRLLZTYS8m493C0EoQZA3UaaYcMianE5Q9PMAw8Tbc0Tph8o2IZRz5/61BQtoHRtL9WSYqjvWFX/1lzI9DvPPLWFf/+6+VppSxmL2VJ5hifVRTgL1vW7C6xeTvcGl4ipYGOA20Bo5AnJEaE+0mSoDbqU7iGwHkrJUD+JBwy/fnT/2LKKQ8qo3j3xPVdE43soqXfz6i7iNZsBnq7087/EqUtTryAXS4xCibuKsEa0AObqsbb97WUv1Y43lCJUsIu1/tg==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=it.aoyama.ac.jp; dmarc=pass action=none header.from=it.aoyama.ac.jp; dkim=pass header.d=it.aoyama.ac.jp; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=itaoyama.onmicrosoft.com; s=selector2-itaoyama-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=6iHyKX+xerF/VWTnRZJ9vvfa4f/UGYzR8v37wjFlgxw=; b=hqF88uhUyR0nz9em9+wvutv1cHeEl7He20Fg5UAeJL+S8WqTe1+D6itsWrxjO0J1Z2ulJh/wrxZ61X2ZiKdFnu0QT4gHk53Xd76WOODZGbKmjhZQkStYHZmIAbQCgw3BTqQY1iiidwT3CSUe16uEHHRVLkEk27q19BaHM0k5JiI=
Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=it.aoyama.ac.jp;
Received: from TYAPR01MB5689.jpnprd01.prod.outlook.com (2603:1096:404:8053::7) by TYCPR01MB11086.jpnprd01.prod.outlook.com (2603:1096:400:3ac::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6433.19; Sun, 28 May 2023 07:28:53 +0000
Received: from TYAPR01MB5689.jpnprd01.prod.outlook.com ([fe80::29a4:16ca:2bec:36d1]) by TYAPR01MB5689.jpnprd01.prod.outlook.com ([fe80::29a4:16ca:2bec:36d1%7]) with mapi id 15.20.6433.020; Sun, 28 May 2023 07:28:53 +0000
Message-ID: <045854a5-3df2-5d04-15bf-57dfb9cecc9e@it.aoyama.ac.jp>
Date: Sun, 28 May 2023 16:28:52 +0900
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.10.1
Content-Language: en-US
To: Willy Tarreau <w@1wt.eu>, Julian Reschke <julian.reschke@gmx.de>
Cc: ietf-http-wg@w3.org
References: <FC5270AF-509C-4331-AE8F-1F2D51BBC5F2@apple.com> <C687C218-7793-4B74-BB51-B7C34059F9C4@gbiv.com> <F84B0780-7710-4F74-9830-ECBD4A926C3D@mnot.net> <B38AA4F7-1F75-4690-9706-B8C7538B4DCC@gbiv.com> <A8EE6B66-1DC4-4A38-9992-9C4BBA0F5E6D@mnot.net> <ZHHBXIBbhgHOKnXd@1wt.eu> <a19c9d6c-d976-0da6-baca-d6582134374f@gmx.de> <ZHJqoFCW5q8+BqZK@1wt.eu> <83822f88-e1ae-708c-22b7-4be44e15d274@gmx.de> <ZHLhKN6nck3aRHsy@1wt.eu>
From: "Martin J. Dürst" <duerst@it.aoyama.ac.jp>
Organization: Aoyama Gakuin University
In-Reply-To: <ZHLhKN6nck3aRHsy@1wt.eu>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 7bit
X-ClientProxiedBy: TYCPR01CA0170.jpnprd01.prod.outlook.com (2603:1096:400:2b2::10) To TYAPR01MB5689.jpnprd01.prod.outlook.com (2603:1096:404:8053::7)
MIME-Version: 1.0
X-MS-PublicTrafficType: Email
X-MS-TrafficTypeDiagnostic: TYAPR01MB5689:EE_|TYCPR01MB11086:EE_
X-MS-Office365-Filtering-Correlation-Id: d34734c7-b8df-4dd1-f5ad-08db5f4d3037
X-MS-Exchange-SenderADCheck: 1
X-MS-Exchange-AntiSpam-Relay: 0
X-Microsoft-Antispam: BCL:0;
X-Microsoft-Antispam-Message-Info: 7FWsW/GB0BU1ZDDkWz9B12BQMLSl02HJQK2BI9lN3yYt0G6XWlmvXPWMva/YL8aK3mEYTDhvPkJSL+9jJFD7MQ918AeGLrxZ+04k4WkAqsAMNFX21XZ8cBLcVYKyhGNHx0CHOIQa3Ghb6hvkve0p03zxh5dXjfQKCs+9wf4hAQT8PnfR77bYgQc4ZGy4dBOPqFpBAowSv2AcAFlZtNFj8hbG697k/TN9NC5qdkWXITeXsSF28ncSBnF8XwBjomAqarJSvMicGRsfONfqkunnqZBKpQ0YD6B8DCuqBxUVvp+JhI1Enk6yak0ycGhSNWwva2kznlrz7Z2rLC4gDADu1yGl7pgRqxUf1n1V4HmO0wvdgTLRl1LRimALAuqB5A4zK6ba2n2UfzQ6uMZPykK7fepA5dW9XLlToobQP5oGbqmoNEtzhUb4C/OHtITGsijUnDZKf5jP69MZs5GaCMl+5ov4Ari5x898xi8C66xMnMonlZEhEMi0SQ0iMjKwp7JTGcFAfpiaUW+Xtm5LfSGqwGIc+BGQysi7ZKcYKQo7OViNgBXAEgKViEIfDFMqwmhotv7TqqyOJU2zmZ1fNb31rGdn9KQ15j2Y+FYz52tBL+Bsd7n3RqeqtX/brpnt1P8yRk4b+UsMNofQuSV2G/Y2CctscAcD0D6RTFRIIQWqgu991zd9iAirCsZ8POw6bBck
X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:TYAPR01MB5689.jpnprd01.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230028)(39850400004)(366004)(346002)(136003)(396003)(376002)(451199021)(66476007)(66556008)(66946007)(478600001)(4326008)(110136005)(31686004)(786003)(316002)(6486002)(41300700001)(36916002)(52116002)(38100700002)(38350700002)(966005)(6506007)(6512007)(26005)(8676002)(8936002)(2906002)(83380400001)(5660300002)(53546011)(41320700001)(31696002)(86362001)(2616005)(186003)(781001)(45980500001)(43740500002);DIR:OUT;SFP:1102;
X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1
X-MS-Exchange-AntiSpam-MessageData-0: iuZuSk9W3AUhrhfCC1Q2aWvsV91wJSMKO5TqbDNAQgq82Bsysz4pQN6piIN8atsWOyvILlgecWktvB26+OatonXWKVRSemk8BoVbz0+FgIWk9H+12nxNbp4cH+/Us5sMfPW+ne2jt1NnZF02Ii6t3pnkv6xubtDg1juK0LPbemCd9bWRtvPOwPX6tcndDXX/XF8QmV9+tlj9MmtS8o6OIX+sZM3PQwAJ2Zw6C11pOpyaTDAiTeRisnZFoF6yHgtG0XAvGpvr5EX0nnTM18lJEFUYvwRY/dLpQF29uNgzHvGRmNrgIkkMYo8ywtsrpnIBpBEuDO+wXa7lSdEWwKBBh887I0b/fd0nfUk4QhZ2SxAz+Ukq9n1OIJsSTU81YVdu0VpDpdnJ6pXNsUhCjGgb/cSEhAinto9GJptyjP+mmtmCe8kotDt/h+jac0PoycumsdpmYDe5wJaUVIkmk1Jf0rEox5JoumSusLAjNQa8GIHJpRxqe0mgodEAjLCT9HJQgSjyb7/1CDVA+GSp4ZNWVqb1NJewzPD+OY8kXnc2Z13NZ1/yr2xaogYMXg5tA7WqzrARun2BqMktKNXsf/xkcdoSSjsI3UgpkQUURKtEm/p54lHlO4lOipjkIYgCMGYxJ5XFqkYIWqNKGgST0CPzkNUUYo+fy1AZITr7qfyBX8iDskRvyGQCe9KxFopJpbDu96Sxw5uCICUUwtC3mGwgTFfEl402yjmnlA6N1oUJmxFDLU5B+AGoeys4yiJCeDiLxYOosw0RbWf5mwDZB3X2PSG2eMvKyn/d4jloL07qNQBhgZXm49+e7tsokopTIc6g82f8lvb0ofrKfPMAfnqJvPWnh6bKYUkn0wKaSAZytjzXVUUocyjfD+ExeCMHNv/2W1RfoNktJG2Ur1s33eaiUWz4/Wo6it/jla1BbMhygjH/3XqfP93/+jz6DMTwdp8FpVCtCt1W5MNklJeZUdg/SH6GyRF5zUWi5CJU5ui594C1VrO2pG3YNl865Acq/d3Cq+F5vT5ahYhBeDMPe6pDdqw4VBUlxZ8+KbkZZCvUwFCBJcslDieOnDLWPAQRZFM2G9PBg7+xXT04NXRBYF+awN6+LZYdmhyC82GCtqUcPC4Gy/2mWHzqAWI2OQeoZYS2MCzwkJD9XY13agWsnMChy68F4HPfpH1/SlUU/wdK4KdZSSt8oYs18qft3QkbYkHfPq9lbmSgSAMb+lOA47+Tu/bRCn9mnQecECZoh6Na4BHA0Ev8bBQ5tPiGhQ3/hiVfkakwmnxtTWo8bbkId0+mrV4emaTO7kNY+q3tntDbmlIlP3nvVHBzaoYzDju9EpPysZD4anlQu4V3PFlRrdoHsqtfwanTLEkFPEg5VMsneJiBNW5vYs860uD6Em6jf837tdmadjDvGypEQrYEd1AVu8iSCtP8J7ISoQBlDY4QnBKaJvd51ghN1JT9cwTNy6jcxMRVHQF/ShkmJMWmWwbzyb0FNvjLOlkALNwbmsk1wFST6Ykp2FYQSb+IQT9XxOu7sYKfHLdHybp2bWU38Lk4mVyy1NOq3upTSMYMUvcSc2K3aNMPBv+0U1/9cTYOTBl/
X-OriginatorOrg: it.aoyama.ac.jp
X-MS-Exchange-CrossTenant-Network-Message-Id: d34734c7-b8df-4dd1-f5ad-08db5f4d3037
X-MS-Exchange-CrossTenant-AuthSource: TYAPR01MB5689.jpnprd01.prod.outlook.com
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 May 2023 07:28:53.0111 (UTC)
X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted
X-MS-Exchange-CrossTenant-Id: e02030e7-4d45-463e-a968-0290e738c18e
X-MS-Exchange-CrossTenant-MailboxType: HOSTED
X-MS-Exchange-CrossTenant-UserPrincipalName: kdcsg1rNz9GxT5a5hVa35euIBUEwB3txFu6gD46sBffZsHQtN1Vx5p7KqhAUZq9FKifsZcZXpzHR1pFboCOLgw==
X-MS-Exchange-Transport-CrossTenantHeadersStamped: TYCPR01MB11086
Received-SPF: pass client-ip=40.107.113.116; envelope-from=duerst@it.aoyama.ac.jp; helo=JPN01-OS0-obe.outbound.protection.outlook.com
X-W3C-Hub-DKIM-Status: validation passed: (address=duerst@it.aoyama.ac.jp domain=itaoyama.onmicrosoft.com), signature is good
X-W3C-Hub-Spam-Status: No, score=-9.0
X-W3C-Hub-Spam-Report: BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, NICE_REPLY_A=-0.093, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, W3C_AA=-1, W3C_DB=-1, W3C_IRA=-1, W3C_IRR=-3, W3C_WL=-1
X-W3C-Scan-Sig: titan.w3.org 1q3Apo-009oJ5-Bm c7eef7b1887240743a4e4c8e8858e18b
X-Original-To: ietf-http-wg@w3.org
Subject: Re: Consensus call to include Display Strings in draft-ietf-httpbis-sfbis
Archived-At: <https://www.w3.org/mid/045854a5-3df2-5d04-15bf-57dfb9cecc9e@it.aoyama.ac.jp>
Resent-From: ietf-http-wg@w3.org
X-Mailing-List: <ietf-http-wg@w3.org> archive/latest/51115
X-Loop: ietf-http-wg@w3.org
Resent-Sender: ietf-http-wg-request@w3.org
Precedence: list
List-Id: <ietf-http-wg.w3.org>
List-Help: <https://www.w3.org/Mail/>
List-Post: <mailto:ietf-http-wg@w3.org>
List-Unsubscribe: <mailto:ietf-http-wg-request@w3.org?subject=unsubscribe>

Hello Willy, Julian, others,

There was a time (way back) when only the basic multilingual plane (i.e. 
a 16-bit space) had characters assigned. That turned out to not be 
enough, but it had the desirable side effect of keeping things compact. 
In UTF-8, that space can be covered by 3 bytes max per character, and it 
may have been that there were some implementations limited to 3 bytes 
max because they thought there wouldn't be any characters in the rest of 
the codespace.

UTF-8 itself was defined to use up to 6 bytes per character, because it 
was covering the full 32-bit space of the early ISO-10646 drafts. There 
were definitely implementations that covered all that space.

After some years, it became clear that a 16-bit space was not enough, 
but a 32-bit space was way too much. ISO and Unicode agreed on 17 planes 
of 16 bits, leading to an overall code space from U+0000 to U+10FFFF. As 
a result, the definition of UTF-8 was restricted to 4 bytes max per 
character (see RFC 3629, e.g. 
https://datatracker.ietf.org/doc/html/rfc3629#section-4, or your 
favorite Unicode version, or ISO 10646).

On 2023-05-28 14:05, Willy Tarreau wrote:
> On Sun, May 28, 2023 at 05:51:49AM +0200, Julian Reschke wrote:

>> AFAIU, the UTF-8 encoding/decoding function (sequence of code points to
>> octets and vice versa) never has changed (see
>> https://datatracker.ietf.org/doc/html/rfc3629#section-3). Am I missing
>> something here?

The actual mapping function at the places it matter indeed hasn't 
changed. But the domain and range have changed from the early max 6 
bytes to the current max 4 bytes.

Regards,   Martin.

> No you're indeed right. But I have clear memories of this "common"
> approach of iterating over a string as long as (c & 0xc0) == 0x80
> (which was the main concern) as well as the possibility of larger
> code sequences they didn't want to support (that was in early
> 2000/2001). I'm still seeing traces of this in the FSS-UTF proposal:
> 
>    https://www.cl.cam.ac.uk/~mgk25/ucs/utf-8-history.txt
> 
>       Bits  Hex Min  Hex Max  Byte Sequence in Binary
>    1    7  00000000 0000007f 0vvvvvvv
>    2   11  00000080 000007FF 110vvvvv 10vvvvvv
>    3   16  00000800 0000FFFF 1110vvvv 10vvvvvv 10vvvvvv
>    4   21  00010000 001FFFFF 11110vvv 10vvvvvv 10vvvvvv 10vvvvvv
>    5   26  00200000 03FFFFFF 111110vv 10vvvvvv 10vvvvvv 10vvvvvv 10vvvvvv
>    6   31  04000000 7FFFFFFF 1111110v 10vvvvvv 10vvvvvv 10vvvvvv 10vvvvvv 10vvvvvv
> 
> So maybe back then I only had to implement the 16-bit one and they
> later wanted to support the 21-bit one as well, I don't remember the
> exact details. But there's less risk if the standardized codes have
> a fixed maximum length, I agree. I just don't want to have to validate
> them when forwarding header fields ;-)
> 
> Regards,
> willy
>