Re: [I18ndir] [art] Fwd: New Version Notification for draft-bray-unichars-06.txt
"Martin J. Dürst" <duerst@it.aoyama.ac.jp> Mon, 09 October 2023 09:49 UTC
Return-Path: <duerst@it.aoyama.ac.jp>
X-Original-To: i18ndir@ietfa.amsl.com
Delivered-To: i18ndir@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 06648C151067; Mon, 9 Oct 2023 02:49:24 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.999
X-Spam-Level:
X-Spam-Status: No, score=-1.999 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, NICE_REPLY_A=-0.091, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=itaoyama.onmicrosoft.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id F5QMa-Gtw7bB; Mon, 9 Oct 2023 02:49:19 -0700 (PDT)
Received: from JPN01-OS0-obe.outbound.protection.outlook.com (mail-os0jpn01on2106.outbound.protection.outlook.com [40.107.113.106]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id CA32CC14CEF9; Mon, 9 Oct 2023 02:49:17 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=A3jo6d0EytLN6+oQvWmxd28ODHBe3OlMZxBEIfG0MGXLElwK42SdfZfdZoROwlUkBQIawDtOhOVnvYRJDOKdo+cy6OSjzuvhR6MKGBY4Hb7SYTYJ9faTxI/6bCrp4yQVactO8ka/zN8Y0V0nzUgFLXK+8nNDDxuWZjx3eq+6QzpPLA6+IhGdCFIQGw3dE8zYQpsZIlCzfad7tspZRJb/Y54yvt5zj23iNufRU9gc5xjy1xnf82y+03XLl5b3BTutU76rUO3qev24zuZR4/6jrXk4xbFZZF+1XSx2T6LFtcyMGLQjxxRZjmI6qPj9bR+Ex9OsP6haAuVZzwOnb0qpYg==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=GBiITNq2K701v2uj+/TyGgmHIM2o5a2+lFGf3ui5Hgg=; b=ldlyFSy0wBZCCpZSUvU974PLVV2mWwyE6JGTLjbB1WJKlkkHr/2XmIJMNT3/eZFLKcoJCpWTa6TGI6LQi0H3lEMKPdWdBU0FNydqtFTW8ozcogAHGV8R8E1vHYkSgD7cEpkXbuelgoIQ2DyFMEIu2WaymcX6uC/gTnYYbFvRUlBKWmOAaZNmB6XRfK61p7uWeeWGZDroS3PuICGhNjiT4I1yaZcqiGaU0xPcscsGPSLNlPRYQJ+PKRf5YO12+ZOAF1T5N8VUOz9K4OFzhWgPZ9lJh5EdTZ27LCVHnDBLE1B4yMzYibTa+NXBs1rI34LdrADL1cOJm/CGpOIkJkhDmA==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=it.aoyama.ac.jp; dmarc=pass action=none header.from=it.aoyama.ac.jp; dkim=pass header.d=it.aoyama.ac.jp; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=itaoyama.onmicrosoft.com; s=selector2-itaoyama-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=GBiITNq2K701v2uj+/TyGgmHIM2o5a2+lFGf3ui5Hgg=; b=J8A9sU2lgSXghqDqSU4yaOse59452gfArXqiUrmuyctI9wuBHssz8kuXDB0dvgSLgKBbfA0b8TwPz7FqFCVlR3P1bA8gSm+xUt+AA6/8orny2Z/z64KH1dnQ6wHhcxw//1p5yIeVe9xlgvu7XffCmWaJgJz/OL9eAnrk2ulDtvk=
Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=it.aoyama.ac.jp;
Received: from TYAPR01MB5689.jpnprd01.prod.outlook.com (2603:1096:404:8053::7) by TYWPR01MB9838.jpnprd01.prod.outlook.com (2603:1096:400:234::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6838.43; Mon, 9 Oct 2023 09:49:15 +0000
Received: from TYAPR01MB5689.jpnprd01.prod.outlook.com ([fe80::d4a2:6f19:ba9f:ed7a]) by TYAPR01MB5689.jpnprd01.prod.outlook.com ([fe80::d4a2:6f19:ba9f:ed7a%7]) with mapi id 15.20.6838.033; Mon, 9 Oct 2023 09:49:14 +0000
Message-ID: <9f16c41b-f2c3-8a05-7be6-585cf965fd5d@it.aoyama.ac.jp>
Date: Mon, 09 Oct 2023 18:49:13 +0900
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.15.1
Content-Language: en-US
To: Tim Bray <tbray@textuality.com>, "Manger, James" <James.H.Manger@team.telstra.com>
Cc: "i18ndir@ietf.org" <i18ndir@ietf.org>, ART Area <art@ietf.org>
References: <169566019635.41806.9804796677919971070@ietfa.amsl.com> <CAHBU6is-wU2NLXNWL56nSJ4=nKvDzGv_Aw4qJN6N2O8CuM4-yw@mail.gmail.com> <SYBPR01MB59814B3448F5754AAEDA1740E5C7A@SYBPR01MB5981.ausprd01.prod.outlook.com> <CAHBU6iueqtd5T1T-ciYUMWvmo8XqBQqO5LkWbdRaoXQzPYSQOQ@mail.gmail.com> <SYBPR01MB59819A9F0BDD785F74EB2855E5C7A@SYBPR01MB5981.ausprd01.prod.outlook.com> <CAHBU6iu_PUdWXk52UfnoYo7-e0s+tWfiWqy5i+QrrvgJhYOenQ@mail.gmail.com>
From: "Martin J. Dürst" <duerst@it.aoyama.ac.jp>
Organization: Aoyama Gakuin University
In-Reply-To: <CAHBU6iu_PUdWXk52UfnoYo7-e0s+tWfiWqy5i+QrrvgJhYOenQ@mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 8bit
X-ClientProxiedBy: TYWP286CA0023.JPNP286.PROD.OUTLOOK.COM (2603:1096:400:262::10) To TYAPR01MB5689.jpnprd01.prod.outlook.com (2603:1096:404:8053::7)
MIME-Version: 1.0
X-MS-PublicTrafficType: Email
X-MS-TrafficTypeDiagnostic: TYAPR01MB5689:EE_|TYWPR01MB9838:EE_
X-MS-Office365-Filtering-Correlation-Id: fb3d97de-7d0e-49c6-ac17-08dbc8acff4d
X-MS-Exchange-SenderADCheck: 1
X-MS-Exchange-AntiSpam-Relay: 0
X-Microsoft-Antispam: BCL:0;
X-Microsoft-Antispam-Message-Info: R81XKL3VtEE3NF8SGIa1y0KMw4dU5PstdO9XOnAFbJpNLWhB9eFuv2qwAaWIeEPASBCNiiz750WAvi65KOhuuba/wE9uTNO5x2U9nsKUPrHlzsai9oMbYEwfl8a0WS/zzN0s9k3utzCztvsReosx46kf/9xflHT+baFb5gZyXsPQpsvXrbdUFHVPkHHn1iqiEstAVT9UVcxxrn47kZtKLW5pMJMbdghCuRHhVJYQyw5AmCMzvB2WRmk1Jur5sZs+VT0joBxmDZcBQpnhuOJD8TiqCEJu2CuxkoDefIDgHopMG9pmT6u2ODL7VEEYnwCBHFsLWAWA3BTRKmlvNfH4b+snfBdiTMl3QN/VIXtvZ5U50aVZU2W+HWmfbRHO7c0SUNGJ8019e3zVrRf1tXJu1Ozux4jToVDM/QIj5/pZg9RlUzOzh9z/mRRovCzWdtJLMaI0tBVmJAnDUUHd7Kb2Oykh+34dfArq0AL20bBQ+CVcNSQS6lpB/zatLG6vzyUMFaH9mUv56tHsHmQWqUj1Zc7BjgKliF5AfupDZijnTVc9j+9creKpqysZkxEDc4se8mKQIVYEpS3Dy+ls5SlVqtwtApYJayd34tQeUo7SkkTYqKW+r3tKzWEQAvCnVXSFvMo9A4zs0x4Az8Z2EyLCIst8DojaBlUTxfmZKR1PmCo=
X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:TYAPR01MB5689.jpnprd01.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(136003)(346002)(396003)(366004)(39850400004)(376002)(230922051799003)(64100799003)(451199024)(1800799009)(186009)(31696002)(38350700002)(38100700002)(86362001)(41320700001)(31686004)(2906002)(6512007)(966005)(478600001)(6486002)(45080400002)(8936002)(53546011)(41300700001)(8676002)(4326008)(5660300002)(6506007)(52116002)(36916002)(83380400001)(2616005)(66556008)(66476007)(316002)(66946007)(786003)(110136005)(54906003)(26005)(43740500002)(45980500001); DIR:OUT; SFP:1102;
X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1
X-MS-Exchange-AntiSpam-MessageData-0: wK7p/YjZs8DKKjq4mr1GDtRh0VLGpm7HnAiZ6z4J9XOHPaUTAKRCxaD0wFcVhkLVTdGcFCRiNt/pIJRJ2Zay8BWcXeCeY0luCo8uy8r6AGN/8uhE9clTjfY8zSFg8cXAd5IODqtafJij0Q+6WncmyRzxd2OEP3Xh2G00IVWHa8ZmhclludslGonTMwtCTuo++BHFgX8PNqzeezDBz0K34iDu5QfQu9L69MHgFpCgzl2znSacButlQ7+D5ftA+AGis3WhEHyMH0sBLQzRHAWMz5kvI7ssU2BX0dG0WQrW8Yrhsy4A8nm1cGGEvBDdkvok5Ic72S4lvLLvmDgWx5vbKTiieWLz9wlV6zDl2J1wYNpd0dSlYxiuKyW5UlY0MQv/qU3Rrk+xj+04IEMABCBIUMkgiWKG2Tf2mNPvcEs8FWSclfihnyatEW987KmJyj46PxHGr5TFuovcVEsbRieGUUeZR3hHxIR/OHvG5MGvYX1c40xIRf0DeHCAS+p4aYo0cE+cWRMGr9Yi+a6JiMwRJwaocpLE8f5swmuY2xAK6k6/sBq/iKgIYqlq2FISqDQXC03yJ7Q4GRWyhpEG/kvEcQ2zJ0tXXn1JMZyqyK6iUiYkxAD26gOrr3yuSG9GuSPWPXltgVrdlBQZllPw5AmaLCknIWWghxY+8QO/BXpfWGkuL0Q5I5XTRKgboG4xp3WGGY9hGjI9KXdk5hiYf2UTAz3W89iJRMMyY1mSzRwDIICqcIKvzf4jitS61BNDA/qPI2H/cs0TL9ageqpYF0jaxVVS3MEDkhMqoVlxI9gXV8QndlxjNup1qjEgov/WlH7kmP+HgWyoQtvmuUebVsbrqtppfsLAjYmKlkpUAOM9RkJEQntj9NdwqllSvnpoy96LNvI73Q2XxGdRGQDwqJCYAgRGFvVDnggqcpNBZ48hCe+lNqEBp2zFjxIt9SOe6UM2mdhNVPNmLYJaDRP+06Ertg8WysJJHNSCKyCISQOItBE+560MDVNrZM8UWIRqruZjnzlQ7CS7xNrd/2uRFUdnMo1bbJ9902jueuWybIasgOQC7png7SlFqKBsBDhZ/rY+EXP3npil8ZW3ksw6DUDI8VNHph96zneI5guM4mYbYzOEe7E/Ozeif2lmexjit3cYioCrYgUGCE5l2BRcBjhoZWc2xWa8HEWZKeZALBhPWnZJrtzCet1mLhikqsiPxXp612b2mJntNtdr1ercp8NwLIPviP4MfMDIXN9/zYF8tSarawu5aasUQcRxU87MnxFoiuuvshfJPrs1hp0zYJohHVBpg3wWQuADJ6dzOueHh5hKao5Fsyf8DfZ6dekc3L9TnvYT/WYN2AymB/kBd1edExptB6XI84tUy8JFkWm0zt3ob01ZWjK1sf+2UYAz2MR2yMGC5rcOeoHh7l3BYRyZbbVFyV1TvtVTcktSfRtiSOh1TI019bQmZGC23PDMeWYmLQxFBO7YUfgbqGH8Si+cElJrTcheGj8PsdlMkU5h//3HCtRyr0r+CWutt94wov0L9GsgsU9S1y1hhI6qPf8fTdOKKhWoAUkLDa/N7/sKCUXTUSF2LV64T09PYNxZBfFO
X-OriginatorOrg: it.aoyama.ac.jp
X-MS-Exchange-CrossTenant-Network-Message-Id: fb3d97de-7d0e-49c6-ac17-08dbc8acff4d
X-MS-Exchange-CrossTenant-AuthSource: TYAPR01MB5689.jpnprd01.prod.outlook.com
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Oct 2023 09:49:14.8036 (UTC)
X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted
X-MS-Exchange-CrossTenant-Id: e02030e7-4d45-463e-a968-0290e738c18e
X-MS-Exchange-CrossTenant-MailboxType: HOSTED
X-MS-Exchange-CrossTenant-UserPrincipalName: +CZ8LdgxrPW4xY4apQrPeXyDT3j8LpvLOIF0txTdiLOwgXu/7YaIMxFkGJ5aCLJUjIP8ZaL66oMYAaWsG6f8YA==
X-MS-Exchange-Transport-CrossTenantHeadersStamped: TYWPR01MB9838
Archived-At: <https://mailarchive.ietf.org/arch/msg/i18ndir/vv1yt_c-nXkjJijLoc-TvevQRGc>
Subject: Re: [I18ndir] [art] Fwd: New Version Notification for draft-bray-unichars-06.txt
X-BeenThere: i18ndir@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Internationalization Directorate <i18ndir.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/i18ndir/>
List-Post: <mailto:i18ndir@ietf.org>
List-Help: <mailto:i18ndir-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/i18ndir>, <mailto:i18ndir-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 09 Oct 2023 09:49:24 -0000
Hello Tim, others, On 2023-10-02 02:31, Tim Bray wrote: > On Sep 30, 2023 at 6:53:28 PM, "Manger, James" < > James.H.Manger@team.telstra.com> wrote: >> Explaining the 1,081,344 size and the U+D800-U+DFFF gap would be >> interesting. >> > > Yes! That history is mystery to me. Well, at the start, Unicode was pure 16-bit only, and ISO 10646 was officially all 32 bits. They had different encoding principles and character allocations initially (that was Unicode 1.0 and some ISO 10646 draft), but users told both sides that having two different universal character standards wasn't really what they wanted (one of the few examples where https://xkcd.com/927/ didn't hold :-). So the encoding principles (e.g. precomposed (ISO) vs. decomposed (Unicode), character repertoires, and code point assignments got aligned (*), but the 16-bit/32-bit difference stayed. (*) Well, for precomposed vs. decomposed, it was a compromise (let's tolerate both), with NFD to define the correspondence and later NFC for a more practical "default representation". The fact that Unicode tried to keep things within 16 bits was actually in many ways beneficial; it forced careful allocation of characters where an open 32-bit space could have lead to much more wasteful allocations. (Korean Hangul being the most notable exception; with something as human as human writing and its encoding, there are always exceptions.) At that point, implementations started to show up, e.g. Windows NT and Java and the like, and they used Unicode, i.e. "16-bit characters". A bit later, it became clear that a 16-bit space was not enough. But it also became clear that a 32-bit space was way too much. So people were looking for ways to carry around a space somewhat wider than 16 bits, but still encodeable in 16-bit code units. The end of the 16-bit space had already been taken (among else by the special meanings for U+FFFF and U+FFFE), but some contiguous spaces were still available. I assume that after playing around with various ideas (but I have never heard about these), people came up with the current solution: Reserve two contiguous blocks of 16-bit code units (now called the high and low surrogates) of 1024 code units each to encode a total of 2^20 additional code points and call the result UTF-16. > Also the nonchars in the Arabic-extended region. Do you mean U+FDDD..FDEF in the Arabic Presentation Forms-A block? If not, please say what else. As the description in the code chart says, these are for process-internal use. Let's say you want to encode some internal formatting information in an application, or want to implement a clever text searching algorithm that only works with some special code points that are not used for actual characters. Then you can use these, if you make sure they never leak. I'm not sure exactly why these were introduced, but my guess is that they were added with the Object Replacement Character (U+FFFC), which was an example of such an "application-internal" use which, as far as I remember, was discovered when Microsoft objected to encode something in that position. (To be fair, Microsoft was not the only implementer who used the trick of hanging off in-line images and the like off a special character.) For more details, please see the Unicode/L2 documents linked from https://en.wikipedia.org/wiki/Arabic_Presentation_Forms-A. Regards, Martin.
- [I18ndir] Fwd: New Version Notification for draft… Tim Bray
- Re: [I18ndir] [art] Fwd: New Version Notification… Rob Sayre
- Re: [I18ndir] [art] Fwd: New Version Notification… Tim Bray
- Re: [I18ndir] [art] Fwd: New Version Notification… Rob Sayre
- Re: [I18ndir] [art] Fwd: New Version Notification… Tim Bray
- Re: [I18ndir] [art] Fwd: New Version Notification… Asmus Freytag
- Re: [I18ndir] [art] New Version Notification for … Carsten Bormann
- Re: [I18ndir] [art] New Version Notification for … Claudio Allocchio
- Re: [I18ndir] [art] Fwd: New Version Notification… Rob Sayre
- Re: [I18ndir] [art] New Version Notification for … Paul Hoffman
- Re: [I18ndir] [art] Fwd: New Version Notification… Tim Bray
- Re: [I18ndir] [art] Fwd: New Version Notification… Carsten Bormann
- Re: [I18ndir] [art] Fwd: New Version Notification… Rob Sayre
- Re: [I18ndir] [art] Fwd: New Version Notification… Rob Sayre
- Re: [I18ndir] [art] Fwd: New Version Notification… Tim Bray
- Re: [I18ndir] [art] Fwd: New Version Notification… Manger, James
- Re: [I18ndir] [art] Fwd: New Version Notification… Tim Bray
- Re: [I18ndir] [art] Fwd: New Version Notification… Rob Sayre
- Re: [I18ndir] [art] New Version Notification for … Rob Sayre
- Re: [I18ndir] [art] New Version Notification for … Carsten Bormann
- Re: [I18ndir] [art] New Version Notification for … Rob Sayre
- Re: [I18ndir] [art] New Version Notification for … Rob Sayre
- Re: [I18ndir] [art] New Version Notification for … Tim Bray
- Re: [I18ndir] [art] New Version Notification for … Rob Sayre
- Re: [I18ndir] [art] New Version Notification for … Tim Bray
- Re: [I18ndir] [art] Fwd: New Version Notification… Manger, James
- Re: [I18ndir] [art] Fwd: New Version Notification… Rob Sayre
- Re: [I18ndir] [art] New Version Notification for … Carsten Bormann
- Re: [I18ndir] [art] New Version Notification for … Rob Sayre
- Re: [I18ndir] [art] New Version Notification for … Tim Bray
- Re: [I18ndir] [art] New Version Notification for … Steffen Nurpmeso
- Re: [I18ndir] [art] New Version Notification for … Manger, James
- Re: [I18ndir] [art] New Version Notification for … Rob Sayre
- Re: [I18ndir] [art] New Version Notification for … Steffen Nurpmeso
- Re: [I18ndir] [art] New Version Notification for … Tim Bray
- Re: [I18ndir] [art] New Version Notification for … Rob Sayre
- Re: [I18ndir] [art] New Version Notification for … Manger, James
- Re: [I18ndir] [art] New Version Notification for … Tim Bray
- Re: [I18ndir] [art] New Version Notification for … Carsten Bormann
- Re: [I18ndir] [art] New Version Notification for … Rob Sayre
- Re: [I18ndir] [art] Fwd: New Version Notification… Martin J. Dürst
- Re: [I18ndir] Fwd: New Version Notification for d… Asmus Freytag
- Re: [I18ndir] [art] Fwd: New Version Notification… Manger, James
- Re: [I18ndir] [art] New Version Notification for … Tim Bray
- Re: [I18ndir] [art] New Version Notification for … Carsten Bormann
- Re: [I18ndir] [art] New Version Notification for … Rob Sayre
- Re: [I18ndir] [art] New Version Notification for … Rob Sayre
- Re: [I18ndir] [art] New Version Notification for … Manger, James
- Re: [I18ndir] [art] Fwd: New Version Notification… Rob Sayre
- Re: [I18ndir] [art] New Version Notification for … Carsten Bormann
- Re: [I18ndir] [art] New Version Notification for … Carsten Bormann
- Re: [I18ndir] [art] New Version Notification for … Rob Sayre
- Re: [I18ndir] [art] Fwd: New Version Notification… Martin J. Dürst
- Re: [I18ndir] [art] New Version Notification for … Martin J. Dürst
- Re: [I18ndir] [art] New Version Notification for … Rob Sayre
- Re: [I18ndir] [art] New Version Notification for … Rob Sayre