Re: [Multiformats] Multiformats Considered Harmful

"Martin J. Dürst" <duerst@it.aoyama.ac.jp> Mon, 11 September 2023 05:06 UTC

Return-Path: <duerst@it.aoyama.ac.jp>
X-Original-To: multiformats@ietfa.amsl.com
Delivered-To: multiformats@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4712DC15108F for <multiformats@ietfa.amsl.com>; Sun, 10 Sep 2023 22:06:33 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.999
X-Spam-Level:
X-Spam-Status: No, score=-1.999 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, NICE_REPLY_A=-0.091, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=itaoyama.onmicrosoft.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id P8QJXX1QfBkp for <multiformats@ietfa.amsl.com>; Sun, 10 Sep 2023 22:06:29 -0700 (PDT)
Received: from JPN01-TYC-obe.outbound.protection.outlook.com (mail-tycjpn01on20706.outbound.protection.outlook.com [IPv6:2a01:111:f403:7010::706]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id E30EBC151078 for <multiformats@ietfa.amsl.com>; Sun, 10 Sep 2023 22:06:28 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=SwbDMt3t/mfqPyD6JmVQYgXsA75DfTmQMOIqjS1CD369pgDbkvIt1mvi3GYKi/cl3CyBzN5yNUyl0HALywiNjRnSrw4l+gG7Q/ccvLepUDvmOhI7gjgOustJs2YpD/lvqWFjqMGK77qz0DHukx0bE3whYr9BOWHGk95yQKQNgyQ76dYa/m15qO//vZZoKD2d1Un8Tk20BLE43YbdbYbfeMIRMqfZKiu6YakXjlsPbq9lrTEtyIBL0sJAKNBLDzanb4tS9Y8ZxR7nadrvCRrfxFcTdEX04P1STEkR5ymeZ2ZfioE18LnFu2vHgsPWCPSrvl/wv/f41NvQgx43dZ/0LQ==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=UTLZJLp/R8ye9+lUb0UI12sPhrrBW4zQ+4hVvxkB/Ns=; b=YrL/VXk6WqP48qLz9n0t1W16Ch65Jvbx1jjiaVuvPOfQcr+njDy6H+J5f1uAmre6l3QGgq/t9bYRs0c1OlkVrLt8jstkZfsVIl3LHqlcwvKt55vU9kmEk6WVgjCwrj9LVgZERj3vH27D2+TgSj0fd6XG38SjEpbJihsG8OTx6TuooVlFbJhmsL2tWwRe51DpnIBFnuW4sKIVHCrIW2Svqouc5xtgvkB8pvxwsgL7GEiBc3xmoAYxvx36tanxpzDprPJQXD55yYsE0nqfOB9qgxUMtnHcf6CpKrM9VuQnoc2jYzydXyG8lTZiAOse416N6SsSuAZawEQp4VeG/JphDQ==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=it.aoyama.ac.jp; dmarc=pass action=none header.from=it.aoyama.ac.jp; dkim=pass header.d=it.aoyama.ac.jp; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=itaoyama.onmicrosoft.com; s=selector2-itaoyama-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=UTLZJLp/R8ye9+lUb0UI12sPhrrBW4zQ+4hVvxkB/Ns=; b=ZHtd70irtQWbPYen36hVAqzFecr46fn9iZ5skcq2ocSSZnqdrgrUeZxefdIoTFdDG6Rbj8P0pngr+XXa5rBoeq3vXfppZLmWP00+zEEFx6twkPCzcFUCBo3DLtG/qPhhuqIwwkWQKRbW9VmetF3JTGCjHAyF+d9YDK0VnBkZ1ZQ=
Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=it.aoyama.ac.jp;
Received: from TYAPR01MB5689.jpnprd01.prod.outlook.com (2603:1096:404:8053::7) by TYYPR01MB7998.jpnprd01.prod.outlook.com (2603:1096:400:ff::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6768.34; Mon, 11 Sep 2023 05:06:24 +0000
Received: from TYAPR01MB5689.jpnprd01.prod.outlook.com ([fe80::d4a2:6f19:ba9f:ed7a]) by TYAPR01MB5689.jpnprd01.prod.outlook.com ([fe80::d4a2:6f19:ba9f:ed7a%7]) with mapi id 15.20.6768.029; Mon, 11 Sep 2023 05:06:24 +0000
Message-ID: <919f8d22-0de7-4a12-3b85-86f50eb04c94@it.aoyama.ac.jp>
Date: Mon, 11 Sep 2023 14:06:23 +0900
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.15.0
Content-Language: en-US
To: bumblefudge von CASA <bumblefudge@learningproof.xyz>, "multiformats@ietfa.amsl.com" <multiformats@ietfa.amsl.com>
References: <I_83nnUevaY5K4VguFlDBh5qKl3Oe6PV_KnCD8QELrnCJqzE3_lU9x2AYiIIpCbxTudTQsQgjE5eEPprdlwDVFPsaKU-uZfo3_DJm5CMX7s=@learningproof.xyz>
From: "Martin J. Dürst" <duerst@it.aoyama.ac.jp>
Organization: Aoyama Gakuin University
In-Reply-To: <I_83nnUevaY5K4VguFlDBh5qKl3Oe6PV_KnCD8QELrnCJqzE3_lU9x2AYiIIpCbxTudTQsQgjE5eEPprdlwDVFPsaKU-uZfo3_DJm5CMX7s=@learningproof.xyz>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 7bit
X-ClientProxiedBy: TYCP286CA0189.JPNP286.PROD.OUTLOOK.COM (2603:1096:400:382::15) To TYAPR01MB5689.jpnprd01.prod.outlook.com (2603:1096:404:8053::7)
MIME-Version: 1.0
X-MS-PublicTrafficType: Email
X-MS-TrafficTypeDiagnostic: TYAPR01MB5689:EE_|TYYPR01MB7998:EE_
X-MS-Office365-Filtering-Correlation-Id: 1b2af85b-1cdf-46b3-d6bb-08dbb284d85e
X-MS-Exchange-SenderADCheck: 1
X-MS-Exchange-AntiSpam-Relay: 0
X-Microsoft-Antispam: BCL:0;
X-Microsoft-Antispam-Message-Info: oKkgjchaKC58sV05GZDBkPi01Pr5V8Loah/M5f7zIhv7sYSOGX842pPUcvaZtju2QMDtdVLRjN4lFt1C9YbAkoBz7K2vLWBlGCDoMLGJl0G44xdoKIELJqFzpuM8WnBHiiSS68mEq+V5P3pPimtKAc7Fjvu9ZEtwcICY8jxDGj6dhSKN23RmtdoGPJkKveQtReBNFP9S0B2ZeW8IBuro2OR7Ae5HyNuhHnfMxl6i+FGHYVuVgVNR0FI0/ip47g/SjKFIKIPtfEtRNH1I11VVsH5V1CjCbgFNg0tHyH+IGeIU74WY+BfQSXKwFW/8rnd+3159li2Dah1kNWKvXJkRQJgXU5woeTIoecIm6I+hRA6Jy4F+i8PYrST69SDzKUBaY9p31gubFxo4MOacI1W2BAwCHs6LKKLEGBq/7xDPXL4UcRF00h05G3l3t2E9qVSP392skVm7WyFnpOlvW2l7WVx/TUSrVPTujUrlSjVfaGUt1koRB4tpJzI7XdXPrOPAgus1uqVRRJO6U5XtUOCB/x4Sf0D/SyHwU1h2oDQrfYtB/NddPEe2sUumOBKzgzXAQNyIOUDOJdSCnlQOatQbST/UrYWES0QJZ65DUcRRhOgtp8d+sjaTaNyN2tuy4oCY/Zro+dxbVpogy0imsxDmJdPYRjBVchbLIjMFepW7zpA=
X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:TYAPR01MB5689.jpnprd01.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(366004)(346002)(39840400004)(136003)(376002)(396003)(186009)(1800799009)(451199024)(6486002)(36916002)(6506007)(53546011)(52116002)(6512007)(66899024)(83380400001)(41320700001)(86362001)(31696002)(38350700002)(38100700002)(2616005)(26005)(31686004)(316002)(786003)(41300700001)(8676002)(66946007)(66476007)(66556008)(8936002)(2906002)(110136005)(478600001)(966005)(5660300002)(45980500001)(43740500002); DIR:OUT; SFP:1102;
X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1
X-MS-Exchange-AntiSpam-MessageData-0: EdH8XmwrI39OQu5ykEc8UYqHbhJX1MVCZzrn4m9e+7gLU7UrcPsFdbO9QSrvFN4ITDn4AJ03uCCYsyAxlSDCyoqv8UUQ0qo2ICwmsGhoa8RDLIyue3etP7QSa+RRu3JRsbi79lM0wMNrQZ8Svgqaz1v9c1GldV4cYCqClv9B9IQSQQ25/aXvt5PiAMzeGFZPbW3jdBJ7gw3KHGu8MVANGcJoNcaWAxjk8T0Y4SK/qA2RuPmlnb1DTxPSyklZWgs+hWzxGd+yRlTF1WdOdIge5W0ohGIg2D0YPJec+rpjAu7ME+8Ii9c30A6tMFqnC4+0n06m95zCx3EUHkEHKJUwPKBS8awwvpD3tVM8ky9KXVBUQSCjizh1JGIGm/2FR40dDwZaPNOhtme9qhPJgMSYX6GRwRHvXCMdzuSYMxCxJE9yyQaIaAFTskOq5S2jSS4pd70oVT84AEotqhnURtNAOjTcV2ES6p0hEEJL8C6ruf/9MREs9J5AFMB05UuDzFrz3VM9QfScj7F7JW4i23vx09y4Hh6t63jx3JvbMorHtSaP/RvwkxNZ5/IOcbLcvU+RCvJzbzQzB79jOVEWjSpZuul4vZdtyjtJlfWiV2FTGDxaZ66aZRGN4VEn0bUZQqLr8ttWxR7yJG+jnkNCaufloCB9yEMSCGSY2/YTOR2QkV4eZQZ71McZPZlJWEbxlhAoL0/1GR2TW7sfscEOE64XLwxSrAk8UdiMfKr8KLnHHvgz+lt0zkXJSwpIDeQlkFxjAK/qpFCFLCw7nlDNtFu+1EGkr1mdjljJIEteQGINkebWW7mMCiXRzxyT3VAkSuXTYkgvYMCBteMDVzshuYDCWqjGZdvs1vepmFTBK73aDyf5EqSobQ0BtCUA76oWrmWd+h1RzX7qsrMhMm1fXxVHsob9+K8IhqG7jvmj/nGRD7KOwS5Bguj6waaZ9l0QwbcytOYogjXYdhRQabHmYdz7d6zXHFCkXP12b18RcGjHf1SuPqlTgneE8HPPaMa8egaq/IVp7YGhLZ+2XHTfSyuaMHEVTeTnhNTSWZ/KHDMIA4xr+kP5RCxVwEnIqcQX1kZBUkfSrgXirxl5FDcB/P6cSe5XfOo0KUjyxfA6mbIEG2KThXq4iSggrcSIvtbdvwMi3321PV0d4uwAsInxyt9JnVI/zBd10HnXbRDE8bqIqlNNBXiEkujylqWQ+MTQpn8q5Kpo+MIquCPg7H91CY3JPWGpWzy4J0riM31clGX4tYjB6tJ72CdeSmO1qY065PoBMgePjgqlaKCG76D65SvmQE7EM9GFzunJm6ZARmdBbka1s5uFzXlb2B3AatoLYDFf+C6EPXe8EgvjjER6exkx8nMnjWzDgu/xIwJbKyloPmvIaqVh6IrU3mRwIguYTtwTgbTnW9u8pstKOu/yd9e9EAIs4ozAgW4EgI8Hr3m7OVXzfzPll1dAyj7jJJLtd2SlVB1uR0hU14lKRzhfHpOia6NY+ASBe7BngVeBnVVIW1yc/hEC28S9uxBU9cNsQ1izqpZbE6seg4lj28rVN35DDNs9f/GUf2d93eD8iFDccrBKUdXoJ2vrH76HjnDKQd0Z
X-OriginatorOrg: it.aoyama.ac.jp
X-MS-Exchange-CrossTenant-Network-Message-Id: 1b2af85b-1cdf-46b3-d6bb-08dbb284d85e
X-MS-Exchange-CrossTenant-AuthSource: TYAPR01MB5689.jpnprd01.prod.outlook.com
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 11 Sep 2023 05:06:24.2274 (UTC)
X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted
X-MS-Exchange-CrossTenant-Id: e02030e7-4d45-463e-a968-0290e738c18e
X-MS-Exchange-CrossTenant-MailboxType: HOSTED
X-MS-Exchange-CrossTenant-UserPrincipalName: O+p3tTpIk/vJHe8l4v+W2YbPYW8MaGAmKGyulUEqgpO0OyzZwQ9UfaRnrcbD2Ra3F6RhATcyAAFkovQAQoIXww==
X-MS-Exchange-Transport-CrossTenantHeadersStamped: TYYPR01MB7998
Archived-At: <https://mailarchive.ietf.org/arch/msg/multiformats/-FGj7J_J7Y_S5l4wXauafb9__DE>
Subject: Re: [Multiformats] Multiformats Considered Harmful
X-BeenThere: multiformats@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Discussion related to the various Multiformats data formats <multiformats.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/multiformats>, <mailto:multiformats-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/multiformats/>
List-Post: <mailto:multiformats@ietf.org>
List-Help: <mailto:multiformats-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/multiformats>, <mailto:multiformats-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 11 Sep 2023 05:06:33 -0000

On 2023-09-07 21:09, bumblefudge von CASA wrote:
> Dear Prof Bormann and Melvin Carvalho:
> 
> Thank you very much for your bibliographic erratum and expression of interest, respectively. They're incorporated below in the longer response.
> 
> Dear Mike Jones:
> 
> Thank you for taking the time to spell out your feedback. I believe that clarifying the spec in a few places can address most of your concerns.

[snip]

>> 2. The stated purpose of "multibase<https://www.ietf.org/archive/id/draft-multiformats-multibase-08.html>" is "Unfortunately, it's not always clear what base encoding is used; that's where this specification comes in. It answers the question: Given data 'd' encoded into text 's', what base is it encoded with?", which is wholly unnecessary. Successful standards DEFINE what encoding is used where. For instance, https://www.rfc-editor.org/rfc/rfc7518.html#section-6.2.1.2 defines that "x" is base64url encoded. No guesswork or prefixing is necessary or useful.
> 
> Successful standards for self-describing encodings have also been adopted by communities rather than relying on fixed registries and protocols that define encodings in advance and universally. While it could be argued that interoperability with those kinds of protocol could be difficult, inefficient, or, in some cases, even zero-sum, I do not feel a universal claim to harmfulness is warranted, particularly without defining the criterion and context of interoperability.

There are examples both of where a single encoding is best, and when 
allowing some variety is best. And there are examples where the best 
choice varied over time. The difficulty may be to figure out which case 
one is looking at. Just because different people prefer different things 
isn't a reason to invent some "multi"thing; it may just be a sign that 
the consensus process isn't going too well (yet).

[longish example, sorry; you may skip to the end]
As an example, character encodings (usually designated using the 
"charset" parameter) for the Web evolved from a start with a single, 
undeclared character encoding (iso-8859-1, covering most of Western 
Europe). The reason for this was that it was simplest to start this way, 
and the choice of encoding resulted from the time and place the Web was 
created (1989 in Geneva).

Later, the charset parameter on the Content-Type header field was 
introduced. This allowed a wide variety of character encodings covering 
wide swaths of the world's scripts and languages. This was necessary at 
a time when a uniform, global encoding was "in the making", but not yet 
widely available. Browsers were supporting some subsets of character 
encodings, and people may have chosen their browsers (among else) based 
on what character encodings it supported.

In RFC 2070 (https://www.rfc-editor.org/rfc/rfc2070), we managed to at 
least nail down numeric character references (e.g. &#xABCD;) in HTML to 
Unicode. Later, Unicode, in the form of UTF-8, became more and more 
popular, to the extent that the WhatWG spec simply calls it "The 
encoding" (see https://encoding.spec.whatwg.org/#the-encoding).

Nowadays, use of UTF-8 is given at 98% of the Web (see 
https://w3techs.com/technologies/overview/character_encoding). Other 
encodings are still in use, and browsers support them because it's not 
too much work to keep old code. But they have mostly made it invisible 
for the user (see e.g. 
https://support.mozilla.org/en-US/kb/text-encoding-no-longer-available-firefox-menu).

As for registration of new "charset"s, effectively nothing is going on, 
because there's no need for anything new. UTF-8 covers everything there 
is, and if it's not in Unicode, it's easier to deploy by adding it to 
Unicode than be defining a new "charset". And many if not all new 
protocols/formats restrict themselves to UTF-8 only these days, for 
obvious reasons.
[end of longish example]

Multibase deals with something much simpler than characters, namely 
bytes. The different bases were mostly introduced because the different 
contexts where the data would be carried had different restrictions on 
the set of usable characters. Putting various bases together with 
multibase means that such restrictions can't be made. So the question 
remains: Why multiple bases? The conclusion should be that multiple 
bases are not needed. In context, it should always be clear what base is 
used. And there should be no "out of context". If you are out of 
context, then you might not even know that you deal with a multibase 
"thingy". If you have to specify that you're dealing with multibase, you 
can as well specify that you are dealing with baseX (where X is a single 
choice).

Interest in decentralization shouldn't mean that each and every element 
of a protocol is decentralized with many more options than necessary. 
Ideally, the options are available only at those points where they 
matter, but at all points where they matter. For base encodings, they 
don't matter; a single option (which is the same as no option) in each 
specific place is enough.

Regards,    Martin.