[Multiformats] IETF 116 Dispatch Multiformats: Responses to Martin Dürst and Murray Kucherawy

Manu Sporny <msporny@digitalbazaar.com> Mon, 10 April 2023 16:48 UTC

Return-Path: <msporny@digitalbazaar.com>
X-Original-To: multiformats@ietfa.amsl.com
Delivered-To: multiformats@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 60D94C15153C for <multiformats@ietfa.amsl.com>; Mon, 10 Apr 2023 09:48:16 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.096
X-Spam-Level:
X-Spam-Status: No, score=-2.096 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=digitalbazaar.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ATkust66U6ZD for <multiformats@ietfa.amsl.com>; Mon, 10 Apr 2023 09:48:12 -0700 (PDT)
Received: from mail-io1-xd30.google.com (mail-io1-xd30.google.com [IPv6:2607:f8b0:4864:20::d30]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A5E42C14CE4B for <multiformats@ietf.org>; Mon, 10 Apr 2023 09:47:40 -0700 (PDT)
Received: by mail-io1-xd30.google.com with SMTP id g16so11259925iom.11 for <multiformats@ietf.org>; Mon, 10 Apr 2023 09:47:40 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=digitalbazaar.com; s=google; t=1681145259; h=content-transfer-encoding:cc:to:subject:message-id:date:from :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=72KLeZk4eYfsyNg3A8IDxoL9f/0SuZnu7QLwcU7bZVY=; b=d8SgiHIGvir/Cx64NYD4D7TXB7RBgyk1OuYt6wHH+60o99IjYozqEpIADdPTk/ypZP GQ/hp+FK+udWB0Uf3RwCMfow/ajJ9kgdy2d/BhGNsCuJtRcAdXFiTBCorV+OT1Sz3epY Ia3f2mRKyjUaqNL4y2YOH/YvYnUtJHZHSp3aB69Zt57DZUTsrIug0VTc/8AJttKoFqL8 ljcpOYpH/f/NqH1QD2Z1Mp5ahELT7jap9NMm2spiPKtjwEeUvkAHItsHrwlowotPhkVJ ZjFgKXlw6knRc4zt8nn7J/pLJa0NQWAbXxgud8TMzBuwhNg1Ta6P1DSPFfAI0D7e4yXm Qazg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1681145259; h=content-transfer-encoding:cc:to:subject:message-id:date:from :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=72KLeZk4eYfsyNg3A8IDxoL9f/0SuZnu7QLwcU7bZVY=; b=62bEq3LD/RPGwHFu9ZmaNBuHtS4LunILA4Olp0jj3DyEs52bKo2Vwj9PwP9ijS2LjJ G3D+n121ZcVad9EFVlvUpvqDrzO7O0C3o6joeA1fJ4Rs5tOglt5oYXitPEFJmeMLC9hJ 7Kw5AICvGy/GdRD2jANgWmTxeAfdZrKuDYzi2uGzPmnfCDQAmqF0r9HhQ8E5XKKeYKlL nu4qP61BZNciklWC112nmu3KxCKbHDAf8okTN415lXUi6qq9Dd7rwpG786VMV2SeAc1I ++TNNU4x1eA+o2xUeb7Nrt4ZT6k3Ryg90C/UDLvPLZC/dymdJunLxTjoiLgRI31ZIWZa kcyw==
X-Gm-Message-State: AAQBX9c1YfkOFax/EMoOcwOmlDZFg76Cd8C6jiJxsLYAKVhE2T+eQRK7 BT6pnahgpUCOpz+akbzMn4zi+JIT8i9sLVOAXqbj3ng2sebWrRqcvXDYqQ==
X-Google-Smtp-Source: AKy350a1zNJiwM8ztWv+7jfWrx1HFjfT86KdsczVFuycKOA8AP3ZbLT2W9FV8Es2Xgp4R6OGM8NlBY+LZPLdKOwcszU=
X-Received: by 2002:a5d:8b49:0:b0:760:8f2e:899a with SMTP id c9-20020a5d8b49000000b007608f2e899amr354413iot.2.1681145259155; Mon, 10 Apr 2023 09:47:39 -0700 (PDT)
MIME-Version: 1.0
From: Manu Sporny <msporny@digitalbazaar.com>
Date: Mon, 10 Apr 2023 12:47:03 -0400
Message-ID: <CAMBN2CReEU5WD0p-fFLa0jUxMyTiu3ApQcCihh41W7ujGn+V-A@mail.gmail.com>
To: multiformats@ietf.org
Cc: duerst@it.aoyama.ac.jp, "Murray S. Kucherawy" <superuser@gmail.com>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://mailarchive.ietf.org/arch/msg/multiformats/MVnlVc3-8to5YuMEY4VadtpPjmk>
Subject: [Multiformats] IETF 116 Dispatch Multiformats: Responses to Martin Dürst and Murray Kucherawy
X-BeenThere: multiformats@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Discussion related to the various Multiformats data formats <multiformats.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/multiformats>, <mailto:multiformats-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/multiformats/>
List-Post: <mailto:multiformats@ietf.org>
List-Help: <mailto:multiformats-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/multiformats>, <mailto:multiformats-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Apr 2023 16:48:16 -0000

Murray Kucherawy said:
> Does that example string decode into anything?

The example string in the slide deck that I believe you're referring
to is: uaGVsbG8gd29ybGQ

IETF 116: DISPATCH Multiformats Slide Deck:
https://docs.google.com/presentation/d/1WjKoxB-fxkPpaXB4kUu8IsTwQ4qXFmka1ocwEVEeBpk/edit#slide=id.g21fdb344882_0_206

Where the Multibase prefix 'u' indicates a base64url-no-pad encoding.
Stripping the 'u' value, and then feeding the rest (aGVsbG8gd29ybGQ)
into a base64url decoder, such as https://www.base64url.com/, gives
you the string: "hello world".

Martin Dürst said:
> For the "Hello World" example, can you express the charset?

The "hello world" example used during dispatch was expressed in
Multibase, which is an ASCII format. All other Multiformats are binary
formats, where some of the text-based formats express what charset to
use. For example, the Multicodec entry for "JSON" says that the data
is UTF-8 encoded).

There are no UTF-16 entries that I'm aware of, most likely because
Multiformats are rarely used to express string values. The
"plaintextv2" entry is missing a charset designation (which is likely
a registration error). One presumption we could make is that the
formats that carry the Multiformat value (such as CBOR, YAML, and
JSON) already have clear rules around charset  encoding OR, if
Multiformats are used as a byte header for a JSON or CBOR object, then
the corresponding format has clear charset rules that should be used.

That said, your concern is not entirely addressed. There is nothing in
the documentation around Multiformats that I have read to date that
gives a clear and definitive answer to your question.

> These questions need to be answered, not sure what alternative might be used for that.

Your charset question could be a signal that a mini-WG might be a
better option. When you asked your question, my internal dialogue was:
"Oh, that's interesting... why has that not come up in the
Multiformats community at all during the last 5+ years? And why is the
ecosystem able to operate w/o answering that question?". It might be
that I'm unaware of where that discussion happened, but I agree with
you, we should have a definitive answer.

-- manu

-- 
Manu Sporny - https://www.linkedin.com/in/manusporny/
Founder/CEO - Digital Bazaar, Inc.
https://www.digitalbazaar.com/