Re: [Cbor] Use and development of draft-faltstrom-base45 (was: Re: CBOR in QRcodes)

Doug Ewell <doug@ewellic.org> Fri, 25 June 2021 19:38 UTC

Return-Path: <doug@ewellic.org>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 3D6603A0B02 for <cbor@ietfa.amsl.com>; Fri, 25 Jun 2021 12:38:30 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.895
X-Spam-Level:
X-Spam-Status: No, score=-1.895 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 77clZVIrX_AX for <cbor@ietfa.amsl.com>; Fri, 25 Jun 2021 12:38:25 -0700 (PDT)
Received: from p3plsmtpa06-05.prod.phx3.secureserver.net (p3plsmtpa06-05.prod.phx3.secureserver.net [173.201.192.106]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 5B3493A0B04 for <cbor@ietf.org>; Fri, 25 Jun 2021 12:38:25 -0700 (PDT)
Received: from DESKTOPLPOB1E4 ([71.237.1.75]) by :SMTPAUTH: with ESMTPSA id wreglwKD8ppibwrehlh6UQ; Fri, 25 Jun 2021 12:38:24 -0700
X-CMAE-Analysis: v=2.4 cv=T9ZJ89GQ c=1 sm=1 tr=0 ts=60d630b0 a=6nY1uNNCgC/8Ccg2lpAcFA==:117 a=6nY1uNNCgC/8Ccg2lpAcFA==:17 a=IkcTkHD0fZMA:10 a=nORFd0-XAAAA:8 a=Pxjxk-o7AAAA:8 a=l70xHGcnAAAA:8 a=48vgC7mUAAAA:8 a=fFBJjsUjjM7i6ISsY_kA:9 a=QEXdDO2ut3YA:10 a=AYkXoqVYie-NGRFAsbO8:22 a=CmotjvasbPAGDBUjbTOk:22 a=JtN_ecm89k2WOvw5-HMO:22 a=w1C3t2QeGrPiZgrLijVG:22
X-SECURESERVER-ACCT: doug@ewellic.org
From: "Doug Ewell" <doug@ewellic.org>
To: =?utf-8?Q?'Christian_Ams=C3=BCss'?= <christian@amsuess.com>, "'Michael Richardson'" <mcr+ietf@sandelman.ca>
Cc: <cbor@ietf.org>
References: <9704.1624378576@localhost> <YNN05Efh4/8Xyt63@hephaistos.amsuess.com> <000201d76914$53aa4890$fafed9b0$@ewellic.org> <YNWdchJRQRzm3I9j@hephaistos.amsuess.com> <25746.1624636305@localhost> <YNX/gD4ScCFnDmWW@hephaistos.amsuess.com>
In-Reply-To: <YNX/gD4ScCFnDmWW@hephaistos.amsuess.com>
Date: Fri, 25 Jun 2021 13:38:22 -0600
Message-ID: <000b01d769f9$a9dfab20$fd9f0160$@ewellic.org>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: quoted-printable
X-Mailer: Microsoft Outlook 16.0
Thread-Index: AQJhHE0SxUNEuLIooMFzy3xZv4qaoQGbwOxHAgwNoPoCdEVEkALPOefmAOV4djqpwyupMA==
Content-Language: en-us
X-CMAE-Envelope: MS4xfNQ0G1PSt1KNWh20ihEYa5ZJAUFHrsf7jakq7RHcricG9qmyaH1Ec0dc2Gr16MlEr3c0YpQXFq2AUkLu51BAv0o17vqtytGmp2ouZMGz2vAD/ubdiYqo omLBI+GLgFFeeW5bNNfa034B+vW1fa0FMs3M6QhVZzh4HNV9HCHR4FT/tPy9pZ1KOMOieI0LbBfom53CeBIvPqXzKQGLG5vh3FmSNrCCfZkRYe0U65xIkoqn vdKeGeb+shaeryeyD62gYA==
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/9seqHd9NqCa_P5Rp7QM-1qQ5L14>
Subject: Re: [Cbor] Use and development of draft-faltstrom-base45 (was: Re: CBOR in QRcodes)
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 25 Jun 2021 19:38:30 -0000

My understanding is also that base45 is intended for the specific alphanumeric encoding for QR codes defined in ISO 18004:2015, sections 7.3.4 and 7.4.4.

Given that, whether an alphabet of size 45 is maximally efficient, or whether a "better" alphabet could have been chosen, is really not at issue. There is an installed base (QR codes have been around for more than two decades), so there is a significant and probably prohibitive cost to changing it.

I do note, however, that the specific application described in draft-faltstrom-base45 — encoding 16-bit integral values in three base45 code units, and encoding a trailing 8-bit value in two — does not appear to be part of the standard, so that part could lend itself to "enhancement," although I'm not clear exactly what that would be.

Creating incompatible branches of base45 for different applications (such as URIs) still seems like a decision not well supported by experience.

--
Doug Ewell, CC, ALB | Lakewood, CO, US | ewellic.org


-----Original Message-----
From: Christian Amsüss <christian@amsuess.com> 
Sent: Friday, June 25, 2021 10:09
To: Michael Richardson <mcr+ietf@sandelman.ca>
Cc: Doug Ewell <doug@ewellic.org>rg>; cbor@ietf.org
Subject: Re: Use and development of draft-faltstrom-base45 (was: Re: [Cbor] CBOR in QRcodes)

On Fri, Jun 25, 2021 at 11:51:45AM -0400, Michael Richardson wrote:
> I don't have enough caffeine in me to figure out how much more 
> efficient
> base45 is over base32.  I think it's log(45)/log(32), right?

The efficiency of base45 on its own is pretty irrelevant, because it's something that'd be only used inside an environment where these 45 characters are encoded in 5.5 bits of the QR code.

base45 does a 2-to-3 encoding, so one 8 bit of binary data gets spread to 24/2=12 bit of ASCII base45 string, but those 1.5 characters are encoded in 1.5*5.5 = 8.25 bit, so in the end it's a 3% loss.

(Same would be the case for 2-to-3 encodings possible with base41,..base44).

> In places where base64 or base85 won't work, it seems to me that base32 is
> probably safer than base45.   As I wrote, I think that the limitation for
> QRcode entry has to do with forms on web pages or 3270 terminals, 
> where the scanner is a bump-in-the-cord on the keyboard interface.  
> Whether it's a
> PS2/PS2 or a second USB input.

The RFC4648 base32 alphabet does work, but it's a 5-to-8 encoding, so it'd give 1.6*6.6 bytes payload binary per byte QR-internal binary, so it's 10% instead of 3% overhead.

Whether that really matters all that much is, as far as generic recommendations for QR codes go, probably up for debate. (Even in the health certificate use case there was discussion that if border conditions were different, base32 would have been a good choice).

> I'm surprised that base45 includes space, + and / :-)

And the percent character; that's part of the trouble (but then again, 4 can be removed without loss of efficiency).

BR
Christian

--
There's always a bigger fish.
  -- Qui-Gon Jinn