Re: [81attendees] What is it at the bottom of restaurant receipts?

"Worley, Dale R (Dale)" <> Sat, 20 August 2011 03:47 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 5984B21F85EC for <>; Fri, 19 Aug 2011 20:47:49 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -103.446
X-Spam-Status: No, score=-103.446 tagged_above=-999 required=5 tests=[AWL=0.153, BAYES_00=-2.599, RCVD_IN_DNSWL_LOW=-1, USER_IN_WHITELIST=-100]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id 6o9CfwYLe033 for <>; Fri, 19 Aug 2011 20:47:48 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id 9022021F85F2 for <>; Fri, 19 Aug 2011 20:47:48 -0700 (PDT)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: Av4EAO0tT07GmAcF/2dsb2JhbABCqA53gUABAQEBAxIoPxACAQgNKRAyJQEBBAENDRqkDAKbd4VpXwSYPotn
X-IronPort-AV: E=Sophos;i="4.68,253,1312171200"; d="scan'208";a="263249044"
Received: from unknown (HELO ([]) by with ESMTP; 19 Aug 2011 23:48:44 -0400
Received: from (HELO ([]) by with ESMTP; 19 Aug 2011 23:45:00 -0400
Received: from ([]) by ([2002:870b:3414::870b:3414]) with mapi; Fri, 19 Aug 2011 23:48:43 -0400
From: "Worley, Dale R (Dale)" <>
To: Richard Barnes <>, "John R. Levine" <>
Date: Fri, 19 Aug 2011 23:48:41 -0400
Thread-Topic: [81attendees] What is it at the bottom of restaurant receipts?
Thread-Index: AcxWyDTZ83t77v3QTLeaZMBrGcvdIAIIHVI9
Message-ID: <>
References: <> <> <> <alpine.BSF.2.00.1108091333400.781@joyce.lan>, <>
In-Reply-To: <>
Accept-Language: en-US
Content-Language: en-US
acceptlanguage: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Cc: "" <>
Subject: Re: [81attendees] What is it at the bottom of restaurant receipts?
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: IETF 81 Attendee List <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Sat, 20 Aug 2011 03:47:49 -0000

I went through the math, and it looks like there are 256 symbols in
the set of dingbats.

I did a maximum-likelihood calculation, or rather, a Bayesian analysis
assuming that all alphabet sizes that are powers of 2 are a-priori
equally likely, then looking at a "large" set of dingbats, seeing
the pattern of duplications among them, and computing a-posteriori
probabilities of the alphabet size.  (This assumes that the dingbats
are statistically random.)

I had to reconstruct the formula for "If you have N symbols, and draw
from them randomly n times, allowing duplicates, the resulting
multiset will be a partition of n.  Given that partition, what is the
probability of that partition resulting, as a function of N?"

Throwing away some factors which are independent of N, which don't
affect the Bayes Rule calculation, the resulting probability is a
function only of N, n, and the number of parts in the partition, k:
P = descendingfactorial(N, k)/N**n.

Taking 5 slips that were perfectly readable, there were 60 symbols,
which included 3 triplets, one duplicate, and 49 unique symbols:

60 = 3 + 3 + 3 + 2 + (49)*1

Plugging this into the formula gives:

         n         ln(p)
        64   -61.867
       128   -46.607
       256   -44.610
       512   -46.457
      1024   -49.890
      2048   -54.051
      4096   -58.562
      8192   -63.245
     16384   -68.013

So it looks like there are 256 symbols, and they carry 8 bits of
information each, which isn't surprising.

Checking the Unicode charts, almost all of the symbols are on the page
U+22xx, "Mathematical symbols".  I haven't tracked down the rest, some
of which are seriously obscure, but some seem to be sans-serif Hebrew.
I am guessing that the symbols not in U+22xx are to replace some
symbols on that page that are too much like others.

So the dingbats contain 96 bits.

I still favor the idea that they are some sort of keyed hash of the
data in the barcode, but I can't figure out how it would be used
operationally, since a Revenue Quebec bar-code reader could easily
read a keyed hash and verify the signature.  It has to be compared by
eye with another display in the same format.

And then there's a receipt whose 12 symbols are two repetitions of a
sequence of 6 symbols.  This is so incredibly unlikely that there must
be something wrong with it.