### Re: [81attendees] What is it at the bottom of restaurant receipts?

"Worley, Dale R (Dale)" <dworley@avaya.com> Sat, 20 August 2011 03:47 UTC

From: "Worley, Dale R (Dale)" <dworley@avaya.com>

Date: Fri, 19 Aug 2011 23:48:41 -0400

I went through the math, and it looks like there are 256 symbols in the set of dingbats. I did a maximum-likelihood calculation, or rather, a Bayesian analysis assuming that all alphabet sizes that are powers of 2 are a-priori equally likely, then looking at a "large" set of dingbats, seeing the pattern of duplications among them, and computing a-posteriori probabilities of the alphabet size. (This assumes that the dingbats are statistically random.) I had to reconstruct the formula for "If you have N symbols, and draw from them randomly n times, allowing duplicates, the resulting multiset will be a partition of n. Given that partition, what is the probability of that partition resulting, as a function of N?" Throwing away some factors which are independent of N, which don't affect the Bayes Rule calculation, the resulting probability is a function only of N, n, and the number of parts in the partition, k: P = descendingfactorial(N, k)/N**n. Taking 5 slips that were perfectly readable, there were 60 symbols, which included 3 triplets, one duplicate, and 49 unique symbols: 60 = 3 + 3 + 3 + 2 + (49)*1 Plugging this into the formula gives: n ln(p) 64 -61.867 128 -46.607 256 -44.610 512 -46.457 1024 -49.890 2048 -54.051 4096 -58.562 8192 -63.245 16384 -68.013 So it looks like there are 256 symbols, and they carry 8 bits of information each, which isn't surprising. Checking the Unicode charts, almost all of the symbols are on the page U+22xx, "Mathematical symbols". I haven't tracked down the rest, some of which are seriously obscure, but some seem to be sans-serif Hebrew. I am guessing that the symbols not in U+22xx are to replace some symbols on that page that are too much like others. So the dingbats contain 96 bits. I still favor the idea that they are some sort of keyed hash of the data in the barcode, but I can't figure out how it would be used operationally, since a Revenue Quebec bar-code reader could easily read a keyed hash and verify the signature. It has to be compared by eye with another display in the same format. And then there's a receipt whose 12 symbols are two repetitions of a sequence of 6 symbols. This is so incredibly unlikely that there must be something wrong with it. Dale

