[openpgp] Crowdsourcing Base214
Phillip Hallam-Baker <phill@hallambaker.com> Wed, 29 April 2015 14:15 UTC
Return-Path: <hallam@gmail.com>
X-Original-To: openpgp@ietfa.amsl.com
Delivered-To: openpgp@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C219F1A6F33 for <openpgp@ietfa.amsl.com>; Wed, 29 Apr 2015 07:15:27 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 2.602
X-Spam-Level: **
X-Spam-Status: No, score=2.602 tagged_above=-999 required=5 tests=[BAYES_40=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FM_FORGED_GMAIL=0.622, FREEMAIL_FROM=0.001, FRT_PROFILE2=1.981, SPF_PASS=-0.001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PtzK4KxOuRq6 for <openpgp@ietfa.amsl.com>; Wed, 29 Apr 2015 07:15:26 -0700 (PDT)
Received: from mail-la0-x235.google.com (mail-la0-x235.google.com [IPv6:2a00:1450:4010:c03::235]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D01F81A8932 for <openpgp@ietf.org>; Wed, 29 Apr 2015 07:13:26 -0700 (PDT)
Received: by layy10 with SMTP id y10so21120712lay.0 for <openpgp@ietf.org>; Wed, 29 Apr 2015 07:13:25 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:date:message-id:subject:from:to:cc:content-type; bh=af7N8bYypFLgUfvhhofRfmGXli7WUZdzc6BCkFdsDiQ=; b=ZKYio39Y+Vv5f6oM9VW6syuON0fDR/osG/56L4TsJMtBKrMwyZPQ8M8jY1MEX2sGCt aK9kynRNAUM/Br6n72S3wimBUkZOPyasI3YYomcAccAdOyb9IkRgSh9UTG1sxVp0TKwD yV0DbNYwjFyIDQemm8yqeKtCbIknEz2r+gY0BreN/38Pc8g35I+vA54/M1A4GUni/okF TtFtgmktLEBRbdcFhWNobeO7Upf2T3N69dPuQWVHLFJ7XvHmFcdQIB5RZSgBycPC8dN5 7aIb8USKrDcyJ76cvFuCs7yO96TNzlnI+eXy4wX0wkI8AOiT2gLApv8ayS22eQ5cokT4 oaKQ==
MIME-Version: 1.0
X-Received: by 10.153.7.104 with SMTP id db8mr15086820lad.124.1430316805285; Wed, 29 Apr 2015 07:13:25 -0700 (PDT)
Sender: hallam@gmail.com
Received: by 10.112.203.163 with HTTP; Wed, 29 Apr 2015 07:13:25 -0700 (PDT)
Date: Wed, 29 Apr 2015 10:13:25 -0400
X-Google-Sender-Auth: cfa3cuRo4wKSiMUKXKGytcgIkxo
Message-ID: <CAMm+LwhTidbfpMQYzJ2MNQ7cdLfjGPdAXFmH2O3XLt5eBF2F1g@mail.gmail.com>
From: Phillip Hallam-Baker <phill@hallambaker.com>
To: "Neal H. Walfield" <neal@walfield.org>
Content-Type: text/plain; charset="UTF-8"
Archived-At: <http://mailarchive.ietf.org/arch/msg/openpgp/-tviqA1oVrCcsyaiuwsaDLZHouw>
Cc: Alessandro Barenghi <alessandro.barenghi.polimi@gmail.com>, IETF OpenPGP <openpgp@ietf.org>
Subject: [openpgp] Crowdsourcing Base214
X-BeenThere: openpgp@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "Ongoing discussion of OpenPGP issues." <openpgp.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/openpgp>, <mailto:openpgp-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/openpgp/>
List-Post: <mailto:openpgp@ietf.org>
List-Help: <mailto:openpgp-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/openpgp>, <mailto:openpgp-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 29 Apr 2015 14:15:27 -0000
On Wed, Apr 29, 2015 at 6:08 AM, Neal H. Walfield <neal@walfield.org> wrote: > I wonder if less if not more. > > If you look at the diceware list, it has "easy to remember words" like > "aaaa", "abner" and "adair". And, this list is just 7776 words long. > These are not only hard for a native speaker to memorize, but also for > those who speak english as a second language. > > If we are going to make a new word list, I would recommend using > something based on the voice of america simply word list. This > includes 1500 simple words, which all english speakers with basic > proficiency are familiar with. > > Alternatively, there is the PGP Biometric word list [1], which aren't > as simple, but are phonetically distinct. > > [1] https://en.wikipedia.org/wiki/Biometric_word_list The larger the alphabet, the shorter the fingerprint. Since there is no need to keep the images/words on the device, the size of the dictionary is not that critical. Fingerprints with the PGP biometric list are rather too long. Looking at the options, it seems like somewhere between 13 and 16 (inclusive) is the sweet spot. Above 64K entries, curating the list is just too hard. Back in 1995, memory constraints were very different. I would very much like to keep the size of the fingerprint within the 7+/-2 working memory limit and provide at least 100 effective bits. That requires each glyph encode at least 14 bits. Presenting images in two sets of four seems to work quite well on an Apple Watch. And a smartphone seems to be able to present eight at once without too much hassle. The big advantage to 14 bits is that it then allows a direct mapping to the CJK unified characters in Unicode. This looks to me to be an excellent opportunity to engage the wider community and to crowdsource parts of the process. There are hundreds of people willing to help. Give each person a part of the image space to curate and we can have the process done pretty quick. So lets say someone has 'road motor transport' for 256 entries. She then breaks that down into 'cars', 'trucks', 'buses', 'motorcycles' and then within each category finds 64 distinctly different examples. Someone else does the same for 'unpowered transport', 'marine transport', etc. A wiki is probably sufficient for the necessary collaboration. The purpose of this isn't just to get the best result. Engage the community and they become advocates and early adopters. And we need advocates who are not from the crypto community. For the word lists, I am thinking that the best approach is to start off with a fairly large dictionary and filter it by putting it through google translate and seeing what distinct words survive translation from English to French and back. Then take the dictionary and machine translate it into 16 odd different languages as a starting point and compute Merkle trees over each individual corpus. Probably the thing to do is begin with a Base 2^14 scheme which could be expanded if desired to 2^16.
- Re: [openpgp] Crowdsourcing Base214 Phillip Hallam-Baker
- [openpgp] Crowdsourcing Base214 Phillip Hallam-Baker
- Re: [openpgp] Crowdsourcing Base214 David Leon Gil
- Re: [openpgp] Crowdsourcing Base214 Stephen Paul Weber
- Re: [openpgp] Crowdsourcing Base214 Phillip Hallam-Baker