Re: UUID version 6 proposal, initial feedback

Brad Peabody <bradgareth@gmail.com> Sat, 01 February 2020 21:29 UTC

Return-Path: <bradgareth@gmail.com>
X-Original-To: ietf@ietfa.amsl.com
Delivered-To: ietf@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 20CC312004A for <ietf@ietfa.amsl.com>; Sat, 1 Feb 2020 13:29:12 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.999
X-Spam-Level:
X-Spam-Status: No, score=-1.999 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id yhywewJUVo8u for <ietf@ietfa.amsl.com>; Sat, 1 Feb 2020 13:29:10 -0800 (PST)
Received: from mail-pl1-x629.google.com (mail-pl1-x629.google.com [IPv6:2607:f8b0:4864:20::629]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 2A70A12008F for <ietf@ietf.org>; Sat, 1 Feb 2020 13:29:10 -0800 (PST)
Received: by mail-pl1-x629.google.com with SMTP id y1so4241158plp.7 for <ietf@ietf.org>; Sat, 01 Feb 2020 13:29:10 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding:content-language; bh=xnUJckRnXpi0XHP1TC3VNOMT5G9Jht6rHaun2VrmaQM=; b=cLwoYV/fEwnlsyNpG/B6Bi/kuPljFbSvc/P+UFf2LMsznmRZh5dtTCB/+HK1QHDxVn 3QgI/7UTArPfkuVOaTWgOad4MIZkP8dogWiKR1QsKKkj9XUdTZHSHxBp6exBWRfpgrU+ bUvohHVAR0cFFu0iB3jvnpLp4RwvkxQjJvp0YoIML3tCbVb97EWzpf8YGWKHG3oL3Ez7 QUg7jwQf8dDCJqS8ZvClmPViN+a/EsHqH5WULWC4HIb8YOkdHquARp8STYiZ70/mz0eX dV7yjcDM2UPUww9J5QCJtIN2JBVieXAcFDkOpkvOEp57lSiU4p2lu0BKxGNq8IwWf5WU PeWQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=xnUJckRnXpi0XHP1TC3VNOMT5G9Jht6rHaun2VrmaQM=; b=igSoqfGjG4B/QxDl/lljrAYp2IxBvpVQYuXPMI8H0wKV74nuRfGvtVPdfzvN6CTb+o Q6egWZ8YJdGQL1NqflpTUSC9ODw/AiIgfOk5066+sL5AGx2HZ2LMEo/Tdkvfqread34b Mlpa2y4drtjrHsgBOoRJLS/r65OB16wKDpMqg0BX9ArLO5U/PR68iHCU1fOdgft6/wvx pXjzhB+LgmEf+4UFUVS7K3HstgNGJcNlmgFgIjW6lf1GKdZdc0J28Q0XZ5qFmCTEFLQI qqi42LVZegAYhU4+3uOIgbpeETDzpW0GVvPucm00OTcM3RHL6U1/FMm+j4DjjDxSvxtl Udmg==
X-Gm-Message-State: APjAAAX1FyCDnDGPOkkGZBSOjQtIRfMrsNJSFJxZ5h++cTU8QWIEntFp 9JiVlJ66C0k7y5/IStTlPQU=
X-Google-Smtp-Source: APXvYqzhs0ifRLlywTj9IFjV+Jq5ZaOPyNLWgk/r0sV/6XT4QbOxhUAwB/sD6JwP+0gsu65ifY16Hg==
X-Received: by 2002:a17:902:aa05:: with SMTP id be5mr15789002plb.142.1580592549567; Sat, 01 Feb 2020 13:29:09 -0800 (PST)
Received: from BGPMacBookPro.charter.com ([2600:6c50:7f:5954:5c5b:94de:cdfa:9b11]) by smtp.gmail.com with ESMTPSA id u11sm14017683pgh.60.2020.02.01.13.29.08 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 01 Feb 2020 13:29:08 -0800 (PST)
Subject: Re: UUID version 6 proposal, initial feedback
To: IETF discussion list <ietf@ietf.org>
Cc: "Theodore Y. Ts'o" <tytso@mit.edu>, "Salz, Rich" <rsalz@akamai.com>
References: <D0894516-3F20-4545-BD7D-BE4FA96FAF75@gmail.com> <CABkgnnXSxqqinyK4QiwVv-VuzAraHFUGCrm0K0e9dJX_F80bWg@mail.gmail.com> <D3517A2C-1FCC-42D2-9AB6-248680BE89E1@gmail.com> <c5ba6f5d-7c61-bfdf-63e6-be7d640ee50c@gmail.com> <6E165220-7D1F-4AD8-B4F3-DDCB8F1DA6E2@akamai.com> <b4b73e11-7e21-03ae-0ebf-badcc2bf9d7e@gmail.com> <20200201060733.GD454818@mit.edu>
From: Brad Peabody <bradgareth@gmail.com>
Message-ID: <418aac22-5686-8877-da0f-12fce0a28d40@gmail.com>
Date: Sat, 01 Feb 2020 13:29:06 -0800
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:68.0) Gecko/20100101 Thunderbird/68.4.1
MIME-Version: 1.0
In-Reply-To: <20200201060733.GD454818@mit.edu>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Content-Language: en-US
Archived-At: <https://mailarchive.ietf.org/arch/msg/ietf/AIzsZ2h7eu1iWK9HGKNlpQLxc3o>
X-BeenThere: ietf@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: IETF-Discussion <ietf.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ietf>, <mailto:ietf-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ietf/>
List-Post: <mailto:ietf@ietf.org>
List-Help: <mailto:ietf-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ietf>, <mailto:ietf-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 01 Feb 2020 21:29:12 -0000

> Another recommendation: publish your proposal as an Internet-Draft, 
rather than expecting people to visit a proprietary site that is known
to be a serious privacy threat.

Fair point, will do.

> The uuidd daemon is used by the UUID library to generate universally unique identifiers (UUIDs), especially time- based UUIDs, in a secure and guaranteed-unique fashion, even in the face of large numbers of threads running on different CPUs trying to grab UUIDs.

Ted, I'll spend some more time looking through the libuuid source code 
(nice work, btw).

I do need to give more thought to the uniqueness property.  One 
observation is that uniqueness guarantees can be somewhat 
application-specific, and specific cases can significantly reduce the 
complexity of implementation.  Examples: Applications deployed on a 
cluster where each machine already has a unique number can simply 
include that number to ensure uniqueness.  Applications only concerned 
about uniqueness in a single process can easily do this using the clock 
and a counter.

Making identifiers globally unique in a distributed fashion with high 
certainty (and still being reasonably short) is a harder problem to 
solve, but again only certain applications actually require this.  But 
some will, so it will need to be accommodated. I've read various 
concerns about using the network interface MAC address and how leaking 
this information can be bad.  My thought is that for many applications 
it may be workable to have an ID with a timestamp and then enough random 
data coming from a CSPRNG that the probability of collision is simply 
low enough to be workable.  I need to do the math on what the collision 
probability is for a nanosecond-precise timestamp followed by 64 bits of 
random data (and for larger amounts of random data also).  A lot of 
engineering already goes into making cryptographically strong random 
numbers available on modern computers.  I could be wrong, but it seems 
to me that simply knowing the collision probability for a given id 
configuration (time and random data bit lengths) one can just use one 
long enough to push the collision probability down below whatever one 
considers acceptable.  E.g. the probability of encountering a duplicate 
MAC address in the wild, or an implementation bug causing a duplicate. 
There is no true absolute guarantee of uniqueness, so mind as well just 
look at the numbers and accept the risk knowingly based on the 
probability. (I'm thinking this probability information could be given 
in the proposal, so it's clear like "use this form of ID if you XYZ 
probability of collision", listed out for each one, etc.) Again, am 
definitely willing to be wrong here, but that's my logic on it. Thoughts?