Re: [Ideas] Computing collisions

Robert Moskowitz <rgm-ietf@htt-consult.com> Tue, 13 December 2016 19:12 UTC

Return-Path: <rgm-ietf@htt-consult.com>
X-Original-To: ideas@ietfa.amsl.com
Delivered-To: ideas@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9EB9F129472 for <ideas@ietfa.amsl.com>; Tue, 13 Dec 2016 11:12:50 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.097
X-Spam-Level:
X-Spam-Status: No, score=-7.097 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RP_MATCHES_RCVD=-2.896, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id YAXuHio8nqRf for <ideas@ietfa.amsl.com>; Tue, 13 Dec 2016 11:12:48 -0800 (PST)
Received: from z9m9z.htt-consult.com (z9m9z.htt-consult.com [50.253.254.3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 73AAE129445 for <ideas@ietf.org>; Tue, 13 Dec 2016 11:12:48 -0800 (PST)
Received: from localhost (localhost [127.0.0.1]) by z9m9z.htt-consult.com (Postfix) with ESMTP id 81E75622FE; Tue, 13 Dec 2016 14:12:46 -0500 (EST)
X-Virus-Scanned: amavisd-new at htt-consult.com
Received: from z9m9z.htt-consult.com ([127.0.0.1]) by localhost (z9m9z.htt-consult.com [127.0.0.1]) (amavisd-new, port 10024) with LMTP id Cxk7-qInsXHH; Tue, 13 Dec 2016 14:12:42 -0500 (EST)
Received: from lx120e.htt-consult.com (unknown [192.168.160.12]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by z9m9z.htt-consult.com (Postfix) with ESMTPSA id E097262301; Tue, 13 Dec 2016 14:12:41 -0500 (EST)
To: Dino Farinacci <farinacci@gmail.com>
References: <fb888f85-22ed-5e87-bf84-2b6244d587e6@htt-consult.com> <CAG-CQxqds9osqTTAi_Gm-VCVKGo49pW5OVALxYiYCrqNQ2yvRA@mail.gmail.com> <83be2a30-1b46-2397-976e-6e2736921e19@htt-consult.com> <6AA7A937-AEC5-4E6F-9D87-87BC257EF9F8@gmail.com>
From: Robert Moskowitz <rgm-ietf@htt-consult.com>
Message-ID: <a6e61ac9-43c4-7acb-9235-bdfc3df52c7b@htt-consult.com>
Date: Tue, 13 Dec 2016 14:12:40 -0500
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1
MIME-Version: 1.0
In-Reply-To: <6AA7A937-AEC5-4E6F-9D87-87BC257EF9F8@gmail.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/ideas/4XT_LniCbBHS9CEhIdBV9Cebdso>
Cc: Padma Pillay-Esnault <padma.ietf@gmail.com>, ideas@ietf.org
Subject: Re: [Ideas] Computing collisions
X-BeenThere: ideas@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: "Discussions relating to the development, clarification, and implementation of control-plane infrastructures and functionalities in ID enabled networks." <ideas.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/ideas>, <mailto:ideas-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/ideas/>
List-Post: <mailto:ideas@ietf.org>
List-Help: <mailto:ideas-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/ideas>, <mailto:ideas-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 13 Dec 2016 19:12:50 -0000


On 12/13/2016 01:35 PM, Dino Farinacci wrote:
> In LISP, an xTR registers with a 128-bit xTR-ID. This is not the EID. So we could have an xTR register with a random EID it chooses. Then the map-server determines it is a collision because another xTR-ID has registered with that same EID. The map-server could select a random EID and check locally if it collides. If it doesn’t, then the xTR registers with a new EID (and makes appropriate interface address assignments locally).
>
> Note the draft-farinacci-lisp-eid-anonymity-01 specs that ephemeral-EIDs can be used. So when an xTR decided to choose a random IPv6 address as an EID and registers. It can ask the map-server if it collides. Today, the map-server Acks Map-Register messages, but it could Nak and say “go get a new random ephemeral-EID”. I like this approach better because the allocation happens at the sites within their own address prefix range and gives you some obfuscation at the same time.
>
> If people thing the second approach makes sense, I can spec out in the above draft how Map-Servers could NAK when there are collisions. Please comment.

That is basically what I do with Hierarchical HITs.  The device 
registers its HIT with the owner of the Hierarchy point.  The owner 
(HDA) then either accepts the HIT or rejects with a 'Hierarchical HIT 
Already Registered' message  (see sec 6.4).

Since it might be the case that device is registering, there is a 
difference between duplicate HI (Either a register, so something bad is 
happening to make it possible for two devices to select the same HI) and 
HIT collision with different HI.

Note that it is possible for a device to register one HI to two 
different HDAs.  Due to the algorithm for HIT generation, these will not 
only have different HDA values, but the hash component will be 
different.  However a device may well choose different HIs for different 
registrations for privacy reasons.


>
> Dino
>
>> On Dec 13, 2016, at 5:35 AM, Robert Moskowitz <rgm-ietf@htt-consult.com> wrote:
>>
>> Here is some results.
>>
>> On 12/11/2016 12:03 PM, Padma Pillay-Esnault wrote:
>>> Hello Bob
>>>
>>> Thanks for sending this formula.
>>>
>>> If we assume that the population is not necessarily a flat space and that it is in different separate instances, this could help reduce the collision.
>>>
>>> Per the ideas problem statement draft, we have discussed about the different types of devices and whether we could have some policy on what they can do. Eg. a camera should not be shopping for clothes or access bank accounts .... wouldn't it make sense to put them in different instances with different access policies?
>>>
>>> What are your thoughts on this?
>> Let's work with a 24 bit ID space which is a population of 16,777,216
>>
>> Let's then assume a random distribution of 2^14 (16,384) devices.  This is a 99.97% probablity of a collision.  A sure thing.
>>
>> Now let's assume that there are 256 (2^8) classes of devices and that the population is evenly divided.  That is 64 devices per class.  We now have reduced our ID space to 16 bits (65,536), but the probablity of a collision with only 64 devices is 3.08%!
>>
>> But if the division is not equal between the classes and one class has 512 devices the probablity jumps up to 84.67%; again an almost sure thing.
>>
>> So we can see from this little exercise that dividing our ID space into groupings can actually reduce the collision risk.  Two caveats are clear though.  How many classes to use (you can see my effort at this in draft-moskowitz-hierarchical-hip) and what if one class out classes all the rest?
>>
>> You still need to manage collisions.  You just may be telling fewer losers to select a new ID.
>>
>> Now if instead you are talking about reusing your ID space on the assumption that these devices are NEVER interacting and thus never 'seeing' the inevitable collisions, then, yes, reuse the ID space.  But we all know what assume also spells.
>>
>>
>>> Padma
>>>
>>> On Thu, Nov 17, 2016 at 1:48 AM, Robert Moskowitz <rgm-ietf@htt-consult.com> wrote:
>>> Here is the equation:
>>>
>>> probability of collision = 1 - e^{-k^2/(2n)}
>>>
>>> Where n is your max population size (e.g. 2^64)
>>>
>>> and k is your deployed population (e.g. 7B)
>>>
>>> Bob
>>>
>>> _______________________________________________
>>> Ideas mailing list
>>> Ideas@ietf.org
>>> https://www.ietf.org/mailman/listinfo/ideas
>>>
>> _______________________________________________
>> Ideas mailing list
>> Ideas@ietf.org
>> https://www.ietf.org/mailman/listinfo/ideas