Re: [core] Your cool presentation on friday core meeting

Alexander Pelov <a@ackl.io> Wed, 18 November 2015 14:23 UTC

Return-Path: <a@ackl.io>
X-Original-To: core@ietfa.amsl.com
Delivered-To: core@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 83C511B2E30 for <core@ietfa.amsl.com>; Wed, 18 Nov 2015 06:23:09 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.001
X-Spam-Level:
X-Spam-Status: No, score=-0.001 tagged_above=-999 required=5 tests=[BAYES_20=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id C9dle5IomPjl for <core@ietfa.amsl.com>; Wed, 18 Nov 2015 06:23:07 -0800 (PST)
Received: from relay6-d.mail.gandi.net (relay6-d.mail.gandi.net [IPv6:2001:4b98:c:538::198]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 25ED91B2E2B for <core@ietf.org>; Wed, 18 Nov 2015 06:23:07 -0800 (PST)
Received: from mfilter19-d.gandi.net (mfilter19-d.gandi.net [217.70.178.147]) by relay6-d.mail.gandi.net (Postfix) with ESMTP id 32179FB8A9; Wed, 18 Nov 2015 15:23:05 +0100 (CET)
X-Virus-Scanned: Debian amavisd-new at mfilter19-d.gandi.net
Received: from relay6-d.mail.gandi.net ([IPv6:::ffff:217.70.183.198]) by mfilter19-d.gandi.net (mfilter19-d.gandi.net [::ffff:10.0.15.180]) (amavisd-new, port 10024) with ESMTP id r-1BfBwaTn8a; Wed, 18 Nov 2015 15:23:03 +0100 (CET)
X-Originating-IP: 193.54.23.146
Received: from Zax.local (unknown [193.54.23.146]) (Authenticated sender: alex@ackl.io) by relay6-d.mail.gandi.net (Postfix) with ESMTPSA id 6A5E4FB8DC; Wed, 18 Nov 2015 15:23:03 +0100 (CET)
To: consultancy@vanderstok.org
References: <82c7667140aeaa28efab31a778a26204@xs4all.nl>
From: Alexander Pelov <a@ackl.io>
Message-ID: <564C89C6.40903@ackl.io>
Date: Wed, 18 Nov 2015 15:23:02 +0100
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:38.0) Gecko/20100101 Thunderbird/38.3.0
MIME-Version: 1.0
In-Reply-To: <82c7667140aeaa28efab31a778a26204@xs4all.nl>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Archived-At: <http://mailarchive.ietf.org/arch/msg/core/glp12zoaqIu96mhGSrYPTXYYdgU>
Cc: Core <core@ietf.org>
Subject: Re: [core] Your cool presentation on friday core meeting
X-BeenThere: core@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "Constrained RESTful Environments \(CoRE\) Working Group list" <core.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/core>, <mailto:core-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/core/>
List-Post: <mailto:core@ietf.org>
List-Help: <mailto:core-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/core>, <mailto:core-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 18 Nov 2015 14:23:09 -0000

Hi Peter,

I'm quite busy these days. I would like to send you a detailed response, 
but unfortunately this will be impossible today.

Please, see in-line for some initial remarks.

Le 18/11/2015 09:09, peter van der Stok a écrit :
> Hi Alexander,
>
> I like to react to your presentation of CoMI during the Friday CoRE 
> meeting.  That is necessary for me to understand the underlying 
> factual discussion.
>
> Your statement “A hash clash 5 years down the road can break your 
> network” needs some clarification. I interpret that, out of the blue, 
> a hash clash can occur and lead to problems in the network. This is 
> simply not true. A new hash clash may occur when modules are changed 
> and recompiled in a server. At that moment and before the server is 
> made operational hash clashes are detected and remedial actions can be 
> taken.
One of the specific cases I cited is, when you have your network running 
for months, maybe several years, before updating a YANG module, which 
leads to hash clash. This leads to unexpected protocol exchanges (e.g. 
clash-file loading), unexpected memory allocations (the clients must 
learn that some of its servers have these new clashes, while others have 
some other clashes, and so forth), bugs that have not been tested for 
long time, ...

> Your next statement “hashes are underspecified” also needs some 
> clarification. I understood from you that the remark was motivated by 
> the absence of a full-proof process defining the solution of hash 
> clashes. Such a process is not necessary to assure inter-operability. 
> Once solved, the server returns the unique rehashed values and there 
> is no need to specify how the rehash values are reached. Nevertheless, 
> the draft suggests that a tilde is prefixed to the YANG name, after 
> which it is rehashed. There is a very small probability that the new 
> hash clashes with another one. Actually, you said to be afraid that 
> the hashing algorithm may continuously generate the same set of 
> clashing values independent of the prefix. I have no idea if this is 
> true, and do not propose to check it for murmur.
I've though on how to handle this specific question for a long time, and 
unfortunately you cannot guarantee that re-hashing will no guarantee new 
hash clash. Which could, in turn, generate new clashes. Each clash leads 
to two new names that need to be rehashed.

>
> I agree that for efficiency reasons the same rehash process should be 
> followed in all servers such that all servers with the same set of 
> module versions arrive at the same rehash values. An approach 
> different from rehashing is to assign the lowest not assigned natural 
> number to the clashing names in lexicographical order. That will 
> generally mean that two colliding names get the values 1 and 2 
> assigned, and very rarely the values 3 and 4 may be used.

I think that Structured IDs provide a way of handling this in a 
reasonable manner. We've already discussed how in a structured ID you 
can have hashes.
>
> Last I should like to make a remark about probabilities. The whole 
> world around us is based on probabilities. For example, there is a 
> finite probability that a fatal fault will occur in an airplane during 
> one hour of flight. Or that during digital transmission bits are 
> toggled without detection by the checksum. These probabilities are 
> calculated and should be smaller than a given probability value. This 
> is a well-established engineering practice. Therefore, the clash 
> probabilities are calculated in appendix E of the CoMI draft. It shows 
> that for the targeted hash size and number of names, the probability 
> that more than one clash occurs is 10^-3 smaller than the probability 
> of one clash. These are quite small values.
>
What constitutes a "small probability" is a relative question. 10E-3 is 
typically considered quite elevated (not to say - unacceptably high) 
collision probability. Given that you can have hundreds of entries, this 
probability is even worst. Given that each collision requires 
re-hashing, which could provoke other collisions, we end up with a 
problem I consider to be quite dangerous.

> By the way, by relying on identifier assignment, there is also a 
> finite probability that the same identifier is allocated to names on 
> different modules (a hidden clash), due to power failures, undetected 
> transmission errors, or simply copying mistakes.
> I recommend that in the security section of the identifier assignment 
> draft, it is discussed how modules are detected with an identifier 
> that has been assigned without going through the registration process.
>
That's a non-issue.

> Hope this will stimulate further discussion.
>
> Peter
Thanks for the useful remarks. I am glad when we have constructive 
discussion. Sorry, for the brevity.

Best,
Alexander