Re: [core] draft-bierman-core-yid-00.txt questions

Alexander Pelov <a@ackl.io> Thu, 25 August 2016 16:01 UTC

Return-Path: <a@ackl.io>
X-Original-To: core@ietfa.amsl.com
Delivered-To: core@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5948C12D1DD for <core@ietfa.amsl.com>; Thu, 25 Aug 2016 09:01:18 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.619
X-Spam-Level:
X-Spam-Status: No, score=-2.619 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id VlIKCJMq4lDB for <core@ietfa.amsl.com>; Thu, 25 Aug 2016 09:01:15 -0700 (PDT)
Received: from relay2-d.mail.gandi.net (relay2-d.mail.gandi.net [217.70.183.194]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 440DD12D135 for <core@ietf.org>; Thu, 25 Aug 2016 09:01:15 -0700 (PDT)
Received: from mfilter24-d.gandi.net (mfilter24-d.gandi.net [217.70.178.152]) by relay2-d.mail.gandi.net (Postfix) with ESMTP id 73FF4C5A6F; Thu, 25 Aug 2016 18:01:13 +0200 (CEST)
X-Virus-Scanned: Debian amavisd-new at mfilter24-d.gandi.net
Received: from relay2-d.mail.gandi.net ([IPv6:::ffff:217.70.183.194]) by mfilter24-d.gandi.net (mfilter24-d.gandi.net [::ffff:10.0.15.180]) (amavisd-new, port 10024) with ESMTP id jnBS281STzMC; Thu, 25 Aug 2016 18:01:09 +0200 (CEST)
X-Originating-IP: 109.8.208.86
Received: from [192.168.0.13] (86.208.8.109.rev.sfr.net [109.8.208.86]) (Authenticated sender: alex@ackl.io) by relay2-d.mail.gandi.net (Postfix) with ESMTPSA id 2C209C5A61; Thu, 25 Aug 2016 18:01:08 +0200 (CEST)
Content-Type: multipart/alternative; boundary="Apple-Mail=_0D6556C2-2110-4B44-880A-D93B976D0D41"
Mime-Version: 1.0 (Mac OS X Mail 9.3 \(3124\))
From: Alexander Pelov <a@ackl.io>
In-Reply-To: <CABCOCHQrNrYkC5yZLou33own9q+h0yn9+GogEWcm4q1CP+6jVw@mail.gmail.com>
Date: Thu, 25 Aug 2016 18:01:09 +0200
Message-Id: <2AC557DE-6F82-4278-83A9-875E25DD01DD@ackl.io>
References: <BN6PR06MB2308A46F6A84378832DBC849FEEA0@BN6PR06MB2308.namprd06.prod.outlook.com> <201608250324.u7P3OTAI005926@mainfs.snmp.com> <BN6PR06MB2308826651FF40B1BB08F39CFEED0@BN6PR06MB2308.namprd06.prod.outlook.com> <CABCOCHQrNrYkC5yZLou33own9q+h0yn9+GogEWcm4q1CP+6jVw@mail.gmail.com>
To: Andy Bierman <andy@yumaworks.com>
X-Mailer: Apple Mail (2.3124)
Archived-At: <https://mailarchive.ietf.org/arch/msg/core/BM0vsnME1d4JbFxbfx4QxANUE4w>
Cc: "core@ietf.org" <core@ietf.org>
Subject: Re: [core] draft-bierman-core-yid-00.txt questions
X-BeenThere: core@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: "Constrained RESTful Environments \(CoRE\) Working Group list" <core.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/core>, <mailto:core-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/core/>
List-Post: <mailto:core@ietf.org>
List-Help: <mailto:core-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/core>, <mailto:core-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 25 Aug 2016 16:01:18 -0000

Hi Andy,

> Le 25 août 2016 à 17:33, Andy Bierman <andy@yumaworks.com> a écrit :
> 
> 
> 
> On Thu, Aug 25, 2016 at 7:15 AM, Michel Veillette <Michel.Veillette@trilliantinc.com <mailto:Michel.Veillette@trilliantinc.com>> wrote:
> Hi David
> 
> About " Is the sequential numbering auto-assigned or manually assigned? "
> 
> If you use a tool such the pyang plugin, they will be automatically assigned.
> 
> For example, the command:
>    pyang --generate-sid-file 20000:100 toaster@2009-11-20.yang
> 
> Generate the file toaster@2009-11-20.sid in attachment.
> 
> The command:
>   pyang --update-sid-file toaster@2009-11-20.sid  toaster@2009-12-28.yang
> 
> 
> Where is the algorithm explained in enough detail that multiple
> independent implementations will all produce the exact same results?

You don’t need this explicitly. You’ll need the SID file generated by the tool. There is a description on the way the current implementation works of course.

Of course, you may imagine Registries that choose policies / allocations that COULD provide for ways of deterministically generating on the fly your identifiers. But the most common denominator should be the SID file.

Example (which I am inventing on the go, so you can modify it according to your vision):
A Hash-based Registry (HREG) is created and IANA allocates SIDs 0x00010000-0x000F0000.
HREG publishes as its official policy that all modules registered under it must generate their SID files using the murmur32 hash, and the size of the hash must be at least 10 bits. (for the sake of the example). The registry MAY also specify that no hash collisions are allowed, e.g. any modification of a module must result in different hashes for each element. If by chance, there is a hash collision, it is up to the author to change the naming in such a way, as to be collision-free. 
HREG may provide its own SID-generating tool, which performs the hash generation, verifies that there are no collisions, and outputs a standard SID file (where the IDs are generated by hashing).

Once this is done, depending on the policy of HREG, it may publish the SID file for interoperability with the whole world, or potentially keep it private. The meta-data publishes by the HREG you could have only thing like: module entry point + range size (10 bits.. or more - could be flexible per module) + some metadata (e.g. indicating that this is no_collision_hash_generated).

This, however, is in my opinion, out of the scope of this draft. We should provide the minimal functions which allow these systems to be built, to coexist and to interoperate. 

Best,
Alexander

> 
> 
> Andy
>  
> Generate the file toaster@2009-12-28.sid in attachment.
> 
> It is important to note that sequential numbering doesn't means that two data nodes defined sequentially in .yang file will receive consecutive IDs.
> As you known, order may change and grouping may be introduced between versions.
> It means that IDs are assigned sequentially from 1 using some arbitrary order.
> If a .yang module have 14 YANG items, they will be numbered from 1 to 14.
> If the next version add 5 YANG items, they will be numbered from 15 to 19.
> 
> Regards,
> Michel
> 
> -----Original Message-----
> From: core [mailto:core-bounces@ietf.org <mailto:core-bounces@ietf.org>] On Behalf Of David Reid
> Sent: Wednesday, August 24, 2016 11:24 PM
> To: core@ietf.org <mailto:core@ietf.org>
> Subject: Re: [core] draft-bierman-core-yid-00.txt questions
> 
> >>> What is the benefit of assigning values using hashing over
> >>> sequentially assigning values?
> 
> >> There is overhead ir reading and processing a giant list of mappings.
> 
> The assignment only happens one-time, so I don't see a problem with a little overhead (which I think would be minimal anyway).
> 
> >> Automatic assignment is risky because both sides need to agree on the
> >> object to number mapping.  Detecting these issues is very difficult.
> >> The YANG Hash algorithm is designed to use the path string, which is
> >> permanent and cannot change no matter how the YANG is refactored.
> >> The murmur hash is a stable algorithm.
> >> There is not a lot that can go wrong for the 2 peers to disagree on
> >> the hashes (except for collisions)
> 
> I would think we could write an algorithm that would guarantee the same number every time. Although I have not thought through all the issues that come up with revisions, maybe this is harder than I think.
> 
> >>> When there is a hash collision, would it be possible to rehash the
> >>> value to get a unique number so that we never have to manually
> >>> assign numbers and thus never need to register the numbers in the registry?
> 
> >> yes, this has been suggested.
> >> The problem with auto-rehashing before was inter-module clashes.
> >> Now those are not possible so automatic rehashing is feasible.
> 
> > [MV] I disagree, see my previous email which explain why all YIDs/SIDs
> > need to be registered even if generated using a hash.
> 
> As long as I have either all revisions of a module or the YIDs from the previous revision, I can generate the numbers for the module. I don't think I need it to be in a central registry.
> 
> >>> Would it make sense to put the module-id inside the yang module with
> >>> a yid extension so that I would not have to go lookup that
> >>> information from a registry?
> 
> >> I suppose -- but how to prevent duplicates and cut-and-paste errors?
> 
> We would still need a registry to prevent duplicates. But only the module writer would need to access the registry. The module users would have the information in the yang module and would not have to look at the registry.
> 
> > [MV] Just adding the module ID in the yang file is not sufficient to
> > use a .yang file, all YIDs/SIDs need to be added.
> 
> If YIDs are generated by hashing with auto rehashing on collisions, the YIDs would not have to be explicitly listed in the module. They could be auto-generated and stored in a different file.
> 
> > [MV] Having YIDs/SIDs in yang files will make their maintenance more
> > complex.
> 
> Yes, it makes it more complex for the module writers. But it is easier on the users of the module.
> 
> > [MV] BTW, a pyang plugin already exist to automatically generate and
> > update a .sid file from a .yang file [MV] See
> > [2]https://github.com/core-wg/yang-cbor/blob/master/sid.py <https://github.com/core-wg/yang-cbor/blob/master/sid.py>
> 
> >>> Would it be possible to assign the module-id based on information in
> >>> the module, for example a hash of the namespace and maybe the revision date.
> >>> That way, a module-id would not have to be assigned and maintained
> >>> in a registry.
> 
> >> I think private module-id would be better, using a range reserved for
> >> temporary assignments.
> 
> I think you are right. I was just hoping we could avoid requiring module writers to register a module-id by finding a way to auto assign it.
> 
> >> How do you resolve hash collisions for module-id?
> 
> I don't have a good solution. I was hoping the probability of collision would be low enough to be acceptable, or we could add the first revision in the number to further reduce collisions. But I don't have a way to resolve collisions.
> 
> >>> Who will assign the local-id numbers? Is that done by the working
> >>> group that defines the YANG module?
> 
> >> I would expect IETF modules to use hashes by default but for some
> >> modules that seem useful to constrained devices, then manual
> >> numbering could be done instead by the WG.
> 
> > [MV] To obtain a smaller encoding, I personally believe that the
> > sequential numbering will be used.
> 
> > [MV] With sequential numbering, all current YANG modules defined in
> > RFCs can be assigned within the first 6500 SIDs which will be encoded
> > as 3 bytes for the first reference and typically as 1 bytes for the
> > following ones using delta encoding.
> 
> > [MV] With hashes, only one YANG module can be assigned within that
> > range.
> 
> Is the sequential numbering auto-assigned or manually assigned?
> 
> -David Reid
> 
> _______________________________________________
> core mailing list
> core@ietf.org <mailto:core@ietf.org>
> https://www.ietf.org/mailman/listinfo/core <https://www.ietf.org/mailman/listinfo/core>
> 
> _______________________________________________
> core mailing list
> core@ietf.org <mailto:core@ietf.org>
> https://www.ietf.org/mailman/listinfo/core <https://www.ietf.org/mailman/listinfo/core>
> 
> 
> _______________________________________________
> core mailing list
> core@ietf.org
> https://www.ietf.org/mailman/listinfo/core